Storage & Entertainment Stories

Over the last two years, demand for data protection is up, yet budgets have shrunk, thus forcing IT departments to do more with less. In the end, many IT shops have backup plans du jour and simply don’t give much thought beyond cost, speed and feed, much less to media life and an evaluation of requirements to meet the business needs of the department. The message is backup data and makes sure archives are maintained.

Many IT managers regard backup and archive as one and the same, yet these business applications address different needs. A backup is a full reserve data set that can replace the current live transactional data set, in the event of a data loss emergency. An archive, on the other hand, is a data set that is stored (in most cases) separately from the day-to-day transactional data set on less costly removable media; whereby on-line access is available in seconds to minutes, depending upon the archive strategy.

There has been much ado regarding backup windows and the amount of time it takes to do a system backup. The real questions should be: Can I do a successful restore 100 percent of the time and how much time does it take to get my system up and running, if it goes down? A fast backup doesn’t amount to much if the backup set does not manage a successful restore, given any amount of time. In addition, if only a few files are corrupt, why does it make sense to take the time to do a full restore, when a file-level backup would allow for specific files to be accessed and restored, thus saving a significant amount of time and money?

Studies have shown that the majority of real-world restores are for individual files and not the complete data set. That said, is tape the most effective medium for data back up? Most companies have suffered through the use of tape for backup and archive because, until now, there hasn’t been any cost-effective alternative. Two changes have taken place in storage that now provides new cost-effective and reliable alternatives for backup and archive.

One, high capacity ATA disks have come down in cost to a point where it is now feasible to have the most recent copy of data stored on disk for quick recovery. The disk provides for a very fast backup and restore. File-level backups on random accessible media make single file restore very attractive. The ability to browse the backup volume-set as a single drive allows for fast access to individual files. In addition, write-verify ensures that the backup job was in fact completed and that all the files were written without error. This ensures that a restore job is 100 percent guaranteed if ever needed. This same volume set can be moved off-line and stored for 100 years or more, depending on data storage life cycle requirements.

Second, DVD-RAM media has come down in price to where it is almost at parity with high-end tape. In addition, DVD-RAM libraries are about one-half of the price of high-end tape autoloaders. When one adds the longevity of optical media and its random accessibility, some are finding this a very attractive option for backup and archive.

Sure, one should have a tape set with a complete operational snapshot, in the event of a total system disaster; however, what performance does tape offer if a storage administrator is looking for just a single DLL or other critical operational file lost due to an error? In this case, disk-based file-level backup makes a lot more sense.

A detailed approach to an organization’s secondary storage needs should include the following:

o Live or transactional data should be stored on immediately accessible storage, such as a hard drive or RAID.

o Live data should be backed up daily and backup sets should be rotated to extend the life of the media.

o Critical data that is accessed infrequently should be archived and kept near-line. In archiving infrequently accessed data, quite a lot of disk space can be freed up for live data storage and will reduce the amount of storage required for live data backup. However, all the data (both archived data and live data) is still system accessible with the click of a mouse.

o Critical data that must be kept, but is unlikely to be accessed, should be archived and tracked off-line.

o Automated policies should be put in place to manage data through the life cycle-as prescribed by business and application needs-to ensure good business practices and to meet specific regulatory requirements that may apply.

One backup model that satisfies a very short backup window-while providing both a very fast file level restore and secure long-term data retention-is using an optical or DVD-RAM library with a dedicated backup server as a caching device. The backup is done in two stages, first with disk-to-disk, using the backup server and low-cost ATA hard drives. This allows for a very fast backup job from the production environment with little to no interruption on the network. The backup server then writes the backup data set off to the optical disk in the background overnight. This provides for 100MB/s transfer rates from the primary disk to the backup server, and the optical disk provides for a disk set that can be off-lined for use in the event of a disaster recovery situation. The optical disks can also be kept in a vault for a very long time. In the event of the need for a file level restore, the optical disks can be browsed and accessed from a file level basis and the data can be restored or copied, as necessary.