Advanced Backup Strategies

Backups are the single most important system maintenance task. So important, developing and implementing an effective backup strategy should be the first thing on your checklist once you have a fully functioning environment. In this article I am not going to harp on how important scheduled backups are to your business data (you should already know this), rather how to develop an effective strategy. The focus will on both the businesses who outsource their hosting needs (I will refer to this as the "desktop" model) and those who run their own servers (I will refer to this as the "server" model).

There is no particular operating system or special software required to implement anything I discuss, we are going to use always reliable command line scripts to do everything. I will provide some basic script examples and external links, but for the most part the focus will be the strategy behind developing effective scripts. Of course there are a myriad number of commercial products available which do effective backups, however my article is centric to the tools the operating system makes available.

Contents

  1. Developing a Backup Strategy
    1. General Considerations
    2. Desktop Data Specifics
    3. Server Data Specifics
    4. Where To Save Backups
    5. Backup Frequency
    6. Testing Backups
    7. Putting It All Together
  2. Introduction to Command Line Scripts
    1. Why Command Line Scripts?
    2. Common Service References
    3. Creating a Basic Script
      1. Desktop Backup
      2. Server Backup
    4. Advanced Techniques
    5. Alternatives
  3. Scheduling Tasks

Developing a Backup Strategy

Of course the first step to creating an effective backup process is determining what you need to backup. Most of what I cover here you might have already heard, but considering the importance of backups, this information is worth reviewing.

General Considerations

As a rule of thumb if data is irreplaceable or you cannot easily obtain it from other source it needs to be backed up. It is not necessary to backup your entire hard drive as program installation files and the like can always be reinstalled from CD/DVD or re-downloaded. That said, if you are using an older version of a particular program and do not wish to upgrade, it’s best to have the installation package available on some sort of CD/DVD media so you will always have it.

If you are loading a system (either reformatting or upgrading to a new computer) from just the barebones operating system, it is a good idea to take quick notes of everything you do. A simple list such as:

  1. Load Program 1
  2. Load Program 2
  3. Copy/restore data for Program 2
  4. Set preferences for Program 2
  5. Load Program 3
  6. etc.

This is an excellent basis for developing a backup procedure. Anytime you copy or restore data this is a clear indicator of data which needs to be backed up. On the other hand, if you already have your system up and running and did not make a list like the above (which is probably the vast majority of you), try to remember what you did to get your system to the point where it is now and create a list. It may sound silly and like a waste of time, but remember you only have to do it one time and it can go a long way to saving time in the future.

Drive imaging is also a form of backing up data. While some may argue, the intent of drive imaging is not for data recovery. Images are particularly useful in the event of complete hardware failure or cloning machines, but not as a replacement for backups. Images are bulky, slow to create, require system downtime and are absolute, meaning you cannot easily recover selective data from an image without restoring the entire image (yes it can be done, but only with the right software or another machine).

Desktop Data Specifics

For the most part, determining what to back up on a desktop is not difficult. Most all of your data will be stored in your profile which is located in the C:Documents and Settings directory in Windows (although some legacy applications will store information in the Program Files, Windows or some other oddball directory) and your /home directory in Linux. Among others, here is common data to consider:

  • My Documents
  • Email
  • Financial (Quickbooks, Quicken, MS Money, GNU Cash etc.) database
  • Contact / CRM database
  • Electronic receipts
  • Bookmarks
  • Password database

Typically a simple file copy is sufficient to back up this information. The catch to a file copy backup is, typically, files in use cannot be copied correctly, so they would need to be closed or skipped over. Some commercial applications are able to effectively backup files in use, however it is best to make sure your work is saved and closed nightly to avoid any complications.

Server Data Specifics

From a standpoint of backups, servers are actually easier to deal with than desktops. A vast majority of the time nobody is logged into the system so only the services running are accessing data. This is ideal because, more times than not, these services offer the ability to create realtime backups without disrupting the running service, typically through a command line interface. The most notable exception to this would be a file server where users access data through desktop applications (such as Microsoft Word) so the locked file problem could occur if these files are open.

It is a good thing server data is easier to backup, because it is absolutely business critical. Imagine your host telling you their server went down and their most recent backup was a month old, or even worse, they had no backup. On the same hand if you are providing hosting services, imagine having to tell one of your customers the same thing. Needless to say, the hosting company would not be in business anymore. That said, here is a quick list of essential data to backup on a typical web server:

  • Customer data
    • Files / Directories
    • Databases (full backups, not just transactional)
    • Statistics (web traffic reports, etc.)
  • Services
    • Webserver entries (IP information, headers, etc.)
    • Email server entries (post offices, mail boxes, user passwords, etc.)
    • DNS entries
  • Login information
  • Custom scripts
  • Scheduled tasks

Of course there are going to be considerations unique to every server due to the services installed, but overall a backup checklist should include most, if not all of the above.

Storing Backups

After you have put together a list of everything you need to backup, the next step is deciding where to store your backups. Considering the entire point of backing up is to be able to recover in the event of a catastrophe, the number one no-no is storing backups on the same physical disk as the respective live data. The only case you could get away with this would be if you had a RAID 1 or RAID 5 configuration. Even so, I would highly recommend backups be placed on a completely separate logical and physical disk. My recommendation is always to use a permanently connected electronic medium, such as a separate hard drive or, for desktop users, a USB flash drive may be sufficient.

Another consideration, when storing backups, is the actual disk space required to store the information. Although it is certainly not required, compressing your backups makes perfect sense. This process drastically reduces the amount of disk space required, but at the cost of system resources to perform compression on, most likely, a sizeable amount of data. Typically this is not an issue with desktop machines, as you can perform the backup and compression during a time you know the computer will not be used, however on servers which need to be constantly available this could be an issue.

The very nature of servers is to provide virtually 100% uptime. Simply put, this means having as much as the system’s resources possible available to it’s designated task. Most likely compressing backups it not on the list, so the pros and cons need to be weighed.

The Pros and Cons of Compressing Backups on Servers
Pros Cons
  • Lower required storage space
  • Backups can be kept longer
  • More portability
  • Intensive CPU / memory use
  • (Possible) Slower responses to critical requests
  • Decompression required to restore data

Building off the above considerations, the server specifications should also be considered.

  • Does the server have multiple CPUs / cores?
    If not, the compression process will need to run with a lower priority which leads to a longer time of execution. Running with a lower CPU priority makes sure the backup process only uses the "left over" CPU cycles not consumed by your primary services.
  • How much memory does the system have?
    Keep in mind, you can control CPU priority, however memory distribution and usage will not take this into consideration. In other words, the compression process will surrender its CPU usage when asked, but it will not give up its memory usage. Only when the compression is complete will the used memory become available.

Hopefully you will seldom need to recover data, but when you do considerations on how quickly you need the data is a factor. Keep in mind, if your backups are compressed, you have to uncompress them before using. Again, take into account the machine specs because you cannot always control when you will need to recover data, it may be during peak usage hours when system resources are at a premium.

Backup Frequency

Much like storage, frequency of backups depends on your system’s resources as well. The trade-offs here are obvious. Frequent backups allow more data "snapshots", so precise recoveries are achievable (for example, recovering a particular email), however this comes at the cost of disk usage and system resources.

The rule of thumb for determining an appropriate frequency is to ask yourself how much data you can afford to lose. If the answer is a half day’s worth, nightly and afternoon backups are the way to go. Be sure to consider server load at your planned time of backup as system resources should be, for the most part, allocated to the server’s primary functions. Regardless, at the very least nightly backups should be performed on any system. Remember the goal of having backups is to be able to recover in the event of a disaster.

Testing Backups

Even if you have a foolproof backup process in place, your backed up data should be periodically tested to insure data integrity in the event a restore is needed. After all, the backup data is no good if you cannot recover from it. Typically this is not necessary for file copy backups, however service backups (such as databases [MySQL, MS SQL, etc.] and email stores [MS Exchange]) always run the risk of corruption. Albeit this is extremely rare, it is still a possibility.

A good practice is to restore your backup data in a test environment. This accomplishes several things:

  • Insures your restore data is usable
  • Helps catch and correct errors in your backup process
  • "Practice" for when an actual restore is needed

A frequency of once a month should be adequate to give good piece of mind.

Putting It All Together

So, bottom line, what is a good strategy? Generally, backup requirements will vary depending on the situation, but in a general sense a weekly full backup and daily incremental (an incremental backup is a backup of files which have been modified since the last full or incremental backup was performed) backups should cover the needs of 99% of desktop users and most servers. Of course if you host data for customers on a server, you might consider performing additional incrementals throughout the day.

Opt In Image
Free Weekly PCMech Newsletter
Almost 500 Issues So Far, Received By Thousands Every Week.

The PCMech.com weekly newsletter has been running strong for over 8 years. Sign up to get tech news, updates and exclusive content - right in your inbox. Also get (several) free gifts.

Pages: 1 2 3

Leave a Reply

PCMech Insider Cover Images - Subscribe To Get Your Copies!
Learn More
Tech Information you can use, sent to your inbox each and every week. Check out PCMech's digital e-zine...