RAID: Your Guide

Redundancy is the key to a RAID array, but regardless of whichever setup you employ, you will defintely use one or
more of the following:

Striping
This is a RAID configuration that can offer huge performance gains. Data in a striped array is interleaved across
all the drives in the array. Data is read and written on both drives at the same time. A good analogy would be
this: Imagine having to write an essay on a sheet of paper. You can take a pen and write it. Now, imagine for a
second that you were a mythological God or something and could write with both hands, nice and neat, at the SAME
time. Imagine how fast you could write that paper now! This theory applies to a RAID array using striping. By
splitting the data up and using both drives to read/write, it effectively doubles the speed.

The performance of a striped array is governed by the stripe width and stripe size. The width is equal to the
number of drives in your array. To outline this, assume you need to write a 1 meg Word file to your RAID array. If
you have two drives, then the stripe width is two. For purpose of clarifying, assume you will be writing this data
in 50K chunks. That is 20 write cycles to write the entire Word file, 10 write cycles per drive. So, the first
drive writes the first 50K, then the third, then the fifth, etc. At the same time, the other drive writes the
second, then the fouth, etc. You can see that this setup would write the entire 1 meg file in about half the time
of one drive. You can increase performance even more by adding another hard drive to the RAID array, thereby
increasing the stripe width to 3.

The stripe size is basically the size of those chunks of data being written across the array. Default for an IDE
configuration is usually 64K. Contrary to common sense, increasing the stripe size can have a negative impact on
performance. See, if the data chunks are huge, then many times the parallel nature of RAID will not even be
employed, because the chunks may be larger than the files themselves. This would lead to no better performance
than a non-RAID setup. On the flip side, a stripe size that is too small will guarantee that your file will be
broken up across the array (increasing performance) but increases the liklihood of small-time random accesses to
the array, meaning your drives will likely be busier. As you can see, its a give-and-take thing.

Mirroring
With striping alone, you do not get any redundancy. The data is all split up amongst the drives in the array, so
if you lose one of the drives, you’re screwed. Mirroring is the other feature of RAID that comes to the rescue. The
only problem is that with mirroring, you don’t get striping. Mirroring is a simple concept: whatever you write to
one drive, you write simultaneously to the other. Thus, you always have an exact duplicate of your data on the
second drive. The cool parts of this come with the controller you decide to use. For example, most controllers will
automatically sense a drive failure and instantly switch to the backup drive, meaning virtually no downtime. This
is great for servers and other mission-critical machines. If the controller doesn’t support this, it will most
likely at least automatically transfer the data from the backup drive to the new drive.

Mirroring does give a small performance benefit as well. Since both drives contain similar data, the controller can
read data from one drive while simultaneously requesting data from the copy. But, write speeds will slow down some,
because the controller must write all data twice.

Parity
Parity is another type of redundancy built into some RAID arrays. Instead of simply making copies of everything,
the RAID controller adds a parity bit to all binary info being written to the array. Basically, its just an extra
bit of data appended onto the actual data. This series of parity bits is added up by the controller to equal either
an even or an odd number. By analyzing this value, the controller can determine whether the information has been
compromised in any way. If it has, it can replace the data automatically with data from the other drive.

Most parity setups use the XOR to do their magic. This is a type of Boolean logic, the eXclusive OR. Basically, it
analyzes the series of 0′s and 1′s and returns either a TRUE or FALSE (even numbers are TRUE, odd is FALSE). By
using this data, the controller can “fill in the blanks”. Its like algebra. We know that 3 + 4 = 7. If you see an
equation like 3 + __ = 7, you know the blank is supposed to be a 4. The XOR logic is used in this way to rebuild
corrupted data on the array, thus maintaining integrity.

Opt In Image
Free Weekly PCMech Newsletter
Almost 500 Issues So Far, Received By Thousands Every Week.

The PCMech.com weekly newsletter has been running strong for over 8 years. Sign up to get tech news, updates and exclusive content - right in your inbox. Also get (several) free gifts.

Pages: 1 2 3 4 5

Leave a Reply

PCMech Insider Cover Images - Subscribe To Get Your Copies!
Learn More
Tech Information you can use, sent to your inbox each and every week. Check out PCMech's digital e-zine...