Basic High Availability - An Introduction To RAID


by Brian Stoffer - Date: 2007-03-18 - Word Count: 869 Share This!

RAID stands for Redundant Array of Inexpensive Disks (other definitions may vary). In practice it is two or more hard disk drives configured for use by a single computer system. The configuration varies by intended use, but the most common use for RAID arrays is for high-availability installations. High-availability is the measure of how resistant to failure a system or infrastructure is. The most common approach to high-availability in IT is redundancy, and RAID arrays are an easy way to enhance highly available services by reducing the likelihood that a computer system will fail due to hard disk drive failure. The net result is that if a single hard disk drive (heretofore referred to as a "hard drive" or "disk drive") that is part of a properly configured RAID array were to fail, the general operations of that computer system would be unaffected.

RAID configurations are referred to by numbers, or levels. The lowest level RAID configuration is called RAID 0, and is referred to as "striping." In this case data is divided over two (or more) hard drives to enhance performace. This is a special case, as it does not actually provide any redundancy. In fact, it can be said that RAID 0 actually increases the possibility for failure (thus lowering availability) because a single set of data is being split over two disparate hard drives. If one of those drives fails, the data on the remaining drive is useless. Why even mention RAID 0? Because it is often used in extremely high performance applications, such as media editing. Also, as we will see later, RAID 0 can be used in combination with RAID 1 for a very potent, but expensive, high-availability configuration.

RAID 1 is referred to as "mirroring." Here every time data is written it is duplicated onto a second hard drive. The net result is that two drives are literal mirrors of each other, thus if one fails the other can take it's place. As far as performance goes, this method typically has the smallest overhead on reads and writes. Given a decent hardware RAID controller RAID 1 should provide no noticeable overhead in all but the most demanding disk-intensive applications. Any overhead at all is almost entirely eliminated by the use of a multi-channel RAID controller card, which can write to (and read from) two disks simultaneously.

RAID 5 is the next commonly used configuration (RAID 2-4 are not considered viable or useful in common computing). RAID 5 is sometimes referred to as "striping with parity." RAID 5 has rapidly become the most common RAID configuration for large data stores that are not affected by the read/write overhead associated with RAID 5. This overhead is due to the fact that data is being duplicated across multiple hard drives (a minimum of three distinct hard drives) in such a way that any two drives could rebuild the contents of a third. This "differential" is referred to as "parity". The overhead comes from calculating the parity bits across available drives. The key benefit with RAID 5 is how easy it is for a system to recover from a failed disk drive. Many RAID 5-capable hardware configurations contain a "hot-swappable" disk drive chassis, which allows administrators to replace and rebuild failed drives while the system is running, with little-to-no effect on the operations of that system. The read/write overhead makes this a bad choice for highly disk-intensive applications, such as active mail servers or busy database systems.

The final popular RAID configuration is referred to as RAID 0+1. As the designation implies, this is a combination of RAID 0 and RAID 1. Basically, stripped disks are then mirrored for redundancy. This requires a minimum of four distinct hard drives, so the cost overhead is high, but this combines the high-availability benefits of mirroring and the high performance of striping into a highly available, high performance disk cluster. The limitation here is typically cost. Also, this requires hardware that can manage at least for hard disks.

Under most circumstances RAID 5 works well. For larger installations, where performance is a real issue, RAID 0+1 is not uncommon. For small business file servers, RAID 5 is a perfect fit. For most small web servers, utility servers (such as file repository, DNS, DHCP, etc), RAID 5 is more than sufficient. For small-to-medium-sized databases which get minimal traffic, RAID 5 if fine. It is not uncommon for administrators to combine configurations in a single system. For example, with a multi-channel RAID controller it is possible to configure RAID 1 for the drive that contains the operating system, and then RAID 5 for a data array. This provides the operating system with high-performance redundancy, while granting a large array of data drives the added high-availability of live hard drive rebuilds inherent in RAID 5.

When evaluating RAID configurations for your office consider the purpose of the system, whether the application will be disk-intensive (lots of reads and writes), whether or not performance is a factor (file server vs. application server), and how highly available the system needs to be (mission-critical?). When buying hardware, look for a multi-channel RAID 5-capable RAID controllers, paired with hot-swappable disk arrays. This will give you the most flexibility in terms of RAID configurations.


Related Tags: data, performance, server, failure, hardware, disk, raid, high availability, configuration

Brian Stoffer is a business and technology consultant with Pixel & Type, a web design and development firm focusing on the needs of financial services professionals. Visit Brian's website at http://www.pixelandtype.com

Your Article Search Directory : Find in Articles

© The article above is copyrighted by it's author. You're allowed to distribute this work according to the Creative Commons Attribution-NoDerivs license.
 

Recent articles in this category:



Most viewed articles in this category: