July 3, 2015
(this is another in the series of crazy-long blogs … the bloke who approves these things is going to get sick of reading :-))
So most people know about RAID 5 – it’s the general purpose great all-rounder of RAID arrays – gets used for just about anything and everything. However when it comes to capacity there are limitations. First…there are physical limitations, then…there are common-sense limitations.
It’s true that you can put up to 32 drives in a RAID 5. You can also buy some pretty big drives these days (http://www.extremetech.com/computing/189813-western-digital-unveils-worlds-first-10tb-hard-drive-helium-filled-shingled-recording), so 32 x 10TB drives in a RAID 5 would yield (theoretically if in fact the drives gives you 10TB usable space): 310TB – remembering of course that RAID 5 loses 1 drive’s capacity for parity data.
Now not many people want 310TB on their home server (though some of my mates come close with movies), but there are more and more businesses out there with massive amounts of archival, static and large-scale data … so it’s not inconceivable that someone will want this monster one day.
It’s a realistic fact that you may want to build a big server, but will be using smaller drives as you can’t afford to purchase such large drives, but the principles of building this server remain the same no matter what drives you use.
So what are the problems with running a 32-drive array using 10TB drives? Plenty (imho)
Most of the issues come in the form of building and rebuilding the array. For a controller to handle this size disk in this size array, let’s look at some dodgy math.
- Stripe size (per disk … what we call a minor stripe) – 256Kb
Number of stripes on single disk – 40+ million (give or take a hundred thousand)
- Each major stripe (the stripe that runs across the entire disk) is made of up 32 x 256Kb pieces (8Mb)
- 1 RAID parity calculation means reading 31 disks, calculating parity and writing it to 1 disk – per stripe
- Multiply that by 40+ million stripes
That’s going to wear out the abacus for sure
The problem with all these stripes and all this data, is that all 31 drives are involved in any operation. Now that will mean some pretty good streaming speed, but it will also make for killer rebuild times. For example, to rebuild an array (best case, with no-one using it) – to get it done in 24 hours the drives would need to read/write at least 115Mb/sec. Now SATA and SAS drives might come close to that on reads, but they are nowhere near that on writes, so rebuilds on this monster will take a lot longer than 24 hours.
Since it’s a RAID 5, if another drives goes west (not south, I’m already in Australia!) during this rebuild process your data is toast and you have a major problem on your hands.
So what is the alternative? Use RAID 50.
RAID 50 is a grouping of RAID 5 arrays into a single RAID 0. Don’t panic, I’m not telling you to use RAID 0 (which in itself has no redundancy), but let’s look at how it works. In a standard RAID 0 you use a bunch of single drives to make a fast RAID array. The individual drives would be called the “legs” of the array. However there is nothing protecting each drive – if one fails there is no redundancy or copy of data for that drive – and your whole array fails (which is why we don’t use RAID 0).
However in a RAID 50 array, each leg of the top-level RAID 0 is made up of a RAID 5. So if a drive fails, the leg does not collapse (as in the case of a RAID 0), it simply runs a bit slower because it is now “degraded”.
The maths above doesn’t change per drive, but the number of drives in the major stripe of the array changes dramatically (at least half), so the speeds in all areas go up accordingly.
A lot of people have heard of RAID 10, 50 and 60 (they are similar in nature), but think like humans – 2 legs. However all these combination RAID levels can have multiple legs – 2, 3, 4 – however many you like. And more legs generally is better. But let’s look at our 32-drive configuration.
Instead of 32-drives in a single RAID 5, a RAID 50 could be 2 x RAID 5 of 16 drives each, with a single RAID 0 over the top. The capacity would be one drive less than the 32-drive RAID 5, but the performance and rebuild times will be greatly improved.
Why? In reality 32 drives is beyond the efficiency levels of RAID 5 algorithms, and it’s not as quick as an 8-16 drive RAID 5 (the sweet spot). So just on its own, a RAID 5 of 16 drives will generally be quicker than a RAID 5 of 32 drives. But now you have two RAID 5 arrays combined into a single array.
The benefit come to light in several ways. In RAID rebuilds (when that dreaded drive fails), only 16 of the drives are involved in the rebuild process. The other half of the RAID 50 (the other 16 drives in the RAID 5 array) are not impacted or affected by the rebuild process. So the rebuild happens a lot faster and performance of the overall array is not impacted anywhere near as badly as the performance of the rebuilding RAID 5 array made up of 32 drives.
So what is the downside to the RAID 50 compared to the single large RAID 5? In this case, with 2 legs in the array, you would lose one additional drive’s capacity.
Mathematics (fitting things in boxes) always comes into play with RAID 50/60 … I want to make a RAID 50 of 3 legs over 32 drives – hmmm … the math doesn’t work does it. It hardly ever does. If you have 32 drives then the best three-leg RAID 50 array you could make would be 30 drives (3 legs of 10 drives each). That would give 27 drives capacity, but it would be (a) faster and (b) rebuild much, much faster than anything described above.
So would you do a 4-leg RAID 50 with 32 drives? Yes you could. That would mean 4 drives lost to parity, giving a total of 28 drives capacity, but now each RAID 5 (each leg) is down to 8 drives and ripping along at a great pace of knots, and the overall system performance is fantastic. Rebuild times are awesome and have very little impact on the server. Downside? Cost is up and capacity is down slightly.
As you can see, there is always a trade-off in RAID 50 – the more legs, the more cost and less capacity, but the better performance. So back to the 32 drives … what could I do? Probably something like a 30-drive 3-leg RAID 50, with 2 hot spares sitting in the last two slots.
But what about your OS in this configuration? Where do you put it? Remember that you can have multiple RAID arrays on the same set of disks, so you can build 2 of these RAID 50 arrays on the same disks … one large enough for your OS (which will use very, very little disk space), and the rest for your data.
So should you consider using RAID 50? Absolutely – just have a think about the long-term usefulness of the system you are building and talk to us if you need advice before going down this path.