June 16, 2015
RAID storage configuration considerations (for the Channel System Builder)
SAS/SATA spinning media, SSD and RAID types – helping you make decisions
Some thoughts from the Storage Advisor
Note: I started writing this for other purposes – some sort of documentation update. But when I finished I realised it was nothing like the doc the user requested … and then “write blog” popped up on the screen (Outlook notification). So I took the easy way out and used my ramblings for this week’s update.
When designing and building a server to meet customer needs, there are many choices you need to consider: CPU, memory, network and (probably most importantly) storage.
We will take it as a given that we are discussing RAID here. RAID is an essential part of the majority of servers because it allows your system to survive a drive failure (HDD or SSD) and not lose data, along with the added benefits of increasing capacity and performance. While there are many components within your system that will happily run for the 3-5 year life of your server, disk drives tend not to be one of those items.
So you need to take a long-term approach to the problem of storage – what do you need now, what will you need in the future and how will your survive mechanical and electronic failures during the life of the server.
What sort of drives do I need to meet my performance requirements?
Rather than looking at capacity first, it’s always a good idea to look at performance. While the number of devices have an impact on the overall performance of the system, you will not build a successful server if you start with the wrong disk type.
There are three basic types of disks on the market today:
- SATA spinning media
- SAS spinning media
- SSD (generally SATA but some SAS)
SATA spinning drives are big and cheap. They come in many different flavours, but you should really consider using only RAID-specific drives in your server. Desktop drives do not work very well with RAID cards as they do not implement some of the specific features of enterprise-level spinning media that help them co-operate with a RAID card in providing a stable storage platform.
The size of the drive needs to be taken into consideration. While drives are getting larger, they are not getting any faster. So a 500Gb drive and a 6Tb drive from the same family will have pretty much the same performance.
Note that this is not the case with SSDs. SSDs tend to be faster the larger they get, so check your specifications carefully to ensure you know the performance characteristics of the specific size of SSD you buy – not just what is on the promotional material.
The key to performance with spinning media is the number of spindles involved in the IO processes. So while it’s possible to build a 6TB array using 2 drives in a mirror configuration, the performance will be low due to the fact that there are 2 spindles in operation at any time. If the same array was built using 7 x 1TB drives, it would be much quicker in both streaming and random data access due to the multiple spindles involved.
SAS spinning media generally rotate at higher revolutions than SATA drives (often 10,000 RPM or higher vs 5400/7200 for SATA), and the SAS interface is slightly quicker than the SATA interface, so they outperform their SATA equivalents in certain areas. This is mostly in the form of random data access: SAS drives are faster than SATA drives. When it comes to streaming data there is little to no difference between SATA and SAS spinning media.
However all performance calculations go out the window when SSD are introduced into the equation. SSD are dramatically faster than spinning media of any kind, especially when it comes to random data. Keeping in mind that random data storage systems tend to be smaller capacity than streaming data environments, the SSD is rapidly overtaking the SAS spinning media as the media of choice for random data environments. In fact, the SSD drive is so much faster than SAS or SATA spinning media for random reads and writes, that it is the number one choice for this type of data.
So what about capacity calculations?
Capacity both confuses and complicates the performance question. With SATA spinning drives reaching upwards of 8TB it’s pretty easy to look at the capacity requirements of a customer and think you can just use a small number of very large spinning drives to meet the capacity requirements of the customer.
And that is true. You can build very big servers with not many disks, but think back to the previous section on performance. With spinning media, it’s all about the number of spindles in the RAID array. Generally speaking, the more there are, the faster it will be. That applies to both SATA and SAS spinning media. The same cannot be said for SSD drives.
So if you need to build an 8TB server you are faced with many options:
- 2 x 8TB drives in a RAID 1
- 4 x 4TB drives in a RAID 10
- 3 x 4TB drives in a RAID 5
- 5 x 2TB drives in a RAID 5
- 9 x 1TB drives in a RAID 5
So what is best with spinning drives? 2 x 8TB or 9 x 1TB? A good general answer is that the middle ground will give you the best combination of performance, cost and capacity. Note however that you need to think about the data type being used on this server, and the operating system requirements. If for example you are building a physical server running multiple virtual machines, all of which are some sort of database-intensive server, then you are wasting your time considering spinning drives at all, and should be moving straight to SSD.
If however this is a video surveillance server, where the data heavily leans towards streaming media, then 3 x 4TB SATA drives in a RAID 5 will be adequate for this machine.
What RAID controller type do I need?
This one is easier to determine. The RAID controller needs to have enough capacity to handle the IOP capability of your drives, with sufficient ports to connect the number of drives you end up choosing. Since there are so many different ways of mounting drives in servers today, you will need to take into account whether the drives are directly attached to the server or whether they are sitting in a hot-swap backplane with specific cabling requirements.
What RAID level should I use?
There are two basic families of RAID:
- Non-Parity RAID
- Parity RAID
Non-Parity RAID consists of RAID 1, and RAID 10. Parity RAID consists of RAID 5, 6, 50 and 60. Generally speaking, you should put random data on non-parity RAID, and general/streaming data on parity RAID. Of course things aren’t that simple as many servers have a combination of both data types running through their storage at any given time. In this case you should lean towards non-parity RAID for performance considerations.
Note of course (there’s always a gotcha) that non-parity RAID tends to be more expensive because it uses more disks to achieve any given capacity than RAID 5 for example.
Putting this all together …
By now you can see that designing the storage for a server is a combination of:
- Capacity requirement
- Performance requirement
- Disk type
- RAID controller type
- RAID level
Let’s look at some examples:
- General use fileserver for small to medium business
General Word, Excel and other office file types (including CAD files)
Performance requirements: medium
Disk type: spinning will be more than adequate
RAID controller type: Series 6,7,8 with sufficient ports
RAID level: RAID 5 for best value for money
Options: should consider having a hot spare in the system
Should also consider having cache protection to protect writes in cache in event of power failure or system crash
Remembering that you don’t get the total usable capacity that you expect from a drive. For example, a 4TB drive won’t give 4TB of usable capacity – it’s a little more like 3.75TB….(I know, seems like a rip off!)
In this scenario we are going to recommend enterprise SATA spinning media. 4 x 3TB drives will give approximately 8TB capacity, with good performance from the 4 spindles. Since many server chassis support 6 or more drives, then the 5th drive can become a hot spare, which will allow the RAID to rebuild immediately in the case of a drive failure.
With spinning drives a 6-series controller will be sufficient for performance, so the 6805 would be the best choice controller. We would recommend an AFM-600 be attached to the controller to protect the cache in event of a power failure etc.
- High-performance small-capacity database server
Windows 2012 stand-alone server running an industry-specific database with a large number of users
Performance requirements: high
Disk type: pure SSD to handle the large number of small reads and writes
RAID controller type: Series 7 (71605E)
RAID level: RAID 10 for best performance
Options: should consider having a hot spare in the system
In this scenario we are definitely going to use a pure SSD configuration. Database places a great load on the server with many small reads and writes, but the overall throughput of the total server data is not great.
RAID 10 is the fastest RAID. When creating a RAID array from pure SSD drives, we recommend to turn off the read and write cache on the controller. Therefore you (a) don’t need much cache on the controller and (b) don’t need cache protection. In this case we would recommend 6 x 1TB (eg 960Gb Sandisk Extreme Pro drives) – which would give approximately 2.7TB usable space in an extremely fast server.
When using SSDs you need to use a Series 7 or Series 8 controller. These controllers have a fast enough processor to keep up with the performance characteristics of the SSDs (the Series 6 is not fast enough).
Again, a hot spare would be advisable in such a heavily used server. This would make a total of 7 drives in a compact 2U server.
- Mixed-mode server with high-performance database and large data file storage requirements
Multiple user types within the organisation – some using a high-speed database and some general documentation. Organisation has requirement to store large volume of image files
Performance requirements: high for database, medium for rest of data
Disk type: mix of pure SSD to handle the database requirements and enterprise SATA for general image files
RAID controller type: Series 8 (81605Z)
RAID level: SSD in RAID 10 for operating system and database (2 separate RAID 10 arrays on same disks). Enterprise SATA drives in RAID 6 due to fact that large number of static image files will not be backed up
Options: definitely have a hot spare in the system
In this scenario (typically a printing company etc), the 4 x SSDs will handle the OS and database requirements. Using 4 x 512Gb SSD, we would make a RAID 10 of 200Gb for Windows server, and a RAID 10 of 800Gb (approx) for the database.
The enterprise SATA spinning media would be 8 x 4TB drives, with 7 in a RAID 6 (5 drives capacity) and 1 hot spare. In this scenario it would be advisable to implement a feature called “copyback hot spare” on the RAID card so the hot spare can protect both the SSD RAID array and spinning media RAID array.
This will give close to 20TB usable capacity in the data volumes.
Some of the key features of RAID cards that need to be taken into consideration, which will allow for the best possible configuration, include:
- Multiple arrays on the same disks
It is possible to build up to 4 different RAID arrays (of differing or same RAID level) on the same set of disks. This means you don’t have to have (for example) 2 disks in a mirror for an operating system, and 2 disks in a mirror for a database, when you can do both requirements on the same 2 disks
- RAID 10 v RAID 5 v RAID 6
RAID 10 is for performance. RAID 5 is the best value for money RAID and is used in most general environments. Many people shy away from RAID 6 because they don’t understand it, but in a situation such as in option 3 above, when a customer has a large amount of data that they are keeping as a near-line backup, or copies of archived data for easy reference … that data won’t be backed up. So you should use RAID 6 to ensure protection of that data. Remember that the read speed of RAID 6 is similar to RAID 5, with the write speed being only very slightly slower.
- Copyback Hot Spare
When considering hot spares, especially when you have multiple drive types within the server, then copyback hot spare makes a lot of sense. In option 3 above, the server has 4Tb SATA spinning drives and 512Gb SSD drives. You don’t want to have 2 hot spares in the system as that wastes drive bays, so having 1 hot spare (4Tb spinning media) will cover both arrays. In the event that an SSD fails, the 4Tb SATA spinning drive will kick in and replace the SSD, meaning the RAID 1 will be made of an SSD and HDD. This keeps your data safe but is not a long-term solution. With copyback hot spare enabled, when the SSD is replaced, the data sitting on the spare HDD will be copied to the new SSD (re-establishing the RAID), and the HDD will be turned back into a hot spare.
As you can see, there are many considerations to take into account when designing server storage, with all factors listed above needing to be taken into consideration to ensure the right mix of performance and capacity at the best possible price.
Using a combination of the right drive type, RAID level, controller model and quantity of drives will give a system builder an advantage over the brand-name “one-model-fits-all” design mentality of competitors.
If you have questions you’d like answered then reply to this post and I’ll see what I can do to help you design your server to suit your, or your customer’s, needs.