The SAS (r)evolution …

January 16, 2014

When someone within the company started discussing the evolution of SAS a couple of weeks ago, I thought I misheard them … was that “revolution” or “evolution”? Turns out it’s been a bit of both. In early days it was a revolution, but now that it is a well-matured technology we are within the “evolution” section of its lifecycle.

At the core of all of this technology is the venerable SCSI command set … which turned out to be just about the most long-standing and solid technology developed within the storage industry. Developed over 30 years ago, it is still running strongly today across many and varied delivery mechanisms.

So since we are talking about SAS, and not the SCSI command set (which is used in a lot of places in storage today), let’s look back, at today, and at tomorrow to see where we’ve been, are and are going.

When serial first came along, my first thoughts were … “thank goodness for simpler cabling”. Didn’t really turn out that way as yes, we didn’t need termination like on the old parallel bus, but we ended up with a truckload of new cables, and probably even more than we had in the old scsi days. However, 1.5Gb down the pipe sounded pretty good, and indeed early performance was great right out of the box.

Adaptec made a decision back in those days to build both SAS and SATA controllers, but we quickly worked out that we could do it all with one controller (SAS) because SATA is a subject of the SAS protocol, and we only need to make one controller to talk to all disk drives at the same time. While SAS jumped straight into the performance end of town, early SATA implementations were pretty dodgy and somewhat slow – and really held things back until SATA II and SATA III came along.

So 1.5Gb … wow, that’s pretty fast. In fact it’s faster than most spinning disks can go even today. Hmmm, don’t think we’ll need an upgrade for a while. No, as usual we had to have an upgrade, and so we got 3Gb. Well that had to be it, surely … spinning media will never catch up with this (and indeed it hasn’t today), but wait, what are those funny little things called SSD? OK, let go to 6Gb … the old “double it and they will come” principle. Great, now we’re cooking. Overkill for spinning drives but what the heck, we’re keeping up with the SSDs … or so we thought. OK, so your SSD can do more than 600Mb/sec … then well go to 12Gb and gloat about our performance (for a while).

And so here we are – 12Gb SAS. Interestingly we haven’t had the complications or growing pains of the old parallel SCSI – shorter cables, different connectors (oh yes, that’s right, we did do different connectors for 12Gb), but generally the evolution has seen less pain and therefore quicker uptake than the previous parallel regime. That said 12Gb is very new and we don’t really know the uptake of it yet. In fact the 12Gb standard has brought some very nice negotiation processes with it for device handling so in that respect it’s a step forward in both functionality and speed over 6Gb SAS.

But is it enough?

For the moment it will have to be … and for a couple of reasons. The PCIexpress bus is capable of somewhere in the vicinity a theoretical 8000Mb/sec, while our 24-port 6Gb SAS chips can hit a theoretical 9600Mb/sec. Note the theory side of things because the fastest I’ve ever achieved out of our controllers is 6,600Mb/sec (wow!). So there is no real advance to PCIe3 on the horizon so we’re not going to get any degree faster there.

There are also some new kids on the block – SCSI Express and NVM Express which drive the storage device across a PCIe bus rather than a SAS bus – and this “may” be the way forward instead of an ever-increasing level of SAS (school is still out somewhat on exactly what the future holds in the interface market just yet).

The bit I really find ironic is that no-one I’ve heard of is really asking for more Mb/sec than they can get today with a bunch of 6Gb SSD. It’s all about IOPs and latency. If we are talking 4Kb blocks, then it takes an awful lot of those to saturate a bus capable of handling 6000Mb/sec … it will generally be the processor that flood and bottlenecks before the bus in this scenario. Latency is proving a problem child for the SAS world which is one of the claims to fame of the SCSI Express/NVMexpress world … since they are PCIe both sides of the controller they claim to have lower latency – and that is very, very appealing to datacenters today – it’s all about latency and IOPs.

So, on this “evolutionary path” of SAS, have we made it to the end? I might have been thinking so until I received a phone call this morning from a bloke looking for a SCSI card … and I was reminded that SCSI/SAS etc in all its formats and variations over the years as outlived more than one computer tech J

Don’t know about you but I’m looking forward to learning new interface, new technology and new way of putting together yet another complex solution for a customer!




I want a rubber/lego/meccano hard disk …

January 6, 2014

Stop sniggering. This is not about being able to drop a disk from a great height and have it survive … it’s about me making my own disk. Now that sounds as dumb as anything I’ve put in a blog before, but let’s look at it. What exactly do I want from the disk in my server?

  1. It has to be big – as big as I want to make it (512Tb should be enough)
  2. It has to be redundant – and I want lots of options on how redundant I make it (because I’m a gambler at heart like most IT pros and want to be able to determine just how close I sail to disaster J)
  3. It has to be fast (SSD or faster is what I mean) – but I want to be able to determine how much of it is fast and how much is “other” (other being big, fat, cheap, SATA)
  4. It has to be cheap (SATA) … this works with points 2 and 3 above – the price will come down with less redundancy but will go up with more speed … but I want to be able to configure this disk the way I want it with any combination of the above
  5. It has to manage itself … because like all IT pros I like playing with things for a while, but then get bored of such menial tasks pretty quickly
  6. So it has to look and feel like one large hard drive with all of the above characteristics

So let’s summarise what I’m after here …

“a large, redundant, flexible, configurable, fast, cheap, self-managing disk”

Hmmm …

Not asking much am I. Seagate make “hybrid” hard drives … spinning drives with some NAND flash built in … and they use some fancy algorithms to work out what should go where. Only problem is that they are a bit limited in size – I want a 20Tb disk (or larger) and it will be a while before I see something that large in a single disk.

Enter maxCache Plus. This clever technology lets me take any storage in the server and combine it into a single self-managing disk. While that sounds groovy, let’s look at a little more practical solution.

You need big capacity so you purchase 16 large enterprise sata hard drives (4tb enterprise SATA). However there is going to be data on this disk that needs to be fast (random, database type material), as well as large amounts of nothingness (ie it’s a VMware machine). So instead of wasting ports on your RAID controller, you purchase a Flash Drive (NAND flash drive such as Fusion IO) to accelerate certain data.

Grab yourself an 81605ZQ, plug all the drives in (doesn’t matter if they are 6gb or 12gb drives), make a RAID 5 with a hot spare and you’re ready to go. Only problem is speed – it’s not fast enough. So stuff the RAID 5 into a “pool” in maxcache (it will be Tier 1 – the slower of the two storage pools). Then grab your Fusion IO or other Flash Drive and stuff it into Tier 0 (the faster of the two pools).

Now grab the storage from both pools (Tier 0 and Tier 1), and combine them together into a single disk (Virtual Volume) that lets you see the capacity of both storage devices, moulds them into one disk and manages the data positioning for you.

So maxCache, in this environment, will move blocks of data around within the virtual volume, repositioning hot data onto the Fusion IO portion of the storage (Tier 0), and moving the not-so-hot stuff onto the RAID 5 array. And what management do you need to do? Sit back, grab a cold one from the fridge (after 5 of course) and put your feet up – there is no user management of the data volume required.

Sounds a bit too good to be true … a disk that is made of up a Fusion IO-like card, a large RAID array of whatever redundancy level I want, made out of whatever storage I have in my system, not just attached to my RAID card, all managing itself and constantly optimizing the customer data onto the best storage medium for the data type.

What I didn’t mention is that I can actually chop up the Flash Drive and RAID array into different pieces so I can make multiple disks – maybe even one with a lot of flash and reasonable amount of SATA, and one with very little flash acceleration and mostly SATA – either way the choice is mine. In other words, it is “flexible” (to a crazy degree).

So I end up with a “large, redundant, flexible, configurable, fast, cheap, self-managing disk” (or disks). But wait … I don’t want to buy a Fusion IO card … I have a 71605E (entry card) sitting on the desk and good fast SSD are cheap as chips these days. No worries – plug the SSD onto the 7 series, stuff it in next to the 8 series, make a raid 10 of ssd arrays (doesn’t need cache on controller or ZMCP cache protection on 7 series for this). Then you can connect up to 16 SSD and use that as Tier 0 storage instead of a flash drive.

So what limits the configuration possibilities? The grey matter between your ears – pretty much your imagination. In other words this really is a flexible and highly configurable technology.