Changing the default task priority …

August 16, 2011

Have you ever wondered why rebuilds take so long. Maybe it’s because you have 24 x 3tb drives in a RAID6 with 2 failed drives, or maybe it’s because the default task priority is set to low.

You can set the default priority by right-clicking on the controller in ASM and changing the value. Note that this won’t affect running tasks – you can change that by right-clicking on the array and changing the task there, but I recommend to set the default priority to high.

This brings about lots of questions … won’t it impact performance? Will my users be affected? Possiby yes, but the real question is … what is important here? Getting the server rebuilt back to optimal as quickly as possible or not having people complain. Since people complain all the time I don’t take much notice of that. Their complaints will be much louder if the server goes down because something else goes wrong before the array is optimal again so I tend to go for the lesser of two evils and get things back working correctly as quickly as possible.

It all depends on your priorities, but it’s worth considering setting the default to high – it makes a considerable difference in such things as rebuilding RAID 6 arrays.

Food for thought.

Ciao
Neil

facebooktwitterlinkedinmail

You did what with a mirror?

August 11, 2011

In days gone by if someone mixed a SAS (fast) drive with a SATA (slow) drive in a mirror us tech guys would generally smile to ourselves, pat the customer comfortingly on the head and make for the nearest exit as quickly as possible muttering under our breath.

It did not make sense to mix fast and slow drives into a RAID 1 because both reads and writes were pretty much limited to the performance of the slowest drive and you had seriously wasted your money on an expensive drive.

So along comes SSD (and big ones at that). Now in MLC format they are getting bigger and somewhat cheaper, and they generally have very good read speeds. Their write speeds are nothing to “write” home about (pun intended) but in a workstation environment or small server you can live with that.

Now Adaptec’s engineers have either had an epihpeny or a complete brain fade … I thought the latter until I looked a little closer at a new innovation in our latest firmwares for series 2, 5 and 6 series controllers.

A hybrid mirror of SSD and Sata hard drive … (or SAS hard drive for that matter)

Now how does that work?

Pretty simply really (in fact most good ideas as simple ones). Data is written to both drives as per a normal mirror. Because the SSD is at least as fast or faster than it’s partner SATA drive the user loses nothing in write performance compared to the standard SATA mirror he was generally intending to run. However, and this is the big innovation, all reads come from the SSD.

Now in mirrors gone by the theory was that reads came evenly from both drives, but the reality is that reads more often come from only one drive. So why not just force the mirror to read only from the fastest drive. When dealing with SAS/SATA combinations the performance improvement would have only been marginal to say the least, but with SSD its a different story.

SSD read speeds are way up and above those of SATA drives, and it makes sense to take advantage of that.

So that’s the way it works. Writes to both disks, reads only from the SSD. Of course if the SSD fails (heaven forbid I just said that), then the reads will switch over to the SATA drive and all will continue to work as normally expected in a mirrored environment.

Now that’s a pretty cool use of SSD … all the performance with safety without having to buy two drives.

Bring on those ever-increasing drive sizes in SSD!

Ciao
Neil

facebooktwitterlinkedinmail

Which drive to use?

August 11, 2011

Question to the Storage Advisors: We are setting up a small RAID6 on a dedicated server with 5 drives initially (it may be expanded later). The array will store mostly video content and will serve a maximum of 2-3 clients at the same time, reading those videos for playback/editing. The question is what disks to use – we have to choose between lower power 1.5tb drives with decent read/write performance but modest IOPS and higher speed drives with comparable read/write performance but nearly double the IOPS. The question is really one of cost (the high speed drives are over 60% more expensive) – will the higher I/O performance make a big difference to these types of file serving workloads under RAID 6? Or is the money going to be wasted? Thanks for any response you can offer – Dimitri.

Dimitri … good question and one that will have a major impact on the overall cost of your server. Since you haven’t mentioned drive models I’ll be generic, but your mention of 1.5TB drives is a giveaway for Seagate desktop (AS) drives … one of the cheapest drives on the market today.

Firstly, SAS is a waste of money and time for your intended scenario. SAS drives have very high rotational speeds, and very low seek times (and hence high IOPS), making them great for database servers and transactional workloads where small amounts of data are required on a regular basis. However their MB/sec throughput is not that much greater than SATA drives (roughly speaking) so you won’t get much of a performance improvement for your dollar by using these drives in a video environment.

SATA drives come in two flavours from almost all drive vendors … desktop and enterprise versions. Technical speaking the drive vendors will tell you there is a great deal of difference between desktop and enterprise drives, and as far as we are concerned there are some major advantages to enterprise drives (TLER for one) which make them suitable for RAID environments. For this, and purported reliability issues, I’m going to say go for enterprise level RAID SATA drives for your server.

Why?

In your question you were concerned as to whether the higher IOPS of (I presume) SAS drives will help your server. Basically … No. IOPS is how we measure small random data loads (Input/Output Operations Per Second). That is not the data type you are described for your server. A video server does “streaming” data … large amounts of data written or read from the drives in a single stream. SATA drives are just as quick as SAS drives in this environment. While SATA drives can’t match SAS drives in the IOPS department (slower seek, rotation etc) once they start delivering data they can keep chugging along with the best of them.

Therefore, for video applications, use SATA drives. Their MB/sec combined with the cache on a RAID card will allow for very high sustained speeds at a fraction of the cost of SAS drives. They are also much larger, so you need a lot less of them to get up to the capacities required for video storage.

So the choice is simple … SATA over SAS for video. That just brings us back to the question of “desktop” drives vs “enterprise” drives (as defined by the hard drive vendors). If your question is really whether to opt for either of these types of drives I’m going to opt for the enterprise drives in a RAID environment no matter what the cost of the drive over the desktop drive. While there are a million studies out there and much anecdotal evidence that one is or is not “better” or “more reliable” that the other … it is our general experience that as desktop-level drives get bigger compatibility and reliability becomes more of an issue, so while there really won’t be any speed difference between the desktop and enterprise drives in your scenario, there will be (I believe) far fewer problems using enterprise SATA in a RAID 6 video server.

With only 5 drives in the server the overall cost difference is a small portion of the overall price of the server, and I believe you will be better served by using drives that drive vendors intended to work with RAID, not the cheapest of the cheap desktop units.

Now, while we are on this tack … you’ll notice I mentioned cache on the card a little further up this blog. Yes it’s there and yes, you need to protect it. While you can put a battery on one of our 5 series controllers, why not look at the 5Z range of controllers … no batteries, no hassles (zero maintenance as my American bosses call it), no wuckers as an Australian would say. The 5805Z teamed with enterprise SAS drives will give you a reliable fast server that will protect your data for years to come (there, I’ve done the sales bit the boss is always asking for).

Spend wisely, but don’t skimp on the bits that really count.

Ciao
Neil

facebooktwitterlinkedinmail

What the spare?

August 11, 2011

Had a question the other day from a customer panicking about his hot spare situation. He was running a RAID 1E on 3 drives with a hot spare in a small database server – very sensible.

As is the way of hard drives, one went west. The hot spare did its thing and kicked in. The array rebuilt and all was hunky dory. However the bit that confused him was the fact that the drive which had originally been the hot spare, but was now part of the array, still had a “hot spare” icon in Adaptec Storage Manager.

Did it work? Was the array optimal? Was there something wrong with the card? Yes, yes and no.

When this situation occurs we make a very subtle change to the hot spare icon, but leave it there so that a system admin can see that this particular drive was a hot spare but has now been built into an array. It’s perfectly safe to right-click on that drive and “delete” the hot spare. All you are in fact doing is removing the designation of the hot spare … the drive itself is now an array member and not a hot spare any more.

Of course you need to replace the failed drive and make it a hot spare to put your world back in order, but all should then be good.

Tricks for young players …

Ciao
Neil

facebooktwitterlinkedinmail

This is going to have someone trembling …

August 11, 2011

In their boots that is. I’m talking about USB3. This new technology must have the tape drive vendors of the world a little concerned about the viability of people using tape drives. Of course libraries are still going to be viable due to their massive scalability, and some small buisness servers will still need multi-terabyte backups, but there are a lot of people out there who must be looking at the cost of replacing their tape drive and wondering about disk.

Disk is now up to 3tb, which is pretty large and relatively cheap. If you consider a small business running a backup every day of the week, one at the end of the month and the occasional offsite backup, that’s normally about 7 tapes plus your tape drive.

OR

You could use 7 single external USB drives.

I recently had an old friend call me up to discuss his tape backup. The unit had died (LTO2) and he needed to replace it and his tapes due to their age. The cost was pretty dramatic. So we looked at some alternatives. Their server is a small business unit with a full backup capacity of just under 300Gb. Some simple experimentation with USB2 external 2.5″ hard drives proved that (a) the backups were quicker (think the tape was getting very tired in it’s old age), that it was simple to manage (a DYMO label on each drive), and that the cost of implementing this backup scenario was radically cheaper than replacing a tape drive with 7 new cartridges and a cleaning tape.

My only real concern in this scenario was the speed of USB2. As it turned out, for the size of this server and the backup window the work practices of this organisation provided, USB2 speed was more than sufficient. However if this was a bigger server we’d be in trouble on time with USB2. Capacity could be handled by much larger USB drives, but USB2 would limit the speed of the backup – solved instantly by USB3.

Ironically the tape drive (now 4 years old) and SCSI card were my recommendation in the first place … because it was the technology of choice back then. However, the current crop of new technologies and their ridiculously low prices has forced me to change my tune.

I’d been hearing for a while from customers that the latest versions of SBS did not support tape natively, and they wanted to use eSATA. While this is a very good technology, it always seemed to cause problems getting cards working in servers or hot-swapping drives. I don’t expect USB3 to have these problems (USB2 has been pretty stable in this regard for quite some time).

So while the tape drive vendors might be a little concerned, the external hard drive vendors (like WD and Seagate) must be sitting back and waiting for server motherboards to take off with USB3 – it will most likely mean a lot of sales for their external drives into SBS servers.

Food for thought.

Ciao
Neil

facebooktwitterlinkedinmail

Thinking Broadband …

August 11, 2011

All the hype in Australia at the moment is the National Broadband Network … basically fibre to the node everywhere (fibre to all homes and businesses). Now no-one believes it’s going to happen in the next 6 months but the current government has a firm commitment to get this going.

So what implications will that have for users?

As a venerable road warrior I live out of my laptop. Like most mobile workers I have everything I need in the laptop, and backup regularly when I get near a link that’s fast enough not to bore me to death. Currently that means LAN. ADSL, which is the most common broadband technology in this country is nowhere near efficient enough to pump large amounts of data across in a short period of time.

But what if the internet link I’m using wherever I am in the country runs at LAN speeds?

I could work from home with corporate storage access speeds that rival my current LAN access speeds when I occasionally make it to the office. All my colleagues could also work from home. Organisations that have offices in multiple locations across the country would benefit dramatically. Imagine having only one server (or cluster) for your entire organisation, rather than a complete server environment (including backups etc) in each city. Now that will impact storage dramatically.

Cloud computing should benefit dramatically as well. Instead of having that server in the one organisation head office, it could actually reside on a service-providers network somewhere in the cloud, mean every user, whether a road warrior or office-bound worker, would have high-speed access to their data through the broadband network.

Fast data access, consistent and reliable backups, common experience for all users no matter of location (in or out of the current corporate network), centralised storage and organisation of data, easy sharing of data across corporate employees … it all sounds to good to be true.

So what impact will it have on storage? Larger, faster, centralised repositories of data, greater need for tiered heirarchy of data, increased reliance on disk2disk backups as a wider time-range of people access the same data (reducing backup windows) … the implications are many and varied.

For the moment all I can really be sure of is that there’s going to be plenty of work for people willing to dig trenches and lay cables for a few years to come. Now thats a thought … the wife is always telling me I need to get fitter :-)

As my daughter says … bring it on!

Ciao
Neil

facebooktwitterlinkedinmail

There has to be a better way …

August 11, 2011

I, like many of the people who work in this industry, am “the guy who knows about computers” to my friends and family. This means I get the job of cleaning viruses, reinstalling Windows, and much more sadly, trying to recover lost data – mainly family photos – cherished items gone into the ether with the crumbling surface of an old 200Gb SATA hard drive.

It’s not really the hardware vendor’s fault, and I’m not having a go at hard disks here, but people now put their lives (and their history, finances, memories and education) in the hands of mechanical devices that they purchased expecting to be like the fridge … reliable, sturdy and last a long, long time … even to the point where we, the consumer, get to make the decision regarding when we send it to meet it’s maker … not when some electronic component forces the decision upon us often with catastrophic consequences.

So all this pain (thankfully at the moment not mine) caused me to think about my own home system. Yes I back up my data (mostly sporting administration documents) but over in the corner is the “family” computer – which seems to have a life of it’s own because I can’t be bothered spending a great deal of time looking after it.

When I do get a pang of conscience, or the wife berates me because she can’t get the cd burner to work etc, then I always find myself surprised as to what I find on it (all good of course). Now that the kids have grown up they mostly just do facebook on the various netbooks lying around the house (for which they were purchased cheaply in China) – no data stored, but virtual communication somewhere up in the cloud – about which I don’t need to worry.

But on the home machine, thankfully in the well ordered file structure I imposed on the children and wife from a very young age, are thousands (yes literally thousands) of photos and videos – mostly of sporting moments good and bad showing that my wife has spent countless hours beside a BMX track videoing the kids while I’ve toiled away at sports administration and watched them … recording their special moments in my grey matter, not on video tape, then memory stick and finally on flash memory in the many different kinds of cameras (still and video) that have cluttered my office over the years.

Hmmm … better back this lot up. A simple arrangement of a batch file and a NAS box handles that behind the scenes, with a quick check occasionally to make sure my precious DOS skills have not failed with the ever changing command line structure of various versions of Windows.

But wait, it’s all in the same room. In fact it’s all in the same house … with all the natural disasters I see implanting themselves on the Australian landscape I’m feeling that this solution is not quite good enough. I could of course put it all on a USB drive and drop it in at mums (being 30 or so kilometres away), but that then becomes difficult to manage – I’ll have to go and get it and add the next instalment of photos and videos – not that I mind going to see mum … it’s just another job to do (the photo management that is).

So what else can I do? Why not the cloud? I hear nothing but the dreaded “cloud” these days that I may as well jump on the bandwagon. So a small investment and I’ve purchased myself a piece of virtual real estate in a land far-far away (for all I know) … I feel very proud of my smart 21st century solution to my dilemma.

But now the fun begins. Most of the videos are over 3gb in size (my wife knows how to take videos but not how to adjust a camera resolution etc). Hmmm, can’t upload more than a 1gb file. OK, that’s fixable – WinRAR to the rescue by compressing folders into CDROM sized segments of 700mb each. OK, I now only have about 130 of these to upload to the ether to easy my concerns about any potential data loss (and the consequent divorce). Easy, let’s just sit down one night and select a certain number of files and push them up to the cloud via the nice little interface that my provider gives me.

As you can imagine, the job is not finished yet. In fact it’s turning into a whole new dimension of problems. Had to increase the upload/download limit and speed on the internet connection so this would finish before next year’s lot had to go up, then had to sit and manage pushing large files up to the cloud over an excruciatingly slow connection. Here in Australia we use ADSL – fast to bring stuff down to me, but hopeless when it comes to pushing stuff up to the cloud.

So the whole grand idea has fallen into a long, time-consuming, miserable task that never seems to end.

There has to be a better way.

Ciao
Neil

facebooktwitterlinkedinmail

The perils of quick init …

August 11, 2011

The perils of quick init …

When building an array using Adaptec RAID cards (hardware RAID 5 that is), you have various build options … clear, build/verify, quick init and (on newer firmwares) skip init.

These options give you choices in how you build your RAID 5 array. While choice is good, and the marketing team love to sprout the line “flexible initialisation options” (because it’s a sexy thing to put on tech sheets), there are issues that users should be aware of when using these options.

As a tech I don’t like giving users all these options. At least not visible on the main screen of your array-build process. Personally I’d like to hide a couple of these options from the user and force them to do things in what I regard is the “correct” manner.

My pet hate is “Quick Init”. Skip Init is even worse but we give plenty of warnings when using this option that in fact you should not be using this option unless the sky is falling and you are talking to an Adaptec Support Professional. But Quick Init is a favourite amongst system builders because it sounds like a good idea.

System builders are always in a hurry. I find this a little frustating as they are generally building a server that is destined for many years of service … and they want reliability, performance and flexibility during those years. Yet they are not willing (in general) to take the time to do things properly in the first place (personal opinion that is bound to slight many builders but what the heck).

Did Michelangelo rush the paining of the Sistine Chapel? No, in fact many Columbines later he took his time and got it right the first time. The result of his initial care was a product that lasted for more than just a few years and has stood the test of time quite well.

So back to building RAID 5 arrays. When you use Quick Init you effectively lay out the array structure on the disk, but do not create any parity across all the stripes. You also, without most people knowing, fix the array into what we call “Full Stripe Writes”. This means that the entire stripe is written in one hit each time anything is written to that stripe. Whether it be a small amount of data or a large sequential write, the whole stripe is written (and subsequently parity calculated for that stripe in the process). This gives a major performance hit for small writes. While RAID 5 is not fantastically good at doing small writes in the first place, it is very, very poor at doing them in full stripe write mode.

We can go into more technical details at a later date if anyone is interested, but the moral behind this story is … use CLEAR or BUILD/VERIFY (clear is my favourite) when building your array. You will take a bit longer to build your system, but like Michelangelo you will create a product that will perform correctly from day one, not run like a dog for your customer. Of course there are many builders out there who either don’t know or don’t care, and just foist the box on the customer as quickly as they can, but for those who are interested in doing things correctly … don’t use Quick Init.

Here endeth the ancient painting (and RAID 5) lesson.

Ciao
Neil

facebooktwitterlinkedinmail

The confusing case of cache protection …

August 11, 2011

Once upon a time we had batteries to protect the cache on a RAID card. Protecting cache on the card is an important issue, especially in enterprise servers. When the OS writes data to the disks it actually goes to the RAID card. If write cache is turned on, then the card takes the data into the cache and reports back to the OS that the data has been written. The card then in turn writes the data to the disk when the disks are available (which is only a short period of time).

The end result of all this is that there is almost always data in the cache that the OS thinks has been written to disk, but is still waiting to be written to the disk. If the power goes out at this point in time then that data will be lost as RAID card cache is DDR (needs power to protect it).

The old-school way of doing this was to put a lithium ion battery on the RAID card which kept power to the DDR. This battery would keep the data alive for as long as the battery had power.

Quite some time ago Adaptec decided there was a better way of handling this. For lots of reasons batteries are a pain in the proverbial. They go flat, they don’t last, they have limited shelf-life, short warranties and long charge times, but they were all we had so everyone told you they were a good thing.

So, we developed what we call ZMCP or “Zero Maintenance Cache Protection”. This basically comes in the form of a small circuit board attached to the card, and a tethered supercapacitor. The daughter board has 4gb nand flash onboard. Basically, when the power goes out the data that is the DDR is copied to the nand flash where it is safe for years (the power to do this is provided by the supercap). When the drives come back up the data is copied from the nand flash to the drives where it was meant to go.

Now this post is not about that technology, but more about how we have marketed and implemented it. This technology was developed when the 5 series controller was our flagship, so we made a new model of card (due to connector restraints) and called it a “Z” card … in other words the 5 series controller has a “Z” on the end (eg 5805Z).

Then along came the 6 series. Adaptec made a conscious decision that ZMCP was a better way to go than batteries for lots of reasons (number one being it has a 3-year warranty compared to 1 year for batteries) and decided that this technology would be the only option we offer for cache protection on our 6 series cards.

Therefore, if all cards use this “ZMCP” technology there is no need to put a “Z” on the end of the card, right? Logically you’d think so. So you look at the 6 series product lineup and you don’t find a “Z” anywhere. If you look on the pricelists you won’t find a battery anywhere either. You’ll find a thing called an AFM600 (stands for Adaptec Flash Module). This is the ZMCP cache protection that fits natively on every 6 series card.

Sounds simple to me (or at least it did to our marketing people), but the world doesn’t get it.

Therefore the bottom line is …

If you have an Adaptec 6 series RAID controller and want to protect the cache, there is no battery option.
The only option you have is to put an AFM600 on the card which is our “Zero Maintenance Cache Protection”.

Confused? If not great, if yes then join the rest of the world :-)

Ciao
Neil

facebooktwitterlinkedinmail

Stumbling around Adaptec …

August 11, 2011

(in the bios that is) …

I was madly making RAID arrays the other day to do some testing, when a message popped up on the screen … “The selected configuration allows for the creation of a logical device with Enclosure Level Redundancy. This will override any second-level devices selection that you have made. Do you want to configure Enclosure Level Redundancy? Y/N”

Now I have a bad habit of just ignoring pop-ups (which causes me some pain occasionally), but this one had me intrigued. Either I was asleep during some training session (not uncommon) or someone left me out of the loop.
So what does this mean (the message, not the sleeping bit). I said yes and nothing exciting happened (very disappointing). That really got me intrigued so I looked at the properties of the array I had created.

My old Supermicro 815TQ is an 8-drive 2U system. Even though it looks like one backplane it’s actually considered by Supermicro (and our cards) as two backplanes … a row of 4 drives above a row of 4 drives in separate backplanes. Since I was making RAID 10s using all 8 drives, the card saw something I had not considered (and did not have control of anyway) … that it could make each pair of mirrors in my RAID 10 with one drive on each backplane for each mirror.

Simply put, if one backplane fell over, the system would keep running. The card is smart enough to see an opportunity to add an extra level of security, simply because of the configuration of my system and the raid level I was using. Cool!

The morals to the story are many … read pop-ups, stay awake during engineering briefings and when prompted to do “enclosure level redundancy” … do it.

Ciao
Neil

facebooktwitterlinkedinmail