(though I think I’ve been called that a few times in my youth).
A colleague sent me a link to a report on the web recently, and while I found it mildly interesting, I actually think the writers may have missed some of the point (just a little). While in general everything they say is correct, there is a factor that they haven’t taken into account … “stuff”.
So what is “stuff”?
Well I have a lot of “stuff” on my laptop. There is “stuff” on CD’s lying around the place, “stuff” on my NAS and “stuff” in the Mac on the other end of the desk. To me “stuff” is old data. I hardly, if ever, use it, but I sure as heck want it kept close and immediately accessible. In my business my old data is my historical library and a great backup to my slowing fading memory.
So what is living out in datacenter land? Lots of useful information, and lots and lots of “stuff”. It has become evident when dealing with users over the past decade that people are reticent, if not downright unwilling, to remove, delete, consolidate or even manage old data – they just want it kept forever “just in case”.
So while there are strategies out there to minimize it’s footprint, there is no strategy out there for changing people’s mindsets on how long they keep data. So datacenterland is, and always will be, awash with “stuff” … which means more and more “comatose” storage. I don’t disagree with the web link article on server compute – that needs to be managed and centralized into newer and newer, faster and more power efficient servers. It’s just the data (storage) side of the equation that I have issues with.
If we take as a given that no-one is going to delete anything, then what do datacenters do about it? While there are larger and larger storage devices coming out all the time (eg high density storage boxes utilizing 10Tb disks), the problem these bring to the datacenter is that while they can handle the capacity of probably 10 old storage boxes, the datacenter is faced with moving all of the data off the old storage onto the new storage to free up the old storage. By the time a large datacenter gets through that little process, 10Tb drives will be replaced by 20Tb drives and the process will start all over again – meaning datacenters will have data in motion almost continuously – with tremendous management, network and overheads/costs to go along with … exactly the sort of stuff that datacenter operators don’t want to hear about.
I’m guessing that datacenter operators are looking at exactly this issue and are crunching numbers. Is it cheaper for us to keep upgrading our storage to handle all this “stuff”, with all of the management complications etc, or do we just buy more storage and keep filling it up without having to touch the old data? Or do we do both and just try to keep costs down everywhere while doing so?
It would be very, very interesting to know how the spreadsheet crunches that little equation.
“Stuff” for thought.