May 13, 2015

Been getting a few questions regarding building RAID arrays recently, and thought it warranted putting something down on paper.

Now I’m talking about RAID 5/6 and other redundant arrays (not RAID 0 – that’s not for “real” data imho). So the questions arise about whether it is possible to restart a server during a raid build or rebuild, and what happens when a drive fails during that process. So let’s take a look at exactly what our controllers¬†actually do in these situations.

RAID Build

If you are building a redundant array using either the clear or build/verify method, then yes, you can power down the server (or the power can go out by any other means) and it won’t hurt your process. We continue building from where we left off, so if the build process gets to 50% and you need to reboot your server, then no worries, it will just continue to build from where it left off – it does not go back to the start again.

If a drive fails during the build process of say a RAID 5, then no worries, the build will continue. When it’s finished, the array will be in degraded mode, and you’ll have to replace the drive, but that’s the normal process. Again, even if a drive fails, you can power down and restart the server during the build process and it will resume building from the point it left off, still finishing in a degraded array that needs fixing.

And if you think drives don’t fail during RAID builds, then think again … I’ve had it happen more than once.

RAID Rebuild

What happens when a drive fails during a rebuild is a bit dependent on the RAID type. Let’s take the example of a RAID 5. A drive fails, so you replace it and the controller starts a rebuild. During that process, another drive fails. You are toast. There is not enough data left for us to calculate the missing data from the first drive failure because now you are missing 2 drives in a RAID 5 and that’s fatal. You need to fix your drives, build a new array and restore from backup.

In a RAID 6 environment it’s slightly different. RAID 6 can support 2 drives failures at the same time, so if a drive fails, you replace it and start a rebuild and then if another drive fails, no worries. The controller will continue to rebuild the array but it will be impacted when finished because it’s still one drive short of a picnic. However the data will be safe during this process, and you’ll just have to replace the second failed drive and let the array rebuild to completion.

Of course, like any of the above, you can power down and restart the server at any time during any of these processes and things will just continue on from where they left off.

Hope that answers a few questions.



