Main Menu

The case for RAID

Started by Grauniad, January 15, 2014, 01:56:54 PM

Previous topic - Next topic

Grauniad

My home server failed and it made me want to look at suitable replacement options.

I did manage to recover my server, but in the process I found out that it has been unsupported since August of 2013. So I have to look into replacements alternatives, anyway.

For now the really simple solution is to just upgrade my home server from the old version of Windows Home Server to the 2011 version. For $55, that gets me to a supported platform until sometime in 2016. Good enough. Since I have about 3TB worth of space on that, including all out family photos (irreplaceable), music, random data and backups of all out in-home computers (even some we don't own anymore), I will soon have to investigate expansion. Probably appropriate is a total review of how we store and backup our data. That may be an interesting topic for another post.

However, in evaluating alternatives to my server, I got to look at NAS (Network Addressable Storage) units. These units are essentially single-function computers that attach to one's network router and is accessible by all devices on the internal network, as well as optionally through the Internet.

However, they're quite pricey - the Synology unit I'm looking at hovers around $500 - and that's without hard drives!

One of the "big things" on a NAS is that they support RAID. Now I know the theory of RAID, and have known it for more years than I care to mention. But over time the mists of forgetfulness sets in and then I have to go on a refresher course to investigate anew and to not have myself led astray by marketing buzzwords.

So, briefly: What is RAID and why would I want it?

The TL;DR take-away is that I'd *never* want it in my home setup.

The elevator pitch for why I don't want it: It's expensive, complicated, error-prone and it is intended to boost reliability and availability of data - none of those are critical factors for a NAS that serves as a large data store of important, but not time-critical data. The caveat is that NAS (as I intend to use it) and RAID are not backup mechanisms - one still has to back the data up, and the backup *has* to be in a different location than the data.

If you want to read up on RAID, Wikipedia has very good coverage of the various RAID levels, from RAID 0 to RAID 6, and then the hybrid (or nested)  RAID 01, 10, 50 and 60.

Also, here's an excellent summary from a Tom's Hardware contributor: http://www.tomshardware.com/forum/257159-32-raid-substitute-backup

All that is left now is to grit my teeth, defer purchase of my SteamOS PC and pay the $600 to get myself a Synology 4-bay NAS station and a 4TB NAS drive to go into it.  Once that is installed and up and running, I'll start migrating the data off my Home Server and free up the hard drives there to install into the Synology - as a JBOD array! :)
A goodnight to all and to all a good night - Goodnight Moon

asmussen

You don't need to buy a dedicated NAS unit to take advantage of RAID. You can do both hardware and software raid on a regular Windows box, (Or in my case a Linux server is what I use for my file server here at home). If you're willing to live with software RAID, the only real expense is the cost of the drives themselves. As for why you'd want it, it gives you redundancy for your important data, so that you can have a drive fail and still not lose any of your data. If you have data that you really don't want to lose, then going with some sort of RAID, or using a good backup solution (Or even both), is a necessity.

I used to be using a 4 drive RAID 5 setup, but when drives got bigger, I converted to a 2 disk RAID 1. I've been running this way for probably close to 15 years, and having RAID has saved my data more than once when a drive failed. Although the hardware has been upgraded and replaced over the years, I'm still running the original filesystem I setup all those years ago. I've migrated it onto new disks, expanded it, upgraded from ext2 to ext3, and then to ext4, etc..., but the only time I backed up the data and rebuilt the filesystem from scratch was when I converted from RAID 5 to RAID 1. Also, it serves the same filesystem to my other Linux boxes and OS X box via NFS and to my Windows systems via CIFS (with Samba).
Shawn Asmussen

Grauniad

As I said (or intended to say), RAID is good for redundant data and for high or continuous availability of data. But it is not a backup. If your house gets consumed in a fire (heaven forbid - but it did happen to a guy a few houses down the street once during a thunderstorm), or if burglars break in and carry off all your electronics, your data will be lost unless you have it backed up somewhere.

Sure, if you're a developer and you're hammering out 2 KLOC a day, you'd not want that day's output to be lost (and however long it takes you to get a replacement drive in there) and RAID is ideal for that.

For my situation, why would I want the extra drives in a RAID when I put relatively stable data on there and back it up to a remote location within hours of placing it there? My data cost on Amazon Glacier is under $5/month for the slow backup I have there.

If I buy the dedicated Synology NAS, I get all the "goodies" that Synology offers, I don't have to re-learn Linux (which I've largely forgotten again - it happens when one doesn't use knowledge) and I can stick up to 16TB in there. Should last me a good long time. 

Contrast that with the cost of building the Linux box (and then having to maintain it) and the price differential isn't that big.

If you want to make a thread extolling the virtues of Linux and FreeNAS, please do so. :)
A goodnight to all and to all a good night - Goodnight Moon

knucracker

The case for raid 1 or raid 5 is recovery time... which you basically said.  Naturally for anything really important you need off site storage.  Given that you have off site storage the question becomes one of how long does it take to restore from that off site location, do you really have everything there, and what is the amount of time you can tolerate waiting to restore the data.

In many cases (like mine), I don't have every bit of data in off site storage.  I have the really important stuff there.  My source tree, pictures and movies of my better half and little boy.... things that have great meaning or financial worth.  Some things I don't store off site, though, because they take up too much space and 5 mbps upload won't cut it.  I can carry on without these things, but given a choice I'd rather keep them.

In the case of a synology box with with 4 bays loaded with 2tb hard drives and no concern of local redundancy you would probably make each HD a separate volume and load them all up with everything you have over the course of time.  Eventually one of the hard drives will fail and you will lose the contents of that drive.  If the entire drive has been backed up offsite and you can recover the 2TB quickly enough for your sanctification, then everything is fine.  You'll just replace the HD with the 8TB model that is available at that time and carry on restoring from off site backup.

Here's an alternate scenario with raid 5.  You only use 3 of the hard drives of storage.  The 4th is the parity drive.  The entire collection of drives looks like one volume of 12TB.  You load it up full of data and eventually one of the drives fails.  When that happens, you need to replace the drive.  When you do, nothing will have been lost.  It will take the box anywhere from minutes to hours to rebuild the lost drive, but no restoration from off site storage will be required.  For the fastest recovery, you just have that extra hard drive sitting on the shelf waiting for this purpose.

In both models, you need off site storage for things you must keep.  That's a given for both models.  That covers the < 1% case of fire, theft, natural disaster.  The 99% failure case is that a hard drive just dies one day and the heads start hitting the side of the metal case like a prisoner in solitary trying to get out.  When that happens (not if), the question to think about is what procedure you want to go through to restore the data.

Now I'll grant you this, and I think it is a good argument against raid 5 setups.  Chances are your hard drives will last around 5 years or so.  In 5 years you won't be able to buy the same model hard drives nor would you want to.  When one hard drive fails you will need to replace it but you will be replacing it with some newer thing.  But you won't be able to take advantage of the increased space unless you replace every drive.  And of course replacing every drive is something you might want to do because well... they will be failing soon anyway.

In my case I have a 2 bay synology NAS.  It runs raid 1.  The sole reason it does that is recovery convenience and not everything on it is in offsite storage (though maybe I should start working on that :)).  When one of the drives fails, I can still read the data from the NAS since there is another drive in the box that is a mirror.  The box will beep and have red lights all over it, but I will be able to read the data.  My plan when that happens?  Copy everything off to a temporary storage location, then replace the failed drive AND it's sibling (even though it is still working) with new hard drives.  I will then recreate a new raid 1 setup using the new and no doubt larger hard drives, then copy stuff back on to them.

In the case of everything being backed up off site and being able to be recovered quickly enough, raid 1 wouldn't really be necessary to accomplish the same thing.

Flabort

While a bit off topic, I see talk about RAID.
I'm curious, as to how a RAID 6+1 system might look (I know it's not commonly used) or work. In theory. Or 1+6.

Also, in theory, if you had a drive that could consider parts of itself to be separate drives, for the purposes of storage, and then use RAID protocols in those "drives", what would this, er, "internal RAID" actually be called?

I'm just curious about the theoretical ideas. They just have to do with a theoretical drive that will (probably) never see the light of day, since I've got no way to actually MAKE it...
My maps: Top scores: Sugarplum, Cryz Dal, Cryz Torri, Cryz Bohz (Click fetch scores, page courtesy of kwinse)