I have a 2 bay NAS, and I was planning on using 2x 18tb HDDs in raid 1. I was planning on purchasing 3 of these drives so when one fails I have the replacement. (I am aware that you should purchase at different times to reduce risk of them all failing at the same time)
Then I setup restic.
It makes backups so easy that I am wondering if I should even bother with raid.
Currently I have ~1TB of backups, and with restics snapshots, it won’t grow to be that big anyways.
Either way, I will be storing the backups in aws S3. So is it still worth it to use raid? (I also will be storing backups at my parents)
I always do some level of RAID. If for no other reason, I’m not out of commission if a disk fails. When you’re working with multi TB, restoring from a backup can take a while. If rapid recovery from a disk failure is not a high priority for you, then you could probably do without RAID.
Either way, make sure you test your backups occasionally.
Another way to put it: With RAID, a disk failure is like your Check Engine light coming on. You can still drive, but you should address the problem as soon as you can. Without RAID, it’s like your engine has seized up and you have to tow it for repair and are without your car until it’s fixed.
Hmm that’s a good point.
Aws also can cost a good chunk if you restore un-optimally
Keep in mind that if you set up raid using zfs or btrfs (idk how it works with other systems but that’s what I’ve used) then you also get scrubs which detect and fix bit rot and unrecoverable read errors. Without that or a similar system, those errors will go undetected and your backup system will backup those corrupted files as well.
Personally one of the main reasons I used zfs and now btrfs with redundancy is to protect irreplaceable files (family memories and stuff) from those kinds of errors, as I used to just keep stuff on a hard drive until I discovered loads of my irreplaceable vacation photos to be corrupted, including the backups which backed up the corruption.
If your files can be reacquired, then I don’t think it’s a big deal. But if they aren’t, then I think having scrubs or integrity checks with redundancy so that issues can be repaired, as well as backups with snapshots to prevent errors or mistakes from messing up your backups, is a necessity. But it just depends on how much you value your files.
Note that you do not need any sort of redundancy to detect corruption.
Redundancy only gains you the ability to have that corruption immediately and automatically repaired.
While this sounds nice in theory, you have no use for such auto repair if you have backups handy because you can simply restore that data manually using your backups in the 2 times in your lifetime that such corruption actually occurs.
(If you do not have backups handy, you should fix that before even thinking about RAID.)It’s incredibly costly to have such redundancy at a disk level and you’re almost always better off using those resources on more backups instead if data security is your primary concern.
Downtime mitigation is another story but IMHO it’s hardly relevant for most home users.Can you explain this to me better?
I need to work on my data storage solution, and I knew about bit rot but thought the only solution was something like a zfs pool.
How do I go about manually detecting bit rot? Assuming I had perfect backups to replace the rotted files.
Is a zfs pool really that inefficient space wise?
Sure :)
I knew about bit rot but thought the only solution was something like a zfs pool.
Right. There are other ways of doing this but a checksumming filesystem such as ZFS, btrfs (or bcachefs if you’re feeling adventurous) are the best way to do that generically and can also be used in combination with other methods.
What you generally need in order to detect corruption on ab abstract level is some sort of “integrity record” which can determine whether some set of data is in an expected state or an unexpected state. The difficulty here is to keep that record up to date with the actually expected changes to the data.
The filesystem sits at a very good place to implement this because it handles all such “expected changes” as executing those on behalf of the running processes is its purpose.Filesystems like ZFS and btrfs implement this integrity record in the form of hashes of smaller portions of each file’s data (“extents”). The hash for each extent is stored in the filesystem metadata. When any part of a file is read, the extents that make up that part of the file are each hashed and the results are compared with the hashes stored in the metadata. If the hash is the same, all is good and the read succeeds but if it doesn’t match, the read fails and the application reading that portion of the file gets an IO error that it needs to handle.
Note how there was never any second disk involved in this. You can do all of this on a single disk.
Now to your next question:
How do I go about manually detecting bit rot?
In order to detect whether any given file is corrupted, you simply read back that file’s content. If you get an error due to a hash mismatch, it’s bad, if you don’t, it’s good. It’s quite simple really.
You can then simply expand that process to all the files in your filesystem to see whether any of them have gotten corrupted. You could do this manually by just reading every file in your filesystem once and reporting errors but those filesystems usually provide a ready-made tool for that with tighter integrations in the filesystem code. The conventional name for this process is to “scrub”.
How do I go about manually detecting bit rot? Assuming I had perfect backups to replace the rotted files.
You let the filesystem-specific scrub run and it will report every file that contains corrupted data.
Now that you know which files are corrupted, you simply replace those files from your backup.
Done; no more corrupted files.
Is a zfs pool really that inefficient space wise?
Not a ZFS pool per-se but redundant RAID in general. And by “incredibly costly” I mean costly for the purpose of immediately restoring data rather than doing it manually.
There actually are use-cases for automatic immediate repair but, in a home lab setting, it’s usually totally acceptable for e.g. a service to be down for a few hours until you e.g. get back from work to restore some file from backup.
It should also be noted that corruption is exceedingly rare. You will encounter it at some point which is why you should protect yourself against it but it’s not like this will happen every few months; this will happen closer to on the order of every few decades.
To answer your original question directly: No, ZFS pools themselves are not inefficient as they can also be used on a single disk or in a non-redundant striping manner (similar to RAID0). They’re just the abstraction layer at which you have the choice of whether to make use of redundancy or not and it’s redundancy that can be wasteful depending on your purpose.
Thanks for the write-up!
I see now I was conflating zfs with RAID in general. It makes sense that you could have the benefits of a checksumming filesystem without the need for RAID, by simply restoring from backups.
This is a great start for me to finally get some local backups going.
@Atemu @beastlykings Every few decades seems optimistic. I have an archive of photos/videos from cameras and phones spanning from early 2000s to mid-2010s. There’s not a lot, maybe 6gb; a few thousand files. At some point around the end of that time period, I noticed corruption in some random photos.
Likewise, I have a (3tb) flac archive, which is about 15-20 years old. Nightly ‘flac -t’ checks are done on 1/60th of the archive, essentially a scrub. Bitrot has struck a dozen times so far.
Interesting. I suspect you must either have had really bad luck or be using faulty hardware.
In my broad summarising estimate, I only accounted for relatively modern disks like something made in the past 5 years or so. Drives from the 2000s or early 2010s could be significantly worse and I wouldn’t be surprised. It sounds like to me your experience was with drives that are well over a decade old at this point.
backups in the 2 times in your lifetime that such corruption actually occurs.
What are you even talking about here? This line invalidates everything else you’ve said.
I was thinking whether I should elaborate on this when I wrote the previous reply.
At the scale of most home users (~dozens of TiBs), corruption is actually quite unlikely to happen. It’ll happen maybe a handful of times in your lifetime if you’re unlucky.
Disk failure is actually also not all that likely (maybe once every decade or so, maybe) but still quite a bit more likely than corruption.
Just because it’s rare doesn’t mean it never happens or that you shouldn’t protect yourself against it though. You don’t want to be caught with your pants down when it does actually happen.
My primary point is however that backups are sufficient to protect against this hazard and also protect you against quite a few other hazards. There are many other such hazards and a hard drive failing isn’t even the most likely among them (that’d be user error).
If you care about data security first and foremost, you should therefore prioritise more backups over downtime mitigation technologies such as RAID.
RAID 1 is mirroring. If you accidentally delete a file, or it becomes corrupt (for reasons other than drive failure), RAID 1 will faithfully replicate that delete/corruption to both drives. RAID 1 only protects you from drive failure.
Implement backups before RAID. If you have an extra drive, use it for backups first.
There is only one case when it’s smart to use RAID on a machine with no backups, and that’s RAID 0 on a read-only server where the data is being replicated in from somewhere else. All other RAID levels only protect against drive failure, and not against the far more common causes of data loss: user- or application-caused data corruption.
I know it’s not totally relevant but I once convinced a company to run their log aggregators with 75 servers and 15 disks in raid0 each.
We relied on the app layer to make sure there was at least 3 copies of the data and if a node’s array shat the bed the rest of the cluster would heal and replicate what was lost. Once the DC people swapped the disk we had automation to rebuild the disks and add the host back into the cluster.
It was glorious - 75 servers each splitting the read/write operations 1/75th and then each server splitting that further between 15 disks. Each query had the potential to have ~1100 disks respond in concert, each with a tiny slice of the data you asked for. It was SO fast.
And that, kids, is a great use of RAID: under some other form of data redundancy.
Great story!
Big elk stack?
It depends on your uptime requirements.
According to Backblaze stats on similarly modern drives, you can expect about a 9% probability that at least one of those drives has died after 6 years. Assuming 1 week recovery time if any one of them dies, that’d be a 99.997% uptime.
If that’s too high of a probability for needing to run a (in case of AWS potentially very costly) restore, you should invest in RAID. Otherwise, that money is better spent on more backups.
RAID is 100% about uptime, not backups. If you want less downtime then RAID is your friend.
Having said that, RAID in modern systems is broken and you should use ZFS instead: https://www.youtube.com/watch?v=l55GfAwa8RI
Raid 1 has saved my server a couple of times over from disaster. I make weekly cold backups, but I didn’t have to worry about it when my alert came in notifying me which drive went dead - just swap, rebuild, move along. So yeah I’d say it’s definitely worth it. Just don’t treat raid as a backup solution - and yes, continue to use an external cold storage backup solution as you mentioned. Fires, exploding power supplies, ransomware, etc don’t care if you’re using raid or not.
It is also useful to stop silent corruption
RAID means that if a drive fails you don’t have some downtime while your backups restore. It depends on how you feel about waiting for that.
I want my personal system down until it is back in proper condition tbh
Also it is easier to hit replace
It’s up to you. Things to consider:
- Size of data
- Recovery speed (Internet speed)
- Recovery time objective
- Recovery point objective (If you’re backing up once per day, is it okay to lose 23 hours of data when a disk fails?)
If your recovery objectives can be met with the anticipated data size and recovery speed, then you could do RAID 0 instead of RAID 1 to get higher speeds and capacity. Just know that if you do that, you better be on top of your backups because they will be needed eventually.
Yes yes yes yes yes
Raid1 that thing and sleep easier. Good on you for having a cold spare, and knowing to buy your drives at different locations/times to get different batches. Your head is in the right place! No reason to leave that data unprotected if you have the underlying tech and hardware.
I absolutely would, for a few reasons:
- restoring from backup is a last resort and involves downtime; swapping a disk is comparatively easier and less disruptive
- it’s possible your backup solution fails, so having some redundancy is always good
- read performance - not a major factor, but saturating a gigabit link is always nice
Read perf would be the same or better if you didn’t add redundancy as you’d obviously use RAID0.
RAID is never in any way something that can replace a backup. If the backup cannot be restored, you didn’t have a backup in the first place. Test your backups.
If you don’t trust 1 backup, you should make a second backup rather than using RAID.The one and only thing RAID has going for it is minimising downtime. For most home use-cases though, the 3rd 9 which this would provide is hardly relevant IMHO.
Read perf would be the same or better if you didn’t add redundancy
RAID 1 can absolutely be faster than a single disk for read perf, and on Linux it is tuned to be faster. It’s not why you’d use it, but it is a feature of RAID. Intuitively, since both disks have exactly the same data, each disk could read different things. Likewise, for writes, you don’t have to write at the same time, as long as they’re always correct (e.g. don’t flip the metadata segment until both have written the data), so you can even get a write boost.
If performance is all you care about, then yeah, go ahead and use RAID 0. But you do get a performance boost with mirroring as well.
Yes, a backup should be tested, but it shouldn’t be relied on. Internet can go down, services can have maintenance, etc, so it’s a lot better to never need it. If you can afford a mirror, it’s having.
You’re missing the point entirely. I never said to use a single disk, I explicitly compared it to RAID0.
As far as data security is concerned, JBOD/linear combination and RAID0 are the same, so you’d obviously use RAID0 if you didn’t need redundancy.
No, JBOD is not the same as RAID0. With RAID0, you always need the disks in sync because reads need to alternate. With JBOD, as long as your reads are distributed, only one disk at a time needs to be active for a given read and you can benefit from simultaneous reads on different disks. RAID0 will probably give the biggest speedup in a single user scenario, whereas I’d expect JBOD to potentially outperform in a multiuser scenario assuming your OS and filesystem is tuned for it.
RAID0 is pretty much never the solution, and I’d much rather have JBOD than RAID0 in almost every scenario.
RAID1 gives you redundancy while preserving the ability for disks to independently seek, so on competent systems (e.g. Linux and BSD), you’ll get a performance speedup over a single disk and get something that rivals RAID0 in practice. You wouldn’t use it for performance because JBOD is probably just as fast in practice without the storage overhead penalty (again, assuming you properly distribute reads across disks), but you do get some performance benefits, which is nice.
JBOD is not the same as RAID0
As far as data security is concerned, JBOD/linear combination and RAID0 are the same
With RAID0, you always need the disks in sync because reads need to alternate. With JBOD, as long as your reads are distributed, only one disk at a time needs to be active for a given read and you can benefit from simultaneous reads on different disks
RAID0 will always have the performance characteristics of the slowest disk times the stripe width.
JBOD will have performance depending on the disk currently used. With sufficient load, it could theoretically max out all disks at once but that’s extremely unlikely and, with that kind of load, you’d necessarily have a queue so deep that latency shoots to the moon; resulting in an unusable system.
Most importantly of all however is that you cannot control which device is used. This means you cannot rely on getting better perf than the slowest device because, with any IO operation, you might just hit the slowest device instead of the more performant drives and there’s no way to predict which you’ll get.
It goes further too because any given application is unlikely to have a workload that even distributes over all disks. In a classical JBOD, you’d need a working set of data that is greater than the size of the individual disks (which is highly unlikely) or lots of fragmentation (you really don’t want that). This means the perf that you can actually rely on getting in a JBOD is the perf of the slowest disk, regardless of how many disks there are.Perf of slowest disk * number of disks > Perf of slowest disk.
QED.
You also assume that disk speeds are somehow vastly different whereas in reality, most modern hard drives perform very similarly.
Also nobody in their right mind would design a system that groups together disks with vastly different performance characteristics when performance is of any importance.
Yes, I would still do raid. Because a disk fail will not cause a blackout. Much better than have your server offline waiting to replace disk and restore backup.
And no way you can backup 18tb in 1tb, restic or no restic.
i was also thinking like this, then i had to restore everything from a backup when the ssd suddenly died. I wasted so much time setting everything back as before
If you needed to spend any time “setting everything back as before”, you didn’t have a full backup.
the reason OP was thinking of doing this, was saving disk space and avoiding buying another hdd. So if it’s a 1:1 full disk image, then there’s almost no difference with the costs of raid1. Setting exclusions, avoiding certain big files, and so on. In this case he’s talking about restic, which can restore data but very hard to do a full bootable linux system - stuff needs to be reinstalled
if it’s a 1:1 full disk image, then there’s almost no difference with the costs of raid1
The problem with that statement is that you’re likening a redundant but dependant copy to a backup which is a redundant independent copy. RAID is not a backup.
As an easy example to illustrate this point: if you delete all of your files, they will still be present in a backup while RAID will happily delete the data on all drives at the same time.
Additionally, backup tools such as restic offer compression and deduplication which saves quite a bit of space; allowing you to store multiple revisions of your data while requiring less space than the original data in most cases.
In this case he’s talking about restic, which can restore data but very hard to do a full bootable linux system - stuff needs to be reinstalled
It’s totally possible to make a backup of the root filesystem tree and restore a full system from that if you know what you’re doing. It’s not even that hard: Format disks, extract backup, adjust fstab, reinstall bootloader, kernels and initrd into the boot/ESP partition(s).
There’s also the wasteful but dead simple method to backing up your whole system with all its configuration which is full-disk backups. The only thing this will not back up are EFI vars but those are easy to simply set again or would just remain set as long as you don’t switch motherboards.
I’m used to Borgbackup which fulfils a very similar purpose to restic, so I didn’t know this but restic doesn’t appear to have first-class support for backing up whole block devices but it appears this can be made to work too: https://github.com/restic/restic/issues/949
I must admit that I also didn’t think of this as a huge issue because declarative system configuration is a thing. If you’re used to it, you have a very different view on the importance of system configuration state.
If my server died, it’d be a few minutes of setting up the disk format and then waiting for a ~3.5GiB download after which everything would work exactly as it did before modulo user data. (The disk format step could also be automatic but I didn’t bother implementing that yet because of https://xkcd.com/1205/.)
Depends, how much do you value your data? Is it all DVD rips where you still have the DVDs? Nah you don’t really need raid. Are they precious family photos where your only backup copy is S3? Yeah I’d use raid for that, plus having a second copy stored elsewhere.
Plus as others have mentioned there’s checks on your data for bitrot, which absolutely does happen.
RAID does not protect your data, it protects data uptime.
RAID cannot ensure integrity (i.e bitrot protection). Its one and only purpose it to mitigate downtime.
ZFS or other software RAIDs can though. Does anyone stll use hardware raid anyways?
ZFS and BTRFS’ integrity checks are entirely independent of whether you have redundancy or not. You don’t need any sort of RAID to get that; it also works on a single disk.
The only thing that redundancy provides you here is immediate automatic repair if corruption is found. I’ve written about why that isn’t as great as it sounds in another reply already.Most other software RAID can not and does not protect integrity. It couldn’t; there’s no hashing. Data verification is extremely annoying to implement on the block level and has massive performance gotchas, so you wouldn’t want that even if you could have it.
RAID is a great backup alternative.
/s