0

I have a 4TB drive that is nearing max capacity and would like to move the data to a RAID or ZFS (the host OS is ProxMox 7.1).

I have 3 other identical (WD Blue) 4TB drives (I might be able to get 1 more drive soon, but Amazon has a cap on how many you can purchase right now).

Ideally the ZFS or RAID configuration would afford some level of whole drive redundancy/fault tolerance for convenience.

But here's the (maybe?) tricky part: I'd like to utilize all 4 drives and in the end have ~8TB or more available for storage.

I don't yet know enough about RAID and ZFS to understand the complexities of this, and have read online that RAID 5 is bad, and maybe so is RAID 6, and that ZFS just can't do what I'm asking as the pool will become unbalanced. So I'm wondering if this data dance is even possible without a 5th drive to temporarily store the data. This seems to me like it should be possible (maybe with some partition dancing), but my own searches just left me with more questions and doubts that it can work.

As a bonus, and perhaps a bit of a stretch, I'd love to be able to (easily/simply) further expand the capacity of the array with additional 4TB drives in the future (maybe once every few years).

Is it possible to do this or do I absolutely need to offload all the data to a separate location first?

  • 3
    "RAID 5 is bad": That's a generalization that does not hold. "So is RAID 6": let's discuss this when you have a use case for that kind of RAID. Not everything fits your use case, but it doesn't make it "bad". "ZFS just can't do what I'm asking": um, far as I can tell, it does? And, 4×4 TB = 16 TB, not 8 TB, so I think you might be making assumptions on your wanted storage features that you just forget to tell us! – Marcus Müller Mar 07 '22 at 20:05
  • 1
    @Marcus at larger (multiples of TB), RAID 5 becomes a liability. Even RAID 6 starts to become questionable with 10 or 12 TB disks – roaima Mar 07 '22 at 20:38
  • 1
    Possible duplicate - [Migrate Single Disk to RAID](https://unix.stackexchange.com/q/557525/100397) – roaima Mar 07 '22 at 22:13
  • Possible duplicate - [How to create a 3x3TB RAID 5 array without losing data from 2 of the drives?](https://unix.stackexchange.com/a/348991/100397) – roaima Mar 07 '22 at 22:14
  • Thanks for the comments. @Marcus, I mentioned "~8TB or more", not just 8TB. I know at least some of my times tables :-) – David Murdoch Mar 07 '22 at 22:52
  • @roaima, When do you think a RAID5 (or 6) reaches a point where it becomes a liability? I know this stuff isn't black and white, but on old the forums I read no one gave any sort of concrete numbers or stats on these things. – David Murdoch Mar 07 '22 at 22:57
  • Oh, and @MarcusMüller, I read somewhere that ZFS won't rebuild it's parity (or whatever it may be called) when you add a new drive. – David Murdoch Mar 07 '22 at 22:58
  • I don't have enough drives (or the time and inclination) to try and get figures, so straight up this is (hopefully informed) opinion. The issue is that when rebuilding RAID5 (or 6) there is so much disk activity that the remaining disks are stressed. If one of those is also at a similar age and quality to the one that has died and been replaced you have a reasonable probability that you'll lose part of another drive. Consider a single read failure on a degraded RAID5: you have now lost that block of data. In many cases this is enough to stop the resync and blam you've lost everything – roaima Mar 07 '22 at 23:41
  • As with mdadm, you can create a zfs vdev with missing devices (but you have to fake it with sparse files of the same size as the drives you're adding), e.g. see https://blog.chaospixel.com/linux/2017/08/zfs-create-pool-with-missing-devices.html. So you could create a pool with a 4 disk raidz-1 vdev with a missing drive (for a total of 12TB, 3x4TB), or pool with two 2-drive mirrors (total 8TB) one with a missing device, then copy your data to the pool, then unmount the original drive and add it to the vdev. Redundancy will be compromised and the pool degraded until you add that last drive. – cas Mar 08 '22 at 00:11
  • BTW, if your data contains a lot of text or other compressible data, it will take a lot less space on the zpool due to transparent compression. ZFS currently defaults to using lz4 which is high performance and moderate compression ratio. You may want to change that to the newer `zstd`, which is high performance and better compression. Note that compression type is set per dataset on a pool so you can have compression=off for a dataset(s) containing already compressed files like videos or music, and datasets with compression=zstd for other files. – cas Mar 08 '22 at 00:18
  • 3
    Finally, `btrfs` offers similar features to zfs but makes it easier to add and remove drives (of same or even different sizes) and has a great `btrfs balance` feature to rebalance the data across all the drives. Unfortunately, that's the one thing that btrfs has that zfs doesn't....in pretty much all other ways, zfs is superior. Still, that rebalance feature is very useful in a home/low-budget setting so btrfs is worth considering. – cas Mar 08 '22 at 00:24

1 Answers1

2

You can do this with ZFS, you can choose from usage of FreeNAS or Solaris and it support your configuration. Proxmox also support ZFS.

If you choose to use RAID-Z (one parity disk) the minimum number of disks is 3 and you can add later more disks. Such array can survive fail of one disk. Similar is RAID5 arrays which you can do using md tool in Linux.

If you need more reliable array you can use RAID-Z2 which ask for minimum of 4 disks and have two parities so it can survive fail of 2 disks. Similar is RAID6.

In ZFS you can grow the array by adding new disk(s). Also you can speedup the array by adding separate caching disks and separate log devices (for example SSD)

Romeo Ninov
  • 16,541
  • 5
  • 32
  • 44
  • I can't remember where, but I _thought_ I read that when you add a new drive to RAID-Z it doesn't have a way to rebuild its parity/redundancy (or whatever it may be calls in ZFS). Do you know anything about that? – David Murdoch Mar 07 '22 at 22:56
  • 2
    @DavidMurdoch you are correct. resizing in ZFS is a little complicated. A pool is made up of one-or-more vdevs, which each consist of one-or-more drives. The vdevs can be single-drives (no redundancy!), mirrors (two or more drives, similar to RAID-1), or RAID-Z (minimum of 3 vdevs). You can add a vdev to a pool, but you can not remove one. You can always attach an extra drive of same size (or larger) to a single-drive or mirror vdev to increase reduncancy. To increase the size of a vdev (mirror or raidz) you have to replace all of the drives in that vdev with larger drives. – cas Mar 08 '22 at 00:01
  • 1
    oops. s/minimum of 3 **vdevs**/minimum of 3 **drives**/ – cas Mar 08 '22 at 00:33
  • 1
    I ended up going with a Pool consisting of 2 devs of 2x4TB mirrors. I'll add another 2x4TB mirror in the near future once I get an additional drive. The balance will be off, but it looks I can sort of "rebalance" things by rewriting all data a few times, which should improve read performance on par with how it would have been had I just started with all 3 vdevs in the first place. – David Murdoch Mar 11 '22 at 19:17