X-Git-Url: https://git.proxmox.com/?a=blobdiff_plain;f=local-zfs.adoc;h=34eb06b9f08d8a8fee36a5ccb6a7fbe7ca3a5a30;hb=750d4f04c60f9f2a1de50672aa8373739d4300f7;hp=a80b7d8310409db47c70eec7cd389ef8f3781d66;hpb=acb4a8998f61408120eb4fb1acfd97b4d7f036a0;p=pve-docs.git diff --git a/local-zfs.adoc b/local-zfs.adoc index a80b7d8..34eb06b 100644 --- a/local-zfs.adoc +++ b/local-zfs.adoc @@ -32,7 +32,8 @@ management. * Copy-on-write clone -* Various raid levels: RAID0, RAID1, RAID10, RAIDZ-1, RAIDZ-2 and RAIDZ-3 +* Various raid levels: RAID0, RAID1, RAID10, RAIDZ-1, RAIDZ-2, RAIDZ-3, +dRAID, dRAID2, dRAID3 * Can use SSD for cache @@ -59,7 +60,7 @@ practice, use as much as you can get for your hardware/budget. To prevent data corruption, we recommend the use of high quality ECC RAM. If you use a dedicated cache and/or log disk, you should use an -enterprise class SSD (e.g. Intel SSD DC S3700 Series). This can +enterprise class SSD. This can increase the overall performance significantly. IMPORTANT: Do not use ZFS on top of a hardware RAID controller which has its @@ -166,17 +167,17 @@ Each `vdev` type has different performance behaviors. The two parameters of interest are the IOPS (Input/Output Operations per Second) and the bandwidth with which data can be written or read. -A 'mirror' vdev (RAID1) will approximately behave like a single disk in regards -to both parameters when writing data. When reading data if will behave like the -number of disks in the mirror. +A 'mirror' vdev (RAID1) will approximately behave like a single disk in regard +to both parameters when writing data. When reading data the performance will +scale linearly with the number of disks in the mirror. A common situation is to have 4 disks. When setting it up as 2 mirror vdevs (RAID10) the pool will have the write characteristics as two single disks in -regard of IOPS and bandwidth. For read operations it will resemble 4 single +regard to IOPS and bandwidth. For read operations it will resemble 4 single disks. A 'RAIDZ' of any redundancy level will approximately behave like a single disk -in regard of IOPS with a lot of bandwidth. How much bandwidth depends on the +in regard to IOPS with a lot of bandwidth. How much bandwidth depends on the size of the RAIDZ vdev and the redundancy level. For running VMs, IOPS is the more important metric in most situations. @@ -234,7 +235,7 @@ There are a few options to counter the increased use of space: The `volblocksize` property can only be set when creating a ZVOL. The default value can be changed in the storage configuration. When doing this, the guest needs to be tuned accordingly and depending on the use case, the problem of -write amplification if just moved from the ZFS layer up to the guest. +write amplification is just moved from the ZFS layer up to the guest. Using `ashift=9` when creating the pool can lead to bad performance, depending on the disks underneath, and cannot be changed later on. @@ -244,6 +245,47 @@ them, unless your environment has specific needs and characteristics where RAIDZ performance characteristics are acceptable. +ZFS dRAID +~~~~~~~~~ + +In a ZFS dRAID (declustered RAID) the hot spare drive(s) participate in the RAID. +Their spare capacity is reserved and used for rebuilding when one drive fails. +This provides, depending on the configuration, faster rebuilding compared to a +RAIDZ in case of drive failure. More information can be found in the official +OpenZFS documentation. footnote:[OpenZFS dRAID +https://openzfs.github.io/openzfs-docs/Basic%20Concepts/dRAID%20Howto.html] + +NOTE: dRAID is intended for more than 10-15 disks in a dRAID. A RAIDZ +setup should be better for a lower amount of disks in most use cases. + +NOTE: The GUI requires one more disk than the minimum (i.e. dRAID1 needs 3). It +expects that a spare disk is added as well. + + * `dRAID1` or `dRAID`: requires at least 2 disks, one can fail before data is +lost + * `dRAID2`: requires at least 3 disks, two can fail before data is lost + * `dRAID3`: requires at least 4 disks, three can fail before data is lost + + +Additional information can be found on the manual page: + +---- +# man zpoolconcepts +---- + +Spares and Data +^^^^^^^^^^^^^^^ +The number of `spares` tells the system how many disks it should keep ready in +case of a disk failure. The default value is 0 `spares`. Without spares, +rebuilding won't get any speed benefits. + +`data` defines the number of devices in a redundancy group. The default value is +8. Except when `disks - parity - spares` equal something less than 8, the lower +number is used. In general, a smaller number of `data` devices leads to higher +IOPS, better compression ratios and faster resilvering, but defining fewer data +devices reduces the available storage capacity of the pool. + + Bootloader ~~~~~~~~~~ @@ -364,8 +406,8 @@ As `` it is possible to use more devices, like it's shown in Add cache and log to an existing pool ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ -If you have a pool without cache and log. First partition the SSD in -2 partition with `parted` or `gdisk` +If you have a pool without cache and log, first create 2 partitions on the SSD +with `parted` or `gdisk`. IMPORTANT: Always use GPT partition tables.