ZFS on Linux
------------
-ZFS is a combined file system and logical volume manager designed by
+ZFS is a combined file system and logical volume manager, designed by
Sun Microsystems. There is no need to manually compile ZFS modules - all
packages are included.
By using ZFS, it's possible to achieve maximum enterprise features with
-low budget hardware, but also high performance systems by leveraging
-SSD caching or even SSD only setups. ZFS can replace cost intense
-hardware raid cards by moderate CPU and memory load combined with easy
+low budget hardware, and also high performance systems by leveraging
+SSD caching or even SSD only setups. ZFS can replace expensive
+hardware raid cards with moderate CPU and memory load, combined with easy
management.
-General ZFS advantages
+General advantages of ZFS:
* Easy configuration and management with GUI and CLI.
* Reliable
Hardware
~~~~~~~~~
-ZFS depends heavily on memory, so you need at least 8GB to start. In
-practice, use as much you can get for your hardware/budget. To prevent
+ZFS depends heavily on memory, so it's recommended to have at least 8GB to
+start. In practice, use as much you can get for your hardware/budget. To prevent
data corruption, we recommend the use of high quality ECC RAM.
If you use a dedicated cache and/or log disk, you should use an
-enterprise class SSD (e.g. Intel SSD DC S3700 Series). This can
+enterprise class SSD (for example, Intel SSD DC S3700 Series). This can
increase the overall performance significantly.
-IMPORTANT: Do not use ZFS on top of hardware controller which has its
+IMPORTANT: Do not use ZFS on top of a hardware controller which has its
own cache management. ZFS needs to directly communicate with disks. An
-HBA adapter is the way to go, or something like LSI controller flashed
-in ``IT`` mode.
+HBA adapter or something like an LSI controller flashed in ``IT`` mode is
+recommended.
ZFS Administration
This section gives you some usage examples for common tasks. ZFS
itself is really powerful and provides many options. The main commands
-to manage ZFS are `zfs` and `zpool`. Both commands come with great
+to manage ZFS are `zfs` and `zpool`. Both commands come with extensive
manual pages, which can be read with:
.. code-block:: console
It is possible to use a dedicated cache drive partition to increase
the performance (use SSD).
-As `<device>` it is possible to use more devices, like it's shown in
+For `<device>`, you can use multiple devices, as is shown in
"Create a new pool with RAID*".
.. code-block:: console
It is possible to use a dedicated cache drive partition to increase
the performance (SSD).
-As `<device>` it is possible to use more devices, like it's shown in
+For `<device>`, you can use multiple devices, as is shown in
"Create a new pool with RAID*".
.. code-block:: console
Add cache and log to an existing pool
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
-If you have a pool without cache and log. First partition the SSD in
-2 partition with `parted` or `gdisk`
+You can add cache and log devices to a pool after its creation. In this example,
+we will use a single drive for both cache and log. First, you need to create
+2 partitions on the SSD with `parted` or `gdisk`
.. important:: Always use GPT partition tables.
Changing a failed bootable device
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
-Depending on how Proxmox Backup was installed it is either using `grub` or `systemd-boot`
-as bootloader.
+Depending on how Proxmox Backup was installed, it is either using `grub` or
+`systemd-boot` as a bootloader.
-The first steps of copying the partition table, reissuing GUIDs and replacing
-the ZFS partition are the same. To make the system bootable from the new disk,
-different steps are needed which depend on the bootloader in use.
+In either case, the first steps of copying the partition table, reissuing GUIDs
+and replacing the ZFS partition are the same. To make the system bootable from
+the new disk, different steps are needed which depend on the bootloader in use.
.. code-block:: console
# grub-mkconfig -o /path/to/grub.cfg
-Activate E-Mail Notification
+Activate e-mail notification
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ZFS comes with an event daemon, which monitors events generated by the
# apt-get install zfs-zed
-To activate the daemon it is necessary to edit `/etc/zfs/zed.d/zed.rc` with your
-favorite editor, and uncomment the `ZED_EMAIL_ADDR` setting:
+To activate the daemon, it is necessary to to uncomment the ZED_EMAIL_ADDR
+setting, in the file `/etc/zfs/zed.d/zed.rc`.
.. code-block:: console
ZED_EMAIL_ADDR="root"
-Please note Proxmox Backup forwards mails to `root` to the email address
+Please note that Proxmox Backup forwards mails to `root` to the email address
configured for the root user.
IMPORTANT: The only setting that is required is `ZED_EMAIL_ADDR`. All
other settings are optional.
-Limit ZFS Memory Usage
+Limit ZFS memory usage
^^^^^^^^^^^^^^^^^^^^^^
It is good to use at most 50 percent (which is the default) of the
-system memory for ZFS ARC to prevent performance shortage of the
+system memory for ZFS ARC, to prevent performance degradation of the
host. Use your preferred editor to change the configuration in
`/etc/modprobe.d/zfs.conf` and insert:
options zfs zfs_arc_max=8589934592
-This example setting limits the usage to 8GB.
+The above example limits the usage to 8 GiB ('8 * 2^30^').
-.. IMPORTANT:: If your root file system is ZFS you must update your initramfs every time this value changes:
+.. IMPORTANT:: In case your desired `zfs_arc_max` value is lower than or equal
+ to `zfs_arc_min` (which defaults to 1/32 of the system memory), `zfs_arc_max`
+ will be ignored. Thus, for it to work in this case, you must set
+ `zfs_arc_min` to at most `zfs_arc_max - 1`. This would require updating the
+ configuration in `/etc/modprobe.d/zfs.conf`, with:
+
+.. code-block:: console
+ options zfs zfs_arc_min=8589934591
+ options zfs zfs_arc_max=8589934592
+
+This example setting limits the usage to 8 GiB ('8 * 2^30^') on
+systems with more than 256 GiB of total memory, where simply setting
+`zfs_arc_max` alone would not work.
+
+.. IMPORTANT:: If your root file system is ZFS, you must update your initramfs
+ every time this value changes.
.. code-block:: console
# update-initramfs -u
-SWAP on ZFS
+Swap on ZFS
^^^^^^^^^^^
-Swap-space created on a zvol may generate some troubles, like blocking the
+Swap-space created on a zvol may cause some issues, such as blocking the
server or generating a high IO load, often seen when starting a Backup
to an external Storage.
-We strongly recommend to use enough memory, so that you normally do not
+We strongly recommend using enough memory, so that you normally do not
run into low memory situations. Should you need or want to add swap, it is
-preferred to create a partition on a physical disk and use it as swap device.
+preferred to create a partition on a physical disk and use it as a swap device.
You can leave some space free for this purpose in the advanced options of the
-installer. Additionally, you can lower the `swappiness` value.
+installer. Additionally, you can lower the `swappiness` value.
A good value for servers is 10:
.. code-block:: console
vm.swappiness = 100 The kernel will swap aggressively.
==================== ===============================================================
-ZFS Compression
+ZFS compression
^^^^^^^^^^^^^^^
To activate compression:
# zpool set compression=lz4 <pool>
We recommend using the `lz4` algorithm, since it adds very little CPU overhead.
-Other algorithms such as `lzjb` and `gzip-N` (where `N` is an integer `1-9` representing
-the compression ratio, 1 is fastest and 9 is best compression) are also available.
-Depending on the algorithm and how compressible the data is, having compression enabled can even increase
-I/O performance.
+Other algorithms such as `lzjb` and `gzip-N` (where `N` is an integer from `1-9`
+representing the compression ratio, where 1 is fastest and 9 is best
+compression) are also available. Depending on the algorithm and how
+compressible the data is, having compression enabled can even increase I/O
+performance.
You can disable compression at any time with:
.. code-block:: console
.. _local_zfs_special_device:
-ZFS Special Device
+ZFS special device
^^^^^^^^^^^^^^^^^^
-Since version 0.8.0 ZFS supports `special` devices. A `special` device in a
+Since version 0.8.0, ZFS supports `special` devices. A `special` device in a
pool is used to store metadata, deduplication tables, and optionally small
file blocks.
A `special` device can improve the speed of a pool consisting of slow spinning
-hard disks with a lot of metadata changes. For example workloads that involve
+hard disks with a lot of metadata changes. For example, workloads that involve
creating, updating or deleting a large number of files will benefit from the
presence of a `special` device. ZFS datasets can also be configured to store
-whole small files on the `special` device which can further improve the
+small files on the `special` device, which can further improve the
performance. Use fast SSDs for the `special` device.
.. IMPORTANT:: The redundancy of the `special` device should match the one of the
- pool, since the `special` device is a point of failure for the whole pool.
+ pool, since the `special` device is a point of failure for the entire pool.
.. WARNING:: Adding a `special` device to a pool cannot be undone!
-Create a pool with `special` device and RAID-1:
+To create a pool with `special` device and RAID-1:
.. code-block:: console
# zpool add <pool> special mirror <device1> <device2>
ZFS datasets expose the `special_small_blocks=<size>` property. `size` can be
-`0` to disable storing small file blocks on the `special` device or a power of
-two in the range between `512B` to `128K`. After setting the property new file
+`0` to disable storing small file blocks on the `special` device, or a power of
+two in the range between `512B` to `128K`. After setting this property, new file
blocks smaller than `size` will be allocated on the `special` device.
.. IMPORTANT:: If the value for `special_small_blocks` is greater than or equal to
the `special` device, so be careful!
Setting the `special_small_blocks` property on a pool will change the default
-value of that property for all child ZFS datasets (for example all containers
+value of that property for all child ZFS datasets (for example, all containers
in the pool will opt in for small file blocks).
-Opt in for all file smaller than 4K-blocks pool-wide:
+Opt in for all files smaller than 4K-blocks pool-wide:
.. code-block:: console
Troubleshooting
^^^^^^^^^^^^^^^
-Corrupted cachefile
+Corrupt cache file
+""""""""""""""""""
+
+`zfs-import-cache.service` imports ZFS pools using the ZFS cache file. If this
+file becomes corrupted, the service won't be able to import the pools that it's
+unable to read from it.
-In case of a corrupted ZFS cachefile, some volumes may not be mounted during
-boot until mounted manually later.
+As a result, in case of a corrupted ZFS cache file, some volumes may not be
+mounted during boot and must be mounted manually later.
For each pool, run:
# zpool set cachefile=/etc/zfs/zpool.cache POOLNAME
-and afterwards update the `initramfs` by running:
+then, update the `initramfs` by running:
.. code-block:: console
# update-initramfs -u -k all
-and finally reboot your node.
-
-Sometimes the ZFS cachefile can get corrupted, and `zfs-import-cache.service`
-doesn't import the pools that aren't present in the cachefile.
+and finally, reboot the node.
Another workaround to this problem is enabling the `zfs-import-scan.service`,
which searches and imports pools via device scanning (usually slower).