]> git.proxmox.com Git - proxmox-backup.git/blame - docs/local-zfs.rst
src/backup/manifest.rs: cleanup signature generation
[proxmox-backup.git] / docs / local-zfs.rst
CommitLineData
859fe9c1 1ZFS on Linux
24406ebc 2------------
859fe9c1
OB
3
4ZFS is a combined file system and logical volume manager designed by
5Sun Microsystems. There is no need for manually compile ZFS modules - all
6packages are included.
7
8By using ZFS, its possible to achieve maximum enterprise features with
9low budget hardware, but also high performance systems by leveraging
10SSD caching or even SSD only setups. ZFS can replace cost intense
11hardware raid cards by moderate CPU and memory load combined with easy
12management.
13
14General ZFS advantages
15
16* Easy configuration and management with GUI and CLI.
17* Reliable
18* Protection against data corruption
19* Data compression on file system level
20* Snapshots
21* Copy-on-write clone
22* Various raid levels: RAID0, RAID1, RAID10, RAIDZ-1, RAIDZ-2 and RAIDZ-3
23* Can use SSD for cache
24* Self healing
25* Continuous integrity checking
26* Designed for high storage capacities
27* Protection against data corruption
28* Asynchronous replication over network
29* Open Source
30* Encryption
31
32Hardware
24406ebc 33~~~~~~~~~
859fe9c1
OB
34
35ZFS depends heavily on memory, so you need at least 8GB to start. In
36practice, use as much you can get for your hardware/budget. To prevent
37data corruption, we recommend the use of high quality ECC RAM.
38
39If you use a dedicated cache and/or log disk, you should use an
40enterprise class SSD (e.g. Intel SSD DC S3700 Series). This can
41increase the overall performance significantly.
42
43IMPORTANT: Do not use ZFS on top of hardware controller which has its
44own cache management. ZFS needs to directly communicate with disks. An
45HBA adapter is the way to go, or something like LSI controller flashed
46in ``IT`` mode.
47
48
859fe9c1 49ZFS Administration
24406ebc 50~~~~~~~~~~~~~~~~~~
859fe9c1
OB
51
52This section gives you some usage examples for common tasks. ZFS
53itself is really powerful and provides many options. The main commands
54to manage ZFS are `zfs` and `zpool`. Both commands come with great
55manual pages, which can be read with:
56
57.. code-block:: console
24406ebc 58
859fe9c1
OB
59 # man zpool
60 # man zfs
61
62Create a new zpool
24406ebc 63^^^^^^^^^^^^^^^^^^
859fe9c1
OB
64
65To create a new pool, at least one disk is needed. The `ashift` should
66have the same sector-size (2 power of `ashift`) or larger as the
67underlying disk.
68
69.. code-block:: console
24406ebc 70
859fe9c1
OB
71 # zpool create -f -o ashift=12 <pool> <device>
72
73Create a new pool with RAID-0
24406ebc 74^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
859fe9c1
OB
75
76Minimum 1 disk
77
78.. code-block:: console
24406ebc 79
859fe9c1
OB
80 # zpool create -f -o ashift=12 <pool> <device1> <device2>
81
82Create a new pool with RAID-1
24406ebc 83^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
859fe9c1
OB
84
85Minimum 2 disks
86
87.. code-block:: console
24406ebc 88
859fe9c1
OB
89 # zpool create -f -o ashift=12 <pool> mirror <device1> <device2>
90
91Create a new pool with RAID-10
24406ebc 92^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
859fe9c1
OB
93
94Minimum 4 disks
95
96.. code-block:: console
24406ebc 97
859fe9c1
OB
98 # zpool create -f -o ashift=12 <pool> mirror <device1> <device2> mirror <device3> <device4>
99
100Create a new pool with RAIDZ-1
24406ebc 101^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
859fe9c1
OB
102
103Minimum 3 disks
104
105.. code-block:: console
24406ebc 106
859fe9c1
OB
107 # zpool create -f -o ashift=12 <pool> raidz1 <device1> <device2> <device3>
108
109Create a new pool with RAIDZ-2
24406ebc 110^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
859fe9c1
OB
111
112Minimum 4 disks
113
114.. code-block:: console
24406ebc 115
859fe9c1
OB
116 # zpool create -f -o ashift=12 <pool> raidz2 <device1> <device2> <device3> <device4>
117
118Create a new pool with cache (L2ARC)
24406ebc 119^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
859fe9c1
OB
120
121It is possible to use a dedicated cache drive partition to increase
122the performance (use SSD).
123
124As `<device>` it is possible to use more devices, like it's shown in
125"Create a new pool with RAID*".
126
127.. code-block:: console
24406ebc 128
859fe9c1
OB
129 # zpool create -f -o ashift=12 <pool> <device> cache <cache_device>
130
131Create a new pool with log (ZIL)
24406ebc 132^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
859fe9c1
OB
133
134It is possible to use a dedicated cache drive partition to increase
135the performance (SSD).
136
137As `<device>` it is possible to use more devices, like it's shown in
138"Create a new pool with RAID*".
139
140.. code-block:: console
24406ebc 141
859fe9c1
OB
142 # zpool create -f -o ashift=12 <pool> <device> log <log_device>
143
144Add cache and log to an existing pool
24406ebc 145^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
859fe9c1
OB
146
147If you have a pool without cache and log. First partition the SSD in
1482 partition with `parted` or `gdisk`
149
150.. important:: Always use GPT partition tables.
151
152The maximum size of a log device should be about half the size of
153physical memory, so this is usually quite small. The rest of the SSD
154can be used as cache.
155
156.. code-block:: console
24406ebc 157
859fe9c1
OB
158 # zpool add -f <pool> log <device-part1> cache <device-part2>
159
160
161Changing a failed device
24406ebc 162^^^^^^^^^^^^^^^^^^^^^^^^
859fe9c1
OB
163
164.. code-block:: console
24406ebc 165
859fe9c1
OB
166 # zpool replace -f <pool> <old device> <new device>
167
168
169Changing a failed bootable device
24406ebc 170^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
859fe9c1
OB
171
172Depending on how Proxmox Backup was installed it is either using `grub` or `systemd-boot`
173as bootloader.
174
175The first steps of copying the partition table, reissuing GUIDs and replacing
176the ZFS partition are the same. To make the system bootable from the new disk,
177different steps are needed which depend on the bootloader in use.
178
179.. code-block:: console
24406ebc 180
859fe9c1
OB
181 # sgdisk <healthy bootable device> -R <new device>
182 # sgdisk -G <new device>
183 # zpool replace -f <pool> <old zfs partition> <new zfs partition>
184
185.. NOTE:: Use the `zpool status -v` command to monitor how far the resilvering process of the new disk has progressed.
186
187With `systemd-boot`:
188
189.. code-block:: console
24406ebc 190
859fe9c1
OB
191 # pve-efiboot-tool format <new disk's ESP>
192 # pve-efiboot-tool init <new disk's ESP>
193
194.. NOTE:: `ESP` stands for EFI System Partition, which is setup as partition #2 on
195 bootable disks setup by the {pve} installer since version 5.4. For details, see
196 xref:sysboot_systemd_boot_setup[Setting up a new partition for use as synced ESP].
197
198With `grub`:
199
200Usually `grub.cfg` is located in `/boot/grub/grub.cfg`
201
202.. code-block:: console
24406ebc 203
859fe9c1
OB
204 # grub-install <new disk>
205 # grub-mkconfig -o /path/to/grub.cfg
206
207
208Activate E-Mail Notification
24406ebc 209^^^^^^^^^^^^^^^^^^^^^^^^^^^^
859fe9c1
OB
210
211ZFS comes with an event daemon, which monitors events generated by the
212ZFS kernel module. The daemon can also send emails on ZFS events like
213pool errors. Newer ZFS packages ship the daemon in a separate package,
214and you can install it using `apt-get`:
215
216.. code-block:: console
24406ebc 217
859fe9c1
OB
218 # apt-get install zfs-zed
219
220To activate the daemon it is necessary to edit `/etc/zfs/zed.d/zed.rc` with your
221favourite editor, and uncomment the `ZED_EMAIL_ADDR` setting:
222
223.. code-block:: console
24406ebc 224
859fe9c1
OB
225 ZED_EMAIL_ADDR="root"
226
227Please note Proxmox Backup forwards mails to `root` to the email address
228configured for the root user.
229
230IMPORTANT: The only setting that is required is `ZED_EMAIL_ADDR`. All
231other settings are optional.
232
233Limit ZFS Memory Usage
24406ebc 234^^^^^^^^^^^^^^^^^^^^^^
859fe9c1
OB
235
236It is good to use at most 50 percent (which is the default) of the
237system memory for ZFS ARC to prevent performance shortage of the
238host. Use your preferred editor to change the configuration in
239`/etc/modprobe.d/zfs.conf` and insert:
240
241.. code-block:: console
24406ebc 242
859fe9c1
OB
243 options zfs zfs_arc_max=8589934592
244
245This example setting limits the usage to 8GB.
246
247.. IMPORTANT:: If your root file system is ZFS you must update your initramfs every time this value changes:
248
249.. code-block:: console
24406ebc 250
859fe9c1
OB
251 # update-initramfs -u
252
253
254SWAP on ZFS
24406ebc 255^^^^^^^^^^^
859fe9c1
OB
256
257Swap-space created on a zvol may generate some troubles, like blocking the
258server or generating a high IO load, often seen when starting a Backup
259to an external Storage.
260
261We strongly recommend to use enough memory, so that you normally do not
262run into low memory situations. Should you need or want to add swap, it is
263preferred to create a partition on a physical disk and use it as swapdevice.
264You can leave some space free for this purpose in the advanced options of the
265installer. Additionally, you can lower the `swappiness` value.
266A good value for servers is 10:
267
268.. code-block:: console
24406ebc 269
859fe9c1
OB
270 # sysctl -w vm.swappiness=10
271
272To make the swappiness persistent, open `/etc/sysctl.conf` with
273an editor of your choice and add the following line:
274
275.. code-block:: console
24406ebc 276
859fe9c1
OB
277 vm.swappiness = 10
278
279.. table:: Linux kernel `swappiness` parameter values
280 :widths:auto
24406ebc
TL
281
282 ==================== ===============================================================
859fe9c1 283 Value Strategy
24406ebc 284 ==================== ===============================================================
859fe9c1
OB
285 vm.swappiness = 0 The kernel will swap only to avoid an 'out of memory' condition
286 vm.swappiness = 1 Minimum amount of swapping without disabling it entirely.
24406ebc 287 vm.swappiness = 10 Sometimes recommended to improve performance when sufficient memory exists in a system.
859fe9c1
OB
288 vm.swappiness = 60 The default value.
289 vm.swappiness = 100 The kernel will swap aggressively.
24406ebc 290 ==================== ===============================================================
859fe9c1
OB
291
292ZFS Compression
24406ebc 293^^^^^^^^^^^^^^^
859fe9c1
OB
294
295To activate compression:
296.. code-block:: console
24406ebc 297
859fe9c1
OB
298 # zpool set compression=lz4 <pool>
299
300We recommend using the `lz4` algorithm, since it adds very little CPU overhead.
301Other algorithms such as `lzjb` and `gzip-N` (where `N` is an integer `1-9` representing
302the compression ratio, 1 is fastest and 9 is best compression) are also available.
303Depending on the algorithm and how compressible the data is, having compression enabled can even increase
304I/O performance.
305
306You can disable compression at any time with:
307.. code-block:: console
24406ebc 308
859fe9c1
OB
309 # zfs set compression=off <dataset>
310
311Only new blocks will be affected by this change.
312
313ZFS Special Device
24406ebc 314^^^^^^^^^^^^^^^^^^
859fe9c1
OB
315
316Since version 0.8.0 ZFS supports `special` devices. A `special` device in a
317pool is used to store metadata, deduplication tables, and optionally small
318file blocks.
319
320A `special` device can improve the speed of a pool consisting of slow spinning
321hard disks with a lot of metadata changes. For example workloads that involve
322creating, updating or deleting a large number of files will benefit from the
323presence of a `special` device. ZFS datasets can also be configured to store
324whole small files on the `special` device which can further improve the
325performance. Use fast SSDs for the `special` device.
326
327.. IMPORTANT:: The redundancy of the `special` device should match the one of the
328 pool, since the `special` device is a point of failure for the whole pool.
329
330.. WARNING:: Adding a `special` device to a pool cannot be undone!
331
332Create a pool with `special` device and RAID-1:
333
334.. code-block:: console
24406ebc 335
859fe9c1
OB
336 # zpool create -f -o ashift=12 <pool> mirror <device1> <device2> special mirror <device3> <device4>
337
338Adding a `special` device to an existing pool with RAID-1:
339
340.. code-block:: console
24406ebc 341
859fe9c1
OB
342 # zpool add <pool> special mirror <device1> <device2>
343
344ZFS datasets expose the `special_small_blocks=<size>` property. `size` can be
345`0` to disable storing small file blocks on the `special` device or a power of
346two in the range between `512B` to `128K`. After setting the property new file
347blocks smaller than `size` will be allocated on the `special` device.
348
349.. IMPORTANT:: If the value for `special_small_blocks` is greater than or equal to
350 the `recordsize` (default `128K`) of the dataset, *all* data will be written to
351 the `special` device, so be careful!
352
353Setting the `special_small_blocks` property on a pool will change the default
354value of that property for all child ZFS datasets (for example all containers
355in the pool will opt in for small file blocks).
356
357Opt in for all file smaller than 4K-blocks pool-wide:
358
359.. code-block:: console
24406ebc 360
859fe9c1
OB
361 # zfs set special_small_blocks=4K <pool>
362
363Opt in for small file blocks for a single dataset:
364
365.. code-block:: console
24406ebc 366
859fe9c1
OB
367 # zfs set special_small_blocks=4K <pool>/<filesystem>
368
369Opt out from small file blocks for a single dataset:
370
371.. code-block:: console
24406ebc 372
859fe9c1
OB
373 # zfs set special_small_blocks=0 <pool>/<filesystem>
374
375Troubleshooting
24406ebc 376^^^^^^^^^^^^^^^
859fe9c1
OB
377
378Corrupted cachefile
379
380In case of a corrupted ZFS cachefile, some volumes may not be mounted during
381boot until mounted manually later.
382
383For each pool, run:
384
385.. code-block:: console
24406ebc 386
859fe9c1
OB
387 # zpool set cachefile=/etc/zfs/zpool.cache POOLNAME
388
389and afterwards update the `initramfs` by running:
390
391.. code-block:: console
24406ebc 392
859fe9c1
OB
393 # update-initramfs -u -k all
394
395and finally reboot your node.
396
397Sometimes the ZFS cachefile can get corrupted, and `zfs-import-cache.service`
398doesn't import the pools that aren't present in the cachefile.
399
400Another workaround to this problem is enabling the `zfs-import-scan.service`,
401which searches and imports pools via device scanning (usually slower).