]> git.proxmox.com Git - mirror_zfs.git/log
mirror_zfs.git
2 years agolibzfs: Prevent overridding of error code
ixhamza [Wed, 15 Jun 2022 21:26:12 +0000 (02:26 +0500)]
libzfs: Prevent overridding of error code

zfs_send_cb_impl fails to report error for some flags.

Use second error variable for send_conclusion_record.

Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ameer Hamza <ahamza@ixsystems.com>
Closes #13558

2 years agoReduce ZIO io_lock contention on sorted scrub
Alexander Motin [Wed, 15 Jun 2022 21:25:08 +0000 (17:25 -0400)]
Reduce ZIO io_lock contention on sorted scrub

During sorted scrub multiple threads (one per vdev) are issuing many
ZIOs same time, all using the same scn->scn_zio_root ZIO as parent.
It causes huge lock contention on the single global lock on that ZIO.
Improve it by introducing per-queue null ZIOs, children to that one,
and using them instead as proxy.

For 12 SSD pool storing 1.5TB of 4KB blocks on 80-core system this
dramatically reduces lock contention and reduces scrub time from 21
minutes down to 12.5, while actual read stages (not scan) are about
3x faster, reaching 100K blocks per second per vdev.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Alexander Motin <mav@FreeBSD.org>
Sponsored-By: iXsystems, Inc.
Closes #13553

2 years agoAdd support for ARCH=um for x86 sub-architectures
crass [Wed, 15 Jun 2022 21:22:52 +0000 (16:22 -0500)]
Add support for ARCH=um for x86 sub-architectures

When building modules (as well as the kernel) with ARCH=um, the options
-Dsetjmp=kernel_setjmp and -Dlongjmp=kernel_longjmp are passed to the C
preprocessor for C files. This causes the setjmp and longjmp used in
module/lua/ldo.c to be kernel_setjmp and kernel_longjmp respectively in
the object file. However, the setjmp and longjmp that is intended to be
called is defined in an architecture dependent assembly file under the
directory module/lua/setjmp. Since it is an assembly and not a C file,
the preprocessor define is not given and the names do not change. This
becomes an issue when modpost is trying to create the Module.symvers
and sees no defined symbol for kernel_setjmp and kernel_longjmp. To fix
this, if the macro CONFIG_UML is defined, then setjmp and longjmp
macros are undefined.

When building with ARCH=um for x86 sub-architectures, CONFIG_X86 is not
defined. Instead, CONFIG_UML_X86 is defined. Despite this, the UML x86
sub-architecture can use the same object files as the x86 architectures
because the x86 sub-architecture UML kernel is running with the same
instruction set as CONFIG_X86. So the modules/Kbuild build file is
updated to add the same object files that CONFIG_X86 would add when
CONFIG_UML_X86 is defined.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Glenn Washburn <development@efficientek.com>
Closes #13547

2 years agoFix clang 13 compilation errors
Damian Szuberski [Wed, 15 Jun 2022 21:20:28 +0000 (07:20 +1000)]
Fix clang 13 compilation errors

```
os/linux/zfs/zvol_os.c:1111:3: error: ignoring return value of function
  declared with 'warn_unused_result' attribute [-Werror,-Wunused-result]
                add_disk(zv->zv_zso->zvo_disk);
                ^~~~~~~~ ~~~~~~~~~~~~~~~~~~~~

zpl_xattr.c:1579:1: warning: no previous prototype for function
  'zpl_posix_acl_release_impl' [-Wmissing-prototypes]
```

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: szubersk <szuberskidamian@gmail.com>
Closes #13551

2 years agoReplace ZPROP_INVAL with ZPROP_USERPROP where it means a user property
Allan Jude [Tue, 14 Jun 2022 18:27:53 +0000 (14:27 -0400)]
Replace ZPROP_INVAL with ZPROP_USERPROP where it means a user property

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Allan Jude <allan@klarasystems.com>
Sponsored-by: Klara Inc.
Closes #12676

2 years agospl: Use a clearer name for the user namespace fd
Ryan Moeller [Mon, 13 Jun 2022 20:30:34 +0000 (20:30 +0000)]
spl: Use a clearer name for the user namespace fd

This fd has nothing to do with cleanup, that's just the name of the
field in zfs_cmd_t that was used to pass it to the kernel.

Call it what it is, an fd for a user namespace.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Allan Jude <allan@klarasystems.com>
Signed-off-by: Ryan Moeller <freqlabs@FreeBSD.org>
Closes #13554

2 years agolibzfs: zfs_userns: Don't leak the namespace fd
Ryan Moeller [Mon, 13 Jun 2022 20:24:23 +0000 (20:24 +0000)]
libzfs: zfs_userns: Don't leak the namespace fd

zfs_userns opens a file descriptor for the kernel to look up a
namespace, but does not close it.

Close the fd when we're done with it.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Allan Jude <allan@klarasystems.com>
Signed-off-by: Ryan Moeller <freqlabs@FreeBSD.org>
Closes #13554

2 years agoAdd weekly and monthly systemd timers for trimming
Julian Brunner [Sat, 11 Jun 2022 01:22:14 +0000 (03:22 +0200)]
Add weekly and monthly systemd timers for trimming

On machines using systemd, trim timers can be enabled on a per-pool
basis. Weekly and monthly timer units are provided. Timers can be
enabled as follows:

systemctl enable zfs-trim-weekly@rpool.timer --now
systemctl enable zfs-trim-monthly@datapool.timer --now

Each timer will pull in zfs-trim@${poolname}.service, which is not
schedule-specific.

The manpage zpool-trim has been updated accordingly.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Julian Brunner <julian.brunner@gmail.com>
Closes #13544

2 years agoImprove sorted scan memory accounting
Alexander Motin [Fri, 10 Jun 2022 17:01:46 +0000 (13:01 -0400)]
Improve sorted scan memory accounting

Since we use two B-trees q_exts_by_size and q_exts_by_addr, we should
count 2x sizeof (range_seg_gap_t) per node.  And since average B-tree
memory efficiency is about 75%, we should increase it to 3x.

Previous code under-counted up to 30% of the memory usage.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Alexander Motin <mav@FreeBSD.org>
Sponsored-By: iXsystems, Inc.
Closes #13537

2 years agoAdd Linux namespace delegation support
Will Andrews [Sun, 21 Feb 2021 16:19:43 +0000 (10:19 -0600)]
Add Linux namespace delegation support

This allows ZFS datasets to be delegated to a user/mount namespace
Within that namespace, only the delegated datasets are visible
Works very similarly to Zones/Jailes on other ZFS OSes

As a user:
```
 $ unshare -Um
 $ zfs list
no datasets available
 $ echo $$
1234
```

As root:
```
 # zfs list
NAME                            ZONED  MOUNTPOINT
containers                      off    /containers
containers/host                 off    /containers/host
containers/host/child           off    /containers/host/child
containers/host/child/gchild    off    /containers/host/child/gchild
containers/unpriv               on     /unpriv
containers/unpriv/child         on     /unpriv/child
containers/unpriv/child/gchild  on     /unpriv/child/gchild

 # zfs zone /proc/1234/ns/user containers/unpriv
```

Back to the user namespace:
```
 $ zfs list
NAME                             USED  AVAIL     REFER  MOUNTPOINT
containers                       129M  47.8G       24K  /containers
containers/unpriv                128M  47.8G       24K  /unpriv
containers/unpriv/child          128M  47.8G      128M  /unpriv/child
```

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Will Andrews <will.andrews@klarasystems.com>
Signed-off-by: Allan Jude <allan@klarasystems.com>
Signed-off-by: Mateusz Piotrowski <mateusz.piotrowski@klarasystems.com>
Co-authored-by: Allan Jude <allan@klarasystems.com>
Co-authored-by: Mateusz Piotrowski <mateusz.piotrowski@klarasystems.com>
Sponsored-by: Buddy <https://buddy.works>
Closes #12263

2 years agoRevert parts of 938cfeb0f27303721081223816d4f251ffeb1767
Allan Jude [Fri, 2 Jul 2021 19:16:58 +0000 (19:16 +0000)]
Revert parts of 938cfeb0f27303721081223816d4f251ffeb1767

When read and writing the UID/GID, we always want the value
relative to the root user namespace, the kernel will take care
of remapping this to the user namespace for us.

Calling from_kuid(user_ns, uid) with a unmapped uid will return -1
as that uid is outside of the scope of that namespace, and will result
in the files inside the namespace all being owned by 'nobody' and not
being allowed to call chmod or chown on them.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Allan Jude <allan@klarasystems.com>
Closes #12263

2 years agoAVL: Remove obsolete branching optimizations
Alexander Motin [Thu, 9 Jun 2022 22:27:36 +0000 (18:27 -0400)]
AVL: Remove obsolete branching optimizations

Modern Clang and GCC can successfully implement simple conditions
without branching with math and flag operations.  Use of arrays for
translation no longer helps as much as it was 14+ years ago.

Disassemble of the code generated by Clang 13.0.0 on FreeBSD 13.1,
Clang 14.0.4 on FreeBSD 14 and GCC 10.2.1 on Debian 11 with this
change still shows no branching instructions.

Profiling of CPU-bound scan stage of sorted scrub shows reproducible
reduction of time spent inside avl_find() from 6.52% to 4.58%.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Alexander Motin <mav@FreeBSD.org>
Sponsored-By: iXsystems, Inc.
Closes #13540

2 years agolibzfs: Rename msg bufs to errbuf for consistency
Ryan Moeller [Wed, 8 Jun 2022 16:32:38 +0000 (16:32 +0000)]
libzfs: Rename msg bufs to errbuf for consistency

`libzfs_pool.c` uses the name `msg` where everywhere else in libzfs uses
`errbuf` for the error message buffer.

Use the name consistent with the rest of libzfs and use ERRBUFLEN
instead of 1024.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ryan Moeller <freqlabs@FreeBSD.org>
Closes #13539

2 years agolibzfs: Define the defecto standard errbuf size
Ryan Moeller [Wed, 8 Jun 2022 13:08:10 +0000 (13:08 +0000)]
libzfs: Define the defecto standard errbuf size

Every errbuf array in libzfs is 1024 chars.

Define ERRBUFLEN in a shared header, and use it.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ryan Moeller <freqlabs@FreeBSD.org>
Closes #13539

2 years agozvol: Support blk-mq for better performance
Tony Hutter [Thu, 9 Jun 2022 14:10:38 +0000 (07:10 -0700)]
zvol: Support blk-mq for better performance

Add support for the kernel's block multiqueue (blk-mq) interface in
the zvol block driver.  blk-mq creates multiple request queues on
different CPUs rather than having a single request queue.  This can
improve zvol performance with multithreaded reads/writes.

This implementation uses the blk-mq interfaces on 4.13 or newer
kernels.  Building against older kernels will fall back to the
older BIO interfaces.

Note that you must set the `zvol_use_blk_mq` module param to
enable the blk-mq API.  It is disabled by default.

In addition, this commit lets the zvol blk-mq layer process whole
`struct request` IOs at a time, rather than breaking them down
into their individual BIOs.  This reduces dbuf lock contention
and overhead versus the legacy zvol submit_bio() codepath.

sequential dd to one zvol, 8k volblocksize, no O_DIRECT:

legacy submit_bio()     292MB/s write  453MB/s read
this commit             453MB/s write  885MB/s read

It also introduces a new `zvol_blk_mq_chunks_per_thread` module
parameter. This parameter represents how many volblocksize'd chunks
to process per each zvol thread.  It can be used to tune your zvols
for better read vs write performance (higher values favor write,
lower favor read).

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Reviewed-by: Tony Nguyen <tony.nguyen@delphix.com>
Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Closes #13148
Issue #12483

2 years agoIntroduce BLAKE3 checksums as an OpenZFS feature
Tino Reichardt [Wed, 8 Jun 2022 22:55:57 +0000 (00:55 +0200)]
Introduce BLAKE3 checksums as an OpenZFS feature

This commit adds BLAKE3 checksums to OpenZFS, it has similar
performance to Edon-R, but without the caveats around the latter.

Homepage of BLAKE3: https://github.com/BLAKE3-team/BLAKE3
Wikipedia: https://en.wikipedia.org/wiki/BLAKE_(hash_function)#BLAKE3

Short description of Wikipedia:

  BLAKE3 is a cryptographic hash function based on Bao and BLAKE2,
  created by Jack O'Connor, Jean-Philippe Aumasson, Samuel Neves, and
  Zooko Wilcox-O'Hearn. It was announced on January 9, 2020, at Real
  World Crypto. BLAKE3 is a single algorithm with many desirable
  features (parallelism, XOF, KDF, PRF and MAC), in contrast to BLAKE
  and BLAKE2, which are algorithm families with multiple variants.
  BLAKE3 has a binary tree structure, so it supports a practically
  unlimited degree of parallelism (both SIMD and multithreading) given
  enough input. The official Rust and C implementations are
  dual-licensed as public domain (CC0) and the Apache License.

Along with adding the BLAKE3 hash into the OpenZFS infrastructure a
new benchmarking file called chksum_bench was introduced.  When read
it reports the speed of the available checksum functions.

On Linux: cat /proc/spl/kstat/zfs/chksum_bench
On FreeBSD: sysctl kstat.zfs.misc.chksum_bench

This is an example output of an i3-1005G1 test system with Debian 11:

implementation      1k      4k     16k     64k    256k      1m      4m
edonr-generic     1196    1602    1761    1749    1762    1759    1751
skein-generic      546     591     608     615     619     612     616
sha256-generic     240     300     316     314     304     285     276
sha512-generic     353     441     467     476     472     467     426
blake3-generic     308     313     313     313     312     313     312
blake3-sse2        402    1289    1423    1446    1432    1458    1413
blake3-sse41       427    1470    1625    1704    1679    1607    1629
blake3-avx2        428    1920    3095    3343    3356    3318    3204
blake3-avx512      473    2687    4905    5836    5844    5643    5374

Output on Debian 5.10.0-10-amd64 system: (Ryzen 7 5800X)

implementation      1k      4k     16k     64k    256k      1m      4m
edonr-generic     1840    2458    2665    2719    2711    2723    2693
skein-generic      870     966     996     992    1003    1005    1009
sha256-generic     415     442     453     455     457     457     457
sha512-generic     608     690     711     718     719     720     721
blake3-generic     301     313     311     309     309     310     310
blake3-sse2        343    1865    2124    2188    2180    2181    2186
blake3-sse41       364    2091    2396    2509    2463    2482    2488
blake3-avx2        365    2590    4399    4971    4915    4802    4764

Output on Debian 5.10.0-9-powerpc64le system: (POWER 9)

implementation      1k      4k     16k     64k    256k      1m      4m
edonr-generic     1213    1703    1889    1918    1957    1902    1907
skein-generic      434     492     520     522     511     525     525
sha256-generic     167     183     187     188     188     187     188
sha512-generic     186     216     222     221     225     224     224
blake3-generic     153     152     154     153     151     153     153
blake3-sse2        391    1170    1366    1406    1428    1426    1414
blake3-sse41       352    1049    1212    1174    1262    1258    1259

Output on Debian 5.10.0-11-arm64 system: (Pi400)

implementation      1k      4k     16k     64k    256k      1m      4m
edonr-generic      487     603     629     639     643     641     641
skein-generic      271     299     303     308     309     309     307
sha256-generic     117     127     128     130     130     129     130
sha512-generic     145     165     170     172     173     174     175
blake3-generic      81      29      71      89      89      89      89
blake3-sse2        112     323     368     379     380     371     374
blake3-sse41       101     315     357     368     369     364     360

Structurally, the new code is mainly split into these parts:
- 1x cross platform generic c variant: blake3_generic.c
- 4x assembly for X86-64 (SSE2, SSE4.1, AVX2, AVX512)
- 2x assembly for ARMv8 (NEON converted from SSE2)
- 2x assembly for PPC64-LE (POWER8 converted from SSE2)
- one file for switching between the implementations

Note the PPC64 assembly requires the VSX instruction set and the
kfpu_begin() / kfpu_end() calls on PowerPC were updated accordingly.

Reviewed-by: Felix Dörre <felix@dogcraft.de>
Reviewed-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tino Reichardt <milky-zfs@mcmilk.de>
Co-authored-by: Rich Ercolani <rincebrain@gmail.com>
Closes #10058
Closes #12918

2 years agoautoconf: AC_MSG_CHECKING consistency
Brian Behlendorf [Tue, 31 May 2022 23:42:49 +0000 (16:42 -0700)]
autoconf: AC_MSG_CHECKING consistency

Make the wording more consistent for the kernel AC_MSG_CHECKING
output (e.g. "checking whether ...".).  Additionally, group some
of the VFS interface checks with the others.  No functional change.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Attila Fülöp <attila@fueloep.org>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #13529

2 years agoLinux 5.19 compat: asm/fpu/internal.h
Brian Behlendorf [Tue, 31 May 2022 23:30:59 +0000 (16:30 -0700)]
Linux 5.19 compat: asm/fpu/internal.h

As of the Linux 5.19 kernel the asm/fpu/internal.h header was
entirely removed.  It has been effectively empty since the 5.16
kernel and provides no required functionality.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Attila Fülöp <attila@fueloep.org>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #13529

2 years agoRemove wrong assertion in log spacemap
Alexander Motin [Wed, 1 Jun 2022 16:54:35 +0000 (12:54 -0400)]
Remove wrong assertion in log spacemap

It is typical, but not generally true that if log summary has more
blocks it must also have unflushed metaslabs.  Normally with metaslabs
flushed in order it works, but there are known exceptions, such as
device removal or metaslab being loaded during its flush attempt.

Before 600a02b8844 if spa_flush_metaslabs() hit loading metaslab it
usually stopped (unless memlimit is also exceeded), but now it may
flush more metaslabs, just skipping that particular one.  This
increased chances of assertion to fire when the skipped metaslab is
flushed on next iteration if all other metaslabs in that summary
entry are already flushed out of order.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Alexander Motin <mav@FreeBSD.org>
Sponsored-By: iXsystems, Inc.
Closes #13486
Closes #13513

2 years agoCorrected parameters for zstd early abort
Rich Ercolani [Tue, 31 May 2022 22:41:33 +0000 (18:41 -0400)]
Corrected parameters for zstd early abort

That'll teach me to try and recall them from the definition.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Rich Ercolani <rincebrain@gmail.com>
Closes #13519

2 years agoFix typo in zil_commit() comment block
Allan Jude [Tue, 31 May 2022 22:37:46 +0000 (18:37 -0400)]
Fix typo in zil_commit() comment block

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Allan Jude <allan@klarasystems.com>
Closes #13518

2 years agoLinux 5.18 compat: META
Brian Behlendorf [Tue, 31 May 2022 21:38:00 +0000 (14:38 -0700)]
Linux 5.18 compat: META

Update the META file to reflect compatibility with the 5.18 kernel.

Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #13527

2 years agoLinux 5.19 compat: zap_flags_t conflict
Brian Behlendorf [Fri, 27 May 2022 22:56:05 +0000 (15:56 -0700)]
Linux 5.19 compat: zap_flags_t conflict

As of the Linux 5.19 kernel an identically named zap_flags_t typedef
is declared in the include/linux/mm_types.h linux header.  Sadly,
the inclusion of this header cannot be easily avoided.  To resolve
the conflict a #define is used to remap the name in the OpenZFS
sources when building against the Linux kernel.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #13515

2 years agoLinux 5.19 compat: bdev_start_io_acct() / bdev_end_io_acct()
Brian Behlendorf [Fri, 27 May 2022 21:31:03 +0000 (21:31 +0000)]
Linux 5.19 compat: bdev_start_io_acct() / bdev_end_io_acct()

As of the Linux 5.19 kernel the disk_*_io_acct() helper functions
have been replaced by the bdev_*_io_acct() functions.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #13515

2 years agoLinux 5.19 compat: aops->read_folio()
Brian Behlendorf [Fri, 27 May 2022 20:44:43 +0000 (20:44 +0000)]
Linux 5.19 compat: aops->read_folio()

As of the Linux 5.19 kernel the readpage() address space operation
has been replaced by read_folio().

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #13515

2 years agoLinux 5.19 compat: blkdev_issue_secure_erase()
Brian Behlendorf [Fri, 27 May 2022 19:40:22 +0000 (19:40 +0000)]
Linux 5.19 compat: blkdev_issue_secure_erase()

Linux 5.19 commit torvalds/linux@44abff2c0 splits the secure
erase functionality from the blkdev_issue_discard() function.
The blkdev_issue_secure_erase() must now be issued to issue
a secure erase.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #13515

2 years agoLinux 5.19 compat: bdev_max_secure_erase_sectors()
Brian Behlendorf [Fri, 27 May 2022 18:20:04 +0000 (18:20 +0000)]
Linux 5.19 compat: bdev_max_secure_erase_sectors()

Linux 5.19 commit torvalds/linux@44abff2c0 removed the
blk_queue_secure_erase() helper function.  The preferred
interface is to now use the bdev_max_secure_erase_sectors()
function to check for discard support.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #13515

2 years agoLinux 5.19 compat: bdev_max_discard_sectors()
Brian Behlendorf [Fri, 27 May 2022 17:51:55 +0000 (17:51 +0000)]
Linux 5.19 compat: bdev_max_discard_sectors()

Linux 5.19 commit torvalds/linux@70200574cc removed the
blk_queue_discard() helper function.  The preferred interface
is to now use the bdev_max_discard_sectors() function to check
for discard support.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #13515

2 years agoLinux 5.18 compat: bio_alloc()
Brian Behlendorf [Fri, 27 May 2022 20:28:51 +0000 (20:28 +0000)]
Linux 5.18 compat: bio_alloc()

As for the Linux 5.18 kernel bio_alloc() expects a block_device struct
as an argument.  This removes the need for the bio_set_dev() compatibility
code for 5.18 and newer kernels.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #13515

2 years agoFix inflated quiesce time caused by lwb_tx during zil_commit()
Kevin Jin [Thu, 26 May 2022 16:36:14 +0000 (12:36 -0400)]
Fix inflated quiesce time caused by lwb_tx during zil_commit()

In current zil_commit() process, transaction lwb_tx is assigned in
zil_lwb_write_issue(), and is committed in zil_lwb_flush_vdevs_done().
Thus, during lwb write out process, the txg is held in open or quiesing
state, until zil_lwb_flush_vdevs_done() is called. If the zil's zio
latency is high, it will cause txg_sync_thread() to starve.

The goal here is to defer waiting for zil_lwb_flush_vdevs_done to the
'syncing' txg state. That is, in zil_sync().

In this patch, it achieves the goal without holding transaction.
A new function zil_lwb_flush_wait_all() is introduced. It waits for
the completion of all the zil_lwb_flush_vdevs_done() by given txg.

Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Prakash Surya <prakash.surya@delphix.com>
Signed-off-by: jxdking <lostking2008@hotmail.com>
Closes #12321

2 years agoReplace EXTRA_DIST with dist_noinst_DATA
Brian Behlendorf [Thu, 26 May 2022 16:24:50 +0000 (09:24 -0700)]
Replace EXTRA_DIST with dist_noinst_DATA

The EXTRA_DIST variable is ignored when used in the FALSE conditional
of a Makefile.am.  This results in the `make dist` target omitting
these files from the generated tarball unless CONFIG_USER is defined.
This issue can be avoided by switching to use the dist_noinst_DATA
variable which is handled as expected by autoconf.

This change also adds support for --with-config=dist as an alias
for --with-config=srpm and updates the GitHub workflows to use it.

Reviewed-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #13459
Closes #13505

2 years agoSilence unused-but-set-variable warning
Ryan Moeller [Thu, 26 May 2022 00:26:59 +0000 (20:26 -0400)]
Silence unused-but-set-variable warning

This was breaking the kmod port build on FreeBSD with Clang 13.

Use the same trick as we do for ASSERT() to make DNODE_VERIFY() use
its parameter at compile time without actually using it at run time
in non-debug builds.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ryan Moeller <ryan@iXsystems.com>
Closes #13507

2 years agoMore speculative prefetcher improvements
Alexander Motin [Wed, 25 May 2022 17:12:52 +0000 (13:12 -0400)]
More speculative prefetcher improvements

- Make prefetch distance adaptive: up to 4MB prefetch doubles for
every, hit same as before, but after that it grows by 1/8 every time
the prefetch read does not complete in time to satisfy the demand.
My tests show that 4MB is sufficient for wide NVMe pool to saturate
single reader thread at 2.5GB/s, while new 64MB maximum allows the
same thread to reach 1.5GB/s on wide HDD pool.  Further distance
increase may increase speed even more, but less dramatic and with
higher latency.

 - Allow early reuse of inactive prefetch streams: streams that never
saw hits can be reused immediately if there is a demand, while others
can be reused after 1s of inactivity, starting with the oldest.  After
2s of inactivity streams are deleted to free resources same as before.
This allows by several times increase strided read performance on HDD
pool in presence of simultaneous random reads, previously filling the
zfetch_max_streams limit for seconds and so blocking most of prefetch.

 - Always issue intermediate indirect block reads with SYNC priority.
Each of those reads if delayed for longer may delay up to 1024 other
block prefetches, that may be not good for wide pools.

Reviewed-by: Allan Jude <allan@klarasystems.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Alexander Motin <mav@FreeBSD.org>
Sponsored-By: iXsystems, Inc.
Closes #13452

2 years agoautomake: don't install /e/d/zfs or /e/z/zfs-functions +x
наб [Wed, 25 May 2022 16:29:47 +0000 (18:29 +0200)]
automake: don't install /e/d/zfs or /e/z/zfs-functions +x

_SCRIPTS means it's made +x when installing; _DATA is made -x.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #13496
Closes #13503

2 years agoCancel in-progress rebuilds when we finish removal
Paul Dagnelie [Wed, 25 May 2022 16:25:13 +0000 (09:25 -0700)]
Cancel in-progress rebuilds when we finish removal

This issue was discovered by zloop runs. When a mirror or other
redundant top-level vdev has a disk failure, and the disk is replaced,
the rebuild process occurs. A removal can happen while this is in
progress. If the removal completes before the rebuild does, the
removal process will try to free the vdev that is still in use.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Paul Dagnelie <pcd@delphix.com>
Closes #13498

2 years agorpm: Keep debug symbols if configured with '--enable-debuginfo'
Umer Saleem [Wed, 25 May 2022 16:22:11 +0000 (21:22 +0500)]
rpm: Keep debug symbols if configured with '--enable-debuginfo'

Do not strip debug information from packages if '--enable-debuginfo' is
configured.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Umer Saleem <usaleem@ixsystems.com>
Closes #13500

2 years agoStandardize RHEL version check in packages
Brian Behlendorf [Wed, 25 May 2022 16:20:17 +0000 (09:20 -0700)]
Standardize RHEL version check in packages

This is a follow up to 3c356622994 which standardizes how the RHEL
version check is done.  This simpler "0%{?rhel}" check is used
elsewhere in the packages so we do the same here.

Reviewed-by: Neal Gompa <ngompa@datto.com>
Reviewed-by: Rich Ercolani <rincebrain@gmail.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #13501

2 years agoUnbreak zstd build on sparc64
Rich Ercolani [Wed, 25 May 2022 16:18:49 +0000 (12:18 -0400)]
Unbreak zstd build on sparc64

It turns out that wrapping the atomic macro in () breaks build
on Linux/SPARC64. Oops.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Rich Ercolani <rincebrain@gmail.com>
Closes #13506

2 years agoSwitch sed -E to -r for better portability
Brian Behlendorf [Wed, 25 May 2022 16:13:51 +0000 (09:13 -0700)]
Switch sed -E to -r for better portability

GNU sed 4.1.2 does not support the -E flag and this version is used by
some cross-compiling tool chains.  Switch -E to -r which is understood.

Reviewed-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #13502

2 years agorpm: Use the correct version-release information in dependencies
Neal Gompa (ニール・ゴンパ) [Tue, 24 May 2022 21:07:01 +0000 (17:07 -0400)]
rpm: Use the correct version-release information in dependencies

This tightly links the subpackages together and ensures that everything
is upgraded together.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Neal Gompa <ngompa@datto.com>
Closes #13489

2 years agoRefactor Log Size Limit
Alexander Motin [Tue, 24 May 2022 16:46:35 +0000 (12:46 -0400)]
Refactor Log Size Limit

Original Log Size Limit implementation blocked all writes in case of
limit reached until the TXG is committed and the log is freed.  It
caused huge delays and following speed spikes in application writes.

This implementation instead smoothly throttles writes, using exactly
the same mechanism as used for dirty data.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: jxdking <lostking2008@hotmail.com>
Signed-off-by: Alexander Motin <mav@FreeBSD.org>
Sponsored-By: iXsystems, Inc.
Issue #12284
Closes #13476

2 years agoTiered early abort, zstd edition
Rich Ercolani [Tue, 24 May 2022 16:43:22 +0000 (12:43 -0400)]
Tiered early abort, zstd edition

It turns out that "do LZ4 and zstd-1 both fail" is a great heuristic
for "don't even bother trying higher zstd tiers".

By way of illustration:
$ cat /incompress | mbuffer | zfs recv -o compression=zstd-12 evenfaster/lowcomp_1M_zstd12_normal
summary: 39.8 GiByte in  3min 40.2sec - average of  185 MiB/s
$ echo 3 | sudo tee /sys/module/zzstd/parameters/zstd_lz4_pass
3
$ cat /incompress | mbuffer -m 4G | zfs recv -o compression=zstd-12 evenfaster/lowcomp_1M_zstd12_patched
summary: 39.8 GiByte in 48.6sec - average of  839 MiB/s
$ sudo zfs list -p -o name,used,lused,ratio evenfaster/lowcomp_1M_zstd12_normal evenfaster/lowcomp_1M_zstd12_patched
NAME                                         USED        LUSED  RATIO
evenfaster/lowcomp_1M_zstd12_normal   39549931520  42721221632   1.08
evenfaster/lowcomp_1M_zstd12_patched  39626399744  42721217536   1.07
$ python3 -c "print(39626399744 - 39549931520)"
76468224
$

I'll take 76 MB out of 42 GB for > 4x speedup.

Reviewed-by: Allan Jude <allan@klarasystems.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Kjeld Schouten <kjeld@schouten-lebbing.nl>
Reviewed-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Signed-off-by: Rich Ercolani <rincebrain@gmail.com>
Closes #13244

2 years agoFreeBSD: libspl: Add locking around statfs globals
Ryan Moeller [Tue, 24 May 2022 16:40:20 +0000 (12:40 -0400)]
FreeBSD: libspl: Add locking around statfs globals

Makes getmntent and getmntany thread-safe for external consumers of
libzfs zpool_disable_datasets, zfs_iter_mounted, libzfs_mnttab_update,
libzfs_mnttab_find.

Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ryan Moeller <freqlabs@FreeBSD.org>
Closes #13484

2 years agoModified ncompress requirement in RPM to exclude RHEL9
Rich Ercolani [Tue, 24 May 2022 16:39:32 +0000 (12:39 -0400)]
Modified ncompress requirement in RPM to exclude RHEL9

The bug this was working around is no longer present.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Rich Ercolani <rincebrain@gmail.com>
Closes #13480
Closes #13490

2 years agozed: Take no action on scrub/resilver checksum errors
Brian Behlendorf [Tue, 24 May 2022 16:36:07 +0000 (09:36 -0700)]
zed: Take no action on scrub/resilver checksum errors

When scrubbing/resilvering a pool it can be counter productive to
cancel the scan and kick of a replace operation to a hot spare
when encountering checksum errors.  In this case, the best course
of action is to allow the scrub/resilver to complete as quickly
as possible and to keep the vdevs fully online if possible.

Realistically, this is less of an issue for a RAIDZ since a
traditional resilver must be used and checksums will be verified.
However, this is not the case for a mirror or dRAID pool which is
sequentially resilvered and checksum verification is deferred
until after the replace operation completes.

Regardless, we apply this policy to all pool types since it's
a good idea for all vdevs.  Degrading additional vdevs has the
potential to make a bad situation worse.  Note the checksum
errors will still be reported as both an event and by
`zpool status`.  This change only prevents the ZED from
proactively taking any action.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Tony Nguyen <tony.nguyen@delphix.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #13499

2 years agoVerify BPs in spa_load_verify_cb() and dsl_scan_visitbp()
Brian Behlendorf [Fri, 20 May 2022 17:36:14 +0000 (10:36 -0700)]
Verify BPs in spa_load_verify_cb() and dsl_scan_visitbp()

We want `zpool import` to be highly robust and never panic, even
when encountering corrupt metadata.  This is already handled in the
arc_read() code path, which covers most cases, but spa_load_verify_cb()
relies on zio_read() and is responsible for verifying the block pointer.

During import it is also possible to encounter blocks pointers which
contain ZIO_COMPRESS_INHERIT and ZIO_CHECKSUM_INHERIT values.  Relax
the verification function slightly to allow this.

Futhermore, extend dsl_scan_recurse() to verify the block pointer
contents of level zero blocks which are not of type DMU_OT_DNODE or
DMU_OT_OBJSET.  This is handled by arc_read() in the other cases.

Reviewed-by: Paul Dagnelie <pcd@delphix.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #13124
Closes #13360

2 years agozdb: Fix handling of nul termination in symlink targets
Mark Johnston [Fri, 20 May 2022 17:32:49 +0000 (13:32 -0400)]
zdb: Fix handling of nul termination in symlink targets

The SA attribute containing the symlink target does not include a nul
terminator, so when printing the target zdb would sometimes include
garbage at the end of the string.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Mark Johnston <markj@FreeBSD.org>
Closes #13482

2 years agolinux: libshare: smb: don't swallow net(1) errors
наб [Wed, 18 May 2022 22:56:38 +0000 (00:56 +0200)]
linux: libshare: smb: don't swallow net(1) errors

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #13191
Closes #13470

2 years agolibzfs: return (allocated) strings instead of filling buffers
наб [Thu, 14 Apr 2022 22:00:02 +0000 (00:00 +0200)]
libzfs: return (allocated) strings instead of filling buffers

This also expands the zfs version output from 127 characters to However
Many Are Actually Set

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #13330

2 years agolinux: libzfs: simplify module-loaded check
наб [Thu, 14 Apr 2022 21:30:41 +0000 (23:30 +0200)]
linux: libzfs: simplify module-loaded check

The short-path is now one access() call,
we always modprobe zfs (ZFS_MODULE_LOADING which doesn't use the libzfs
boolean parsing is gone),
and we use a simple inotify IN_CREATE loop with a timerfd timeout
rather than 10ms kernel-style polling

There's one substantial difference: ZFS_MODULE_TIMEOUT=-1
now means "never give up", rather than "wait 10 minutes"

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #13330

2 years agoRemove final K&R definitions
наб [Tue, 10 May 2022 21:28:02 +0000 (23:28 +0200)]
Remove final K&R definitions

Clang trunk now warns -Wstrict-prototypes on this, and they're removed
in C2x

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #13447

2 years agolibspl/include: remove unused/empty headers
наб [Tue, 10 May 2022 21:25:43 +0000 (23:25 +0200)]
libspl/include: remove unused/empty headers

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #13447

2 years agokmodtool: cleanup
наб [Tue, 10 May 2022 20:11:30 +0000 (22:11 +0200)]
kmodtool: cleanup

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #13447

2 years agorpm: don't spec obsolete_name/version anymore
наб [Tue, 10 May 2022 20:10:57 +0000 (22:10 +0200)]
rpm: don't spec obsolete_name/version anymore

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #13447

2 years agoAdd make regen-tests to regenerate the test bundle
наб [Tue, 10 May 2022 20:05:20 +0000 (22:05 +0200)]
Add make regen-tests to regenerate the test bundle

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #13447

2 years agozed: support subject as header in zed_notify_email()
heeplr [Wed, 18 May 2022 17:27:53 +0000 (19:27 +0200)]
zed: support subject as header in zed_notify_email()

Some minimal MUAs don't support passing the subjects as cmdline option.
This commit checks if "@SUBJECT@" is missing in ZED_EMAIL_OPTS and then
prepends a subject header to the notification message.
Also set a default for ${subject}.

Reviewed-by: Ahelenia Ziemia<C5><84>ska <nabijaczleweli@nabijaczleweli.xyz>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Daniel Hiepler <d-git@coderdu.de>
Closes #13440

2 years agoExpose zpool guids through kstats
Andrew [Wed, 18 May 2022 17:25:33 +0000 (12:25 -0500)]
Expose zpool guids through kstats

There are times when end-users may wish to have
a fast and convenient method to get zpool guid
without having to use libzfs. This commit
exposes the zpool guid via kstats in similar
manner to the zpool state.

Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Andrew Walker <awalker@ixsystems.com>
Closes #13466

2 years agoFix compiler warnings about zero-length arrays in inline bitops
Coleman Kane [Tue, 17 May 2022 20:07:39 +0000 (16:07 -0400)]
Fix compiler warnings about zero-length arrays in inline bitops

The compiler appears to be expanding the unused NULL pointer into a
zero-length array via the inline bitops code. When -Werror=array-bounds
is used, this causes a build failure. Recommended solution is allocate
temporary structures, fill with zeros (to avoid uninitialized data use
warnings), and pass the pointer to those to the inline calls.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Coleman Kane <ckane@colemankane.org>
Closes #13463
Closes #13465

2 years agolinux: libzutil: zfs_strip_path: only strip known prefixes
наб [Tue, 3 May 2022 18:13:22 +0000 (20:13 +0200)]
linux: libzutil: zfs_strip_path: only strip known prefixes

This mirrors FreeBSD:
  # zpool create -o cachefile= testpsko media/testpsko
  # zpool create -o cachefile= testpsko2 $PWD/testpsko2
  $ ./zpool list -v
  NAME                                              SIZE  ALLOC   FREE
  filling                                          25.5T  6.85T  18.6T
    mirror-0                                       3.64T   500G  3.15T
      ata-HGST_HUS726T4TALE6L4_V6K2L4RR                -      -      -
      ata-HGST_HUS726T4TALE6L4_V6K2MHYR                -      -      -
    raidz1-1                                       21.8T  6.36T  15.5T
      ata-HGST_HUS728T8TALE6L4_VDKT237K                -      -      -
      ata-HGST_HUS728T8TALE6L4_VDGY075D                -      -      -
      ata-HGST_HUS728T8TALE6L4_VDKVRRJK                -      -      -
  cache                                                -      -      -
    nvme0n1p4                                      63.0G  12.8G  50.2G
  tarta-boot                                        240M  50.0M   190M
    mirror-0                                        240M  50.0M   190M
      tarta-boot                                       -      -      -
      tarta-boot-nvme                                  -      -      -
  tarta-zoot                                       55.5G  6.96G  48.5G
    mirror-0                                       55.5G  6.96G  48.5G
      tarta-zoot                                       -      -      -
      tarta-zoot-nvme                                  -      -      -
  testpsko                                         39.5G   744K  39.5G
    media/testpsko1                                39.5G   744K  39.5G
  testpsko2                                        39.5G   130K  39.5G
    /home/nabijaczleweli/store/code/zfs/testpsko2  39.5G   130K  39.5G

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #13413
Closes #9771

2 years agolibzfs: constify zfs_strip_partition(), zfs_strip_path()
наб [Tue, 3 May 2022 17:56:12 +0000 (19:56 +0200)]
libzfs: constify zfs_strip_partition(), zfs_strip_path()

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #13413

2 years agolibzfs: pool: zpool_vdev_name: use libzfs_envvar_is_set
наб [Tue, 3 May 2022 17:38:15 +0000 (19:38 +0200)]
libzfs: pool: zpool_vdev_name: use libzfs_envvar_is_set

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #13413

2 years agozpool: max_width: monomorphise subtype iteration
наб [Tue, 3 May 2022 17:33:31 +0000 (19:33 +0200)]
zpool: max_width: monomorphise subtype iteration

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #13413

2 years agolinux: spl: generic: ddi_strto*: match solaris ddi_strto*(9)
наб [Sat, 7 May 2022 17:54:29 +0000 (19:54 +0200)]
linux: spl: generic: ddi_strto*: match solaris ddi_strto*(9)

Recognise initial whitespace, + in both cases,
and - also in unsigneds

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #13434

2 years agolinux: spl: generic: ddi_strtou##type: elide unused flag
наб [Sat, 7 May 2022 17:23:28 +0000 (19:23 +0200)]
linux: spl: generic: ddi_strtou##type: elide unused flag

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #13434

2 years agoRemove hw_serial, ddi_strtoul()
наб [Sat, 7 May 2022 17:18:41 +0000 (19:18 +0200)]
Remove hw_serial, ddi_strtoul()

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #13434

2 years agoFix typos in zfs-bookmark examples
Mateusz Piotrowski [Thu, 12 May 2022 16:34:24 +0000 (18:34 +0200)]
Fix typos in zfs-bookmark examples

Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Mateusz Piotrowski <0mp@FreeBSD.org>
Closes #13456

2 years agolinux: libshare/nfs: don't do anything unless exportfs is available
наб [Sun, 17 Apr 2022 13:03:03 +0000 (15:03 +0200)]
linux: libshare/nfs: don't do anything unless exportfs is available

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #13165
Closes #13324

2 years agolinux: libshare/smb: cache smb_available
наб [Sun, 17 Apr 2022 13:00:15 +0000 (15:00 +0200)]
linux: libshare/smb: cache smb_available

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #13165

2 years agolibzfs: zfs_unshare: minor cleanup
наб [Fri, 18 Mar 2022 17:40:13 +0000 (18:40 +0100)]
libzfs: zfs_unshare: minor cleanup

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #13165

2 years agotests: add zfs_unshare_008_pos checking whitespace escaping
наб [Sun, 6 Mar 2022 00:39:54 +0000 (01:39 +0100)]
tests: add zfs_unshare_008_pos checking whitespace escaping

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #13165

2 years agolibshare/nfs: escape mount points when needed
наб [Mon, 28 Feb 2022 19:42:22 +0000 (20:42 +0100)]
libshare/nfs: escape mount points when needed

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #13165
Closes #13153

2 years agolinux: libshare/nfs: bsearch() over valid keys
наб [Mon, 28 Feb 2022 16:44:06 +0000 (17:44 +0100)]
linux: libshare/nfs: bsearch() over valid keys

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #13165

2 years agolibzfs: mount: zfs_unshare: don't reallocate mountpoint
наб [Mon, 28 Feb 2022 15:55:16 +0000 (16:55 +0100)]
libzfs: mount: zfs_unshare: don't reallocate mountpoint

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #13165

2 years agoReplace libzfs sharing _nfs() and _smb() APIs with protocol lists
наб [Mon, 28 Feb 2022 15:52:07 +0000 (16:52 +0100)]
Replace libzfs sharing _nfs() and _smb() APIs with protocol lists

With the additional benefit of removing all the _all() functions and
treating a NULL list as "all" ‒ the remaining all function is for all
/datasets/, which is consistent with the rest of the API

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #13165

2 years agoPublish libshare protocols, use enum-based API
наб [Mon, 28 Feb 2022 14:46:25 +0000 (15:46 +0100)]
Publish libshare protocols, use enum-based API

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #13165

2 years agolibshare: delineate obsolete errors
наб [Mon, 28 Feb 2022 14:00:49 +0000 (15:00 +0100)]
libshare: delineate obsolete errors

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #13165

2 years agolibshare: use AVL tree with static data, pass all data in arguments
наб [Mon, 28 Feb 2022 13:50:28 +0000 (14:50 +0100)]
libshare: use AVL tree with static data, pass all data in arguments

This makes it so we don't leak a consistent 64 bytes anymore,
makes the searches simpler and faster, removes /all allocations/
from the driver (quite trivially, since they were absolutely needless),
and makes libshare thread-safe (except, maybe, linux/smb, but that only
does pointer-width loads/stores so it's also mostly fine, except for
leaking smb_shares)

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #13165

2 years agolibshare: interface: {=> const} char *
наб [Mon, 28 Feb 2022 12:37:06 +0000 (13:37 +0100)]
libshare: interface: {=> const} char *

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #13165

2 years agolibshare/smb: cleanup
наб [Mon, 28 Feb 2022 12:13:10 +0000 (13:13 +0100)]
libshare/smb: cleanup

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #13165

2 years agolibshare/nfs: destaticify nfs_lock_fd
наб [Mon, 28 Feb 2022 11:57:47 +0000 (12:57 +0100)]
libshare/nfs: destaticify nfs_lock_fd

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #13165

2 years agofreebsd: libshare/nfs: write directly in translate_opts()
наб [Mon, 28 Feb 2022 11:55:07 +0000 (12:55 +0100)]
freebsd: libshare/nfs: write directly in translate_opts()

This renders it thread-safe

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #13165

2 years agofreebsd: libshare/nfs: constify static const data
наб [Mon, 28 Feb 2022 11:40:14 +0000 (12:40 +0100)]
freebsd: libshare/nfs: constify static const data

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #13165

2 years agoAdd missing AC_MSG_RESULT(no) to configure
Brian Behlendorf [Thu, 12 May 2022 16:12:32 +0000 (09:12 -0700)]
Add missing AC_MSG_RESULT(no) to configure

When the HAVE_IOPS_MKDIR_USERNS check fails output result
as required.

Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #13454

2 years agoztest: reduce runtile of zloop.sh in CI
Brian Behlendorf [Thu, 12 May 2022 16:11:29 +0000 (09:11 -0700)]
ztest: reduce runtile of zloop.sh in CI

The zloop.sh script is primarily designed to randomly stress
the DMU and SPA layers.  This can result in some unrealistic
(or even impossible) scenarios being tested which then fail.

Since the longer we run zloop.sh the more likely this is to occur
this commit reduces the runtime.  The intention being that normally
this will result in a clean CI run unless the PR does introduce
serious breaking change.

Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #13453

2 years agoAdded a workaround for Linux KASAN builds
Rich Ercolani [Wed, 11 May 2022 20:26:55 +0000 (16:26 -0400)]
Added a workaround for Linux KASAN builds

Linux passes -Wframe-larger-than=1024, which breaks
our build in a number of places with -Werror.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Rich Ercolani <rincebrain@gmail.com>
Closes #13450

2 years agoudev: zvol_id: simplify/modernise
наб [Wed, 11 May 2022 17:58:19 +0000 (19:58 +0200)]
udev: zvol_id: simplify/modernise

zero-alloc, sensibler errors, don't close (or free) before exit.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #13337

2 years agoztest: O_CLOEXEC ztest_fd_rand
наб [Tue, 3 May 2022 14:58:40 +0000 (16:58 +0200)]
ztest: O_CLOEXEC ztest_fd_rand

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #13411

2 years agoztest: take -B ./path/to/ztest, LD_LIBRARY_PATH=./path/lib:$L_L_P
наб [Tue, 3 May 2022 14:55:14 +0000 (16:55 +0200)]
ztest: take -B ./path/to/ztest, LD_LIBRARY_PATH=./path/lib:$L_L_P

This changes the behaviour of -B from the illumos one which would,
in the example in the manual, take just ./chroots/lenny;
this, however, is more versatile, and scales much better for systems
with ZFS in /usr/local, for example

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #13411
Closes #1770

2 years agotests: many_fds: simplify, modernise
наб [Tue, 3 May 2022 13:40:34 +0000 (15:40 +0200)]
tests: many_fds: simplify, modernise

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #13411

2 years agoRemove enable_extended_FILE_stdio()
наб [Tue, 3 May 2022 12:52:53 +0000 (14:52 +0200)]
Remove enable_extended_FILE_stdio()

Even on Illumos it's only available in the 32-bit programming
environment, and, quoth enable_extended_FILE_stdio(3C):
> Historically, 32-bit Solaris applications have been limited to using
> only the file descriptors 0 through 255 with the standard I/O
> functions (see stdio(3C)) in the C library. The extended FILE
> facility allows well-behaved 32-bit applications to use any
> valid file descriptor with the standard I/O functions.
where "well-behaved" means that it
> does not directly access any fields in the FILE structure pointed
> to by the FILE pointer associated with any standard I/O stream,

And the stdio/flush.c implementation reads:
  /*
   * if this is not an internal extended FILE then check
   * if _file is being changed from underneath us.
   * It should not be because if
   * it is then then we lose our ability to guard against
   * silent data corruption.
   */
  if (!iop->__xf_nocheck && bad_fd > -1 && iop->_magic != bad_fd) {
      (void) fprintf(stderr,
          "Application violated extended FILE safety mechanism.\n"
          "Please read the man page for extendedFILE.\nAborting\n");
      abort();
  }

This appears to be an insane workaround for broken implementation with
exposed FILE internals and _file being an u8, both only on non-LP64;
it's shimmed out on all LP64 targets in Illumos,
and we shim it out as well: just get rid of it

This appears to've been originally fixed in illumos-gate
a5f69788de7ac07553de47f7fec8c05a9a94c105 ("PSARC 2006/162 Extended FILE
space for 32-bit Solaris processes", "1085341 32-bit stdio routines
should support file descriptors >255"), which also bears extendedFILE
and enable_extended_FILE_stdio(3C):
  -       unsigned char   _file;  /* UNIX System file descriptor */
  +       unsigned char   _magic; /* Old home of the file descriptor */
  +                               /* Only fileno(3C) can retrieve the
   value now */
and
  +/*
  + * Macros to aid the extended fd FILE work.
  + * This helps isolate the changes to only the 32-bit code
  + * since 64-bit Solaris is not affected by this.
  + */
  +#ifdef  _LP64
  +#define        GET_FD(iop)             ((iop)->_file)
  +#define        SET_FILE(iop, fd)       ((iop)->_file = (fd))
  +#else
  +#define        GET_FD(iop)             \
  +               (((iop)->__extendedfd) ? _file_get(iop) : (iop)->_magic)
  +#define        SET_FILE(iop, fd)       (iop)->_magic = (fd); (iop)->__extendedfd = 0
  +#endif

Also remove the 1k setrlimit(NOFILE) calls: that's the default on Linux,
with 64k on Illumos and 171k on FreeBSD

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #13411

2 years agoautoconf: Fail when __copy_from_user_inatomic is a non-GPL symbol
szubersk [Sat, 7 May 2022 00:53:42 +0000 (00:53 +0000)]
autoconf: Fail when __copy_from_user_inatomic is a non-GPL symbol

A followup to 849c14e04844a2f0e1f7e42886c2cef083563f35
Fix https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1009242

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: szubersk <szuberskidamian@gmail.com>
Closes #13389

2 years agoAdding ZTS test for O_APPEND
Brian Atkinson [Wed, 11 May 2022 15:38:16 +0000 (11:38 -0400)]
Adding ZTS test for O_APPEND

Commit 63b18e4 fixed an issue in zpl_aio_write() to make sure that
kiocb->ki_pos was updated correctly when opening a file with O_APPEND.
Adding a test to verify O_APPEND functionality with lseek can make
sure that all other distros/kernel versions also have the correct
behavior.

Also moved the threadappends_001_pos test into this append test
directory in functional ZTS directory. This way the two append tests
are together for organization purposes.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Brian Atkinson <batkinson@lanl.gov>
Closes #13424

2 years agoRemove constrained path on clean
наб [Tue, 3 May 2022 11:17:50 +0000 (13:17 +0200)]
Remove constrained path on clean

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #13316

2 years agoztest: fix in-tree detection for automatic zdb path
наб [Tue, 3 May 2022 10:22:30 +0000 (12:22 +0200)]
ztest: fix in-tree detection for automatic zdb path

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #13316

2 years agoztest: use $ZDB instead of $ZDB_PATH for zdb
наб [Tue, 3 May 2022 10:15:46 +0000 (12:15 +0200)]
ztest: use $ZDB instead of $ZDB_PATH for zdb

Which actually gets zdb as set in common.sh

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #13316

2 years agoscripts: zpool.sh: cleanup
наб [Fri, 29 Apr 2022 23:21:16 +0000 (01:21 +0200)]
scripts: zpool.sh: cleanup

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #13316

2 years agocppcheck: explicitly exclude kernel code from userspace checks
наб [Mon, 25 Apr 2022 21:27:03 +0000 (23:27 +0200)]
cppcheck: explicitly exclude kernel code from userspace checks

Thus extracting the final shred of utility

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #13316

2 years agoMakefile: respect V=1 for checks
наб [Mon, 11 Apr 2022 22:56:32 +0000 (00:56 +0200)]
Makefile: respect V=1 for checks

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #13316

2 years agoautogen.sh: paper over automake <1.14's lack of %reldir% support
наб [Mon, 11 Apr 2022 21:41:14 +0000 (23:41 +0200)]
autogen.sh: paper over automake <1.14's lack of %reldir% support

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #13316

2 years agocontrib: bash_completion.d: install, fix dist
наб [Sun, 10 Apr 2022 21:10:02 +0000 (23:10 +0200)]
contrib: bash_completion.d: install, fix dist

dist diff:
  -zfs-2.1.99/contrib/bash_completion.d/zfs

install diff:
  +destdir/usr/local/etc/bash_completion.d
  +destdir/usr/local/etc/bash_completion.d/zfs

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #13316