]> git.proxmox.com Git - mirror_zfs.git/log
mirror_zfs.git
9 months agoMove nodes into correct subgraphs
Andrew Innes [Mon, 29 Jan 2024 17:16:02 +0000 (01:16 +0800)]
Move nodes into correct subgraphs

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Tino Reichardt <milky-zfs@mcmilk.de>
Signed-off-by: Andrew Innes <andrew.c12@gmail.com>
Closes #15828

9 months agoRemove list_size struct member from list implementation
MigeljanImeri [Fri, 26 Jan 2024 22:46:42 +0000 (15:46 -0700)]
Remove list_size struct member from list implementation

Removed the list_size struct member as it was only used in a single
assertion, as mentioned in PR #15478.

Reviewed-by: Brian Atkinson <batkinson@lanl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Signed-off-by: MigeljanImeri <imerimigel@gmail.com>
Closes #15812

9 months agozpool wait: print timestamp before the header
Rob N [Fri, 26 Jan 2024 22:41:31 +0000 (09:41 +1100)]
zpool wait: print timestamp before the header

list, status and iostat all display the -T timestamp before the header,
but wait showed it after. Make it be like the others.

Reported-by: Kyle Evans <kevans@FreeBSD.org>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Rob Norris <robn@despairlabs.com>
Closes #15825

9 months agoUpdate vdev devid and physpath if changed between imports
Ameer Hamza [Fri, 26 Jan 2024 22:24:35 +0000 (03:24 +0500)]
Update vdev devid and physpath if changed between imports

If devid or physpath for a vdev changes between imports, ensure it is
updated to the new value.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Signed-off-by: Ameer Hamza <ahamza@ixsystems.com>
Closes #15816

9 months agoZTS: Update deprecated Github Action version numbers
Tino Reichardt [Fri, 26 Jan 2024 22:22:26 +0000 (23:22 +0100)]
ZTS: Update deprecated Github Action version numbers

GitHub Actions is transitioning from Node 16 to Node 20.

So we need to update these:
- actions/checkout@v3 -> v4
- actions/download-artifact@v3 -> v4
- actions/upload-artifact@v3 -> v4 and some minor changes

Update also the documentation of the testings workflow.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Andrew Innes <andrew.c12@gmail.com>
Signed-off-by: Tino Reichardt <milky-zfs@mcmilk.de>
Closes #15820

9 months agoSwitch to CodeQL to detect prohibited function use
Richard Yao [Fri, 26 Jan 2024 22:11:33 +0000 (17:11 -0500)]
Switch to CodeQL to detect prohibited function use

The LLVM/Clang developers pointed out that using the CPP to detect use
of functions that our QA policies prohibit risks invoking undefined
behavior. To resolve this, we configure CodeQL to detect forbidden
function usage.

Note that cpp in the context of CodeQL refers to C/C++, rather than the
C PreProcessor, which C++ also uses. It really should have been written
cxx, but that ship sailed a long time ago. This misuse of the term cpp
is retained in the CodeQL configuration for consistency with upstream
CodeQL.

As a side benefit, verbose make no longer is a wall of text showing a
bunch of CPP macros, which can make debugging slightly easier.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Closes #15819
Closes #14134

9 months agoZTS: Apply small changes for speeding up the tests
Tino Reichardt [Fri, 26 Jan 2024 21:36:59 +0000 (22:36 +0100)]
ZTS: Apply small changes for speeding up the tests

The Github Action Runner got some new hardware metrics.  We should use
the provided and empty disk which is pre-mounted at /mnt now.

Disk1: 89GiB -> rootfs + bootfs with ~80MB/s -> don't care
Disk2: 64GiB -> /mnt with 420MB/s -> new testing ssd

This commit will mount the new disk to /var/tmp and provide hopefully
some speedups within our testings.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Andrew Innes <andrew.c12@gmail.com>
Signed-off-by: Tino Reichardt <milky-zfs@mcmilk.de>
Closes #15811

10 months agoFix file descriptor leak on pool import.
Pawel Jakub Dawidek [Tue, 23 Jan 2024 23:03:48 +0000 (15:03 -0800)]
Fix file descriptor leak on pool import.

Descriptor leak can be easily reproduced by doing:

# zpool import tank
# sysctl kern.openfiles
# zpool export tank; zpool import tank
# sysctl kern.openfiles

We were leaking four file descriptors on every import.

Similar leak most likely existed when using file-based VDEVs.

External-issue: https://reviews.freebsd.org/D43529
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Pawel Jakub Dawidek <pawel@dawidek.net>
Closes #15630

10 months agoZTS: Apply zfs_bclone_enabled to bclone tests
Brian Behlendorf [Tue, 23 Jan 2024 00:14:08 +0000 (16:14 -0800)]
ZTS: Apply zfs_bclone_enabled to bclone tests

If block cloning is disabled by default then enable it when running
the bclone tests.  Follow up to #15529.

Reviewed-by: Brian Atkinson <batkinson@lanl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #15796

10 months agoFreeBSD: Fix bootstrapping tools under Linux/musl
Val Packett [Fri, 19 Jan 2024 21:01:26 +0000 (18:01 -0300)]
FreeBSD: Fix bootstrapping tools under Linux/musl

musl libc has deprecated LFS64 aliases, so bootstrapping FreeBSD tools
under musl distros has been failing with stat64 errors.

Apply the aliases under non-glibc Linux to fix this problem.

Reviewed-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Val Packett <val@packett.cool>
Closes #15780

10 months agofix: variable type with zfs-tests/cmd/clonefile.c
Tino Reichardt [Wed, 17 Jan 2024 17:06:14 +0000 (18:06 +0100)]
fix: variable type with zfs-tests/cmd/clonefile.c

Compiling on arm64 freebsd-13.2 and arm64 almalinux-8 brings currently
this error:

```
  CC       tests/zfs-tests/cmd/clonefile.o
tests/zfs-tests/cmd/clonefile.c:166:43: error: result of comparison of \
constant -1 with expression of type 'char' is always true \
[-Werror,-Wtautological-constant-out-of-range-compare]
        while ((c = getopt(argc, argv, "crfdq")) != -1) {
               ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ^  ~~
1 error generated.
gmake[2]: *** [Makefile:8675: tests/zfs-tests/cmd/clonefile.o] Error 1
```

Fix: use correct variable type `int`.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Rob Norris <robn@despairlabs.com>
Signed-off-by: Tino Reichardt <milky-zfs@mcmilk.de>
Closes #15783

10 months agolinux spl: fix typo in top comment of spl-condvar.c
Tino Reichardt [Wed, 17 Jan 2024 17:05:12 +0000 (18:05 +0100)]
linux spl: fix typo in top comment of spl-condvar.c

Credential Implementation -> Condition Variables Implementation

Reviewed-by: Brian Atkinson <batkinson@lanl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tino Reichardt <milky-zfs@mcmilk.de>
Closes #15782

10 months agoAutotrim High Load Average Fix
Kevin Jin [Wed, 17 Jan 2024 17:03:58 +0000 (12:03 -0500)]
Autotrim High Load Average Fix

Switch from cv_wait() to cv_wait_idle() in vdev_autotrim_wait_kick(),
which should mitigate the high load average while waiting.

Reviewed-by: Brian Atkinson <batkinson@lanl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Signed-off-by: jxdking <lostking2008@hotmail.com>
Closes #15781

10 months agoFix cloning into mmaped and cached file.
Pawel Jakub Dawidek [Wed, 17 Jan 2024 16:51:07 +0000 (08:51 -0800)]
Fix cloning into mmaped and cached file.

If the destination file is mmaped and the mmaped region was already
read, so it is cached, we need to update mmaped pages after successful
clone using update_pages().

Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Pointed out by: Ka Ho Ng <khng@freebsd.org>
Signed-off-by: Pawel Jakub Dawidek <pawel@dawidek.net>
Closes #15772

10 months agoLinux 6.7 compat: zfs_setattr fix atime update
Rob N [Tue, 16 Jan 2024 22:01:17 +0000 (09:01 +1100)]
Linux 6.7 compat: zfs_setattr fix atime update

In db4fc559c I messed up and changed this bit of code to set the inode
atime to an uninitialised value, when actually it was just supposed to
loading the atime from the inode to be stored in the SA. This changes it
to what it should have been.

Ensure times change by the right amount Previously, we only checked
if the times changed at all, which missed a bug where the atime was
being set to an undefined value.

Now ensure the times change by two seconds (or thereabouts), ensuring
we catch cases where we set the time to something bonkers

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Rob Norris <robn@despairlabs.com>
Sponsored-by: https://despairlabs.com/sponsor/
Closes #15762
Closes #15773

10 months agoMake sure all necessary RPM path macros are defined
Lalufu [Tue, 16 Jan 2024 21:32:59 +0000 (22:32 +0100)]
Make sure all necessary RPM path macros are defined

When building (s)rpm files through the Makefile, a directory structure
is created in /tmp to hold the various files.

In case the user running the command has overridden some of the RPM path
settings through their user profile (for example in `~/.rpmmacros`),
these paths do not line up with the configuration, and the build fails.

Make sure all paths used are properly defined.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ralf Ertzinger <ralf@skytale.net>
Closes #15756

10 months agoMake spl_kmem_cache size check consistent
youzhongyang [Tue, 16 Jan 2024 21:30:58 +0000 (16:30 -0500)]
Make spl_kmem_cache size check consistent

On Linux x86_64, kmem cache can have size up to 4M,
however increasing spl_kmem_cache_slab_limit can lead
to crash due to the size check inconsistency.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Youzhong Yang <yyang@mathworks.com>
Closes #15757

10 months agoAdd path handling for aux vdevs in `label_path`
Ameer Hamza [Thu, 4 Jan 2024 14:35:04 +0000 (19:35 +0500)]
Add path handling for aux vdevs in `label_path`

If the AUX vdev is added using UUID, importing the pool falls back AUX
vdev to open it with disk name instead of UUID due to the absence of
path information for AUX vdevs. Since AUX label now have path
information, this PR adds path handling for it in `label_path`.

Reviewed-by: Umer Saleem <usaleem@ixsystems.com>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Signed-off-by: Ameer Hamza <ahamza@ixsystems.com>
Closes #15737

10 months agoExtend aux label to add path information
Ameer Hamza [Thu, 4 Jan 2024 14:32:53 +0000 (19:32 +0500)]
Extend aux label to add path information

Pool import logic uses vdev paths, so it makes sense to add path
information on AUX vdev as well.

Reviewed-by: Umer Saleem <usaleem@ixsystems.com>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Signed-off-by: Ameer Hamza <ahamza@ixsystems.com>
Closes #15737

10 months agofix: Uber block label not always found for aux vdevs
Ameer Hamza [Thu, 4 Jan 2024 14:02:50 +0000 (19:02 +0500)]
fix: Uber block label not always found for aux vdevs

When spare or l2cache (aux) vdev is added during pool creation,
spa->spa_uberblock is not dumped until that point. Subsequently,
the aux label is never synchronized after its initial creation,
resulting in the uberblock label remaining undumped. The uberblock
is crucial for lib_blkid in identifying the ZFS partition type. To
address this issue, we now ensure sync of the uberblock label once
if it's not dumped initially.

Reviewed-by: Umer Saleem <usaleem@ixsystems.com>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Signed-off-by: Ameer Hamza <ahamza@ixsystems.com>
Closes #15737

10 months agoMake zdb -R a little more sane.
Rich Ercolani [Tue, 16 Jan 2024 21:16:08 +0000 (16:16 -0500)]
Make zdb -R a little more sane.

zdb -R has a minor flaw in which it will not always print the full
output of a decompressed block. Oops.

While I was in there, I also reworked the logic so it won't try
ZLE unless everything else fails, which will hopefully avoid the
problem ZDB_NO_ZLE was intended to mitigate of reporting a lot of
false positives of ZLE compressed blocks...

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Rich Ercolani <rincebrain@gmail.com>
Closes #15723

10 months agoZTS: Test for clone, mmap and write for block cloning
Umer Saleem [Tue, 16 Jan 2024 21:15:10 +0000 (02:15 +0500)]
ZTS: Test for clone, mmap and write for block cloning

For block cloning, if we mmap the cloned file and write from the
map into the file, it triggers a panic in dbuf_redirty() on Linux.

The same scenario causes data corruption on FreeBSD. Both these
issues are fixed under PR#15656 and PR#15665.

It would be good to add a test for this scenario in ZTS. The test
program and issue was produced by @robn.

Reviewed-by: Pawel Jakub Dawidek <pawel@dawidek.net>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Ameer Hamza <ahamza@ixsystems.com>
Signed-off-by: Umer Saleem <usaleem@ixsystems.com>
Closes #15717

10 months agoFix "out of memory" error
Brian Behlendorf [Fri, 12 Jan 2024 20:35:29 +0000 (12:35 -0800)]
Fix "out of memory" error

Drop the no_memory() call from zpool_in_use() when reading the
label fails and instead return the error to the caller.  This
prevents a misleading "internal error: out of memory" error
when the label can't be read.  This will result in is_spare()
returning B_FALSE instead of aborting, which is already safely
handled.

Furthermore, on Linux it's possible for EREMOTEIO to returned
by an NVMe device if the device has been low-level formatted
and not rescanned.  In this case we want to fallback to the
legacy scanning method and read any of the labels we can.

Reviewed-by: Brian Atkinson <batkinson@lanl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Issue #13538
Closes #15747

10 months agofix: preserve linux kmod signature in zfs-kmod rpm spec
Benjamin Sherman [Fri, 12 Jan 2024 20:33:41 +0000 (14:33 -0600)]
fix: preserve linux kmod signature in zfs-kmod rpm spec

This change provides rpm spec macros to sign the zfs and spl kmods as
the final step after the %install scriptlet. This is needed since the
find-debuginfo.sh script strips out debug symbols plus signatures.

Kernel module signing only occurs when the required files are present
as typically required in the Linux source tree:
- certs/signing_key.pem
- certs/signing_key.x509

The method for overriding the default __spec_install_post macro is
inspired by (and largely copied from) the Fedora kernel.spec.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Tino Reichardt <milky-zfs@mcmilk.de>
Signed-off-by: Benjamin Sherman <benjamin@holyarmy.org>
Closes #15744

10 months agospa: Let spa_taskq_param_get()'s addition of a newline be optional
Mark Johnston [Fri, 29 Dec 2023 17:56:35 +0000 (12:56 -0500)]
spa: Let spa_taskq_param_get()'s addition of a newline be optional

For FreeBSD sysctls, we don't want the extra newline, since the
sysctl(8) utility will format strings appropriately.

Reviewed-by: Rob Norris <robn@despairlabs.com>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reported-by: Peter Holm <pho@FreeBSD.org>
Signed-off-by: Mark Johnston <markj@FreeBSD.org>
Closes #15719

10 months agospa: Fix FreeBSD sysctl handlers
Mark Johnston [Fri, 29 Dec 2023 15:22:58 +0000 (10:22 -0500)]
spa: Fix FreeBSD sysctl handlers

sbuf_cpy() resets the sbuf state, which is wrong for sbufs allocated by
sbuf_new_for_sysctl().  In particular, this code triggers an assertion
failure in sbuf_clear().

Simplify by just using sysctl_handle_string() for both reading and
setting the tunable.

Fixes: 6930ecbb7 ("spa: make read/write queues configurable")
Reviewed-by: Rob Norris <robn@despairlabs.com>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reported-by: Peter Holm <pho@FreeBSD.org>
Signed-off-by: Mark Johnston <markj@FreeBSD.org>
Closes #15719

10 months agoStop wasting time on malloc in snprintf_zstd_header
Rich Ercolani [Fri, 12 Jan 2024 20:17:26 +0000 (15:17 -0500)]
Stop wasting time on malloc in snprintf_zstd_header

Profiling zdb -vvvvv on datasets with a lot of zstd blocks, we find
ourselves spending quite a lot of time on malloc/free, because we
allocate a 16M abd each call, and never free it, so we're leaking
16M per call as well.

This seems sub-optimal. So let's just keep the buffer around and
reuse it.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Rob Norris <robn@despairlabs.com>
Signed-off-by: Rich Ercolani <rincebrain@gmail.com>
Closes #15721

10 months agofix(mount): do not truncate shares not zfs mount
Stefan Lendl [Fri, 12 Jan 2024 20:05:11 +0000 (21:05 +0100)]
fix(mount): do not truncate shares not zfs mount

When running zfs share -a resetting the exports.d/zfs.exports makes
sense the get a clean state.
Truncating was also called with zfs mount which would not populate the
file again.
Add test to verify shares persist after mount -a.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Stefan Lendl <s.lendl@proxmox.com>
Closes #15607
Closes #15660

10 months agoEnable block_cloning tests on FreeBSD
Brian Behlendorf [Fri, 12 Jan 2024 19:57:13 +0000 (11:57 -0800)]
Enable block_cloning tests on FreeBSD

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Pawel Jakub Dawidek <pawel@dawidek.net>
Closes #15749

10 months agoMake zdb -R scale less poorly
Rich Ercolani [Fri, 12 Jan 2024 19:55:17 +0000 (14:55 -0500)]
Make zdb -R scale less poorly

zdb -R with :d tries to use gzip decompression 9 times per size.
There's absolutely no reason for that, they're all the same
decompressor.

Reviewed-by: Brian Atkinson <batkinson@lanl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Rich Ercolani <rincebrain@gmail.com>
Closes #15726

10 months agoFix a potential use-after-free in zfs_setsecattr()
Mark Johnston [Tue, 9 Jan 2024 23:57:09 +0000 (18:57 -0500)]
Fix a potential use-after-free in zfs_setsecattr()

In general, VOPs must not load the "z_log" field until having called
zfs_enter_verify_zp().

Reviewed-by: Brian Atkinson <batkinson@lanl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Mark Johnston <markj@FreeBSD.org>
Closes #15752

10 months agoLinux: Defer loading the object set in zfs_setattr()
Mark Johnston [Tue, 9 Jan 2024 15:57:29 +0000 (10:57 -0500)]
Linux: Defer loading the object set in zfs_setattr()

We need to wait until after having done a zfs_enter() to load some
fields from the zfsvfs structure.  Otherwise a use-after-free is
possible in the face of a concurrent rollback.

Other functions in this file are careful to avoid this bug, I believe
this is the only instance.

Reviewed-by: Brian Atkinson <batkinson@lanl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Mark Johnston <markj@FreeBSD.org>
Closes #15752

10 months agoAdd Gotify notification support to ZED
gofaster [Tue, 9 Jan 2024 17:49:30 +0000 (12:49 -0500)]
Add Gotify notification support to ZED

This commit adds the zed_notify_gotify() function and hooks it
into zed_notify(). This will allow ZED to send notifications
to a self-hosted Gotify service, which can be received
on a desktop or mobile device. It is configured with ZED_GOTIFY_URL,
ZED_GOTIFY_APPTOKEN and ZED_GOTIFY_PRIORITY variables in zed.rc.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: gofaster <felix.gofaster@gmail.com>
Closes #15693

10 months agoFix livelist assertions for dedup and cloning
Alexander Motin [Tue, 9 Jan 2024 17:48:40 +0000 (12:48 -0500)]
Fix livelist assertions for dedup and cloning

Two block pointers in livelist pointing to the same location may
be caused not only by dedup, but also by block cloning. We should
not assert D bit set in them.

Two block pointers in livelist pointing to the same location may
have different logical birth time in case of dedup or cloning. We
should assert identical physical birth time instead.

Assert identical physical block size between pointers in addition
to checksum, since that is what checksums are calculated on.

Reviewed-by: Matthew Ahrens <mahrens@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Alexander Motin <mav@FreeBSD.org>
Sponsored by: iXsystems, Inc.
Closes #15732

10 months agoImprove block sizes checks during cloning
Alexander Motin [Tue, 9 Jan 2024 17:46:43 +0000 (12:46 -0500)]
Improve block sizes checks during cloning

- Fail if source block is smaller than destination.  We can only
grow blocks, not shrink them.
 - Fail if we do not have full znode range lock.  In that case grow
is not even called.  We should improve zfs_rangelock_cb() somehow
to know when cloning needs to grow the block size unlike write.
 - Fail of we tried to resize, but failed.  There are many reasons
for it to fail that we can not predict at this level, so be ready
for them.  Unlike write, that may proceed after growth failure,
block cloning can't and must return error.

This fixes assertion inside dmu_brt_clone() when it sees different
number of blocks held in destination than it got block pointers.
Builds without ZFS_DEBUG returned EXDEV, so are not affected much.

Reviewed-by: Pawel Jakub Dawidek <pawel@dawidek.net>
Reviewed-by: Brian Atkinson <batkinson@lanl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Alexander Motin <mav@FreeBSD.org>
Sponsored by: iXsystems, Inc.
Closes #15724
Closes #15735

10 months agomake zdb_decompress_block check decompression reliably
Kent Ross [Tue, 9 Jan 2024 17:13:52 +0000 (09:13 -0800)]
make zdb_decompress_block check decompression reliably

This function decompresses to two buffers and then compares them to
check whether the (opaque) decompression process filled the whole
buffer. Previously it began with lbuf uninitialized and lbuf2 filled
with pseudorandom data. This neither guarantees that any bytes not
written by the compressor would be different, nor seems incredibly
sound otherwise!

After these changes, instead of filling one buffer with generated
pseudorandom data we overwrite each buffer with completely different
data. This should remove the possibility of low-probability failures,
as well as make the process simpler and cheaper.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Rich Ercolani <rincebrain@gmail.com>
Signed-off-by: Kent Ross <k@mad.cash>
Closes #15733

10 months agozpoolprops.7: Remove unnecessary .Ns
Jose Luis Duran [Tue, 9 Jan 2024 01:03:15 +0000 (22:03 -0300)]
zpoolprops.7: Remove unnecessary .Ns

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Jose Luis Duran <jlduran@gmail.com>
Closes #15727

10 months agoZIL: Update Linux tracing after #15635
Alexander Motin [Tue, 9 Jan 2024 00:49:39 +0000 (19:49 -0500)]
ZIL: Update Linux tracing after #15635

While picking parts from #14909 I've missed Linux tracing specific
ones, that went unnoticed in default configurations, but breaks the
build in some.

Reviewed-by: Ameer Hamza <ahamza@ixsystems.com>
Reviewed-by: Brian Atkinson <batkinson@lanl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Alexander Motin <mav@FreeBSD.org>
Sponsored by: iXsystems, Inc.
Closes #15730

10 months agoLinux 6.2 compat: add check for kernel_neon_* availability
Shengqi Chen [Tue, 9 Jan 2024 00:05:24 +0000 (08:05 +0800)]
Linux 6.2 compat: add check for kernel_neon_* availability

This patch adds check for `kernel_neon_*` symbols on arm and arm64
platforms to address the following issues:

1. Linux 6.2+ on arm64 has exported them with `EXPORT_SYMBOL_GPL`, so
   license compatibility must be checked before use.
2. On both arm and arm64, the definitions of these symbols are guarded
   by `CONFIG_KERNEL_MODE_NEON`, but their declarations are still
   present. Checking in configuration phase only leads to MODPOST
   errors (undefined references).

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Shengqi Chen <harry-chen@outlook.com>
Closes #15711
Closes #14555
Closes: #15401
10 months agoFix the FreeBSD userspace build (#15716)
Mark Johnston [Wed, 27 Dec 2023 20:17:53 +0000 (15:17 -0500)]
Fix the FreeBSD userspace build (#15716)

- Mark some parameters to zpool_power*() as unused.
- Add a stub zpool_disk_wait().

Fixes: a9520e6e5 ("zpool: Add slot power control, print power status")
Signed-off-by: Mark Johnston <markj@FreeBSD.org>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
10 months agoBlock cloning tests.
Pawel Jakub Dawidek [Tue, 26 Dec 2023 20:01:53 +0000 (12:01 -0800)]
Block cloning tests.

The test mostly focus on testing various corner cases.
The tests take a long time to run, so for the common.run runfile
we randomly select a hundred tests.
To run all the bclone tests, bclone.run runfile should be used.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Pawel Jakub Dawidek <pawel@dawidek.net>
Closes #15631

11 months agoLinux 6.5 compat: check BLK_OPEN_EXCL is defined
Brian Behlendorf [Thu, 21 Dec 2023 19:22:56 +0000 (11:22 -0800)]
Linux 6.5 compat: check BLK_OPEN_EXCL is defined

On some systems we already have blkdev_get_by_path() with 4 args
but still the old FMODE_EXCL and not BLK_OPEN_EXCL defined.
The vdev_bdev_mode() function was added to handle this case
but there was no generic way to specify exclusive access.

Reviewed-by: Brian Atkinson <batkinson@lanl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #15692

11 months agoDon't panic on unencrypted block in encrypted dataset
chrisperedun [Thu, 21 Dec 2023 19:12:30 +0000 (14:12 -0500)]
Don't panic on unencrypted block in encrypted dataset

While 763ca47 closes the situation of block cloning creating
unencrypted records in encrypted datasets, existing data still causes
panic on read. Setting zfs_recover bypasses this but at the cost of
potentially ignoring more serious issues.

Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Chris Peredun <chris.peredun@ixsystems.com>
Closes #15677

11 months agoZIL: Improve next log block size prediction
Alexander Motin [Thu, 21 Dec 2023 18:54:44 +0000 (13:54 -0500)]
ZIL: Improve next log block size prediction

Track history in context of bursts, not individual log blocks. It
allows to not blow away all the history by single large burst of
many block, and same time allows optimizations covering multiple
blocks in a burst and even predicted following burst.  For each
burst account its optimal block size and minimal first block size.
Use that statistics from the last 8 bursts to predict first block
size of the next burst.

Remove predefined set of block sizes. Allocate any size we see fit,
multiple of 4KB, as required by ZIL now.  With compression enabled
by default, ZFS already writes pretty random block sizes, so this
should not surprise space allocator any more.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Alexander Motin <mav@FreeBSD.org>
Sponsored by: iXsystems, Inc.
Closes #15635

11 months agozpool: Add slot power control, print power status
Tony Hutter [Thu, 21 Dec 2023 18:53:16 +0000 (10:53 -0800)]
zpool: Add slot power control, print power status

Add `zpool` flags to control the slot power to drives.  This assumes
your SAS or NVMe enclosure supports slot power control via sysfs.

The new `--power` flag is added to `zpool offline|online|clear`:

    zpool offline --power <pool> <device>    Turn off device slot power
    zpool online --power <pool> <device>     Turn on device slot power
    zpool clear --power <pool> [device]      Turn on device slot power

If the ZPOOL_AUTO_POWER_ON_SLOT env var is set, then the '--power'
option is automatically implied for `zpool online` and `zpool clear`
and does not need to be passed.

zpool status also gets a --power option to print the slot power status.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Mart Frauenlob <AllKind@fastest.cc>
Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Closes #15662

11 months agospa: make read/write queues configurable
Rob N [Wed, 20 Dec 2023 22:17:14 +0000 (09:17 +1100)]
spa: make read/write queues configurable

We are finding that as customers get larger and faster machines
(hundreds of cores, large NVMe-backed pools) they keep hitting
relatively low performance ceilings. Our profiling work almost always
finds that they're running into bottlenecks on the SPA IO taskqs.
Unfortunately there's often little we can advise at that point, because
there's very few ways to change behaviour without patching.

This commit adds two load-time parameters `zio_taskq_read` and
`zio_taskq_write` that can configure the READ and WRITE IO taskqs
directly.

This achieves two goals: it gives operators (and those that support
them) a way to tune things without requiring a custom build of OpenZFS,
which is often not possible, and it lets us easily try different config
variations in a variety of environments to inform the development of
better defaults for these kind of systems.

Because tuning the IO taskqs really requires a fairly deep understanding
of how IO in ZFS works, and generally isn't needed without a pretty
serious workload and an ability to identify bottlenecks, only minimal
documentation is provided. Its expected that anyone using this is going
to have the source code there as well.

Sponsored-by: Klara, Inc.
Sponsored-by: Wasabi Technology, Inc.
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Rob Norris <rob.norris@klarasystems.com>
Closes #15675

11 months agoLinux 6.7 compat: rework shrinker setup for heap allocations
Rob Norris [Sat, 16 Dec 2023 13:36:21 +0000 (00:36 +1100)]
Linux 6.7 compat: rework shrinker setup for heap allocations

6.7 changes the shrinker API such that shrinkers must be allocated
dynamically by the kernel. To accomodate this, this commit reworks
spl_register_shrinker() to do something similar against earlier kernels.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Rob Norris <robn@despairlabs.com>
Sponsored-by: https://github.com/sponsors/robn
Closes #15681

11 months agoLinux 6.7 compat: handle superblock shrinker member change
Rob Norris [Sat, 16 Dec 2023 06:39:07 +0000 (17:39 +1100)]
Linux 6.7 compat: handle superblock shrinker member change

In 6.7 the superblock shrinker member s_shrink has changed from being an
embedded struct to a pointer. Detect this, and don't take a reference if
it already is one.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Rob Norris <robn@despairlabs.com>
Sponsored-by: https://github.com/sponsors/robn
Closes #15681

11 months agoLinux 6.7 compat: use inode atime/mtime accessors
Rob Norris [Sat, 16 Dec 2023 11:31:32 +0000 (22:31 +1100)]
Linux 6.7 compat: use inode atime/mtime accessors

6.6 made i_ctime inaccessible; 6.7 has done the same for i_atime and
i_mtime. This extends the method used for ctime in b37f29341 to atime
and mtime as well.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Rob Norris <robn@despairlabs.com>
Sponsored-by: https://github.com/sponsors/robn
Closes #15681

11 months agoLinux 6.7 compat: simplify current_time() check
Rob Norris [Sat, 16 Dec 2023 07:01:45 +0000 (18:01 +1100)]
Linux 6.7 compat: simplify current_time() check

6.7 changed the names of the time members in struct inode, so we can't
assign back to it because we don't know its name. In practice this
doesn't matter though - if we're missing current_time(), then we must be
on <4.9, and we know our fallback will need to return timespec.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Rob Norris <robn@despairlabs.com>
Sponsored-by: https://github.com/sponsors/robn
Closes #15681

11 months agoTest LWB buffer overflow for block cloning
Umer Saleem [Fri, 15 Dec 2023 22:18:27 +0000 (03:18 +0500)]
Test LWB buffer overflow for block cloning

PR#15634 removes 128K into 2x68K LWB split optimization, since it
was found to cause LWB buffer overflow while trying to write 128KB
TX_CLONE_RANGE record with 1022 block pointers into 68KB buffer,
with multiple VDEVs ZIL.

This commit adds a test for this particular scenario by writing
maximum sizes TX_CLONE_RANE record with 1022 block pointers into
68KB buffer, with two SLOG devices.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Ameer Hamza <ahamza@ixsystems.com>
Signed-off-by: Umer Saleem <usaleem@ixsystems.com>
Closes #15672

11 months agodmu: Allow buffer fills to fail
Alexander Motin [Fri, 15 Dec 2023 17:51:41 +0000 (12:51 -0500)]
dmu: Allow buffer fills to fail

When ZFS overwrites a whole block, it does not bother to read the
old content from disk. It is a good optimization, but if the buffer
fill fails due to page fault or something else, the buffer ends up
corrupted, neither keeping old content, nor getting the new one.

On FreeBSD this is additionally complicated by page faults being
blocked by VFS layer, always returning EFAULT on attempt to write
from mmap()'ed but not yet cached address range.  Normally it is
not a big problem, since after original failure VFS will retry the
write after reading the required data.  The problem becomes worse
in specific case when somebody tries to write into a file its own
mmap()'ed content from the same location.  In that situation the
only copy of the data is getting corrupted on the page fault and
the following retries only fixate the status quo.  Block cloning
makes this issue easier to reproduce, since it does not read the
old data, unlike traditional file copy, that may work by chance.

This patch provides the fill status to dmu_buf_fill_done(), that
in case of error can destroy the corrupted buffer as if no write
happened.  One more complication in case of block cloning is that
if error is possible during fill, dmu_buf_will_fill() must read
the data via fall-back to dmu_buf_will_dirty().  It is required
to allow in case of error restoring the buffer to a state after
the cloning, not not before it, that would happen if we just call
dbuf_undirty().

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Rob Norris <robn@despairlabs.com>
Signed-off-by: Alexander Motin <mav@FreeBSD.org>
Sponsored by: iXsystems, Inc.
Closes #15665

11 months agodbuf: Set dr_data when unoverriding after clone
Alexander Motin [Tue, 12 Dec 2023 20:59:24 +0000 (15:59 -0500)]
dbuf: Set dr_data when unoverriding after clone

Block cloning normally creates dirty record without dr_data.  But if
the block is read after cloning, it is moved into DB_CACHED state and
receives the data buffer.  If after that we call dbuf_unoverride()
to convert the dirty record into normal write, we should give it the
data buffer from dbuf and release one.

Reviewed-by: Kay Pedersen <mail@mkwg.de>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Alexander Motin <mav@FreeBSD.org>
Sponsored by: iXsystems, Inc.
Closes #15654
Closes #15656

11 months agodbuf: Handle arcbuf assignment after block cloning
Alexander Motin [Tue, 12 Dec 2023 20:53:59 +0000 (15:53 -0500)]
dbuf: Handle arcbuf assignment after block cloning

In some cases dbuf_assign_arcbuf() may be called on a block that
was recently cloned.  If it happened in current TXG we must undo
the block cloning first, since the only one dirty record per TXG
can't and shouldn't mean both cloning and overwrite same time.

Reviewed-by: Kay Pedersen <mail@mkwg.de>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Alexander Motin <mav@FreeBSD.org>
Sponsored by: iXsystems, Inc.
Closes #15653

11 months agoZTS: Update raidz_expand_005_pos.ksh
Brian Behlendorf [Tue, 12 Dec 2023 17:56:19 +0000 (09:56 -0800)]
ZTS: Update raidz_expand_005_pos.ksh

Align the raidz_expand_005_pos test with the raidz_expand_004_pos test
and only verify no errors were reported.  Allow scrub repair IO.

Reviewed-by: Don Brady <don.brady@klarasystems.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #15663

11 months agoFor db_marker inherit the db pointer for AVL comparision.
Chunwei Chen [Mon, 11 Dec 2023 22:42:06 +0000 (14:42 -0800)]
For db_marker inherit the db pointer for AVL comparision.

While evicting dbufs of a dnode, a marker node is added to the AVL.
The marker node should be inserted in AVL tree ahead of the dbuf its
trying to delete. The blkid and level is used to ensure this. However,
this could go wrong there's another dbufs with the same blkid and level
in DB_EVICTING state but not yet removed from AVL tree. dbuf_compare()
could fail to give the right location or could cause confusion and
trigger ASSERTs.

To ensure that the marker is inserted before the deleting dbuf, use
the pointer value of the original dbuf for comparision.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Co-authored-by: Sanjeev Bagewadi <sanjeev.bagewadi@nutanix.com>
Signed-off-by: Chunwei Chen <david.chen@nutanix.com>
Closes #12482
Closes #15643

11 months agoZTS: Add dirty dnode stress test
Tony Hutter [Mon, 11 Dec 2023 17:59:59 +0000 (09:59 -0800)]
ZTS: Add dirty dnode stress test

Add a test for the dirty dnode SEEK_HOLE/SEEK_DATA bug described in
https://github.com/openzfs/zfs/issues/15526

The bug was fixed in https://github.com/openzfs/zfs/pull/15571 and
was backported to 2.2.2 and 2.1.14.  This test case is just to
make sure it does not come back.

seekflood.c originally written by Rob Norris.

Reviewed-by: Graham Perrin <grahamperrin@freebsd.org>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Rob Norris <robn@despairlabs.com>
Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Closes #15608

11 months agoZTS: Add zpool_import_status.ksh to Makefile.am
Brian Behlendorf [Mon, 11 Dec 2023 17:40:32 +0000 (09:40 -0800)]
ZTS: Add zpool_import_status.ksh to Makefile.am

The zpool_import_status.ksh test case was not being run because
it was not included in the Makefile.am.

Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #15655

11 months agoZTS: Disable io_uring test on CentOS 9
Brian Behlendorf [Sat, 9 Dec 2023 01:31:31 +0000 (17:31 -0800)]
ZTS: Disable io_uring test on CentOS 9

The io_uring test fails on CentOS 9 with the following fio error.
Disable the test for the benefit of the CI until this can be fully
investigated.  This basic test passes as expected on newer kernels.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #15636

11 months agoDMU: Fix lock leak on dbuf_hold() error
Alexander Motin [Sat, 9 Dec 2023 00:43:39 +0000 (19:43 -0500)]
DMU: Fix lock leak on dbuf_hold() error

dmu_assign_arcbuf_by_dnode() should drop dn_struct_rwlock lock in
case dbuf_hold() failed.  I don't have reproduction for this, but
it looks inconsistent with dmu_buf_hold_noread_by_dnode() and co.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Alexander Motin <mav@FreeBSD.org>
Sponsored by: iXsystems, Inc.
Closes #15644

11 months agoZTS: Add test cases for block cloning replay
Ameer Hamza [Thu, 30 Nov 2023 20:14:56 +0000 (01:14 +0500)]
ZTS: Add test cases for block cloning replay

Reviewed-by: Kay Pedersen <mail@mkwg.de>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Signed-off-by: Ameer Hamza <ahamza@ixsystems.com>
Closes #15614

11 months agoZTS: block_cloning: Use numeric sort for get_same_blocks
Ameer Hamza [Wed, 6 Dec 2023 20:18:43 +0000 (01:18 +0500)]
ZTS: block_cloning: Use numeric sort for get_same_blocks

Reviewed-by: Kay Pedersen <mail@mkwg.de>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Signed-off-by: Ameer Hamza <ahamza@ixsystems.com>
Closes #15614

11 months agozed: fix typo in variable ZED_POWER_OFF_ENCLO*US*RE_SLOT_ON_FAULT
Mauricio Faria de Oliveira [Sat, 9 Dec 2023 00:32:35 +0000 (21:32 -0300)]
zed: fix typo in variable ZED_POWER_OFF_ENCLO*US*RE_SLOT_ON_FAULT

Replace ENCLO_US_RE with ENCLO_SU_RE in the name of the variable.

Note this changes the user-visible string in zed.rc, thus might
break current users with the wrong string, but it's ~2 months
since zfs-2.2.0 tag is out, thus should not be widespread yet.

Mechanical change:

    $ grep -rl ZED_POWER_OFF_ENCLOUSRE_SLOT_ON_FAULT
    cmd/zed/zed.d/zed.rc
    cmd/zed/zed.d/statechange-slot_off.sh

    $ sed -i 's/ZED_POWER_OFF_ENCLOUSRE_SLOT_ON_FAULT/<linebreak>
                ZED_POWER_OFF_ENCLOSURE_SLOT_ON_FAULT/g' \
      cmd/zed/zed.d/zed.rc \
      cmd/zed/zed.d/statechange-slot_off.sh

    $ grep -rl ZED_POWER_OFF_ENCLOUSRE_SLOT_ON_FAULT
    $

Fixes 11fbcacf37d1a66c7a40bb8920c70ce9a87270ea
("zed: Add zedlet to power off slot when drive is faulted")

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Mauricio Faria de Oliveira <mfo@canonical.com>
Closes #15651

11 months agoimport: ignore return on hostid lookups
Rob N [Thu, 7 Dec 2023 16:41:54 +0000 (03:41 +1100)]
import: ignore return on hostid lookups

Just silencing a warning. Its totally fine for a hostid to not be there.

Reported-by: Coverity (CID-1573336)
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Ameer Hamza <ahamza@ixsystems.com>
Signed-off-by: Rob Norris <robn@despairlabs.com>
Closes #15650

11 months agosetproctitle: fix ununitialised variable
Rob N [Thu, 7 Dec 2023 16:23:16 +0000 (03:23 +1100)]
setproctitle: fix ununitialised variable

Reported-by: Coverity (CID-1573333)
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Ameer Hamza <ahamza@ixsystems.com>
Signed-off-by: Rob Norris <robn@despairlabs.com>
Closes #15649

11 months agozfs_refcount_remove: explictly ignore returns
Rob N [Thu, 7 Dec 2023 16:21:38 +0000 (03:21 +1100)]
zfs_refcount_remove: explictly ignore returns

Coverity noticed that sometimes we ignore the return, and sometimes we
don't. Its not wrong, and I like consistent style, so here we are.

Reported-by: Coverity (CID-1564584)
Reported-by: Coverity (CID-1564585)
Reported-by: Coverity (CID-1564586)
Reported-by: Coverity (CID-1564587)
Reported-by: Coverity (CID-1564588)
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Rob Norris <robn@despairlabs.com>
Closes #15647

11 months agoFreeBSD: Ensure that zfs_getattr() initializes the va_rdev field
Mark Johnston [Thu, 7 Dec 2023 16:20:11 +0000 (11:20 -0500)]
FreeBSD: Ensure that zfs_getattr() initializes the va_rdev field

Otherwise the field is left uninitialized, leading to a possible kernel
memory disclosure to userspace or to the network.  Use the same
initialization value we use in zfsctl_common_getattr().

Reported-by: KMSAN
Sponsored-by: The FreeBSD Foundation
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Ed Maste <emaste@FreeBSD.org>
Signed-off-by: Mark Johnston <markj@FreeBSD.org>
Closes #15639

11 months agoBRT: Limit brt_vdev_dump() to only one vdev
Alexander Motin [Wed, 6 Dec 2023 23:37:27 +0000 (18:37 -0500)]
BRT: Limit brt_vdev_dump() to only one vdev

Without this patch on pool of 60 vdevs with ZFS_DEBUG enabled clone
takes much more time than copy, while heavily trashing dbgmsg for
no good reason, repeatedly dumping all vdevs BRTs again and again,
even unmodified ones.

I am generally not sure this dumping is not excessive, but decided
to keep it for now, just restricting its scope to more reasonable.

Reviewed-by: Kay Pedersen <mail@mkwg.de>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Alexander Motin <mav@FreeBSD.org>
Sponsored by: iXsystems, Inc.
Closes #15625

11 months agoZIL: Remove 128K into 2x68K LWB split optimization
Alexander Motin [Wed, 6 Dec 2023 23:02:05 +0000 (18:02 -0500)]
ZIL: Remove 128K into 2x68K LWB split optimization

To improve 128KB block write performance in case of multiple VDEVs
ZIL used to spit those writes into two 64KB ones.  Unfortunately it
was found to cause LWB buffer overflow, trying to write maximum-
sizes 128KB TX_CLONE_RANGE record with 1022 block pointers into
68KB buffer, since unlike TX_WRITE ZIL code can't split it.

This is a minimally-invasive temporary block cloning fix until the
following more invasive prediction code refactoring.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Ameer Hamza <ahamza@ixsystems.com>
Signed-off-by: Alexander Motin <mav@FreeBSD.org>
Sponsored by: iXsystems, Inc.
Closes #15634

11 months agozdb: Dump encrypted write and clone ZIL records
Alexander Motin [Wed, 6 Dec 2023 20:39:12 +0000 (15:39 -0500)]
zdb: Dump encrypted write and clone ZIL records

Block pointers are not encrypted in TX_WRITE and TX_CLONE_RANGE
records, so we can dump them, that may be useful for debugging.

Related to #15543.
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Alexander Motin <mav@FreeBSD.org>
Sponsored by: iXsystems, Inc.
Closes #15629

11 months agocompact: workaround for GPL-only symbols on riscv from Linux 6.2
Shengqi Chen [Wed, 6 Dec 2023 20:37:50 +0000 (04:37 +0800)]
compact: workaround for GPL-only symbols on riscv from Linux 6.2

Since Linux 6.2, the implementation of flush_dcache_page on riscv
references GPL-only symbol `PageHuge`, breaking the build of zfs.

This patch uses existing mechanism to override flush_dcache_page,
removing the call to `PageHuge`. According to comments in kernel,
it is only used to do some check against HugeTLB pages, which only
exist in userspace. ZFS uses flush_dcache_page only on kernel pages,
thus this patch will not introduce any behaviour change.

See also: torvalds/linux@d33deda, openzfs/zfs@589f59b

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Shengqi Chen <harry-chen@outlook.com>
Closes #14974
Closes #15627

11 months agoExtend import_progress kstat with a notes field
Don Brady [Tue, 5 Dec 2023 22:27:56 +0000 (15:27 -0700)]
Extend import_progress kstat with a notes field

Detail the import progress of log spacemaps as they can take a very
long time.  Also grab the spa_note() messages to, as they provide
insight into what is happening

Sponsored-By: OpenDrives Inc.
Sponsored-By: Klara Inc.
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Don Brady <don.brady@klarasystems.com>
Co-authored-by: Allan Jude <allan@klarasystems.com>
Closes #15539

11 months agomodule/icp/asm-arm/sha2: enable non-SIMD asm kernels on armv5/6
Shengqi Chen [Tue, 5 Dec 2023 20:01:09 +0000 (04:01 +0800)]
module/icp/asm-arm/sha2: enable non-SIMD asm kernels on armv5/6

My merged pull request #15557 fixes compilation of sha2 kernels on arm
v5/6. However, the compiler guards only allows sha256/512_armv7_impl to
be used when __ARM_ARCH > 6. This patch enables these ASM kernels on all
arm architectures. Some compiler guards are adjusted accordingly to
avoid the unnecessary compilation of SIMD (e.g., neon, armv8ce) kernels
on old architectures.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Shengqi Chen <harry-chen@outlook.com>
Closes #15623

11 months agozpool: flush output before sleeping
Rob N [Tue, 5 Dec 2023 19:53:14 +0000 (06:53 +1100)]
zpool: flush output before sleeping

Several zpool commands (status, list, iostat) have modes that present
some information, sleep a while, present the current state, sleep, etc.
Some of those had ways to invoke them that when piped would appear to do
nothing for a while, because non-terminals are block-buffered, not
line-buffered, by default.  Fix this by forcing a flush before sleeping.

In particular, all of these buffered:
- zpool status <pool> <interval>
- zpool iostat -y<m> <pool> <interval>
- zpool list <pool> <interval>

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Rob Norris <robn@despairlabs.com>
Closes #15593

11 months agoAllow block cloning across encrypted datasets
oromenahar [Tue, 5 Dec 2023 19:03:48 +0000 (20:03 +0100)]
Allow block cloning across encrypted datasets

When two datasets share the same master encryption key, it is safe
to clone encrypted blocks. Currently only snapshots and clones
of a dataset share with it the same encryption key.

Added a test for:
- Clone from encrypted sibling to encrypted sibling with
  non encrypted parent
- Clone from encrypted parent to inherited encrypted child
- Clone from child to sibling with encrypted parent
- Clone from snapshot to the original datasets
- Clone from foreign snapshot to a foreign dataset
- Cloning from non-encrypted to encrypted datasets
- Cloning from encrypted to non-encrypted datasets

Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Original-patch-by: Pawel Jakub Dawidek <pawel@dawidek.net>
Signed-off-by: Kay Pedersen <mail@mkwg.de>
Closes #15544

11 months agoZIL: Do not clone blocks from the future
Alexander Motin [Tue, 5 Dec 2023 18:58:11 +0000 (13:58 -0500)]
ZIL: Do not clone blocks from the future

ZIL claim can not handle block pointers cloned from the future,
since they are not yet allocated at that point.  It may happen
either if the block was just written when it was cloned, or if
the pool was frozen or somehow else rewound on import.

Handle it from two sides: prevent cloning of blocks with physical
birth time from not yet synced or frozen TXG, and abort ZIL claim
if we still detect such blocks due to rewind or something else.

While there, assert that any cloned blocks we claim are really
allocated by calling metaslab_check_free().

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Alexander Motin <mav@FreeBSD.org>
Sponsored by: iXsystems, Inc.
Closes #15617

11 months agoAdd Ntfy notification support to ZED
Dex Wood [Fri, 1 Dec 2023 23:25:17 +0000 (17:25 -0600)]
Add Ntfy notification support to ZED

This commit adds the zed_notify_ntfy() function and hooks it
into zed_notify(). This will allow ZED to send notifications
to ntfy.sh or a self-hosted Ntfy service, which can be received
on a desktop or mobile device. It is configured with ZED_NTFY_TOPIC,
ZED_NTFY_URL, and ZED_NTFY_ACCESS_TOKEN variables in zed.rc.

Reviewed-by: @classabbyamp
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Dex Wood <slash2314@gmail.com>
Closes #15584

11 months agoZIL: Remove TX_CLONE_RANGE replay for ZVOLs.
Alexander Motin [Fri, 1 Dec 2023 23:23:20 +0000 (18:23 -0500)]
ZIL: Remove TX_CLONE_RANGE replay for ZVOLs.

zil_claim_clone_range() takes references on cloned blocks before ZIL
replay.  Later zil_free_clone_range() drops them after replay or on
dataset destroy.  The total balance is neutral.  It means we do not
need to do anything (drop the references) for not implemented yet
TX_CLONE_RANGE replay for ZVOLs.

This is a logical follow up to #15603.

Reviewed-by: Kay Pedersen <mail@mkwg.de>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Alexander Motin <mav@FreeBSD.org>
Sponsored by: iXsystems, Inc.
Closes #15612

11 months agoZIO: Add overflow checks for linear buffers
Alexander Motin [Fri, 1 Dec 2023 19:50:10 +0000 (14:50 -0500)]
ZIO: Add overflow checks for linear buffers

Since we use a limited set of kmem caches, quite often we have unused
memory after the end of the buffer.  Put there up to a 512-byte canary
when built with debug to detect buffer overflows at the free time.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Alexander Motin <mav@FreeBSD.org>
Sponsored by: iXsystems, Inc.
Closes #15553

11 months agoOnly provide execvpe(3) when needed
Brooks Davis [Fri, 1 Dec 2023 00:04:18 +0000 (16:04 -0800)]
Only provide execvpe(3) when needed

Check for the existence of execvpe(3) and only provide the FreeBSD
compat version if required.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Brooks Davis <brooks.davis@sri.com>
Closes #15609

11 months agoUse uint64_t instead of u_int64_t
Yuri Pankov [Thu, 30 Nov 2023 18:36:33 +0000 (01:36 +0700)]
Use uint64_t instead of u_int64_t

Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Yuri Pankov <ypankov@tintri.com>
Closes #15610

11 months agoZIL: Call brt_pending_add() replaying TX_CLONE_RANGE
Alexander Motin [Wed, 29 Nov 2023 18:51:34 +0000 (13:51 -0500)]
ZIL: Call brt_pending_add() replaying TX_CLONE_RANGE

zil_claim_clone_range() takes references on cloned blocks before ZIL
replay.  Later zil_free_clone_range() drops them after replay or on
dataset destroy.  The total balance is neutral.  It means on actual
replay we must take additional references, which would stay in BRT.

Without this blocks could be freed prematurely when either original
file or its clone are destroyed.  I've observed BRT being emptied
and the feature being deactivated after ZIL replay completion, which
should not have happened.  With the patch I see expected stats.

Reviewed-by: Kay Pedersen <mail@mkwg.de>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Rob Norris <robn@despairlabs.com>
Signed-off-by: Alexander Motin <mav@FreeBSD.org>
Sponsored by: iXsystems, Inc.
Closes #15603

11 months agoFix zoneid when USER_NS is disabled
Wraithh [Wed, 29 Nov 2023 17:55:17 +0000 (19:55 +0200)]
Fix zoneid when USER_NS is disabled

getzoneid() should return GLOBAL_ZONEID instead of 0 when USER_NS is disabled.

Reviewed-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ilkka Sovanto <github@ilkka.kapsi.fi>
Closes #15560

11 months agoZTS: get_persistent_disk_name can return truncated names
VaibhavB [Wed, 29 Nov 2023 17:34:29 +0000 (23:04 +0530)]
ZTS: get_persistent_disk_name can return truncated names

Instead of using only the 3rd element return the entire string after
the split to handle device names with dashes.

Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Vaibhav Bhanawat <vaibhav.bhanawat@delphix.com>
Closes #15567

11 months agozdb: fix printf() length for uint64_t devid
Martin Matuška [Wed, 29 Nov 2023 17:18:30 +0000 (18:18 +0100)]
zdb: fix printf() length for uint64_t devid

Bug introduced in 213d6829673.

Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Warner Losh <imp@FreeBSD.org>
Signed-off-by: Martin Matuska <mm@FreeBSD.org>
Closes #15606

11 months agoZIL: Assert record sizes in different places
Alexander Motin [Tue, 28 Nov 2023 21:35:14 +0000 (16:35 -0500)]
ZIL: Assert record sizes in different places

This should make sure we have log written without overflows.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Alexander Motin <mav@FreeBSD.org>
Sponsored by: iXsystems, Inc.
Closes #15517

11 months agomodule/icp/asm-arm/sha2: fix compiling on armv5/6
Shengqi Chen [Wed, 22 Nov 2023 14:27:24 +0000 (22:27 +0800)]
module/icp/asm-arm/sha2: fix compiling on armv5/6

The `adr` insn in neon kernel generates an compiling
error on armv5/6 target. Fix that by using `ldr`.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Shengqi Chen <harry-chen@outlook.com>
Closes #15557

11 months agomodule/icp/asm-arm/sha2: auto detect __ARM_ARCH
Shengqi Chen [Wed, 22 Nov 2023 13:58:47 +0000 (21:58 +0800)]
module/icp/asm-arm/sha2: auto detect __ARM_ARCH

This patch uses __ARM_ARCH set by compiler (both
GCC and Clang have this) whenever possible instead
of hardcoding it to 7. This change allows code to
compile on earlier ARM architectures such as armv5te.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Shengqi Chen <harry-chen@outlook.com>
Closes #15557

11 months agoLinux 6.6 compat: fix configure error with clang (#15558)
Jaron Kent-Dobias [Tue, 28 Nov 2023 19:34:40 +0000 (20:34 +0100)]
Linux 6.6 compat: fix configure error with clang (#15558)

With Linux v6.6.x and clang 16, a configure step fails on a warning that
later results in an error while building, due to 'ts' being
uninitialized. Add a trivial initialization to silence the warning.

Signed-off-by: Jaron Kent-Dobias <jaron@kent-dobias.com>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
11 months agodmu_buf_will_clone: fix race in transition back to NOFILL
Rob N [Tue, 28 Nov 2023 17:53:04 +0000 (04:53 +1100)]
dmu_buf_will_clone: fix race in transition back to NOFILL

Previously, dmu_buf_will_clone() would roll back any dirty record, but
would not clean out the modified data nor reset the state before
releasing the lock. That leaves the last-written data in db_data, but
the dbuf in the wrong state.

This is eventually corrected when the dbuf state is made NOFILL, and
dbuf_noread() called (which clears out the old data), but at this point
its too late, because the lock was already dropped with that invalid
state.

Any caller acquiring the lock before the call into
dmu_buf_will_not_fill() can find what appears to be a clean, readable
buffer, and would take the wrong state from it: it should be getting the
data from the cloned block, not from earlier (unwritten) dirty data.

Even after the state was switched to NOFILL, the old data was still not
cleaned out until dbuf_noread(), which is another gap for a caller to
take the lock and read the wrong data.

This commit fixes all this by properly cleaning up the previous state
and then setting the new state before dropping the lock. The
DBUF_VERIFY() calls confirm that the dbuf is in a valid state when the
lock is down.

Sponsored-by: Klara, Inc.
Sponsored-By: OpenDrives Inc.
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Pawel Jakub Dawidek <pawel@dawidek.net>
Signed-off-by: Rob Norris <rob.norris@klarasystems.com>
Closes #15566
Closes #15526

11 months agounnecessary alloc/free in dsl_scan_visitbp()
Matthew Ahrens [Tue, 28 Nov 2023 17:20:48 +0000 (09:20 -0800)]
unnecessary alloc/free in dsl_scan_visitbp()

Clean up code in dsl_scan_visitbp() by removing an unnecessary
alloc/free and `goto`.  This has the side benefit of reducing CPU usage,
which is only really noticeable if we are not doing i/o for the leaf
blocks, like when `zfs_no_scrub_io` is set.

Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Mark Maybee <mark.maybee@delphix.com>
Reviewed-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Signed-off-by: Matthew Ahrens <mahrens@delphix.com>
Closes #15549

11 months agodnode_is_dirty: check dnode and its data for dirtiness
Rob N [Tue, 28 Nov 2023 17:07:57 +0000 (04:07 +1100)]
dnode_is_dirty: check dnode and its data for dirtiness

Over its history this the dirty dnode test has been changed between
checking for a dnodes being on `os_dirty_dnodes` (`dn_dirty_link`) and
`dn_dirty_record`.

  de198f2d9 Fix lseek(SEEK_DATA/SEEK_HOLE) mmap consistency
  2531ce372 Revert "Report holes when there are only metadata changes"
  ec4f9b8f3 Report holes when there are only metadata changes
  454365bba Fix dirty check in dmu_offset_next()
  66aca2473 SEEK_HOLE should not block on txg_wait_synced()

Also illumos/illumos-gate@c543ec060d illumos/illumos-gate@2bcf0248e9

It turns out both are actually required.

In the case of appending data to a newly created file, the dnode proper
is dirtied (at least to change the blocksize) and dirty records are
added.  Thus, a single logical operation is represented by separate
dirty indicators, and must not be separated.

The incorrect dirty check becomes a problem when the first block of a
file is being appended to while another process is calling lseek to skip
holes. There is a small window where the dnode part is undirtied while
there are still dirty records. In this case, `lseek(fd, 0, SEEK_DATA)`
would not know that the file is dirty, and would go to
`dnode_next_offset()`. Since the object has no data blocks yet, it
returns `ESRCH`, indicating no data found, which results in `ENXIO`
being returned to `lseek()`'s caller.

Since coreutils 9.2, `cp` performs sparse copies by default, that is, it
uses `SEEK_DATA` and `SEEK_HOLE` against the source file and attempts to
replicate the holes in the target. When it hits the bug, its initial
search for data fails, and it goes on to call `fallocate()` to create a
hole over the entire destination file.

This has come up more recently as users upgrade their systems, getting
OpenZFS 2.2 as well as a newer coreutils. However, this problem has been
reproduced against 2.1, as well as on FreeBSD 13 and 14.

This change simply updates the dirty check to check both types of dirty.
If there's anything dirty at all, we immediately go to the "wait for
sync" stage, It doesn't really matter after that; both changes are on
disk, so the dirty fields should be correct.

Sponsored-by: Klara, Inc.
Sponsored-by: Wasabi Technology, Inc.
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Rich Ercolani <rincebrain@gmail.com>
Signed-off-by: Rob Norris <rob.norris@klarasystems.com>
Closes #15571
Closes #15526

11 months agoFreeBSD: Fix ZFS so that snapshots under .zfs/snapshot are NFS visible
rmacklem [Tue, 28 Nov 2023 00:31:03 +0000 (16:31 -0800)]
FreeBSD: Fix ZFS so that snapshots under .zfs/snapshot are NFS visible

Call vfs_exjail_clone() for mounts created under .zfs/snapshot
to fill in the mnt_exjail field for the mount.  If this is not
done, the snapshots under .zfs/snapshot with not be accessible
over NFS.

This version has the argument name in vfs.h fixed to match that
of the name in spl_vfs.c, although it really does not matter.

External-issue: https://reviews.freebsd.org/D42672
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Signed-off-by: Rick Macklem <rmacklem@uoguelph.ca>
Closes #15563

11 months agozdb: Fix zdb '-O|-r' options with -e/exported zpool
Akash B [Mon, 27 Nov 2023 21:41:58 +0000 (03:11 +0530)]
zdb: Fix zdb '-O|-r' options with -e/exported zpool

zdb with '-e' or exported zpool doesn't work along with
'-O' and '-r' options as we process them before '-e' has
been processed.

Below errors are seen:

~> zdb -e pool-mds65/mdt65 -O oi.9/0x200000009:0x0:0x0
failed to hold dataset 'pool-mds65/mdt65': No such file or directory

~> zdb -e pool-oss0/ost0 -r file1 /tmp/filecopy1 -p.
failed to hold dataset 'pool-oss0/ost0': No such file or directory
zdb: internal error: No such file or directory

We need to make sure to process '-O|-r' options after the
'-e' option has been processed, which imports the pool to
the namespace if it's not in the cachefile.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Akash B <akash-b@hpe.com>
Closes #15532

11 months agozdb: show BRT statistics and dump its contents
Rob Norris [Sat, 18 Nov 2023 10:33:45 +0000 (21:33 +1100)]
zdb: show BRT statistics and dump its contents

Same idea as the dedup stats, but for block cloning.

Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Kay Pedersen <mail@mkwg.de>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Rob Norris <robn@despairlabs.com>
Closes #15541

11 months agobrt: lift internal definitions into _impl header
Rob Norris [Sat, 18 Nov 2023 10:32:16 +0000 (21:32 +1100)]
brt: lift internal definitions into _impl header

So that zdb (and others!) can get at the BRT on-disk structures.

Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Kay Pedersen <mail@mkwg.de>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Rob Norris <robn@despairlabs.com>
Closes #15541

11 months agoZTS: Fix zfs_load-key failures on F39
Tony Hutter [Mon, 27 Nov 2023 21:24:37 +0000 (13:24 -0800)]
ZTS: Fix zfs_load-key failures on F39

The zfs_load-key tests were failing on F39 due to their use of the
deprecated ssl.wrap_socket function.  This commit updates the test to
instead use ssl.SSLContext() as described in
https://stackoverflow.com/a/65194957.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Closes #15534
Closes #15550

11 months agozfs-dkms: fix shell-init error message
AllKind [Mon, 27 Nov 2023 21:17:48 +0000 (22:17 +0100)]
zfs-dkms: fix shell-init error message

If all zfs dkms modules have been removed, a shell-init error message
may appear, because /var/lib/dkms/zfs does no longer exist.
Resolve this by leaving the directory earlier on.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Mart Frauenlob <AllKind@fastest.cc>
Closes #15576

11 months agoZVOL: Minor code cleanup
Alexander Motin [Mon, 27 Nov 2023 21:16:59 +0000 (16:16 -0500)]
ZVOL: Minor code cleanup

- Remove zsda_tx field, it is used only once.
 - Remove unneeded string lengths checks, all names are terminated.
 - Replace few explicit MAXNAMELEN usages with sizeof().
 - Change dsname from MAXNAMELEN to ZFS_MAX_DATASET_NAME_LEN, as
expected by dsl_dataset_name().  Both are 256 bytes now, but it is
better to be safe.

This should have no functional difference.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Alexander Motin <mav@FreeBSD.org>
Sponsored by: iXsystems, Inc.
Closes #15535

11 months agoFreeBSD: Fix the build on FreeBSD 12
Alan Somers [Mon, 27 Nov 2023 20:58:03 +0000 (13:58 -0700)]
FreeBSD: Fix the build on FreeBSD 12

It was broken for several reasons:
* VOP_UNLOCK lost an argument in 13.0.  So OpenZFS should be using
  VOP_UNLOCK1, but a few direct calls to VOP_UNLOCK snuck in.
* The location of the zlib header moved in 13.0 and 12.1.  We can drop
  support for building on 12.0, which is EoL.
* knlist_init lost an argument in 13.0.  OpenZFS change 9d0887402ba
  assumed 13.0 or later.
* FreeBSD 13.0 added copy_file_range, and OpenZFS change 67a1b037915
  assumed 13.0 or later.

Sponsored-by: Axcient
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Signed-off-by: Alan Somers <asomers@gmail.com>
Closes #15551