]> git.proxmox.com Git - mirror_zfs.git/log
mirror_zfs.git
3 years agoSuppress cppcheck invalidSyntax warninigs
Brian Behlendorf [Sat, 6 Mar 2021 01:56:35 +0000 (17:56 -0800)]
Suppress cppcheck invalidSyntax warninigs

For some reason cppcheck 1.90 is generating an invalidSyntax warning
when the BF64_SET macro is used in the zstream source.  The same
warning is not reported by cppcheck 2.3, nor is their any evident
problem with the expanded macro.  This appears to be an issue with
this version of cppcheck.  This commit annotates the source to suppress
the warning.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #11700

3 years agoInitialize ZIL buffers
Brian Behlendorf [Fri, 5 Mar 2021 22:45:13 +0000 (14:45 -0800)]
Initialize ZIL buffers

When populating a ZIL destination buffer ensure it is always
zeroed before its contents are constructed.

Reviewed-by: Matthew Ahrens <mahrens@delphix.com>
Reviewed-by: Tom Caputi <caputit1@tcnj.edu>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #11687

3 years agozpool: use tab to intend continuation from removal status
Thomas Lamprecht [Fri, 5 Mar 2021 20:15:35 +0000 (21:15 +0100)]
zpool: use tab to intend continuation from removal status

Bring the output of the removal status in line with the other
"fields" that zpool status outputs, and thus allows an parser to
easier detect this as continuation of the 'remove:' output.

Before:
remove: Removal of vdev 0 copied 282G in 0h9m, completed on [...]
    776K memory used for removed device mappings

Now:
remove: Removal of vdev 0 copied 282G in 0h9m, completed on [...]
776K memory used for removed device mappings

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Closes #11674

3 years agoDon't bomb out when using keylocation=file://
James Wah [Wed, 3 Mar 2021 16:28:49 +0000 (03:28 +1100)]
Don't bomb out when using keylocation=file://

Avoid following the error path when the operation in fact succeeded.

Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: James Wah <james@laird-wah.net>
Closes #11651

3 years agoAdd "zstd-fast" to help options for "compression" property
Jake Howard [Wed, 3 Mar 2021 16:14:19 +0000 (16:14 +0000)]
Add "zstd-fast" to help options for "compression" property

This value does work as expected, and is documented in the manpage.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Jake Howard <git@theorangeone.net>
Closes #11670

3 years agoFix assert in FreeBSD-specific dmu_read_pages
Andriy Gapon [Sun, 28 Feb 2021 01:23:09 +0000 (03:23 +0200)]
Fix assert in FreeBSD-specific dmu_read_pages

The function has three similar pieces of code: for read-behind pages,
requested pages and read-ahead pages.  All three pieces had an
assert to ensure that the page is not mapped.  Later the assert was
relaxed to require that the page is not mapped for writing.  But that
was done in two places out of three.  This change fixes the third piece,
read-ahead.

Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Andriy Gapon <avg@FreeBSD.org>
Closes #11654

3 years agoLinux 5.12 compat: replace bio_*_io_acct with disk_*_io_acct
Coleman Kane [Tue, 23 Feb 2021 02:18:41 +0000 (21:18 -0500)]
Linux 5.12 compat: replace bio_*_io_acct with disk_*_io_acct

The bio_*_acct functions became GPL exports, which causes the
kernel modules to refuse to compile. This replaces code with
alternate function calls to the disk_*_io_acct interfaces, which
are not GPL exports. This change was added in kernel commit
99dfc43ecbf67f12a06512918aaba61d55863efc.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Coleman Kane <ckane@colemankane.org>
Closes #11639

3 years agoLinux 5.12 compat: bio->bi_disk member moved
Coleman Kane [Tue, 23 Feb 2021 02:07:51 +0000 (21:07 -0500)]
Linux 5.12 compat: bio->bi_disk member moved

The struct bio member bi_disk was moved underneath a new member named
bi_bdev. So all attempts to reference bio->bi_disk need to now become
bio->bi_bdev->bd_disk.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Coleman Kane <ckane@colemankane.org>
Closes #11639

3 years agoLinux: increase max nvlist_src size
Brian Behlendorf [Wed, 24 Feb 2021 17:57:18 +0000 (09:57 -0800)]
Linux: increase max nvlist_src size

On Linux increase the maximum allowed size of the src nvlist which
can be passed to the /dev/zfs ioctl.  Originally, this was set
to a maximum of KMALLOC_MAX_SIZE (4M) because it was kmalloc'd.
Since that time it's been converted to a vmalloc so that's no
longer a hard limit, and it's desirable for `zfs send/recv` to
allow larger nvlists so more snapshots can be sent at once.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #6572
Closes #11638

3 years agosend_iterate_snap : doall send without fromsnap
Cedric Maunoury [Wed, 24 Feb 2021 17:48:58 +0000 (18:48 +0100)]
send_iterate_snap : doall send without fromsnap

The behavior of a NULL fromsnap was inadvertently changed for a doall
send when the send/recv logic in libzfs was updated.  Restore the
previous behavior by correcting send_iterate_snap() to include all
the snapshots in the nvlist for this case.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Cedric Maunoury <cedric.maunoury@gmail.com>
Closes #11608

3 years agovdev_ops: don't try to call vdev_op_hold or vdev_op_rele when NULL
fbynite [Sun, 21 Feb 2021 04:19:20 +0000 (19:19 -0900)]
vdev_ops: don't try to call vdev_op_hold or vdev_op_rele when NULL

This prevents a panic after a SLOG add/removal on the root pool followed
by a zpool scrub.

When a SLOG is removed, a hole takes its place - the vdev_ops for a hole
is vdev_hole_ops, which defines the handler functions of vdev_op_hold
and vdev_op_rele as NULL.

This bug has been reported in illumos and FreeBSD, a different trigger
in the FreeBSD report though.

Credit for this patch goes to Patrick Mooney <pmooney@pfmooney.com>

Obtained from: illumos-gate commit: c65bd18728f34725
External-issue: https://www.illumos.org/issues/12981
External-issue: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=252396
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Rob Wing <rob.fx907@gmail.com>
Closes #11623

3 years agolibzpool: set_global_var: fix endianness handling (fixes zdb -o )
Christian Schwarz [Mon, 15 Feb 2021 12:02:32 +0000 (13:02 +0100)]
libzpool: set_global_var: fix endianness handling (fixes zdb -o )

Without this patch I get the error

  Setting global variables is only supported on little-endian systems

when using `zdb -o` on my amd64 machine.

Reviewed-by: Matthew Ahrens <mahrens@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Pavel Zakharov <pavel.zakharov@delphix.com>
Signed-off-by: Christian Schwarz <me@cschwarz.com>
Closes #11602

3 years agoRestore FreeBSD resource usage accounting
Ryan Moeller [Sat, 20 Feb 2021 06:34:33 +0000 (01:34 -0500)]
Restore FreeBSD resource usage accounting

Add zfs_racct_* interfaces for platform-dependent read/write accounting.

Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Signed-off-by: Ryan Moeller <ryan@iXsystems.com>
Closes #11613

3 years agoFreeBSD: disable the use of hardware crypto offload drivers for now
Mark Johnston [Thu, 18 Feb 2021 23:51:20 +0000 (18:51 -0500)]
FreeBSD: disable the use of hardware crypto offload drivers for now

First, the crypto request completion handler contains a bug in that it
fails to reset fs_done correctly after the request is completed.  This
is only a problem for asynchronous drivers.  Second, some hardware
drivers have input constraints which ZFS does not satisfy.  For
instance, ccp(4) apparently requires the AAD length for AES-GCM to be a
multiple of the cipher block size, and with qat(4) the AES-GCM AAD
length may not be longer than 240 bytes.  FreeBSD's generic crypto
framework doesn't have a mechanism to automatically fall back to a
software implementation if a hardware driver cannot process a request,
and ZFS does not tolerate such errors.

The plan is to implement such a fallback mechanism, but with FreeBSD
13.0 approaching we should simply disable the use hardware drivers for
now.

Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Signed-off-by: Mark Johnston <markj@FreeBSD.org>
Closes #11612

3 years agoFix report_mount_progress never calling set_progress_header
Andriy Gapon [Thu, 18 Feb 2021 21:53:05 +0000 (23:53 +0200)]
Fix report_mount_progress never calling set_progress_header

That happens because of an off-by-one mistake.
share_mount_one_cb() calls report_mount_progress(current=sm_done) after
having incremented sm_done by one.  Then report_mount_progress()
increments the parameter again.  It appears that that logic became
obsolete after commit a10d50f999511, parallel zfs mount.

On FreeBSD I observe that zfs mount -a -v prints, for example,
    (null): (201/248)
That happens because set_progress_header() is never called.

With this change the output becomes correct:
    Mounting ZFS filesystems: (209/248)

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Andriy Gapon <avg@FreeBSD.org>
Closes #11607

3 years agoSupport uClibc for the tests compilations
José Luis Salvador Rufo [Wed, 17 Feb 2021 05:51:46 +0000 (06:51 +0100)]
Support uClibc for the tests compilations

There are two issues that don't allow ZFS to be compiled using uClibc.
`backtrace()`, and `program_invocation_short_name` as a `const`.
This patch adds uClibc to the conditionals in the same way there are
already for Glibc for `backtrace()`; and removes the external param
`program_invocation_short_name` because its only used here for the
whole project.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: José Luis Salvador Rufo <salvador.joseluis@gmail.com>
Closes #11600

3 years agoLinux 5.11 compat: META
Brian Behlendorf [Wed, 10 Feb 2021 18:11:21 +0000 (10:11 -0800)]
Linux 5.11 compat: META

Increase the Linux-Maximum version in the META file to 5.11.
All of the required compatibility patches have been merged.

Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #11586

3 years agoTag zfs-2.0.3
Tony Hutter [Wed, 10 Feb 2021 20:15:17 +0000 (12:15 -0800)]
Tag zfs-2.0.3

META file and changelog updated.

Signed-off-by: Tony Hutter <hutter2@llnl.gov>
3 years agoSet file mode during zfs_write
Antonio Russo [Mon, 8 Feb 2021 17:15:05 +0000 (10:15 -0700)]
Set file mode during zfs_write

3d40b65 refactored zfs_vnops.c, which shared much code verbatim between
Linux and BSD.  After a successful write, the suid/sgid bits are reset,
and the mode to be written is stored in newmode.  On Linux, this was
propagated to both the in-memory inode and znode, which is then updated
with sa_update.

3d40b65 accidentally removed the initialization of newmode, which
happened to occur on the same line as the inode update (which has been
moved out of the function).

The uninitialized newmode can be saved to disk, leading to a crash on
stat() of that file, in addition to a merely incorrect file mode.

Reviewed-by: Ryan Moeller <ryan@ixsystems.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Antonio Russo <aerusso@aerusso.net>
Closes #11474
Closes #11576

3 years agozfs-import-{cache,scan}: change condition to FileNotEmpty
наб [Fri, 5 Feb 2021 19:25:22 +0000 (20:25 +0100)]
zfs-import-{cache,scan}: change condition to FileNotEmpty

When all pools are exported ZFS will generate an empty cache file.
This will cause the import service to fail, which is sub-optimal,
since this means that dracut fails, and it necessary to run
`zpool import -a` to boot, delete the file, and regenerate+reinstall
the initrd.

This resolves the issue by treating an zero-length cache files the
same as a missing cache file.  This aligns the behavior with that
of the `zpool` command itself.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Richard Laager <rlaager@wiktel.com>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #11568

3 years agodracut: Fix race condition between load-key and import
Lorenz Hüdepohl [Tue, 26 Jan 2021 20:14:22 +0000 (21:14 +0100)]
dracut: Fix race condition between load-key and import

zfs-load-key.sh is called by the dracut-pre-mount.service unit which has
no explicit 'After' dependency on zfs-import.target. That way it can be
that the pool has not yet been imported and the zfs-load-key.sh finishes
without ever seeing the relevant pool.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Lorenz Hüdepohl <dev@stellardeath.org>
Closes #11500

3 years agozfs-list.8: clarify listing snapshots
Brian Behlendorf [Thu, 4 Feb 2021 17:56:28 +0000 (09:56 -0800)]
zfs-list.8: clarify listing snapshots

Clarify how to include snapshots in the `zpool list` output by
referencing the full name of the `listsnapshots` pool property,
and the `zpool list -t snapshot` option.

Reviewed-by: Matthew Ahrens <mahrens@delphix.com>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #11562
Closes #11565

3 years agozts-report.py: ignore some skipped tests in Github CI
George Melikov [Wed, 27 Jan 2021 12:18:01 +0000 (15:18 +0300)]
zts-report.py: ignore some skipped tests in Github CI

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: George Melikov <mail@gmelikov.ru>
Closes #11554

3 years agoCI: add ubuntu-* functional tests runner
George Melikov [Tue, 26 Jan 2021 12:01:44 +0000 (15:01 +0300)]
CI: add ubuntu-* functional tests runner

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: George Melikov <mail@gmelikov.ru>
Closes #11554

3 years agoCI: rename zfs-tests workflow
George Melikov [Tue, 26 Jan 2021 12:01:19 +0000 (15:01 +0300)]
CI: rename zfs-tests workflow

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: George Melikov <mail@gmelikov.ru>
Closes #11554

3 years agoTag 2.0.2 zfs-2.0.2
Brian Behlendorf [Mon, 1 Feb 2021 18:08:22 +0000 (10:08 -0800)]
Tag 2.0.2

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
3 years agoAvoid updating the L2ARC device header unnecessarily
George Amanakis [Thu, 28 Jan 2021 17:20:03 +0000 (18:20 +0100)]
Avoid updating the L2ARC device header unnecessarily

If we do not write any buffers to the cache device and the evict hand
has not advanced do not update the cache device header.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: George Amanakis <gamanakis@gmail.com>
Closes #11522
Closes #11537

3 years agoFix zrele race in zrele_async that can cause hang
Paul Dagnelie [Thu, 28 Jan 2021 05:29:58 +0000 (21:29 -0800)]
Fix zrele race in zrele_async that can cause hang

There is a race condition in zfs_zrele_async when we are checking if
we would be the one to evict an inode. This can lead to a txg sync
deadlock.

Instead of calling into iput directly, we attempt to perform the atomic
decrement ourselves, unless that would set the i_count value to zero.
In that case, we dispatch a call to iput to run later, to prevent a
deadlock from occurring.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Matthew Ahrens <mahrens@delphix.com>
Signed-off-by: Paul Dagnelie <pcd@delphix.com>
Closes #11527
Closes #11530

3 years agoFix a resource leak in uu_avl_pool_destroy
Alan Somers [Wed, 27 Jan 2021 03:39:28 +0000 (20:39 -0700)]
Fix a resource leak in uu_avl_pool_destroy

Need to destroy the pthread mutex created in uu_avl_pool_create.

https://svnweb.freebsd.org/base?view=revision&revision=262912

Obtained from: FreeBSD
Sponsored by: Spectra Logic Corporation
Reviewed-by: Ryan Moeller <ryan@ixsystems.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Alan Somers <asomers@gmail.com>
Closes #11528

3 years agoFix a man page link in zfs-program.8
Alan Somers [Wed, 27 Jan 2021 00:17:11 +0000 (17:17 -0700)]
Fix a man page link in zfs-program.8

zfs-program.8 has an orphan link, fix it.

https://svnweb.freebsd.org/base?view=revision&revision=360080

Obtained from: FreeBSD
Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Alan Somers <asomers@gmail.com>
Closes #11529

3 years agozfsprops.8: fix mispluralisation in "Default values is"
наб [Sun, 24 Jan 2021 23:57:51 +0000 (00:57 +0100)]
zfsprops.8: fix mispluralisation in "Default values is"

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #11509

3 years agoZTS: Use swapctl to list swap devices on FreeBSD
Ryan Moeller [Sun, 24 Jan 2021 23:56:59 +0000 (18:56 -0500)]
ZTS: Use swapctl to list swap devices on FreeBSD

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Signed-off-by: Ryan Moeller <ryan@iXsystems.com>
Closes #11503

3 years agovdev_id: Add error message when $CONFIG is missing
Arshad Hussain [Sat, 23 Jan 2021 23:52:29 +0000 (05:22 +0530)]
vdev_id: Add error message when $CONFIG is missing

It was observed that vdev_id exists silently when
the $CONFIG file is missing.

This patch adds error message in case vdev_id is
called without default $CONFIG or '-c'. This makes
end user observe the exit message more easily.

Before Patch:
~~~~~~~~~~~~~
$ ./cmd/vdev_id/vdev_id
$

After Patch:
~~~~~~~~~~~~
$ ./cmd/vdev_id/vdev_id
Error: Config file "/etc/zfs/vdev_id.conf" not found
$

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Closes #11498

3 years agoFix two minor lint errors (cppcheck)
Colm [Sat, 23 Jan 2021 23:49:32 +0000 (23:49 +0000)]
Fix two minor lint errors (cppcheck)

Fix two minor errors reported by cppcheck:

In module/zfs/abd.c (abd_get_offset_impl), add non-NULL
assertion to prevent NULL dereference warning.

In module/zfs/arc.c (l2arc_write_buffers), change 'try'
variable to 'pass' to avoid C++ reserved word.

Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Colm Buckley <colm@tuatha.org>
Closes #11507

3 years agoRelax special_small_blocks assertion.
Alexander Motin [Sat, 23 Jan 2021 23:45:27 +0000 (18:45 -0500)]
Relax special_small_blocks assertion.

Follow up for commit 624222a, value asserted <= SPA_OLD_MAXBLOCKSIZE
instead of SPA_MAXBLOCKSIZE as it should be after the previous change.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Alexander Motin <mav@FreeBSD.org>
Closes #11501

3 years agoFreeBSD: upstream changes to VFS interface
Ryan Moeller [Thu, 21 Jan 2021 23:20:14 +0000 (23:20 +0000)]
FreeBSD: upstream changes to VFS interface

Set VIRF_MOUNTPOINT flag on snapshot mountpoint.

Authored-by: Mateusz Guzik <mjg@FreeBSD.org>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ryan Moeller <ryan@iXsystems.com>
Closes #11458

3 years agoFreeBSD: fix HEAD build, conditionally remove FDSYNC defines
Matt Macy [Wed, 13 Jan 2021 00:22:29 +0000 (16:22 -0800)]
FreeBSD: fix HEAD build, conditionally remove FDSYNC defines

Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Matt Macy <mmacy@FreeBSD.org>
Closes #11458

3 years agodracut: Support /usr/bin as 'systemctl' path
Lorenz Hüdepohl [Thu, 21 Jan 2021 20:59:24 +0000 (21:59 +0100)]
dracut: Support /usr/bin as 'systemctl' path

On openSUSE the initrd has systemctl in /usr/bin, check this path as
well.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Richard Laager <rlaager@wiktel.com>
Signed-off-by: Lorenz Hüdepohl <dev@stellardeath.org>
Closes #11487

3 years agoInstall zgenhostid to sbindir
Antonio Russo [Thu, 21 Jan 2021 20:58:24 +0000 (13:58 -0700)]
Install zgenhostid to sbindir

zgenhostid(8) is used to modify or create /etc/hostid.  This
administrative tool is currently installed to bindir.  System utilities
are typically placed in sbin.

Modify the installation directory for zgenhostid.  Additionally, track
this change in its use in dracut and the rpm installation.

Authored-by: наб <nabijaczleweli@nabijaczleweli.xyz>
Authored-by: Antonio Russo <aerusso@aerusso.net>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Antonio Russo <aerusso@aerusso.net>
Closes #11485

3 years agoRe-apply path sanitizer, as mount(8) still mangles it
sterlingjensen [Tue, 19 Jan 2021 19:57:31 +0000 (13:57 -0600)]
Re-apply path sanitizer, as mount(8) still mangles it

Prior to util-linux 2.36.2, if a file or directory in the
current working directory was named 'dataset' then mount(8)
would prepend the current working directory to the dataset.

Eventually, we should be able to drop this workaround.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Sterling Jensen <sterlingjensen@users.noreply.github.com>
Closes #11295
Closes #11462

3 years agoZTS: avoid piping to special devices
Antonio Russo [Tue, 19 Jan 2021 19:53:35 +0000 (12:53 -0700)]
ZTS: avoid piping to special devices

As described in #11445, the kernel interface kernel_{read,write} no
longer act on special devices.  In the ZTS, zfs send and receive are
tested by piping to these devices, leading to spurious failures (for
positive tests) and may mask errors (for negative tests).

Until a more permanent mechanism to address this deficiency is
developed, clean up the output from the ZTS by avoiding directly piping
to or from /dev/null and /dev/zero.

For /dev/zero input, simply use a pipe: `cat </dev/zero |` .

However, for /dev/null output, the shell semantics for pipe failures
means that zfs send error codes will be masked by the successful
`| cat >/dev/null` command execution.  In that case, use a temporary
file under $TEST_BASE_DIR for output in favor.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Attila Fülöp <attila@fueloep.org>
Signed-off-by: Antonio Russo <aerusso@aerusso.net>
Closes #11478

3 years agoZTS: avoid race to unmount in zfs_rollback_001
Antonio Russo [Wed, 13 Jan 2021 01:20:02 +0000 (18:20 -0700)]
ZTS: avoid race to unmount in zfs_rollback_001

The zfs_rollback_001 test modifies files in a temporary, test dataset
repeatedly.  Before each iteration, any preexisting dataset is removed,
after unmounted with umount -f, if necessary.

Add a short delay after the forced unmount, avoiding a race that can
prevent zfs destroy from succeeding, leading to a test failure.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Antonio Russo <aerusso@aerusso.net>
Closes #11451

3 years agoassertion failed in arc_wait_for_eviction()
Matthew Ahrens [Fri, 8 Jan 2021 04:06:32 +0000 (20:06 -0800)]
assertion failed in arc_wait_for_eviction()

If the system is very low on memory (specifically,
`arc_free_memory() < arc_sys_free/2`, i.e. less than 1/16th of RAM
free), `arc_evict_state_impl()` will defer wakups.  In this case, the
arc_evict_waiter_t's remain on the list, even though `arc_evict_count`
has been incremented past their `aew_count`.

The problem is that `arc_wait_for_eviction()` assumes that if there are
waiters on the list, the count they are waiting for has not yet been
reached.  However, the deferred wakeups may violate this, causing
`ASSERT(last->aew_count > arc_evict_count)` to fail.

This commit resolves the issue by having new waiters use the greater of
`arc_evict_count` and the last `aew_count`.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: George Wilson <gwilson@delphix.com>
Reviewed-by: George Amanakis <gamanakis@gmail.com>
Signed-off-by: Matthew Ahrens <mahrens@delphix.com>
Closes #11285
Closes #11397

3 years agoFreeBSD: minor_t needs to be signed so that -1 is recognized as such
Matthew Macy [Thu, 7 Jan 2021 18:41:27 +0000 (10:41 -0800)]
FreeBSD: minor_t needs to be signed so that -1 is recognized as such

zfsdev_close sets zs_minor to -1 to avoid duplicate calls to
destroy. This doesn't mix well with the current u_int used.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Signed-off-by: Matt Macy <mmacy@FreeBSD.org>
Closes #11437

3 years agoLinux 5.10 compat: restore custom uio_prefaultpages()
Brian Behlendorf [Fri, 22 Jan 2021 17:58:49 +0000 (09:58 -0800)]
Linux 5.10 compat: restore custom uio_prefaultpages()

As part of commit 1c2358c1 the custom uio_prefaultpages() code
was removed in favor of using the generic kernel provided
iov_iter_fault_in_readable() interface.  Unfortunately, it
turns out that up until the Linux 4.7 kernel the function would
only ever fault in the first iovec of the iov_iter.  The result
being uiomove_iov() may hang waiting for the page.

This commit effectively restores the custom uio_prefaultpages()
pages code for Linux 4.9 and earlier kernels which contain the
troublesome version of iov_iter_fault_in_readable().

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #11463
Closes #11484

3 years agoZTS: three small follow up fixes for #11167
Attila Fülöp [Thu, 10 Dec 2020 05:27:12 +0000 (06:27 +0100)]
ZTS: three small follow up fixes for #11167

Follow up fix for 0cb40fa3. Remove unused variables, don't source
unused libs and add missed cleanup.

Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Attila Fülöp <attila@fueloep.org>
Closes #11311

3 years agozpool: Dryrun fails to list some devices
Attila Fülöp [Fri, 4 Dec 2020 22:04:39 +0000 (23:04 +0100)]
zpool: Dryrun fails to list some devices

`zpool create -n` fails to list cache and spare vdevs.
`zpool add -n` fails to list spare devices.
`zpool split -n` fails to list `special` and `dedup` labels.
`zpool add -n` and `zpool split -n` shouldn't list hole devices.

Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Attila Fülöp <attila@fueloep.org>
Closes #11122
Closes #11167

3 years agoTag 2.0.1 zfs-2.0.1
Brian Behlendorf [Tue, 5 Jan 2021 22:16:53 +0000 (14:16 -0800)]
Tag 2.0.1

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
3 years agoAutoconf 2.70 compatibility
Brian Behlendorf [Sun, 3 Jan 2021 00:55:55 +0000 (16:55 -0800)]
Autoconf 2.70 compatibility

Several m4 macros have been retired in autoconf 2.70.  Update the
the build system to use the new macros provided to replace them.

* Replaced AC_HELP_STRING with AS_HELP_STRING.

* Replaced AC_TRY_COMPILE with AC_COMPILE_IFELSE/AC_LANG_PROGRAM.

* Replaced AC_CANONICAL_SYSTEM with AC_CANONICAL_TARGET

* Replaced AC_PROG_LIBTOOL with LT_INIT

* $CPP is not defined in ZFS_AC_KERNEL and really shouldn't be
  directly used like this.  Replace it with an $AWK command
  to extract the kernel source version.

Reviewed-by: Eli Schwartz <eschwartz@archlinux.org>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Issue #11413
Closes #11419

3 years agozfs_mount_all_mountpoints: cleanup_all should leave pool root mounted
Toomas Soome [Sun, 3 Jan 2021 00:54:53 +0000 (02:54 +0200)]
zfs_mount_all_mountpoints: cleanup_all should leave pool root mounted

if pool root is not mounted, then zpool umount in next test will leave
dataset mountpoint directory around and next zfs mount -a will fail
with error: cannot mount '/testpool': directory is not empty

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Toomas Soome <tsoome@me.com>
Closes #11417

3 years agoVZ 7 kernel compat: introduce ITER-enabled .direct_IO() via IOVECs
Konstantin Khorenko [Wed, 30 Dec 2020 22:18:29 +0000 (01:18 +0300)]
VZ 7 kernel compat: introduce ITER-enabled .direct_IO() via IOVECs

Virtuozzo 7 kernels starting 3.10.0-1127.18.2.vz7.163.46
have the following configuration:

  * no HAVE_VFS_RW_ITERATE
  * HAVE_VFS_DIRECT_IO_ITER_RW_OFFSET

=> let's add implementation of zpl_direct_IO() via
zpl_aio_{read,write}() in this case.

https://bugs.openvz.org/browse/OVZ-7243

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Konstantin Khorenko <khorenko@virtuozzo.com>
Closes #11410
Closes #11411

3 years agoMemory leak in zdb:import_checkpointed_state()
Matthew Ahrens [Fri, 25 Dec 2020 05:07:24 +0000 (21:07 -0800)]
Memory leak in zdb:import_checkpointed_state()

Reviewed-by: Igor Kozhukhov <igor@dilos.org>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Matthew Ahrens <mahrens@delphix.com>
Closes #11396

3 years agoMemory leak in ztest_dmu_objset_own()
Matthew Ahrens [Fri, 25 Dec 2020 04:58:17 +0000 (20:58 -0800)]
Memory leak in ztest_dmu_objset_own()

Reviewed-by: Igor Kozhukhov <igor@dilos.org>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Matthew Ahrens <mahrens@delphix.com>
Closes #11396

3 years agoMemory leak in ztest_vdev_attach_detach()
Matthew Ahrens [Fri, 25 Dec 2020 04:51:13 +0000 (20:51 -0800)]
Memory leak in ztest_vdev_attach_detach()

Reviewed-by: Igor Kozhukhov <igor@dilos.org>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Matthew Ahrens <mahrens@delphix.com>
Closes #11396

3 years agonvlist leaked in zpool_find_config()
Matthew Ahrens [Wed, 23 Dec 2020 17:52:24 +0000 (09:52 -0800)]
nvlist leaked in zpool_find_config()

In `zpool_find_config()`, the `pools` nvlist is leaked.  Part of it (a
sub-nvlist) is returned in `*configp`, but the callers also leak that.

Additionally, in `zdb.c:main()`, the `searchdirs` is leaked.

The leaks were detected by ASAN (`configure --enable-asan`).

This commit resolves the leaks.

Reviewed-by: Igor Kozhukhov <igor@dilos.org>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Matthew Ahrens <mahrens@delphix.com>
Closes #11396

3 years agoimplicit conversion from 'boolean_t' to 'ds_hold_flags_t'
Toomas Soome [Mon, 28 Dec 2020 00:31:02 +0000 (02:31 +0200)]
implicit conversion from 'boolean_t' to 'ds_hold_flags_t'

Build error on illumos with gcc 10 did reveal:

In function 'dmu_objset_refresh_ownership':
../../common/fs/zfs/dmu_objset.c:857:25: error: implicit conversion
from 'boolean_t' to 'ds_hold_flags_t' {aka 'enum ds_hold_flags'}
[-Werror=enum-conversion]
      857 |  dsl_dataset_disown(ds, decrypt, tag);
          |                         ^~~~~~~
cc1: all warnings being treated as errors

libzfs_input_check.c: In function 'zfs_ioc_input_tests':
libzfs_input_check.c:754:28: error: implicit conversion from
'enum dmu_objset_type' to 'enum lzc_dataset_type'
[-Werror=enum-conversion]
  754 |  err = lzc_create(dataset, DMU_OST_ZFS, NULL, NULL, 0);
      |                            ^~~~~~~~~~~
cc1: all warnings being treated as errors

The same issue is present in openzfs, and also the same issue about
ds_hold_flags_t, which currently defines exactly one valid value.

Reviewed-by: Igor Kozhukhov <igor@dilos.org>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Toomas Soome <tsoome@me.com>
Closes #11406

3 years agoLinux 5.11 compat: blk_{un}register_region()
Brian Behlendorf [Tue, 22 Dec 2020 22:12:31 +0000 (14:12 -0800)]
Linux 5.11 compat: blk_{un}register_region()

As of 5.11 the blk_register_region() and blk_unregister_region()
functions have been retired. This isn't a problem since add_disk()
has implicitly allocated minor numbers for a very long time.

Reviewed-by: Rafael Kitover <rkitover@gmail.com>
Reviewed-by: Coleman Kane <ckane@colemankane.org>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #11387
Closes #11390

3 years agoLinux 5.11 compat: revalidate_disk_size()
Brian Behlendorf [Tue, 22 Dec 2020 21:53:25 +0000 (13:53 -0800)]
Linux 5.11 compat: revalidate_disk_size()

Both revalidate_disk_size() and revalidate_disk() have been removed.
Functionally this isn't a problem because we only relied on these
functions to call zvol_revalidate_disk() for us and to perform any
additional handling which might be needed for that kernel version.
When neither are available we know there's no additional handling
needed and we can directly call zvol_revalidate_disk().

Reviewed-by: Rafael Kitover <rkitover@gmail.com>
Reviewed-by: Coleman Kane <ckane@colemankane.org>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #11387
Closes #11390

3 years agoLinux 5.11 compat: bdev_whole()
Brian Behlendorf [Tue, 22 Dec 2020 21:02:59 +0000 (13:02 -0800)]
Linux 5.11 compat: bdev_whole()

The bd_contains member was removed from the block_device structure.
Callers needing to determine if a vdev is a whole block device should
use the new bdev_whole() wrapper.  For older kernels we provide our
own bdev_whole() wrapper which relies on bd_contains for compatibility.

Reviewed-by: Rafael Kitover <rkitover@gmail.com>
Reviewed-by: Coleman Kane <ckane@colemankane.org>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #11387
Closes #11390

3 years agoLinux 5.11 compat: bio_start_io_acct() / bio_end_io_acct()
Brian Behlendorf [Tue, 22 Dec 2020 20:17:13 +0000 (12:17 -0800)]
Linux 5.11 compat: bio_start_io_acct() / bio_end_io_acct()

The generic IO accounting functions have been removed in favor of the
bio_start_io_acct() and bio_end_io_acct() functions which provide a
better interface.  These new functions were introduced in the 5.8
kernels but it wasn't until the 5.11 kernel that the previous generic
IO accounting interfaces were removed.

This commit updates the blk_generic_*_io_acct() wrappers to provide
and interface similar to the updated kernel interface.  It's slightly
different because for older kernels we need to pass the request queue
as well as the bio.

Reviewed-by: Rafael Kitover <rkitover@gmail.com>
Reviewed-by: Coleman Kane <ckane@colemankane.org>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #11387
Closes #11390

3 years agoLinux 5.11 compat: lookup_bdev()
Brian Behlendorf [Tue, 22 Dec 2020 18:26:45 +0000 (10:26 -0800)]
Linux 5.11 compat: lookup_bdev()

The lookup_bdev() function has been updated to require a dev_t
be passed as the second argument. This is actually pretty nice
since the major number stored in the dev_t was the only part we
were interested in. This allows to us avoid handling the bdev
entirely.  The vdev_lookup_bdev() wrapper was updated to emulate
the behavior of the new lookup_bdev() for all supported kernels.

Reviewed-by: Rafael Kitover <rkitover@gmail.com>
Reviewed-by: Coleman Kane <ckane@colemankane.org>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #11387
Closes #11390

3 years agoLinux 5.11 compat: conftest
Brian Behlendorf [Wed, 23 Dec 2020 19:40:02 +0000 (11:40 -0800)]
Linux 5.11 compat: conftest

Update the ZFS_LINUX_TEST_PROGRAM macro to always set the module
license.  As of the 5.11 kernel not setting a license has been
converted from a warning to an error.

Reviewed-by: Rafael Kitover <rkitover@gmail.com>
Reviewed-by: Coleman Kane <ckane@colemankane.org>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #11387
Closes #11390

3 years agodbufstat: Fix warnings with Python 3.8
Ryan Moeller [Wed, 23 Dec 2020 23:10:35 +0000 (18:10 -0500)]
dbufstat: Fix warnings with Python 3.8

Replace "is" with "==" and "is not" with "!=".

Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: Ryan Moeller <ryan@iXsystems.com>
Closes #11394

3 years agoLinux 5.10 compat: META
Brian Behlendorf [Wed, 23 Dec 2020 16:55:02 +0000 (08:55 -0800)]
Linux 5.10 compat: META

Increase the Linux-Maximum version in the META file to 5.10.
All of the required compatibility patches have been merged.

Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #11391

3 years agozfs-kmods: install to /lib/modules instead of /usr/lib/modules
Christian Schwarz [Tue, 22 Dec 2020 04:14:32 +0000 (04:14 +0000)]
zfs-kmods: install to /lib/modules instead of /usr/lib/modules

Before this patch, dracut wouldn't find zfs.ko for inclusion in
initramfs. This was caused by the packages installing in to
/lib/modules instead of /usr/lib/modules.  Correcting this allows
dracut to do the right thing, even without

    # /etc/dracut.conf
    add_drivers+=" zfs "

Notably, rpm/redhat/zfs-kmod.spec.in does not contain the definition of
the `prefix` macro that this commit removes in the generic kmod spec.
And https://rpmfusion.org/Packaging/KernelModules/Kmods2 does not
mention `prefix` at all.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Christian Schwarz <me@cschwarz.com>
Closes #11381

3 years agoDangling reference from dmu_objset_upgrade
Andy Fiddaman [Mon, 21 Dec 2020 18:13:23 +0000 (18:13 +0000)]
Dangling reference from dmu_objset_upgrade

After porting the fix for https://github.com/openzfs/zfs/issues/5295
over to illumos, we started hitting an assertion failure when running
the testsuite:

assertion failed: rc->rc_count == number, file: .../refcount.c

and the unexpected hold has this stack:

dsl_dataset_long_hold+0x59 dmu_objset_upgrade+0x73
dmu_objset_id_quota_upgrade+0x15 dmu_objset_own+0x14f

The simplest reproducer for this in illumos is

    zpool create -f -O version=1 testpool c3t0d0; zpool destroy testpool

which is run as part of the zpool_create_tempname test, but I can't get
this to trigger on FreeBSD. This appears to be because of the call to
txg_wait_synced() in dmu_objset_upgrade_stop() (which was missing in
illumos), slows down dmu_objset_disown() enough to avoid the condition.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Andy Fiddaman <andy@omnios.org>
Closes #11368

3 years agoLinux 4.18.0-257.el8 compat: blk_alloc_queue()
Brian Behlendorf [Mon, 21 Dec 2020 18:11:56 +0000 (10:11 -0800)]
Linux 4.18.0-257.el8 compat: blk_alloc_queue()

The CentOS stream 4.18.0-257 kernel appears to have backported
the Linux 5.9 change to make_request_fn and the associated API.
To maintain weak modules compatibility the original symbol was
retained and the new interface blk_alloc_queue_rh() was added.

Unfortunately, blk_alloc_queue() was replaced in the blkdev.h
header by blk_alloc_queue_bh() so there doesn't seem to be a way
to build new kmods against the old interfces.  Even though they
appear to still be available for weak module binding.

To accommodate this a configure check is added for the new _rh()
variant of the function and used if available.  If compatibility
code gets added to the kernel for the original blk_alloc_queue()
interface this should be fine.  OpenZFS will simply continue to
prefer the new interface and only fallback to blk_alloc_queue()
when blk_alloc_queue_rh() isn't available.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #11374

3 years agoLinux 5.10 compat: also zvol_revalidate_disk()
Michael D Labriola [Fri, 18 Dec 2020 17:36:19 +0000 (12:36 -0500)]
Linux 5.10 compat: also zvol_revalidate_disk()

Commit 59b68723 added a configure check for 5.10, which removed
revalidate_disk(), and conditionally replaced it's usage with a call to
the new revalidate_disk_size() function.  However, the old function also
invoked the device's registered callback, in our case
zvol_revalidate_disk().  This commit adds a call to zvol_revalidate_disk()
in zvol_update_volsize() to make sure the code path stays the same.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Michael D Labriola <michael.d.labriola@gmail.com>
Closes #11358

3 years agoFix maybe uninitialized variable warning
Brian Behlendorf [Sun, 20 Dec 2020 17:50:13 +0000 (09:50 -0800)]
Fix maybe uninitialized variable warning

Commit 1c2358c12 restructured this code and introduced a warning
about the variable maybe not being initialized.  This cannot happen
with the updated code but we should initialize the variable anyway
to silence the warning.

    zpl_file.c: In function ‘zpl_iter_write’:
    zpl_file.c:324:9: warning: ‘count’ may be used uninitialized
        in this function [-Wmaybe-uninitialized]

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #11373

3 years agoRemove iov_iter_advance() from iter_read
Brian Behlendorf [Sun, 20 Dec 2020 17:49:29 +0000 (09:49 -0800)]
Remove iov_iter_advance() from iter_read

There's no need to call iov_iter_advance() in zpl_iter_read().
This was preserved from the previous code where it wasn't needed
but also didn't cause any problems.  Now that the iter functions
also handle pipes that's no longer the case.  When fully reading a
pipe buffer iov_iter_advance() may results in the pipe buf release
function being called which will not be registered resulting in
a NULL dereference.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #11375
Closes #11378

3 years agoLinux 5.10 compat: use iov_iter in uio structure
Brian Behlendorf [Fri, 18 Dec 2020 16:48:26 +0000 (08:48 -0800)]
Linux 5.10 compat: use iov_iter in uio structure

As of the 5.10 kernel the generic splice compatibility code has been
removed.  All filesystems are now responsible for registering a
->splice_read and ->splice_write callback to support this operation.

The good news is the VFS provided generic_file_splice_read() and
iter_file_splice_write() callbacks can be used provided the ->iter_read
and ->iter_write callback support pipes.  However, this is currently
not the case and only iovecs and bvecs (not pipes) are ever attached
to the uio structure.

This commit changes that by allowing full iov_iter structures to be
attached to uios.  Ever since the 4.9 kernel the iov_iter structure
has supported iovecs, kvecs, bvevs, and pipes so it's desirable to
pass the entire thing when possible.  In conjunction with this the
uio helper functions (i.e uiomove(), uiocopy(), etc) have been
updated to understand the new UIO_ITER type.

Note that using the kernel provided uio_iter interfaces allowed the
existing Linux specific uio handling code to be simplified.  When
there's no longer a need to support kernel's older than 4.9, then
it will be possible to remove the iovec and bvec members from the
uio structure and always use a uio_iter.  Until then we need to
maintain all of the existing types for older kernels.

Some additional refactoring and cleanup was included in this change:

- Added checks to configure to detect available iov_iter interfaces.
  Some are available all the way back to the 3.10 kernel and are used
  when available.  In particular, uio_prefaultpages() now always uses
  iov_iter_fault_in_readable() which is available for all supported
  kernels.

- The unused UIO_USERISPACE type has been removed.  It is no longer
  needed now that the uio_seg enum is platform specific.

- Moved zfs_uio.c from the zcommon.ko module to the Linux specific
  platform code for the zfs.ko module.  This gets it out of libzfs
  where it was never needed and keeps this Linux specific code out
  of the common sources.

- Removed unnecessary O_APPEND handling from zfs_iter_write(), this
  is redundant and O_APPEND is already handled in zfs_write();

NOTE: Cleanly applying this kernel compatibility change required
applying the following commits.  This makes the change larger than
it absolutely needs to be, but the resulting code matches what's in
the branch branch.  This is both more tested and makes it easier to
apply any future backports in this area.

7cf4cd824 Remove incorrect assertion
783be694f Reduce confusion in zfs_write
af5626ac2 Return EFAULT at the end of zfs_write() when set
cc1f85be8 Simplify offset and length limit in zfs_write
9585538d0 Const some unchanging variables in zfs_write
86e74dc16 Remove redundant oid parameter to update_pages
b3d723fb0 Factor uid, gid, and projid out of loop in zfs_write
3d40b6554 Share zfs_fsync, zfs_read, zfs_write, et al between Linux and FreeBSD

Reviewed-by: Colin Ian King <colin.king@canonical.com>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #11351

3 years agoRemove incorrect assertion
Brian Behlendorf [Tue, 24 Nov 2020 17:28:42 +0000 (09:28 -0800)]
Remove incorrect assertion

Commit 85703f6 added a new ASSERT to zfs_write() as part of the
cleanup which isn't correct in the case where multiple processes
are concurrently extending a file.  The `zp->z_size` is updated
atomically while holding a range lock on only a portion of the
file.  Therefore, it's possible for the file size to increase
after a same check is performed earlier in the loop causing this
ASSERT to fail.  The code itself handles this case correctly so
only the invalid ASSERT needs to be removed.

Reviewed-by: Brian Atkinson <batkinson@lanl.gov>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #11235

3 years agoReduce confusion in zfs_write
Ryan Moeller [Wed, 18 Nov 2020 23:06:59 +0000 (18:06 -0500)]
Reduce confusion in zfs_write

Is this block when abuf != NULL ever reached? Yes, it is.

Add asserts and comments to prove that when we get here, we have a full
block write at an aligned offset extending past EOF.

Simplify by removing the check that tx_bytes == max_blksz, since we can
assert that it is always true.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ryan Moeller <ryan@iXsystems.com>
Closes #11191

3 years agoReturn EFAULT at the end of zfs_write() when set
Ryan Moeller [Sat, 14 Nov 2020 18:16:26 +0000 (13:16 -0500)]
Return EFAULT at the end of zfs_write() when set

FreeBSD's VFS expects EFAULT from zfs_write() if we didn't complete
the full write so it can retry the operation.  Add some missing
SET_ERRORs in zfs_write().

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ryan Moeller <ryan@iXsystems.com>
Closes #11193

3 years agoSimplify offset and length limit in zfs_write
Ryan Moeller [Mon, 9 Nov 2020 21:01:56 +0000 (16:01 -0500)]
Simplify offset and length limit in zfs_write

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Matt Macy <mmacy@FreeBSD.org>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Signed-off-by: Ryan Moeller <ryan@iXsystems.com>
Closes #11176

3 years agoConst some unchanging variables in zfs_write
Ryan Moeller [Wed, 4 Nov 2020 23:10:12 +0000 (23:10 +0000)]
Const some unchanging variables in zfs_write

Show that these values will not be changing later.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Matt Macy <mmacy@FreeBSD.org>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Signed-off-by: Ryan Moeller <ryan@iXsystems.com>
Closes #11176

3 years agoRemove redundant oid parameter to update_pages
Ryan Moeller [Wed, 4 Nov 2020 21:47:14 +0000 (21:47 +0000)]
Remove redundant oid parameter to update_pages

The oid comes from the znode we are already passing.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Matt Macy <mmacy@FreeBSD.org>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Signed-off-by: Ryan Moeller <ryan@iXsystems.com>
Closes #11176

3 years agoFactor uid, gid, and projid out of loop in zfs_write
Ryan Moeller [Wed, 4 Nov 2020 22:10:13 +0000 (22:10 +0000)]
Factor uid, gid, and projid out of loop in zfs_write

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Matt Macy <mmacy@FreeBSD.org>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Signed-off-by: Ryan Moeller <ryan@iXsystems.com>
Closes #11176

3 years agoShare zfs_fsync, zfs_read, zfs_write, et al between Linux and FreeBSD
Matthew Macy [Wed, 21 Oct 2020 21:08:06 +0000 (14:08 -0700)]
Share zfs_fsync, zfs_read, zfs_write, et al between Linux and FreeBSD

The zfs_fsync, zfs_read, and zfs_write function are almost identical
between Linux and FreeBSD.  With a little refactoring they can be
moved to the common code which is what is done by this commit.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Matt Macy <mmacy@FreeBSD.org>
Closes #11078

3 years agoZTS: Simplify zpool_initialize_verify_initialized
Brian Behlendorf [Fri, 18 Dec 2020 16:42:59 +0000 (08:42 -0800)]
ZTS: Simplify zpool_initialize_verify_initialized

Consider the test to be a success as long as the initializing pattern
is found at least once per metaslab.  This indicates that at least
part of the free space was initialized.  Ideally we'd check that the
pattern was written to all free space but that's much trickier so this
check is a reasonable compromise.

Using a here-string to feed the loop in this test causes an empty
string to still trigger the loop so we miss the `spacemaps=0` case.
Pipe into the loop instead.

While here, we can use `zpool wait -t initialize $TESTPOOL` to wait for
the pool to initialize.

Co-authored-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ryan Moeller <ryan@iXsystems.com>
Closes #11365

3 years agospecial device removal space accounting fixes
Matthew Ahrens [Thu, 17 Dec 2020 20:11:56 +0000 (12:11 -0800)]
special device removal space accounting fixes

The space in special devices is not included in spa_dspace (or
dsl_pool_adjustedsize(), or the zfs `available` property).  Therefore
there is always at least as much free space in the normal class, as
there is allocated in the special class(es).  And therefore, there is
always enough free space to remove a special device.

However, the checks for free space when removing special devices did not
take this into account.  This commit corrects that.

Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Don Brady <don.brady@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Matthew Ahrens <mahrens@delphix.com>
Closes #11329

3 years agoUse the correct return type for getopt
sterlingjensen [Thu, 17 Dec 2020 18:19:30 +0000 (12:19 -0600)]
Use the correct return type for getopt

Use the correct return type for getopt otherwise clang complains
about tautological-constant-out-of-range-compare.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Sterling Jensen <sterlingjensen@users.noreply.github.com>
Closes #11359

3 years agoDKMS: Disable weak modules
gregory-lee-bartholomew [Tue, 15 Dec 2020 17:22:30 +0000 (11:22 -0600)]
DKMS: Disable weak modules

Fedora does not guarantee a stable kABI, so weak modules should be dis-
abled. See the dkms man page for a more detailed explanation of the weak
module feature.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Gregory Bartholomew <gregory.lee.bartholomew@gmail.com>
Closes #9891
Closes #11128
Closes #11242
Closes #11335

3 years agolua: avoid gcc -Wreturn-local-addr bug
Ryan Libby [Tue, 15 Dec 2020 17:20:48 +0000 (09:20 -0800)]
lua: avoid gcc -Wreturn-local-addr bug

Avoid a bug with gcc's -Wreturn-local-addr warning with some
obfuscation.  In buggy versions of gcc, if a return value is an
expression that involves the address of a local variable, and even if
that address is legally converted to a non-pointer type, a warning may
be emitted and the value of the address may be replaced with zero.
Howerver, buggy versions don't emit the warning or replace the value
when simply returning a local variable of non-pointer type.

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90737

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ryan Libby <rlibby@FreeBSD.org>
Closes #11337

3 years agospa: avoid type narrowing warning
Ryan Libby [Tue, 15 Dec 2020 17:20:06 +0000 (09:20 -0800)]
spa: avoid type narrowing warning

Building the spa module for i386 caused gcc to emit
-Wint-to-pointer-cast "cast to pointer from integer of different size"
because spa.spa_did was uint64_t but pthread_join (via thread_join in
spa_deactivate) takes a pointer (32-bit on i386).  Define spa_did to be
pointer-size instead.  For now spa_did is in fact never non-zero and the
thread_join could instead be ifdef'd out, but changing the size of
spa_did may be more useful for the future.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ryan Libby <rlibby@FreeBSD.org>
Closes #11336

3 years agoFreeBSD libzfs: gcc requires __thread after static
Ryan Libby [Mon, 14 Dec 2020 17:28:24 +0000 (09:28 -0800)]
FreeBSD libzfs: gcc requires __thread after static

Building libzfs with gcc on FreeBSD failed because gcc is picky about
the order of keywords in declarations with __thread, whereas clang is
more relaxed.

https://gcc.gnu.org/onlinedocs/gcc/Thread-Local.html

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Ryan Libby <rlibby@FreeBSD.org>
Closes #11331

3 years agoFix reporting of CKSUM errors in indirect vdevs
George Amanakis [Fri, 11 Dec 2020 20:15:37 +0000 (21:15 +0100)]
Fix reporting of CKSUM errors in indirect vdevs

When removing and subsequently reattaching a vdev, CKSUM errors may
occur as vdev_indirect_read_all() reads from all children of a mirror
in case of a resilver.

Fix this by checking whether a child is missing the data and setting a
flag (ic_error) which is then checked in vdev_indirect_repair() and
suppresses incrementing the checksum counter.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: George Amanakis <gamanakis@gmail.com>
Closes #11277

3 years agoarc_summary3: Handle overflowing value width
Ryan Moeller [Tue, 8 Dec 2020 20:20:25 +0000 (20:20 +0000)]
arc_summary3: Handle overflowing value width

Some tunables shown by arc_summary3 have string values that may exceed
the normal line length, leaving a negative offset between the name and
value fields.  The negative space is of course not valid and Python
rightly barfs up an exception traceback.

Handle an overflowing value field width by ignoring the line length
and separating the name from the value by a single space instead.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ryan Moeller <ryan@iXsystems.com>
Closes #11270

3 years agoFreeBSD: Implement sysctl for fletcher4 impl
Ryan Moeller [Wed, 2 Dec 2020 21:45:08 +0000 (21:45 +0000)]
FreeBSD: Implement sysctl for fletcher4 impl

There is a tunable to select the fletcher 4 checksum implementation on
Linux but it was not present in FreeBSD.

Implement the sysctl handler for FreeBSD and use ZFS_MODULE_PARAM_CALL
to provide the tunable on both platforms.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ryan Moeller <ryan@iXsystems.com>
Closes #11270

3 years agoFix kernel panic induced by redacted send
Paul Dagnelie [Fri, 11 Dec 2020 18:22:29 +0000 (10:22 -0800)]
Fix kernel panic induced by redacted send

In the redaction list traversal code, there is a bug in the binary search
logic when looking for the resume point. Maxbufid can be decremented to -1,
causing us to read the last possible block of the object instead of the one we
wanted. This can cause incorrect resume behavior, or possibly even a hang in
some cases. In addition, when examining non-last blocks, we can treat the
block as being the same size as the last block, causing us to miss entries in
the redaction list when determining where to resume. Finally, we were ignoring
the case where the resume point was found in the buffer being searched, and
resuming from minbufid. All these issues have been corrected, and the code has
been significantly simplified to make future issues less likely.

Reviewed-by: Serapheim Dimitropoulos <serapheim@delphix.com>
Reviewed-by: Matthew Ahrens <mahrens@delphix.com>
Signed-off-by: Paul Dagnelie <pcd@delphix.com>
Closes #11297

3 years agoFreeBSD: Fix format of vfs.zfs.arc_no_grow_shift
Ryan Moeller [Tue, 8 Dec 2020 17:21:36 +0000 (17:21 +0000)]
FreeBSD: Fix format of vfs.zfs.arc_no_grow_shift

vfs.zfs.arc_no_grow_shift has an invalid type (15) and this causes
py-sysctl to format it as a bytearray when it should be an integer.

"U" is not a valid format, it should be "I" and the type should match
the variable type, int.  We can return EINVAL if the value is set below
zero.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ryan Moeller <ryan@iXsystems.com>
Closes #11318

3 years agoFreeBSD: Update usage of py-sysctl
Ryan Moeller [Tue, 8 Dec 2020 17:02:16 +0000 (17:02 +0000)]
FreeBSD: Update usage of py-sysctl

py-sysctl now includes the CTLTYPE_NODE type nodes in the list returned
by sysctl.filter() on FreeBSD head.  It also provides descriptions now.

Eliminate the subprocess call to get descriptions, and filter out the
nodes so we only deal with values.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ryan Moeller <ryan@iXsystems.com>
Closes #11318

3 years agoFix possibly uninitialized 'root_inode' variable warning
Brian Behlendorf [Thu, 10 Dec 2020 23:23:26 +0000 (15:23 -0800)]
Fix possibly uninitialized 'root_inode' variable warning

Resolve an uninitialized variable warning when compiling.

    In function ‘zfs_domount’:
    warning: ‘root_inode’ may be used uninitialized in this
        function [-Wmaybe-uninitialized]
    sb->s_root = d_make_root(root_inode);

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #11306

3 years agoCI: add zloop workflow
George Melikov [Tue, 8 Dec 2020 18:40:44 +0000 (21:40 +0300)]
CI: add zloop workflow

Run ztest via zloop for 20 minutes, total run time is ~30 minutes.

Signed-off-by: George Melikov <mail@gmelikov.ru>
3 years agoFreeBSD: Do zcommon_init sooner to avoid FPU panic
Ryan Moeller [Thu, 10 Dec 2020 05:29:00 +0000 (00:29 -0500)]
FreeBSD: Do zcommon_init sooner to avoid FPU panic

There has been a panic affecting some system configurations where the
thread FPU context is disturbed during the fletcher 4 benchmarks,
leading to a panic at boot.

module_init() registers zcommon_init to run in the last subsystem
(SI_SUB_LAST).  Running it as soon as interrupts have been configured
(SI_SUB_INT_CONFIG_HOOKS) makes sure we have finished the benchmarks
before we start doing other things.

While it's not clear *how* the FPU context was being disturbed, this
does seem to avoid it.

Add a module_init_early() macro to run zcommon_init() at this earlier
point on FreeBSD.  On Linux this is defined as module_init().

Authored by: Konstantin Belousov <kib@FreeBSD.org>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ryan Moeller <ryan@iXsystems.com>
Closes #11302

3 years agomount_zfs: print strerror instead of errno for error reporting
Érico Nogueira Rolim [Thu, 10 Dec 2020 05:24:59 +0000 (02:24 -0300)]
mount_zfs: print strerror instead of errno for error reporting

Tracking down an error message with the errno value can be difficult,
using strerror makes the error message clearer.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Érico Rolim <erico.erc@gmail.com>
Closes #11303

3 years agoDrop path prefix workaround
sterlingjensen [Thu, 10 Dec 2020 05:24:26 +0000 (23:24 -0600)]
Drop path prefix workaround

Canonicalization, the source of the trouble, was disabled in 9000a9f.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Sterling Jensen <sterlingjensen@users.noreply.github.com>
Closes #11295

3 years agoDelete rw_semaphore.wait_lock configure check
Orivej Desh [Thu, 10 Dec 2020 05:22:54 +0000 (05:22 +0000)]
Delete rw_semaphore.wait_lock configure check

Last use of wait_lock was removed in "Linux 5.3 compat: retire
rw_tryupgrade()" (e7a99dab2b065ac2f8736a65d1b226d21754d771).

Fixes the issue reported in
https://github.com/openzfs/zfs/issues/11097#issuecomment-714532367

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Orivej Desh <orivej@gmx.fr>
Closes #11309

3 years agoFix optional "force" arg handing in zfs_ioc_pool_sync()
Brian Behlendorf [Wed, 9 Dec 2020 22:52:45 +0000 (14:52 -0800)]
Fix optional "force" arg handing in zfs_ioc_pool_sync()

The fnvlist_lookup_boolean_value() function should not be used
to check the force argument since it's optional.  It may not be
provided or may have been created with the wrong flags.

Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #11281
Closes #11284

3 years agoCI: add new zfs-tests-sanity workflow
George Melikov [Tue, 8 Dec 2020 17:53:45 +0000 (20:53 +0300)]
CI: add new zfs-tests-sanity workflow

Run zfs-tests with sanity.run for brief results.  Timeouts
are rare, so minimize false positives by increasing the
default from 60 to 180 seconds.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: George Melikov <mail@gmelikov.ru>
Closes #11304