]> git.proxmox.com Git - mirror_zfs.git/log
mirror_zfs.git
3 years agozpool command complains about /etc/exports.d
George Wilson [Fri, 25 Sep 2020 20:09:40 +0000 (15:09 -0500)]
zpool command complains about /etc/exports.d

If the /etc/exports.d directory does not exist, then we should only
create it when we're performing an action which already requires root
privileges.

This commit moves the directory creation to the enable/disable code
path which ensures that we have the appropriate privileges.

Reviewed-by: Richard Elling <Richard.Elling@RichardElling.com>
Reviewed-by: Matthew Ahrens <mahrens@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: George Wilson <gwilson@delphix.com>
Closes #10785
Closes #10934

3 years agozfs_log_write: simplify data copying code for WR_COPIED records
Christian Schwarz [Fri, 25 Sep 2020 20:06:34 +0000 (22:06 +0200)]
zfs_log_write: simplify data copying code for WR_COPIED records

lr_write_t records that are WR_COPIED have the record data directly
appended to them (see lr_write_t type definition).

The data is copied from the debuf using dmu_read_by_dnode.

This function was called, only for WR_COPIED records, as part of a
short-circuiting if-statement's if-expression.

I found this side-effectful call to dmu_read_by_dnode pretty
hard to spot.
This patch improves readability by moving the call to its own line.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: George Wilson <gwilson@delphix.com>
Signed-off-by: Christian Schwarz <me@cschwarz.com>
Closes #10956

3 years agoFreeBSD: Add support for procfs_list
Matthew Macy [Wed, 23 Sep 2020 23:43:51 +0000 (16:43 -0700)]
FreeBSD: Add support for procfs_list

The procfs_list interface is required by several kstats. Implement
this functionality for FreeBSD to provide access to these kstats.

Reviewed-by: Allan Jude <allan@klarasystems.com>
Reviewed-by: Ryan Moeller <ryan@ixsystems.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Matt Macy <mmacy@FreeBSD.org>
Closes #10890

3 years agoFreeBSD: Don't save user FPU context in kernel threads
Matthew Macy [Wed, 23 Sep 2020 18:09:48 +0000 (11:09 -0700)]
FreeBSD: Don't save user FPU context in kernel threads

Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Ryan Moeller <freqlabs@FreeBSD.org>
Signed-off-by: Matt Macy <mmacy@FreeBSD.org>
Closes #10899

3 years agoUpdate issue templates, commitcheck and Contributing.md
Kjeld Schouten-Lebbing [Wed, 23 Sep 2020 16:53:26 +0000 (18:53 +0200)]
Update issue templates, commitcheck and Contributing.md

- Removes OpenZFS ports from commit check
- Removes OpenZFS ports from CONTRIBUTING.md
- Adds mailings lists and IRC to issue template selector
- Remove blank issue option from issue creator

Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Co-authored-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: Kjeld Schouten-Lebbing <kjeld@schouten-lebbing.nl>
Closes #10965

3 years agoDon't set numobjs to UINT64_MAX or near it
Paul Dagnelie [Tue, 22 Sep 2020 23:16:07 +0000 (16:16 -0700)]
Don't set numobjs to UINT64_MAX or near it

Resolves an issue with `zfs send` streams from 0.8.4 which
prevents them from being received by versions < 0.7.

Reviewed-by: Matthew Ahrens <mahrens@delphix.com>
Reviewed-by: Paul Zuchowski <pzuchowski@datto.com>
Signed-off-by: Paul Dagnelie <pcd@delphix.com>
Closes #10911
Closes #10916

3 years agocontrib/initramfs: fix shellcheck and checkbashisms errors with shebang
наб [Tue, 22 Sep 2020 23:10:09 +0000 (01:10 +0200)]
contrib/initramfs: fix shellcheck and checkbashisms errors with shebang

Reviewed-by: Gabriel A. Devenyi <gdevenyi@gmail.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #10908
Closes #10917

3 years agoRestore clearing of L2CACHE flag in arc_read_done()
George Amanakis [Tue, 22 Sep 2020 23:08:05 +0000 (19:08 -0400)]
Restore clearing of L2CACHE flag in arc_read_done()

Commit 45152dc removed clearing of L2CACHE flag in arc_read_done() and
moved related code in l2arc_write_eligible(). After careful code
inspection arc_read_done() is not bypassed in the case of prefetches.
Thus restore the old behavior.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: adam moss <c@yotes.com>
Signed-off-by: George Amanakis <gamanakis@gmail.com>
Closes #10951

3 years agoFix a logic bug in the FreeBSD getpages VOP
Mark Johnston [Tue, 22 Sep 2020 23:05:52 +0000 (19:05 -0400)]
Fix a logic bug in the FreeBSD getpages VOP

In commit cd32b4f5b79c ("Fix a deadlock in the FreeBSD getpages VOP") I
introduced a bug while porting the patch originally committed to
FreeBSD: the rangelock pointer may be NULL if the try operation failed,
so we must avoid calling zfs_rangelock_unlock() in that case.

Reviewed-by: Allan Jude <allan@klarasystems.com>
Reviewed-by: Ryan Moeller <ryan@ixsystems.com>
Reviewed-by: Matt Macy <mmacy@FreeBSD.org>
Reported-by: Steve Wills <swills@FreeBSD.org>
Signed-off-by: Mark Johnston <markj@FreeBSD.org>
Closes #10519
Closes #10960

3 years agoFreeBSD: Reduce stack usage of Lua
Ryan Moeller [Tue, 22 Sep 2020 23:03:11 +0000 (19:03 -0400)]
FreeBSD: Reduce stack usage of Lua

Use the same reduced buffer size for lauxlib that is used on Linux.

Fixes panic on HEAD in lua gsub test designed to exhaust stack space.

With this we can remove the special case to reserve more stack space
on FreeBSD.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Kyle Evans <kevans@FreeBSD.org>
Signed-off-by: Ryan Moeller <ryan@iXsystems.com>
Closes #10959

3 years agoAnnontate FreeBSD sysctls with CTLFLAG_MPSAFE
Mark Johnston [Fri, 18 Sep 2020 12:45:54 +0000 (08:45 -0400)]
Annontate FreeBSD sysctls with CTLFLAG_MPSAFE

Without this, the sysctl system calls will acquire a global lock before
invoking the handler.  This is noticeable in some situations when
running top(1).  The global lock is mostly vestigal but continues to see
some use and so contention is still a problem; until the default sense
of the MPSAFE flag changes, we have to annotate each and every handler.

Reviewed-by: Allan Jude <allan@klarasystems.com>
Reviewed-by: Ryan Moeller <ryan@ixsystems.com>
Signed-off-by: Mark Johnston <markj@FreeBSD.org>
Closes #10836

3 years agoFix switch statement indentation in the FreeBSD kstat code
Mark Johnston [Fri, 18 Sep 2020 12:41:28 +0000 (08:41 -0400)]
Fix switch statement indentation in the FreeBSD kstat code

This is in preparation for some functional changes.

Reviewed-by: Allan Jude <allan@klarasystems.com>
Reviewed-by: Ryan Moeller <ryan@ixsystems.com>
Signed-off-by: Mark Johnston <markj@FreeBSD.org>
Closes #10950

3 years agoUpdate documentation of l2arc_mfuonly
George Amanakis [Mon, 21 Sep 2020 16:26:24 +0000 (12:26 -0400)]
Update documentation of l2arc_mfuonly

with regard to evicted_l2_eligibile_mru. Even if l2arc_mfuonly is
enabled, this is not reflected in evicted_l2_eligible_mru as this
information is useful for deciding whether to toggle l2arc_mfuonly
depending on the current workload.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: George Amanakis <gamanakis@gmail.com>
Closes #10945

3 years agovdev_ashift should only be set once
George Wilson [Fri, 18 Sep 2020 19:13:47 +0000 (14:13 -0500)]
vdev_ashift should only be set once

== Motivation and Context

The new vdev ashift optimization prevents the removal of devices when
a zfs configuration is comprised of disks which have different logical
and physical block sizes. This is caused because we set 'spa_min_ashift'
in vdev_open and then later call 'vdev_ashift_optimize'. This would
result in an inconsistency between spa's ashift calculations and that
of the top-level vdev.

In addition, the optimization logical ignores the overridden ashift
value that would be provided by '-o ashift=<val>'.

== Description

This change reworks the vdev ashift optimization so that it's only
set the first time the device is configured. It still allows the
physical and logical ahsift values to be set every time the device
is opened but those values are only consulted on first open.

Reviewed-by: Matthew Ahrens <mahrens@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Cedric Berger <cedric@precidata.com>
Signed-off-by: George Wilson <gwilson@delphix.com>
External-Issue: DLPX-71831
Closes #10932

3 years agolibzfs: Don't leak buf if nvlist is too large
Allan Jude [Fri, 18 Sep 2020 17:23:29 +0000 (13:23 -0400)]
libzfs: Don't leak buf if nvlist is too large

Resolves FreeBSD Coverity defect:
CID 1432398:  Resource leaks  (RESOURCE_LEAK)

libzfs: don't leak hdl if there is an error reading env var

Resolves FreeBSD Coverity defect:
CID 1432395:  Resource leaks  (RESOURCE_LEAK)

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Allan Jude <allanjude@freebsd.org>
Closes #10882

3 years agopool may become suspended during device expansion
George Wilson [Fri, 18 Sep 2020 03:03:10 +0000 (22:03 -0500)]
pool may become suspended during device expansion

When expanding a device zfs needs to rescan the partition table to
get the correct size. This can only happen when we're in the kernel
and requires the device to be closed. As part of the rescan, udev is
notified and the device links are removed and recreated. This leave a
window where the vdev code may try to reopen the device before udev
has recreated the link. If that happens, then the pool may end up in
a suspended state.

To correct this, we leverage the BLKPG_RESIZE_PARTITION ioctl which
allows the partition information to be modified even while it's in use.
This ioctl also does not remove the device link associated with the zfs
data partition so it eliminates the race condition that can occur in
the kernel.

Reviewed-by: Pavel Zakharov <pavel.zakharov@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: George Wilson <gwilson@delphix.com>
Closes #10897

3 years agozdb leak detection fails with in-progress device removal
Matthew Ahrens [Thu, 17 Sep 2020 17:55:30 +0000 (10:55 -0700)]
zdb leak detection fails with in-progress device removal

When a device removal is in progress, there are 2 locations for the data
that's already been moved: the original location, on the device that's
being removed; and the new location, which is pointed to by the indirect
mapping.  When doing leak detection, zdb needs to know about both
locations.  To determine what's already been copied, we load the
spacemaps of the removing vdev, omit the blocks that are yet to be
copied, and then use the vdev's remap op to find the new location.

The problem is with an optimization to the spacemap-loading code in zdb.
When processing the log spacemaps, we ignore entries that are not
relevant because they are past the point that's been copied.  However,
entries which span the point that's been copied (i.e. they are partly
relevant and partly irrelevant) are processed normally.  This can lead
to an illegal spacemap operation, for example if offsets up to 100KB
have been copied, and the spacemap log has the following entries:

ALLOC 50KB-150KB (partly relevant)
FREE 50KB-100KB (entirely relevant)
FREE 100KB-150KB (entirely irrlevant - ignored)
ALLOC 50KB-150KB (partly relevant)

Because the entirely irrelevant entry was ignored, its space remains in
the spacemap.  When the last entry is processed, we attempt to add it to
the spacemap, but it partially overlaps with the 100-150KB entry that
was left over.

This problem was discovered by ztest/zloop.

One solution would be to also ignore the irrelevant parts of
partially-irrelevant entries (i.e. when processing the ALLOC 50-150, to
only add 50-100 to the spacemap).  However, this commit implements a
simpler solution, which is to remove this optimization entirely.  I.e.
to process the entire spacemap log, without regard for the point that's
been copied.  After reconstructing the entire allocatable range tree,
there's already code to remove the parts that have not yet been copied.

Reviewed-by: Serapheim Dimitropoulos <serapheim@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Matthew Ahrens <mahrens@delphix.com>
External-issue: DLPX-71820
Closes #10920

3 years agoFreeBSD: Do not copy vp into f_data for DTYPE_VNODE files
Ryan Moeller [Thu, 17 Sep 2020 17:54:14 +0000 (13:54 -0400)]
FreeBSD: Do not copy vp into f_data for DTYPE_VNODE files

https://reviews.freebsd.org/D26346

Do not copy vp into f_data for DTYPE_VNODE files.  The vnode pointer is
already stored in f_vnode.  Use that so f_data can be reused.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ryan Moeller <ryan@iXsystems.com>
Closes #10929

3 years agoNeed a long hold in zpl_mount_impl
John Poduska [Thu, 17 Sep 2020 17:53:02 +0000 (13:53 -0400)]
Need a long hold in zpl_mount_impl

In zpl_mount_impl, there is:
    dmu_objset_hold ; returns with pool & ds held
    dsl_pool_rele

    sget

    dsl_dataset_rele

As spelled out in the "DSL Pool Configuration Lock" in dsl_pool.c,
this requires a long hold.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Paul Zuchowski <pzuchowski@datto.com>
Signed-off-by: John Poduska <jpoduska@datto.com>
Closes #10936

3 years agolibzfsbootenv: lzbe_nvlist_set needs to store bootenv version VB_NVLIST
Toomas Soome [Thu, 17 Sep 2020 17:51:09 +0000 (20:51 +0300)]
libzfsbootenv: lzbe_nvlist_set needs to store bootenv version VB_NVLIST

A small bug did slip into initial libzfsbootenv; while storing nvlist
in nvlist, we should make sure the bootenv is using VB_NVLIST format.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Toomas Soome <tsoome@me.com>
Closes #10937

3 years agoRename acltype=posixacl to acltype=posix
Ryan Moeller [Wed, 16 Sep 2020 19:26:06 +0000 (15:26 -0400)]
Rename acltype=posixacl to acltype=posix

Prefer acltype=off|posix, retaining the old names as aliases.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ryan Moeller <ryan@iXsystems.com>
Closes #10918

3 years agocmd/zgenhostid: replace with simple c implementation
Georgy Yakovlev [Wed, 16 Sep 2020 19:25:12 +0000 (12:25 -0700)]
cmd/zgenhostid: replace with simple c implementation

It was discovered that dracut scripts and zgenhostid
always generate little-endian /etc/hostid.

This commit provides simple endianess-aware binary
and updates the scripts to use it.

New features include:
 -f flag to force overwrite.
 -o flag to write to different file (for dracut)
 accepting both 0x01234567 and 01234567 values as input

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Olaf Faaland <faaland1@llnl.gov>
Signed-off-by: Georgy Yakovlev <gyakovlev@gentoo.org>
Closes #10887
Closes #10925

3 years agoFix stack frame size: dnode_dirty_l1range()
Pavel Snajdr [Mon, 7 Sep 2020 15:33:34 +0000 (17:33 +0200)]
Fix stack frame size: dnode_dirty_l1range()

Reviewed-by: Ryan Moeller <freqlabs@FreeBSD.org>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Pavel Snajdr <snajpa@snajpa.net>
Closes #10879

3 years agodmu_redact_snap: fix possible memleak
Pavel Snajdr [Mon, 7 Sep 2020 15:27:51 +0000 (17:27 +0200)]
dmu_redact_snap: fix possible memleak

Reviewed-by: Ryan Moeller <freqlabs@FreeBSD.org>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Pavel Snajdr <snajpa@snajpa.net>
Closes #10879

3 years agoFix stack frame size: dmu_redact_snap()
Pavel Snajdr [Mon, 7 Sep 2020 15:12:17 +0000 (17:12 +0200)]
Fix stack frame size: dmu_redact_snap()

Reviewed-by: Ryan Moeller <freqlabs@FreeBSD.org>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Pavel Snajdr <snajpa@snajpa.net>
Closes #10879

3 years agoFix stack frame size: spa_livelist_delete_cb()
Pavel Snajdr [Thu, 3 Sep 2020 15:38:16 +0000 (17:38 +0200)]
Fix stack frame size: spa_livelist_delete_cb()

Reviewed-by: Ryan Moeller <freqlabs@FreeBSD.org>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Pavel Snajdr <snajpa@snajpa.net>
Closes #10879

3 years agozpoolprops.8: fix raidz par[i]ty typo
наб [Tue, 15 Sep 2020 22:43:42 +0000 (00:43 +0200)]
zpoolprops.8: fix raidz par[i]ty typo

Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #10923

3 years agozfs label bootenv should store data as nvlist
Toomas Soome [Tue, 15 Sep 2020 22:42:27 +0000 (01:42 +0300)]
zfs label bootenv should store data as nvlist

nvlist does allow us to support different data types and systems.

To encapsulate user data to/from nvlist, the libzfsbootenv library is
provided.

Reviewed-by: Arvind Sankar <nivedita@alum.mit.edu>
Reviewed-by: Allan Jude <allan@klarasystems.com>
Reviewed-by: Paul Dagnelie <pcd@delphix.com>
Reviewed-by: Igor Kozhukhov <igor@dilos.org>
Signed-off-by: Toomas Soome <tsoome@me.com>
Closes #10774

3 years agoLinux: Prevent destruction while showing mount devname
Ryan Moeller [Tue, 15 Sep 2020 22:40:03 +0000 (18:40 -0400)]
Linux: Prevent destruction while showing mount devname

Use ZFS_ENTER and ZFS_EXIT to protect datasets while their mount
devname is being retrieved.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ryan Moeller <ryan@iXsystems.com>
Closes #10892
Closes #10927

3 years agoAdd L2ARC arcstats for MFU/MRU buffers and buffer content type
George Amanakis [Mon, 14 Sep 2020 17:10:44 +0000 (13:10 -0400)]
Add L2ARC arcstats for MFU/MRU buffers and buffer content type

Currently the ARC state (MFU/MRU) of cached L2ARC buffer and their
content type is unknown. Knowing this information may prove beneficial
in adjusting the L2ARC caching policy.

This commit adds L2ARC arcstats that display the aligned size
(in bytes) of L2ARC buffers according to their content type
(data/metadata) and according to their ARC state (MRU/MFU or
prefetch). It also expands the existing evict_l2_eligible arcstat to
differentiate between MFU and MRU buffers.

L2ARC caches buffers from the MRU and MFU lists of ARC. Upon caching a
buffer, its ARC state (MRU/MFU) is stored in the L2 header
(b_arcs_state). The l2_m{f,r}u_asize arcstats reflect the aligned size
(in bytes) of L2ARC buffers according to their ARC state (based on
b_arcs_state). We also account for the case where an L2ARC and ARC
cached MRU or MRU_ghost buffer transitions to MFU. The l2_prefetch_asize
reflects the alinged size (in bytes) of L2ARC buffers that were cached
while they had the prefetch flag set in ARC. This is dynamically updated
as the prefetch flag of L2ARC buffers changes.

When buffers are evicted from ARC, if they are determined to be L2ARC
eligible then their logical size is recorded in
evict_l2_eligible_m{r,f}u arcstats according to their ARC state upon
eviction.

Persistent L2ARC:
When committing an L2ARC buffer to a log block (L2ARC metadata) its
b_arcs_state and prefetch flag is also stored. If the buffer changes
its arcstate or prefetch flag this is reflected in the above arcstats.
However, the L2ARC metadata cannot currently be updated to reflect this
change.
Example: L2ARC caches an MRU buffer. L2ARC metadata and arcstats count
this as an MRU buffer. The buffer transitions to MFU. The arcstats are
updated to reflect this. Upon pool re-import or on/offlining the L2ARC
device the arcstats are cleared and the buffer will now be counted as an
MRU buffer, as the L2ARC metadata were not updated.

Bug fix:
- If l2arc_noprefetch is set, arc_read_done clears the L2CACHE flag of
  an ARC buffer. However, prefetches may be issued in a way that
  arc_read_done() is bypassed. Instead, move the related code in
  l2arc_write_eligible() to account for those cases too.

Also add a test and update manpages for l2arc_mfuonly module parameter,
and update the manpages and code comments for l2arc_noprefetch.
Move persist_l2arc tests to l2arc.

Reviewed-by: Ryan Moeller <freqlabs@FreeBSD.org>
Reviewed-by: Richard Elling <Richard.Elling@RichardElling.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: George Amanakis <gamanakis@gmail.com>
Closes #10743

3 years agoconfig/zfs-build.m4: never define _initramfs in RPM_DEFINE_UTIL
Harald van Dijk [Sat, 12 Sep 2020 15:22:07 +0000 (16:22 +0100)]
config/zfs-build.m4: never define _initramfs in RPM_DEFINE_UTIL

The zfs-initramfs package has never worked as no RPM-based distribution
uses initramfs-tools, which is listed as a dependency of zfs-initramfs.

This would not ordinarily be a problem, as it is only enabled when
/usr/share/initramfs-tools is present, which should not normally be the
case on RPM-based distributions. However, other packages may install
unused files there even if initramfs-tools is not used, so remove this
auto-detection for the rpm-utils target.

This does not fully remove the logic for the zfs-initramfs package. This
splits it out into a separate rpm-utils-initramfs target so that the
Debian builds can still use it.

Reviewed-by: Kjeld Schouten <kjeld@schouten-lebbing.nl>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Harald van Dijk <harald@gigawatt.nl>
Closes #10898

3 years agolibzutil depends on libnvpair
Matthew Ahrens [Sat, 12 Sep 2020 15:19:48 +0000 (08:19 -0700)]
libzutil depends on libnvpair

libzutil depends on libnvpair, but this dependency is undeclared in the
build system.  Therefore it isn't possible to make a new command that
depends on libzutil, but does not (directly) depend on libnvpair.

This commit makes this dependency explicit.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reivewed-by: Ryan Moeller <freqlabs@FreeBSD.org>
Signed-off-by: Matthew Ahrens <mahrens@delphix.com>
Closes #10915

3 years agoFreeBSD: convert teardown inactive lock to a read-mostly sleepable lock
Mateusz Guzik [Wed, 9 Sep 2020 17:15:52 +0000 (19:15 +0200)]
FreeBSD: convert teardown inactive lock to a read-mostly sleepable lock

The lock is taken all the time and as a regular read-write lock
avoidably serves as a mount point-wide contention point.

This forward ports FreeBSD revision r357322.

To quote aforementioned commit:

Sample result doing an incremental -j 40 build:
before: 173.30s user 458.97s system 2595% cpu 24.358 total
after:  168.58s user 254.92s system 2211% cpu 19.147 total

Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Ryan Moeller <freqlabs@FreeBSD.org>
Signed-off-by: Mateusz Guzik <mjguzik@gmail.com>
Closes #10896

3 years agoForce the use of '.' as decimal separator.
xdch47 [Wed, 9 Sep 2020 17:14:04 +0000 (19:14 +0200)]
Force the use of '.' as decimal separator.

This solves issues occurring with a different decimal operator and
keeps the command line interface consistent for all locales .
E.g. `zfs set quota=0.5T`

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Felix Neumärker <xdch47@posteo.de>
Closes #10878

3 years agoInitialize mmp_last_write when the mmp thread starts
Olaf Faaland [Wed, 9 Sep 2020 17:12:54 +0000 (10:12 -0700)]
Initialize mmp_last_write when the mmp thread starts

A great deal of time may go by between when mmp_init() is called and
the MMP thread starts, particularly if there are bad devices, because
there is I/O checking configs etc.  If this time is too long,

    (gethrtime() - mmp_last_write) > mmp_fail_ns

at the time the MMP thread starts.  If MMP is configured to suspend
the pool, the pool will be suspended immediately.

This can be seen in issue #10838

The value of mmp_last_write doesn't matter before the mmp thread
starts.  To give the MMP thread time to issue and land MMP writes,
initialize mmp_last_write when the MMP thread starts.

Reviewed-by: Giuseppe Di Natale <guss80@gmail.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Olaf Faaland <faaland1@llnl.gov>
Closes #10873

3 years agoFreeBSD: drop dependency on cryptodev module
Ryan Moeller [Wed, 9 Sep 2020 17:10:32 +0000 (13:10 -0400)]
FreeBSD: drop dependency on cryptodev module

We only need the kernel interfaces in crypto, not the device node in
cryptodev.

Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ryan Moeller <ryan@iXsystems.com>
Closes #10901

3 years agoIntroduce ZFS module parameter l2arc_mfuonly
George Amanakis [Tue, 8 Sep 2020 18:44:37 +0000 (14:44 -0400)]
Introduce ZFS module parameter l2arc_mfuonly

In certain workloads it may be beneficial to reduce wear of L2ARC
devices by not caching MRU metadata and data into L2ARC. This commit
introduces a new tunable l2arc_mfuonly for this purpose.

Reviewed-by: Matthew Ahrens <mahrens@delphix.com>
Reviewed-by: Richard Elling <Richard.Elling@RichardElling.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: George Amanakis <gamanakis@gmail.com>
Closes #10710

3 years agoAvoid possibility of division by zero
Ryan Moeller [Tue, 8 Sep 2020 18:39:16 +0000 (14:39 -0400)]
Avoid possibility of division by zero

When hz > 1000, msec / (1000 / hz) results in division by zero.

I found somewhere in FreeBSD using howmany(msec * hz, 1000) to convert
ms to ticks, avoiding the potential for a zero in the divisor.

Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ryan Moeller <freqlabs@FreeBSD.org>
Closes #10894

3 years agodnode_special_open() error: unchecked function return 'zrl_tryenter'
Toomas Soome [Tue, 8 Sep 2020 18:36:52 +0000 (21:36 +0300)]
dnode_special_open() error: unchecked function return 'zrl_tryenter'

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Toomas Soome <tsoome@me.com>
Closes #10876

3 years agoAdd a missing option prefix `-` in zfs-tests.sh usage()
Peter Dave Hello [Tue, 8 Sep 2020 16:04:36 +0000 (00:04 +0800)]
Add a missing option prefix `-` in zfs-tests.sh usage()

Reviewed-by: Giuseppe Di Natale <guss80@gmail.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Peter Dave Hello <hsu@peterdavehello.org>
Closes #10893

3 years agoDisplay pbkdf2iters property as plain number
Fabio Buso [Tue, 8 Sep 2020 15:49:55 +0000 (17:49 +0200)]
Display pbkdf2iters property as plain number

The pbkdf2iters property is an iteration counter
and should be displayed as plain number rather
than in binary unit.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Fabio Buso <buso.fabio@gmail.com>
Closes #10871

3 years agolibshare: Add missing headers for nfs.c
alaviss [Fri, 4 Sep 2020 19:03:57 +0000 (19:03 +0000)]
libshare: Add missing headers for nfs.c

On musl libc, zfs failed to compile due to the missing <fcntl.h>
include, which is required for `open()` per POSIX.

This commit add the missing <fcntl.h> include.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Hiếu Lê <leorize+oss@disroot.org>
Closes #10880

3 years agoFreeBSD: reduce priority of ZIO_TASKQ_ISSUE writes by a larger value
Matthew Macy [Fri, 4 Sep 2020 18:13:27 +0000 (11:13 -0700)]
FreeBSD: reduce priority of ZIO_TASKQ_ISSUE writes by a larger value

On FreeBSD, if priorities divided by four (RQ_PPQ) are equal then
a difference between them is insignificant. In other words,
incrementing pri by only one as on Linux is insufficient.

Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Matt Macy <mmacy@FreeBSD.org>
Closes #10872

3 years agoSpruce up pkg-config files for libzfs/libzfs_core
Ryan Moeller [Fri, 4 Sep 2020 18:11:18 +0000 (14:11 -0400)]
Spruce up pkg-config files for libzfs/libzfs_core

Several of the listed library dependencies are not relevant on FreeBSD.
Have ./configure save libraries that are found via pkg-config as
${LIB}_PC and use the configured automake variables instead of hard
coded names so we only get what was actually needed.

While here, update the URL to point at the OpenZFS Github repo.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ryan Moeller <ryan@iXsystems.com>
Closes #10869

3 years agoman: Cross-reference zfs-load-key(8) for ENCRYPTION mention
Ryan Moeller [Tue, 1 Sep 2020 18:06:22 +0000 (14:06 -0400)]
man: Cross-reference zfs-load-key(8) for ENCRYPTION mention

Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Submitted-by: Harry Schmalzbauer
Signed-off-by: Ryan Moeller <ryan@iXsystems.com>
Closes #10866

3 years agoman: Add `zfs rename -r` to zfs-rename(8) SYNOPSIS
Ryan Moeller [Tue, 1 Sep 2020 17:49:35 +0000 (13:49 -0400)]
man: Add `zfs rename -r` to zfs-rename(8) SYNOPSIS

Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ryan Moeller <ryan@iXsystems.com>
Closes #10866

3 years agoSequential scrub and resilver updated comments
Brian Behlendorf [Fri, 4 Sep 2020 17:39:58 +0000 (10:39 -0700)]
Sequential scrub and resilver updated comments

Commit d4a72f2 which introduced multi-phase scrubs and resilvers
continued the work presented by Nexenta at the 2016 ZFS developer
summit.  Update the source to reflect their contribution.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
3 years agoAvoid posting duplicate zpool events
Don Brady [Fri, 4 Sep 2020 17:34:28 +0000 (11:34 -0600)]
Avoid posting duplicate zpool events

Duplicate io and checksum ereport events can misrepresent that
things are worse than they seem. Ideally the zpool events and the
corresponding vdev stat error counts in a zpool status should be
for unique errors -- not the same error being counted over and over.
This can be demonstrated in a simple example. With a single bad
block in a datafile and just 5 reads of the file we end up with a
degraded vdev, even though there is only one unique error in the pool.

The proposed solution to the above issue, is to eliminate duplicates
when posting events and when updating vdev error stats. We now save
recent error events of interest when posting events so that we can
easily check for duplicates when posting an error.

Reviewed by: Brad Lewis <brad.lewis@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Don Brady <don.brady@delphix.com>
Closes #10861

3 years agonowait synctask must succeed
Matthew Ahrens [Fri, 4 Sep 2020 17:29:39 +0000 (10:29 -0700)]
nowait synctask must succeed

If a `zfs_space_check_t` other than `ZFS_SPACE_CHECK_NONE` is used with
`dsl_sync_task_nowait()`, the sync task may fail due to ENOSPC.
However, there is no way to notice or communicate this failure, so it's
extremely difficult to use this functionality correctly, and in fact
almost all callers use `ZFS_SPACE_CHECK_NONE`.

This commit removes the `zfs_space_check_t` argument from
`dsl_sync_task_nowait()`, and always uses `ZFS_SPACE_CHECK_NONE`.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Matthew Ahrens <mahrens@delphix.com>
Closes #10855

3 years agoRetain thread name when resuming a zthr
Ryan Moeller [Fri, 4 Sep 2020 03:09:52 +0000 (23:09 -0400)]
Retain thread name when resuming a zthr

When created, a zthr is given a name to identify it by.  This name is
lost when a cancelled zthr is resumed.

Retain the name of a zthr so it can be used when resuming.

Reviewed-by: Serapheim Dimitropoulos <serapheim@delphix.com>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ryan Moeller <ryan@iXsystems.com>
Closes #10881

3 years agoFixes for running FreeBSD buildworld on Linux/macOS hosts
Alexander Richardson [Fri, 4 Sep 2020 03:06:03 +0000 (04:06 +0100)]
Fixes for running FreeBSD buildworld on Linux/macOS hosts

Adding an #ifdef __FreeBSD__ to a FreeBSD-specific header may seem odd,
but these headers are used on non-FreeBSD systems during the bootstrap
tools phase.
Originally submitted downstream as https://reviews.freebsd.org/D26193

Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Alex Richardson <Alexander.Richardson@cl.cam.ac.uk>
Closes #10863

3 years agoReplace cv_{timed}wait_sig with cv_{timed}wait_idle where appropriate
Matthew Macy [Fri, 4 Sep 2020 03:04:09 +0000 (20:04 -0700)]
Replace cv_{timed}wait_sig with cv_{timed}wait_idle where appropriate

There are a number of places where cv_?_sig is used simply for
accounting purposes but the surrounding code has no ability to
cope with actually receiving a signal. On FreeBSD it is possible
to send signals to individual kernel threads so this could
enable undesirable behavior.

This patch adds routines on Linux that will do the same idle
accounting as _sig without making the task interruptible. On
FreeBSD cv_*_idle  are all aliases for cv_*

Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Matt Macy <mmacy@FreeBSD.org>
Closes #10843

3 years agoRemove 'ZFS on Linux' references from PR Template
Garrett Fields [Thu, 3 Sep 2020 23:31:05 +0000 (19:31 -0400)]
Remove 'ZFS on Linux' references from PR Template

As mentioned in the #OpenZFS IRC channel (thanks "Toomas Soome"):
The OpenZFS PR Template still mentions "ZFS on Linux".
This changes that reference and updates the URLs.

Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Garrett Fields <ghfields@gmail.com>
Closes #10868

3 years agoLinks in Source Files
Spencer Kinny [Wed, 2 Sep 2020 16:42:12 +0000 (22:12 +0530)]
Links in Source Files

Added comments in following files
with links to Illumos manual pages:

./module/avl/avl.c
./module/nvpair/nvpair.c
./module/os/linux/spl/spl-kstat.c
./module/os/freebsd/spl/spl_kstat.c

Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Spencer Kinny <spencerkinny1995@gmail.com>
Closes #5113
Closes #10859

3 years agozvol: unsigned off can not be less than zero
Toomas Soome [Wed, 2 Sep 2020 16:30:29 +0000 (19:30 +0300)]
zvol: unsigned off can not be less than zero

Reviewed-by: Richard Elling <Richard.Elling@RichardElling.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Toomas Soome <tsoome@me.com>
Closes #10867

3 years agoFix -Werror,-Wmacro-redefined in limits.h
Alexander Richardson [Tue, 1 Sep 2020 23:22:09 +0000 (00:22 +0100)]
Fix -Werror,-Wmacro-redefined in limits.h

Those macros are also defined by the compiler-provided float.h which
will be included later on (at least in the FreeBSD buildworld case) and
triggers these -Werror warnings. Including <float.h> first and only
defining the macros when DBL_DIG/FLT_DIG is missing fixes this problem.

Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Alex Richardson <Alexander.Richardson@cl.cam.ac.uk>
Closes #10864

3 years agoMake spa_stats.c tunables visible on FreeBSD
Ryan Moeller [Tue, 1 Sep 2020 23:19:19 +0000 (19:19 -0400)]
Make spa_stats.c tunables visible on FreeBSD

Use ZFS_MODULE_PARAM for cross-platform tunables in spa_stats.c, and
add update tunables.cfg in tests for the newly supported ones.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ryan Moeller <ryan@iXsystems.com>
Closes #10858

3 years agoFreeBSD: Fix up after spa_stats.c move
Matthew Macy [Tue, 1 Sep 2020 23:16:56 +0000 (16:16 -0700)]
FreeBSD: Fix up after spa_stats.c move

Moving spa_stats added the additional burden of supporting
KSTAT_TYPE_IO.

spa_state_addr will always return a valid value regardless of
the value of 'n'. On FreeBSD this will cause an infinite loop
as it relies on the raw ops addr routine to indicate that there
is no more data.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Ryan Moeller <freqlabs@FreeBSD.org>
Signed-off-by: Matt Macy <mmacy@FreeBSD.org>
Closes #10860

3 years agoAdd 'zfs rename -u' to rename without remounting
Ryan Moeller [Tue, 1 Sep 2020 23:14:16 +0000 (19:14 -0400)]
Add 'zfs rename -u' to rename without remounting

Allow to rename file systems without remounting if it is possible.
It is possible for file systems with 'mountpoint' property set to
'legacy' or 'none' - we don't have to change mount directory for them.
Currently such file systems are unmounted on rename and not even
mounted back.

This introduces layering violation, as we need to update
'f_mntfromname' field in statfs structure related to mountpoint (for
the dataset we are renaming and all its children).

In my opinion it is worth it, as it allow to update FreeBSD in even
cleaner way - in ZFS-only configuration root file system is ZFS file
system with 'mountpoint' property set to 'legacy'. If root dataset is
named system/rootfs, we can snapshot it (system/rootfs@upgrade), clone
it (system/oldrootfs), update FreeBSD and if it doesn't boot we can
boot back from system/oldrootfs and rename it back to system/rootfs
while it is mounted as /. Before it was not possible, because
unmounting / was not possible.

Authored by: Pawel Jakub Dawidek <pjd@FreeBSD.org>
Reviewed-by: Allan Jude <allan@klarasystems.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Ported by: Matt Macy <mmacy@freebsd.org>
Signed-off-by: Ryan Moeller <ryan@iXsystems.com>
Closes #10839

3 years agoFreeBSD: Remove unused SECLABEL code
Ryan Moeller [Tue, 1 Sep 2020 02:52:46 +0000 (22:52 -0400)]
FreeBSD: Remove unused SECLABEL code

SECLABEL is undefined on FreeBSD and should be pruned.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ryan Moeller <freqlabs@FreeBSD.org>
Closes #10847

3 years agolibspl: Provide platform-specific zone implementations
Ryan Moeller [Sun, 30 Aug 2020 17:37:44 +0000 (17:37 +0000)]
libspl: Provide platform-specific zone implementations

FreeBSD has the concept of jails, a precursor to Solaris's zones, which
can be mapped to the required zones interface with relative ease.  The
previous ZFS implementation in FreeBSD did so, and we should continue
to provide an appropriate implementation in OpenZFS as well.

Move lib/libspl/zone.c into platform code and adopt the correct
implementation for FreeBSD.

While here, prune unused code.

Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ryan Moeller <freqlabs@FreeBSD.org>
Closes #10851

3 years agoFreeBSD: Simplify INGLOBALZONE
Ryan Moeller [Sat, 29 Aug 2020 18:43:26 +0000 (18:43 +0000)]
FreeBSD: Simplify INGLOBALZONE

FreeBSD's previous ZFS implemented INGLOBALZONE(thread) as
(!jailed((thread)->td_ucred)) and passed curthread to INGLOBALZONE.

We pass curproc instead of curthread, so we can achieve the same effect
with (!jailed((proc)->p_ucred)).  The implementation is trivial enough
to fit on a single line in a define.  We don't really need a whole
separate function for something that's already macros all the way down.

Eliminate in_globalzone.

Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ryan Moeller <freqlabs@FreeBSD.org>
Closes #10851

3 years agoFreeBSD: Define crgetzoneid appropriately
Ryan Moeller [Sat, 29 Aug 2020 18:25:56 +0000 (18:25 +0000)]
FreeBSD: Define crgetzoneid appropriately

The previous ZFS implementation on FreeBSD had ifdefs to use jailed()
instead of crgetzoneid() in dsl_dir.c, however we can simply provide an
appropriate definition of crgetzoneid for the same effect.

Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ryan Moeller <freqlabs@FreeBSD.org>
Closes #10851

3 years agozio_ereport_post() and zio_ereport_start() return values are ignored
Toomas Soome [Tue, 1 Sep 2020 02:35:11 +0000 (05:35 +0300)]
zio_ereport_post() and zio_ereport_start() return values are ignored

use (void) to silence analyzers.

Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Toomas Soome <tsoome@me.com>
Closes #10857

3 years agoTypo Correction
Spencer Kinny [Sun, 30 Aug 2020 21:14:32 +0000 (02:44 +0530)]
Typo Correction

Corrected the typo in zfs/cmd/zfs/zfs_main.c
line number 404 pbkfd2iters to pbkdf2iters

Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Spencer Kinny <spencerkinny1995@gmail.com>
Closes #10850

3 years agoMove spa_stats.c to common code
Matthew Macy [Sun, 30 Aug 2020 21:12:46 +0000 (14:12 -0700)]
Move spa_stats.c to common code

Initially it was considered simplest to stub out all
of the functions on FreeBSD. Now that FreeBSD supports
KSTAT_TYPE_RAW at least some of the functionality should
be made available.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Richard Elling <Richard.Elling@RichardElling.com>
Signed-off-by: Matt Macy <mmacy@FreeBSD.org>
Closes #10842

3 years agoFreeBSD: Fix spurious failure in zvol_geom_open
Matthew Macy [Sun, 30 Aug 2020 21:11:33 +0000 (14:11 -0700)]
FreeBSD: Fix spurious failure in zvol_geom_open

In zvol_geom_open on first open we need to guarantee
that the namespace lock is held to avoid spurious
failures in zvol_first_open.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Ryan Moeller <freqlabs@FreeBSD.org>
Signed-off-by: Matt Macy <mmacy@FreeBSD.org>
Closes #10841

3 years agoAuto close "Status: Feedback requested" after a month
Kjeld Schouten-Lebbing [Sun, 30 Aug 2020 21:09:54 +0000 (23:09 +0200)]
Auto close "Status: Feedback requested" after a month

This commit closes issues labeled with:
"Status: Feedback requested" after 1 month, if the
label is not removed or the author has not responded

Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Richard Laager <rlaager@wiktel.com>
Signed-off-by: Kjeld Schouten-Lebbing <kjeld@schouten-lebbing.nl>
Closes #10807
Closes #10808

3 years agoFreeBSD: add support for KSTAT_TYPE_RAW
Matthew Macy [Sun, 30 Aug 2020 03:59:50 +0000 (20:59 -0700)]
FreeBSD: add support for KSTAT_TYPE_RAW

A few kstats use KSTAT_TYPE_RAW to provide a string generated on
demand.  Implementing these as sysctls was punted until now.

Reviewed by: Toomas Soome <tsoome@me.com>
Reviewed-by: Allan Jude <allan@klarasystems.com>
Reviewed-by: Ryan Moeller <ryan@ixsystems.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Matt Macy <mmacy@FreeBSD.org>
Closes #10836

3 years agoLinux 5.9 compat: NR_SLAB_RECLAIMABLE
Brian Behlendorf [Sun, 30 Aug 2020 03:57:45 +0000 (20:57 -0700)]
Linux 5.9 compat: NR_SLAB_RECLAIMABLE

Commit dcdc12e added compatibility code to treat NR_SLAB_RECLAIMABLE_B
as if it were the same as NR_SLAB_RECLAIMABLE.  However, the new value
is in bytes while the old value was in pages which means they are not
interchangeable.

The only place the reclaimable slab size is used is as a component of
the calculation done by arc_free_memory().  This function returns the
amount of memory the ARC considers to be free or reclaimable at little
cost.  Rather than switch to a new interface to get this value it has
been removed it from the calculation.  It is normally a minor component
compared to the number of inactive or free pages, and removing it
aligns the behavior with the FreeBSD version of arc_free_memory().

Reviewed-by: Matthew Ahrens <mahrens@delphix.com>
Reviewed-by: Coleman Kane <ckane@colemankane.org>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #10834

3 years agoFix another dependency loop
Richard Laager [Sun, 31 May 2020 01:39:31 +0000 (20:39 -0500)]
Fix another dependency loop

zfs-load-key-DATASET.service was gaining an
After=systemd-journald.socket due to its stdout/stderr going to the
journal (which is the default).  systemd-journald.socket has an After
(via RequiresMountsFor=/run/systemd/journal) on -.mount.  If the root
filesystem is encrypted, -.mount gets an After
zfs-load-key-DATASET.service.

By setting stdout and stderr to null on the key load services, we avoid
this loop.

Reviewed-by: Antonio Russo <antonio.e.russo@gmail.com>
Reviewed-by: InsanePrawn <insane.prawny@gmail.com>
Signed-off-by: Richard Laager <rlaager@wiktel.com>
Closes #10356
Closes #10388

3 years agoFix a dependency loop
Richard Laager [Sat, 30 May 2020 23:40:45 +0000 (18:40 -0500)]
Fix a dependency loop

When generating units with zfs-mount-generator, if the pool is already
imported, zfs-import.target is not needed.  This avoids a dependency
loop on root-on-ZFS systems:
  systemd-random-seed.service After (via RequiresMountsFor)
  var-lib.mount After
  zfs-import.target After
  zfs-import-{cache,scan}.service After
  cryptsetup.service After
  systemd-random-seed.service

Reviewed-by: Antonio Russo <antonio.e.russo@gmail.com>
Reviewed-by: InsanePrawn <insane.prawny@gmail.com>
Signed-off-by: Richard Laager <rlaager@wiktel.com>
Closes #10388

3 years agoconfig/zfs-build.m4: add --with-vendor flag
Georgy Yakovlev [Fri, 28 Aug 2020 16:43:44 +0000 (09:43 -0700)]
config/zfs-build.m4: add --with-vendor flag

This will allow an override of auto-detection of distribution, which
is based on checking presence of /etc/*-release files.

Build systems makes a lot of file location assumptions based on
detected distribution.

Some distributions (like gentoo) may prefer explicitly
setting --with-vendor=gentoo to avoid auto-detection.

Since auto-detection checks all files in order, current script may
misdetect even on gentoo system if /etc/redhat-release file is present

Default behavior is unchanged and default is --with-vendor=check

Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Georgy Yakovlev <gyakovlev@gentoo.org>
Closes #10835

3 years agoFix definition of BLKGETSIZE64 on FreeBSD
Alexander Richardson [Thu, 27 Aug 2020 23:09:26 +0000 (00:09 +0100)]
Fix definition of BLKGETSIZE64 on FreeBSD

The matching ioctl is DIOCGMEDIASIZE.

Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Ryan Moeller <freqlabs@FreeBSD.org>
Signed-off-by: Alex Richardson <Alexander.Richardson@cl.cam.ac.uk>
Closes #10818

3 years agomodule/zstd: pass -U__BMI__
Georgy Yakovlev [Thu, 27 Aug 2020 22:50:13 +0000 (15:50 -0700)]
module/zstd: pass -U__BMI__

If kernel is compiled with -march=znver1 or -march=znver2 zstd module
compilation will fail due to SSE register return with SSE disabled.
What's interesting, is that -march=skylake also implies -mbmi which
defines __BMI__ but compilation succeeds.  It is probably due to
different BMI implementations on AMD and INTEL processors and the
way compiler uses instructions.

Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Georgy Yakovlev <gyakovlev@gentoo.org>
Closes #10758
Closes #10829

3 years agoAdd the Xr's to the SEE ALSO as well
John-Mark Gurney [Thu, 27 Aug 2020 05:29:00 +0000 (22:29 -0700)]
Add the Xr's to the SEE ALSO as well

There are a ton of zfs-* and zpool-* man pages. This adds them to
the SEE ALSO section so that people can more quickly look through
what all the options are, now that the pages have been split.

Reviewed-by: Richard Laager <rlaager@wiktel.com>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Allan Jude <allan@klarasystems.com>
Signed-off-by: John-Mark Gurney <jmg@funkthat.com>
Closes #10589

3 years agodnode_sync is careless with range tree
Patrick Mooney [Thu, 27 Aug 2020 04:48:29 +0000 (23:48 -0500)]
dnode_sync is careless with range tree

Because dnode_sync_free_range() must drop dn_mtx during its processing,
using it as a callback to range_tree_vacate() is not safe.  No other
operations (besides destroy) are allowed once range_tree_vacate() has
begun, and dropping dn_mtx would leave a window open for another thread
to observe that invalid (and unsafe) state via dnode_block_freed().

Reviewed-by: Matthew Ahrens <mahrens@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Igor Kozhukhov <igor@dilos.org>
Signed-off-by: Patrick Mooney <pmooney@oxide.computer>
Closes #10708
Closes #10823

3 years agoFix NEWS file
Cédric Berger [Thu, 27 Aug 2020 04:44:41 +0000 (06:44 +0200)]
Fix NEWS file

Points to https://github.com/openzfs/zfs/releases

Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Cédric Berger <cedric@precidata.com>
Closes #10824

3 years agozpool: Change base URL for ZFS messages to openzfs-docs
Ryan Moeller [Thu, 27 Aug 2020 04:43:06 +0000 (00:43 -0400)]
zpool: Change base URL for ZFS messages to openzfs-docs

Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Kjeld Schouten <kjeld@schouten-lebbing.nl>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ryan Moeller <ryan@iXsystems.com>
Closes #10820

3 years agoRemove duplicate dnode.h include
Brian Behlendorf [Thu, 27 Aug 2020 04:41:09 +0000 (21:41 -0700)]
Remove duplicate dnode.h include

The zfs/sa.c source file accidentally includes sys/dnode.h twice.
Remove the second occurrence.

Reviewed-by: Matthew Ahrens <mahrens@delphix.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #10816
Closes #10819

3 years agoAlways track temporary fses and snapshots for accounting
Paul Dagnelie [Thu, 27 Aug 2020 04:38:27 +0000 (21:38 -0700)]
Always track temporary fses and snapshots for accounting

The root cause of the issue is that we only occasionally do as the
comments in the code suggest and actually ignore the %recv dataset when
it comes to filesystem limit tracking. Specifically, the only time we
ignore it is when initializing the filesystem and snapshot limit values;
when creating a new %recv dataset or deleting one, we always update
the bookkeeping. This causes a problem if you init the fs count on a
filesystem that already has a %recv dataset, since the bookmarking
will be decremented but not incremented. This is resolved in this
patch by simply always tracking the %recv dataset as a child.

Reviewed-by: Matt Ahrens <matt@delphix.com>
Reviewed by: Jerry Jelinek <jerry.jelinek@joyent.com>
Signed-off-by: Paul Dagnelie <pcd@delphix.com>
Closes #10791

3 years agoFix broken bug report form
Kjeld Schouten-Lebbing [Wed, 26 Aug 2020 17:49:51 +0000 (19:49 +0200)]
Fix broken bug report form

By accident previous PR broke the bug report form.
This commit fixes it
(and is actually tested completely to work)

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Kjeld Schouten-Lebbing <kjeld@schouten-lebbing.nl>
Closes #10821

3 years agoRemove pragma ident lines
Toomas Soome [Wed, 26 Aug 2020 17:35:50 +0000 (20:35 +0300)]
Remove pragma ident lines

The #pragma ident is a historical relic and not needed any more, this
pragma is actually unknown for common compilers and is only causing
trouble.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Matt Macy <mmacy@FreeBSD.org>
Signed-off-by: Toomas Soome <tsoome@me.com>
Closes #10810

3 years agoFreeBSD: disable neon usage
Matthew Macy [Wed, 26 Aug 2020 16:54:37 +0000 (09:54 -0700)]
FreeBSD: disable neon usage

The neon support code does not build on FreeBSD,
ifdef out references to fix linker issues on arm64.

Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Matt Macy <mmacy@FreeBSD.org>
Closes #10809

3 years agoGithub CI: Enable checkbashism
George Melikov [Wed, 26 Aug 2020 16:52:28 +0000 (19:52 +0300)]
Github CI: Enable checkbashism

Run checkbashisms on checkstyle too.

Reviewed-by: Kjeld Schouten <kjeld@schouten-lebbing.nl>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: George Melikov <mail@gmelikov.ru>
Closes #10811

3 years agoStaleBot Tweaks
Kjeld Schouten-Lebbing [Wed, 26 Aug 2020 16:49:58 +0000 (18:49 +0200)]
StaleBot Tweaks

- Add Status: Triage Needed to bug reports

Currently "Type: Defect" is auto added.
Adding a triage tag, makes sure all issues are reviewed by a maintainer
It also opens up some options to priorities defects in the near future.

- Prevent future StaleBot Spam

StaleBot will limit itself to 6 actions per hour
This should prevent future floods of StaleBot activity
(aka Spam)

- StaleBot: Ignore issues that are being worked on

Ignore the following Issues:
- tagged: "Status: Work in Progress"
- Having a maintainer assigned
- Being part of a project
- Having a milestone tag

- Rename Ignore "Type: Understood" to "Bot: Not Stale"

This Commits changes the general ignore tag for StaleBot from:
 "Type: Understood"
to
"Bot: Not Stale"

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Kjeld Schouten-Lebbing <kjeld@schouten-lebbing.nl>
Closes #10813

3 years agoIntroduce limit on size of L2ARC headers
Alexander Motin [Tue, 25 Aug 2020 21:33:36 +0000 (17:33 -0400)]
Introduce limit on size of L2ARC headers

Since L2ARC buffers are not evicted on memory pressure, too large
amount of headers on system with irrationally large L2ARC can render
it slow or even unusable.  This change limits L2ARC writes and
rebuild if unevictable L2ARC-only headers reach dangerous level.

While there, call arc_adapt() on L2ARC rebuild, so that it could
properly grow arc_c, reflecting potentially significant ARC size
increase and avoiding slow growth with hopeless eviction attempts
later when "overflow" is detected.

Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reported-by: Richard Elling <Richard.Elling@RichardElling.com>
Signed-off-by: Alexander Motin <mav@FreeBSD.org>
Closes #10765

3 years agoTag 2.0.0-rc1
Brian Behlendorf [Tue, 25 Aug 2020 18:48:28 +0000 (11:48 -0700)]
Tag 2.0.0-rc1

New features:
- Unified code base for Linux and FreeBSD
- Redacted 'zfs send/recv'
- Persistent L2ARC
- Sequential resilvering
- ZSTD Compression
- Log spacemaps
- Fast clone deletion
- Sectional zfs/zpool man pages
- Added 'zpool wait' subcommand
- Improved 'zfs share' scalability
- Improved AES-GCM encryption performance

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
3 years agoDon't assert on nvlists larger than SPA_MAXBLOCKSIZE
Allan Jude [Tue, 25 Aug 2020 18:04:20 +0000 (14:04 -0400)]
Don't assert on nvlists larger than SPA_MAXBLOCKSIZE

Originally we asserted that all reads are less than SPA_MAXBLOCKSIZE
However, nvlists are not ZFS records, and are not limited to
SPA_MAXBLOCKSIZE.

Add a new environment variable, ZFS_SENDRECV_MAX_NVLIST, to allow the
user to specify the maximum size of the nvlist that can be sent or
received.
Default value: 4 * SPA_MAXBLOCKSIZE (64 MB)

Modify libzfs send routines to return a useful error if the send stream
will generate an nvlist that is beyond the maximum size.

Modify libzfs recv routines to add an explicit error message if the
nvlist is too large, rather than abort()ing.

Move the change the assert() to only trigger on data records

Reviewed-by: Paul Dagnelie <pcd@delphix.com>
Reviewed-by: Kjeld Schouten <kjeld@schouten-lebbing.nl>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Matthew Ahrens <mahrens@delphix.com>
Signed-off-by: Allan Jude <allan@klarasystems.com>
Closes #9616

3 years agoMark lua setjmp/longjmp for powerpc weak
sterlingjensen [Tue, 25 Aug 2020 17:32:49 +0000 (12:32 -0500)]
Mark lua setjmp/longjmp for powerpc weak

Linux already defines setjmp/longjmp for powerpc, which leads to
duplicate symbols in a statically linked build.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Sterlng Jensen <sterlingjensen@users.noreply.github.com>
Closes #10795

3 years agoExport dmu_offset_next() symbol
Brian Behlendorf [Tue, 25 Aug 2020 15:34:41 +0000 (08:34 -0700)]
Export dmu_offset_next() symbol

Export the dmu_offset_next() symbol for use by Lustre.

Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #10796

3 years agoman: Canonicalize .TH usage
Ryan Moeller [Tue, 25 Aug 2020 04:25:28 +0000 (00:25 -0400)]
man: Canonicalize .TH usage

* Use all caps for document title.
* Remove section name as it can be inferred from the section number.
* Name "OpenZFS" as the document source.
* Bump modification date.

While here, fixed trailing whitespace reported by igor.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: Ryan Moeller <ryan@iXsystems.com>
Closes #10792

3 years agoFix inability to destroy snapshot used over NFS
youzhongyang [Tue, 25 Aug 2020 00:33:02 +0000 (20:33 -0400)]
Fix inability to destroy snapshot used over NFS

The cache of struct svc_export and struct svc_expkey by nfsd and
rpc.mountd for the snapshot holds references to the mount point.
We need to flush them out before unmounting, otherwise umount
would fail with EBUSY.

Reviewed-by: Don Brady <don.brady@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Youzhong Yang <yyang@mathworks.com>
Closes #6000
Closes #10783

3 years agoAvoid symbol collision with in-kernel zstdlib
Sebastian Gottschall [Mon, 24 Aug 2020 19:20:41 +0000 (21:20 +0200)]
Avoid symbol collision with in-kernel zstdlib

For Linux, when zfs is compiled as an in kernel static variant
and the in kernel zstd library is compiled statically into the kernel
a symbol collision will occur.  This wrapper header renames all
of the relevant zstd functions to avoid this problem.

Reviewed-by: Kjeld Schouten <kjeld@schouten-lebbing.nl>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Sebastian Gottschall <s.gottschall@dd-wrt.com>
Closes #10775

3 years agoAdd Stale-bot
Kjeld Schouten-Lebbing [Mon, 24 Aug 2020 19:04:38 +0000 (21:04 +0200)]
Add Stale-bot

This file configures the following stale-bot:
https://github.com/apps/stale

It is set to mark issues as "Stale" after 365 days
It is also set to auto-close the issue 90 days after.

Please be aware that this issue also requires-
The listed stale-bot being added to the repo.

Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Kjeld Schouten-Lebbing <kjeld@schouten-lebbing.nl>
Closes #10778

3 years agoAppease GCC sprintf warnings found on Fedora 32/GCC 10.0.1
Chris McDonough [Mon, 24 Aug 2020 17:32:59 +0000 (13:32 -0400)]
Appease GCC sprintf warnings found on Fedora 32/GCC 10.0.1

Increase the size of DDT_NAMELEN and MNT_LINE_MAX to appease GCC
snprintf truncation warnings.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Chris McDonough <chrism@plope.com>
Closes #10712
Closes #10766

3 years agoZTS: Improve block_device_wait on FreeBSD
Ryan Moeller [Mon, 24 Aug 2020 15:50:15 +0000 (11:50 -0400)]
ZTS: Improve block_device_wait on FreeBSD

FreeBSD doesn't have an equivalent to udevadm settle, so we have been
resorting to a three second sleep to wait for device changes to take
effect.  This is far from ideal.

We are mainly waiting for volmode=geom zvols to appear in /dev, so as
a hack, reading the geom config will have the desired effect of
quiescing the geom state.

Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ryan Moeller <ryan@iXsystems.com>
Closes #10768

3 years agoImprove documentation of zpool import -d/-c vs -s
Chris McDonough [Mon, 24 Aug 2020 04:18:30 +0000 (00:18 -0400)]
Improve documentation of zpool import -d/-c vs -s

Specify that, by default, zpool import uses the libblkid
cache on Linux and geom on FreeBSD, and only scans when
-d/-s is provided.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Ryan Moeller <freqlabs@FreeBSD.org>
Signed-off-by: Chris McDonough <chrism@plope.com>
Closes #7656
Closes #10771

3 years agoCI checkstyle: add linter + rename job + install latest flake8
George Melikov [Mon, 24 Aug 2020 04:15:25 +0000 (07:15 +0300)]
CI checkstyle: add linter + rename job + install latest flake8

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: George Melikov <mail@gmelikov.ru>
Closes #10784

3 years agoZFS performance tests should clean up NFS mount
Tony Nguyen [Sun, 23 Aug 2020 22:14:22 +0000 (16:14 -0600)]
ZFS performance tests should clean up NFS mount

This change umounts client's NFS mount after each test so we can avoid
two sporadic issues:
1) client NFS stale mount and
2) zpool export and zpool destroy failed due to dataset busy

Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tony Nguyen <tony.nguyen@delphix.com>
Closes #10767