From 85537f77a3574f265c5e87c3ee4b1858317530ce Mon Sep 17 00:00:00 2001 From: Serapheim Dimitropoulos Date: Thu, 3 Nov 2022 15:02:46 -0700 Subject: [PATCH] Expose zfs_vdev_open_timeout_ms as a tunable Some of our customers have been occasionally hitting zfs import failures in Linux because udevd doesn't create the by-id symbolic links in time for zpool import to use them. The main issue is that the systemd-udev-settle.service that zfs-import-cache.service and other services depend on is racy. There is also an openzfs issue filed (see https://github.com/openzfs/zfs/issues/10891) outlining the problem and potential solutions. With the proper solutions being significant in terms of complexity and the priority of the issue being low for the time being, this patch exposes `zfs_vdev_open_timeout_ms` as a tunable so people that are experiencing this issue often can increase it as a workaround. Reviewed-by: Matthew Ahrens Reviewed-by: Richard Yao Reviewed-by: Alexander Motin Reviewed-by: Don Brady Reviewed-by: Brian Behlendorf Signed-off-by: Serapheim Dimitropoulos Closes #14133 --- man/man4/zfs.4 | 7 +++++++ module/os/linux/zfs/vdev_disk.c | 5 ++++- 2 files changed, 11 insertions(+), 1 deletion(-) diff --git a/man/man4/zfs.4 b/man/man4/zfs.4 index b317941bb..ed8914276 100644 --- a/man/man4/zfs.4 +++ b/man/man4/zfs.4 @@ -1222,6 +1222,13 @@ Ideally, this will be at least the sum of each queue's .Sy max_active . .No See Sx ZFS I/O SCHEDULER . . +.It Sy zfs_vdev_open_timeout_ms Ns = Ns Sy 1000 Pq uint +Timeout value to wait before determining a device is missing +during import. +This is helpful for transient missing paths due +to links being briefly removed and recreated in response to +udev events. +. .It Sy zfs_vdev_rebuild_max_active Ns = Ns Sy 3 Pq int Maximum sequential resilver I/O operations active to each device. .No See Sx ZFS I/O SCHEDULER . diff --git a/module/os/linux/zfs/vdev_disk.c b/module/os/linux/zfs/vdev_disk.c index d19595706..2f84792d8 100644 --- a/module/os/linux/zfs/vdev_disk.c +++ b/module/os/linux/zfs/vdev_disk.c @@ -56,7 +56,7 @@ static void *zfs_vdev_holder = VDEV_HOLDER; * device is missing. The missing path may be transient since the links * can be briefly removed and recreated in response to udev events. */ -static unsigned zfs_vdev_open_timeout_ms = 1000; +static uint_t zfs_vdev_open_timeout_ms = 1000; /* * Size of the "reserved" partition, in blocks. @@ -1020,3 +1020,6 @@ param_set_max_auto_ashift(const char *buf, zfs_kernel_param_t *kp) return (0); } + +ZFS_MODULE_PARAM(zfs_vdev, zfs_vdev_, open_timeout_ms, UINT, ZMOD_RW, + "Timeout before determining that a device is missing"); -- 2.39.5