]> git.proxmox.com Git - mirror_zfs.git/commitdiff
Don't assert mg_initialized due to device addition race
authorPaul Dagnelie <pcd@delphix.com>
Mon, 29 Jan 2024 18:36:42 +0000 (10:36 -0800)
committerGitHub <noreply@github.com>
Mon, 29 Jan 2024 18:36:42 +0000 (10:36 -0800)
During device removal stress tests, we noticed that we were tripping
the assertion that mg_initialized was true. After investigation, it was
determined that the mg in question was the embedded log metaslab
group for a newly added vdev; the normal mg had been initialized (by
metaslab_sync_reassess, via vdev_sync_done). However, because the spa
config alloc lock is not held as writer across both calls to
metaslab_sync_reassess, it is possible for an allocation to happen
between the two metaslab_groups being initialized. Because the metaslab
code doesn't check the group in question, just the vdev's main mg, it
is possible to get past the initial check in vdev_allocatable and
later fail due to the assertion.

We simply remove the assertions. We could also consider locking the
ALLOC lock around the reassess calls in vdev_sync_done, but that risks
deadlocks. We could check the actual target mg in vdev_allocatable,
but that risks racing with a passivation that comes in after that
check but before the assertion. We still won't be able to actually
allocate from the metaslab group if no metaslabs are ready, so this
change shouldn't break anything.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: George Wilson <george.wilson@delphix.com>
Signed-off-by: Paul Dagnelie <pcd@delphix.com>
Closes #15818

module/zfs/metaslab.c

index 0983ba143a1d1e5688df07587a672d12f63c26f7..7237fa8eeb5917859cbc3c884ae44471731f3edd 100644 (file)
@@ -5105,7 +5105,6 @@ metaslab_group_alloc(metaslab_group_t *mg, zio_alloc_list_t *zal,
     int allocator, boolean_t try_hard)
 {
        uint64_t offset;
-       ASSERT(mg->mg_initialized);
 
        offset = metaslab_group_alloc_normal(mg, zal, asize, txg, want_unique,
            dva, d, allocator, try_hard);
@@ -5256,8 +5255,6 @@ top:
                        goto next;
                }
 
-               ASSERT(mg->mg_initialized);
-
                /*
                 * Avoid writing single-copy data to an unhealthy,
                 * non-redundant vdev, unless we've already tried all