]> git.proxmox.com Git - mirror_zfs.git/commit - module/zfs/spa.c
Illumos #4101, #4102, #4103, #4105, #4106
authorGeorge Wilson <george.wilson@delphix.com>
Tue, 1 Oct 2013 21:25:53 +0000 (13:25 -0800)
committerBrian Behlendorf <behlendorf1@llnl.gov>
Tue, 22 Jul 2014 16:39:16 +0000 (09:39 -0700)
commit93cf20764a1be64a603020f54b45200e37b3877e
treeb0db8d60368de34cdbd4eccc9ee98d1110beb15e
parent1be627f5c28a355bcd49e4e097114c13fae7731b
Illumos #4101, #4102, #4103, #4105, #4106

4101 metaslab_debug should allow for fine-grained control
4102 space_maps should store more information about themselves
4103 space map object blocksize should be increased
4105 removing a mirrored log device results in a leaked object
4106 asynchronously load metaslab
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Adam Leventhal <ahl@delphix.com>
Reviewed by: Sebastien Roy <seb@delphix.com>
Approved by: Garrett D'Amore <garrett@damore.org>

Prior to this patch, space_maps were preferred solely based on the
amount of free space left in each. Unfortunately, this heuristic didn't
contain any information about the make-up of that free space, which
meant we could keep preferring and loading a highly fragmented space map
that wouldn't actually have enough contiguous space to satisfy the
allocation; then unloading that space_map and repeating the process.

This change modifies the space_map's to store additional information
about the contiguous space in the space_map, so that we can use this
information to make a better decision about which space_map to load.
This requires reallocating all space_map objects to increase their
bonus buffer size sizes enough to fit the new metadata.

The above feature can be enabled via a new feature flag introduced by
this change: com.delphix:spacemap_histogram

In addition to the above, this patch allows the space_map block size to
be increase. Currently the block size is set to be 4K in size, which has
certain implications including the following:

    * 4K sector devices will not see any compression benefit
    * large space_maps require more metadata on-disk
    * large space_maps require more time to load (typically random reads)

Now the space_map block size can adjust as needed up to the maximum size
set via the space_map_max_blksz variable.

A bug was fixed which resulted in potentially leaking an object when
removing a mirrored log device. The previous logic for vdev_remove() did
not deal with removing top-level vdevs that are interior vdevs (i.e.
mirror) correctly. The problem would occur when removing a mirrored log
device, and result in the DTL space map object being leaked; because
top-level vdevs don't have DTL space map objects associated with them.

References:
  https://www.illumos.org/issues/4101
  https://www.illumos.org/issues/4102
  https://www.illumos.org/issues/4103
  https://www.illumos.org/issues/4105
  https://www.illumos.org/issues/4106
  https://github.com/illumos/illumos-gate/commit/0713e23

Porting notes:

A handful of kmem_alloc() calls were converted to kmem_zalloc(). Also,
the KM_PUSHPAGE and TQ_PUSHPAGE flags were used as necessary.

Ported-by: Tim Chase <tim@chase2k.com>
Signed-off-by: Prakash Surya <surya1@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #2488
24 files changed:
cmd/zdb/zdb.c
include/sys/Makefile.am
include/sys/metaslab.h
include/sys/metaslab_impl.h
include/sys/range_tree.h [new file with mode: 0644]
include/sys/space_map.h
include/sys/space_reftree.h [new file with mode: 0644]
include/sys/vdev_impl.h
include/sys/zfeature.h
include/zfeature_common.h
lib/libzpool/Makefile.am
man/man5/zpool-features.5
module/zfs/Makefile.in
module/zfs/dnode.c
module/zfs/metaslab.c
module/zfs/range_tree.c [new file with mode: 0644]
module/zfs/spa.c
module/zfs/spa_misc.c
module/zfs/space_map.c
module/zfs/space_reftree.c [new file with mode: 0644]
module/zfs/vdev.c
module/zfs/vdev_label.c
module/zfs/zfeature.c
module/zfs/zfeature_common.c