git.proxmox.com Git - mirror_ubuntu-jammy-kernel.git/commit

xfs: increase inode cluster size for v5 filesystems

v5 filesystems use 512 byte inodes as a minimum, so read inodes in
clusters that are effectively half the size of a v4 filesystem with
256 byte inodes. For v5 fielsystems, scale the inode cluster size
with the size of the inode so that we keep a constant 32 inodes per
cluster ratio for all inode IO.

This only works if mkfs.xfs sets the inode alignment appropriately
for larger inode clusters, so this functionality is made conditional
on mkfs doing the right thing. xfs_repair needs to know about
the inode alignment changes, too.

Wall time:
create bulkstat find+stat ls -R unlink
v4 237s 161s 173s 201s 299s
v5 235s 163s 205s 31s 356s
patched 234s 160s 182s 29s 317s

System time:
create bulkstat find+stat ls -R unlink
v4 2601s 2490s 1653s 1656s 2960s
v5 2637s 2497s 1681s   20s 3216s
patched 2613s 2451s 1658s   20s 3007s

So, wall time same or down across the board, system time same or
down across the board, and cache hit rates all improve except for
the ls -R case which is a pure cold cache directory read workload
on v5 filesystems...

So, this patch removes most of the performance and CPU usage
differential between v4 and v5 filesystems on traversal related
workloads.

Note: while this patch is currently for v5 filesystems only, there
is no reason it can't be ported back to v4 filesystems.  This hasn't
been done here because bringing the code back to v4 requires
forwards and backwards kernel compatibility testing.  i.e. to
deterine if older kernels(*) do the right thing with larger inode
alignments but still only using 8k inode cluster sizes. None of this
testing and validation on v4 filesystems has been done, so for the
moment larger inode clusters is limited to v5 superblocks.

(*) a current default config v4 filesystem should mount just fine on
2.6.23 (when lazy-count support was introduced), and so if we change
the alignment emitted by mkfs without a feature bit then we have to
make sure it works properly on all kernels since 2.6.23. And if we
allow it to be changed when the lazy-count bit is not set, then it's
all kernels since v2 logs were introduced that need to be tested for
compatibility...

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Ben Myers <bpm@sgi.com>

author	Dave Chinner <dchinner@redhat.com>
	Fri, 1 Nov 2013 04:27:20 +0000 (15:27 +1100)
committer	Ben Myers <bpm@sgi.com>
	Mon, 18 Nov 2013 15:29:36 +0000 (09:29 -0600)
commit	8f80587bacb6eb893df0ee4e35fefa0dfcfdf9f4
tree	0b1eb078933bbf6f76a240dd21aedc90b967fb21	tree
parent	9e3908e342eba6684621e616529669c17e271e7e	commit \| diff

fs/xfs/xfs_mount.c		diff \| blob \| blame \| history
fs/xfs/xfs_mount.h		diff \| blob \| blame \| history
fs/xfs/xfs_trans_resv.c		diff \| blob \| blame \| history