X-Git-Url: https://git.proxmox.com/?a=blobdiff_plain;f=man%2Fman5%2Fspl-module-parameters.5;h=3e7e877fbbbc86becc7221499702cb658b6911c0;hb=10946b0206deecf1f7f9df2f443079ddc53a7208;hp=9b351762cbc9281e0bba466a8541456489e765f9;hpb=9e4fb5c2f9158743c5e87456334c88008e2e2074;p=mirror_spl-debian.git

diff --git a/man/man5/spl-module-parameters.5 b/man/man5/spl-module-parameters.5
index 9b35176..3e7e877 100644
--- a/man/man5/spl-module-parameters.5
+++ b/man/man5/spl-module-parameters.5
@@ -17,113 +17,223 @@ Description of the different parameters to the SPL module.
 .sp
 .ne 2
 .na
-\fBspl_debug_subsys\fR (ulong)
+\fBspl_kmem_cache_expire\fR (uint)
 .ad
 .RS 12n
-Subsystem debugging level mask.
+Cache expiration is part of default Illumos cache behavior.  The idea is
+that objects in magazines which have not been recently accessed should be
+returned to the slabs periodically.  This is known as cache aging and
+when enabled objects will be typically returned after 15 seconds.
+.sp
+On the other hand Linux slabs are designed to never move objects back to
+the slabs unless there is memory pressure.  This is possible because under
+Linux the cache will be notified when memory is low and objects can be
+released.
+.sp
+By default only the Linux method is enabled.  It has been shown to improve
+responsiveness on low memory systems and not negatively impact the performance
+of systems with more memory.  This policy may be changed by setting the
+\fBspl_kmem_cache_expire\fR bit mask as follows, both policies may be enabled
+concurrently.
+.sp
+0x01 - Aging (Illumos), 0x02 - Low memory (Linux)
 .sp
-Default value: \fB~0\fR.
+Default value: \fB0x02\fR
 .RE
 
 .sp
 .ne 2
 .na
-\fBspl_debug_mask\fR (ulong)
+\fBspl_kmem_cache_reclaim\fR (uint)
 .ad
 .RS 12n
-Debugging level mask.
+When this is set it prevents Linux from being able to rapidly reclaim all the
+memory held by the kmem caches.  This may be useful in circumstances where
+it's preferable that Linux reclaim memory from some other subsystem first.
+Setting this will increase the likelihood out of memory events on a memory
+constrained system.
 .sp
-Default value: \fB8 | 10 | 4 | 20\fR (SD_ERROR | SD_EMERG | SD_WARNING | SD_CONSOLE).
+Default value: \fB0\fR
 .RE
 
 .sp
 .ne 2
 .na
-\fBspl_debug_printk\fR (ulong)
+\fBspl_kmem_cache_obj_per_slab\fR (uint)
 .ad
 .RS 12n
-Console printk level mask.
+The preferred number of objects per slab in the cache.   In general, a larger
+value will increase the caches memory footprint while decreasing the time
+required to perform an allocation.  Conversely, a smaller value will minimize
+the footprint and improve cache reclaim time but individual allocations may
+take longer.
 .sp
-Default value: \fB8 | 10 | 4 | 20\fR (SD_ERROR | SD_EMERG | SD_WARNING | SD_CONSOLE).
+Default value: \fB8\fR
 .RE
 
 .sp
 .ne 2
 .na
-\fBspl_debug_mb\fR (int)
+\fBspl_kmem_cache_obj_per_slab_min\fR (uint)
 .ad
 .RS 12n
-Total debug buffer size.
+The minimum number of objects allowed per slab.  Normally slabs will contain
+\fBspl_kmem_cache_obj_per_slab\fR objects but for caches that contain very
+large objects it's desirable to only have a few, or even just one, object per
+slab.
 .sp
-Default value: \fB-1\fR.
+Default value: \fB1\fR
 .RE
 
 .sp
 .ne 2
 .na
-\fBspl_debug_panic_on_bug\fR (int)
+\fBspl_kmem_cache_max_size\fR (uint)
 .ad
 .RS 12n
-Panic on BUG
+The maximum size of a kmem cache slab in MiB.  This effectively limits
+the maximum cache object size to \fBspl_kmem_cache_max_size\fR /
+\fBspl_kmem_cache_obj_per_slab\fR.  Caches may not be created with
+object sized larger than this limit.
 .sp
-Use \fB1\fR for yes and \fB0\fR for no (default).
+Default value: \fB32 (64-bit) or 4 (32-bit)\fR
 .RE
 
 .sp
 .ne 2
 .na
-\fBspl_kmem_cache_expire\fR (uint)
+\fBspl_kmem_cache_slab_limit\fR (uint)
 .ad
 .RS 12n
-By age (0x1) or low memory (0x2)
+For small objects the Linux slab allocator should be used to make the most
+efficient use of the memory.  However, large objects are not supported by
+the Linux slab and therefore the SPL implementation is preferred.  This
+value is used to determine the cutoff between a small and large object.
+.sp
+Objects of \fBspl_kmem_cache_slab_limit\fR or smaller will be allocated
+using the Linux slab allocator, large objects use the SPL allocator.  A
+cutoff of 16K was determined to be optimal for architectures using 4K pages.
 .sp
-Default value: \fB0\fR.
+Default value: \fB16,384\fR
 .RE
 
 .sp
 .ne 2
 .na
-\fBspl_hostid\fR (ulong)
+\fBspl_kmem_cache_kmem_limit\fR (uint)
 .ad
 .RS 12n
-The system hostid.
-.sp
-Default value: \fB0xFFFFFFFF\fR (an invalid hostid!)
+Depending on the size of a cache object it may be backed by kmalloc()'d
+or vmalloc()'d memory.  This is because the size of the required allocation
+greatly impacts the best way to allocate the memory.
+.sp
+When objects are small and only a small number of memory pages need to be
+allocated, ideally just one, then kmalloc() is very efficient.  However,
+when allocating multiple pages with kmalloc() it gets increasingly expensive
+because the pages must be physically contiguous.
+.sp
+For this reason we shift to vmalloc() for slabs of large objects which
+which removes the need for contiguous pages.  We cannot use vmalloc() in
+all cases because there is significant locking overhead involved.  This
+function takes a single global lock over the entire virtual address range
+which serializes all allocations.  Using slightly different allocation
+functions for small and large objects allows us to handle a wide range of
+object sizes.
+.sh
+The \fBspl_kmem_cache_kmem_limit\fR value is used to determine this cutoff
+size.  One quarter the PAGE_SIZE is used as the default value because
+\fBspl_kmem_cache_obj_per_slab\fR defaults to 16.  This means that at
+most we will need to allocate four contiguous pages.
+.sp
+Default value: \fBPAGE_SIZE/4\fR
 .RE
 
 .sp
 .ne 2
 .na
-\fBspl_hostid_path\fR (charp)
+\fBspl_kmem_alloc_warn\fR (uint)
 .ad
 .RS 12n
-The system hostid file
-.sp
-Default value: \fB/etc/hostid\fR.
+As a general rule kmem_alloc() allocations should be small, preferably
+just a few pages since they must by physically contiguous.  Therefore, a
+rate limited warning will be printed to the console for any kmem_alloc()
+which exceeds a reasonable threshold.
+.sp
+The default warning threshold is set to eight pages but capped at 32K to
+accommodate systems using large pages.  This value was selected to be small
+enough to ensure the largest allocations are quickly noticed and fixed.
+But large enough to avoid logging any warnings when a allocation size is
+larger than optimal but not a serious concern.  Since this value is tunable,
+developers are encouraged to set it lower when testing so any new largish
+allocations are quickly caught.  These warnings may be disabled by setting
+the threshold to zero.
+.sp
+Default value: \fB32,768\fR
 .RE
 
 .sp
 .ne 2
 .na
-\fBmutex_spin_max\fR (int)
+\fBspl_kmem_alloc_max\fR (uint)
 .ad
 .RS 12n
-Spin a maximum of N times to acquire lock
+Large kmem_alloc() allocations will fail if they exceed KMALLOC_MAX_SIZE.
+Allocations which are marginally smaller than this limit may succeed but
+should still be avoided due to the expense of locating a contiguous range
+of free pages.  Therefore, a maximum kmem size with reasonable safely
+margin of 4x is set.  Kmem_alloc() allocations larger than this maximum
+will quickly fail.  Vmem_alloc() allocations less than or equal to this
+value will use kmalloc(), but shift to vmalloc() when exceeding this value.
+.sp
+Default value: \fBKMALLOC_MAX_SIZE/4\fR
+.RE
+
 .sp
 .ne 2
 .na
-\fBPossible values:\fR
-.sp
+\fBspl_kmem_cache_magazine_size\fR (uint)
+.ad
 .RS 12n
- \fB0\fR		Never spin when trying to acquire lock
+Cache magazines are an optimization designed to minimize the cost of
+allocating memory.  They do this by keeping a per-cpu cache of recently
+freed objects, which can then be reallocated without taking a lock. This
+can improve performance on highly contended caches.  However, because
+objects in magazines will prevent otherwise empty slabs from being
+immediately released this may not be ideal for low memory machines.
+.sp
+For this reason \fBspl_kmem_cache_magazine_size\fR can be used to set a
+maximum magazine size.  When this value is set to 0 the magazine size will
+be automatically determined based on the object size.  Otherwise magazines
+will be limited to 2-256 objects per magazine (i.e per cpu).  Magazines
+may never be entirely disabled in this implementation.
+.sp
+Default value: \fB0\fR
+.RE
+
 .sp
-\fB-1\fR		Spin until acquired or holder yields without dropping lock
+.ne 2
+.na
+\fBspl_hostid\fR (ulong)
+.ad
+.RS 12n
+The system hostid, when set this can be used to uniquely identify a system.
+By default this value is set to zero which indicates the hostid is disabled.
+It can be explicitly enabled by placing a unique non-zero value in
+\fB/etc/hostid/\fR.
 .sp
-\fB1-MAX_INT\fR	Spin for N attempts before sleeping for lock
+Default value: \fB0\fR
 .RE
+
+.sp
+.ne 2
+.na
+\fBspl_hostid_path\fR (charp)
+.ad
+.RS 12n
+The expected path to locate the system hostid when specified.  This value
+may be overridden for non-standard configurations.
 .sp
-.ne -4
-Default value: \fB0\fR.
+Default value: \fB/etc/hostid\fR
 .RE
 
 .sp
@@ -132,7 +242,10 @@ Default value: \fB0\fR.
 \fBspl_taskq_thread_bind\fR (int)
 .ad
 .RS 12n
-Bind taskq thread to CPU
+Bind taskq threads to specific CPUs.  When enabled all taskq threads will
+be distributed evenly  over the available CPUs.  By default, this behavior
+is disabled to allow the Linux scheduler the maximum flexibility to determine
+where a thread should run.
 .sp
-Default value: \fB0\fR.
+Default value: \fB0\fR
 .RE