git.proxmox.com Git - mirror_zfs-debian.git/commit

author	Ned Bass <bass6@llnl.gov>
	Fri, 13 Jan 2012 21:51:39 +0000 (13:51 -0800)
committer	Brian Behlendorf <behlendorf1@llnl.gov>
	Tue, 17 Jan 2012 16:54:00 +0000 (08:54 -0800)
commit	08d08ebba2247ad404001785a890de4281d0a362
tree	145d8623fa70871c1612311a5ff7b23a7d2a324f	tree
parent	a8783adf24a8c40dcae0fbfa90eb231212f26884	commit \| diff

Reduce number of zio free threads

As described in Issue #458 and #258, unlinking large amounts of data
can cause the threads in the zio free wait queue to start spinning.
Reducing the number of z_fr_iss threads from a fixed value of 100 to 1
per cpu signficantly reduces contention on the taskq spinlock and
improves throughput.

Instrumenting the taskq code showed that __taskq_dispatch() can spend
a long time holding tq->tq_lock if there are a large number of threads
in the queue.  It turns out the time spent in wake_up() scales
linearly with the number of threads in the queue.  When a large number
of short work items are dispatched, as seems to be the case with
unlink, the worker threads drain the queue faster than the dispatcher
can fill it.  They then all pile into the work wait queue to wait for
new work items.  So if 100 threads are in the queue, wake_up() takes
about 100 times as long, and the woken threads have to spin until the
dispatcher releases the lock.

Reducing the number of threads helps with the symptoms, but doesn't
get to the root of the problem.  It would seem that wake_up()
shouldn't scale linearly in time with queue depth, particularly if we
are only trying to wake up one thread.  In that vein, I tried making
all of the waiting processes exclusive to prevent the scheduler from
iterating over the entire list, but I still saw the linear time
scaling.  So further investigation is needed, but in the meantime
reducing the thread count is an easy workaround.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Issue #258
Issue #458