]> git.proxmox.com Git - mirror_ubuntu-focal-kernel.git/commit
hugetlbfs: take read_lock on i_mmap for PMD sharing
authorWaiman Long <longman@redhat.com>
Thu, 4 Jun 2020 09:39:00 +0000 (11:39 +0200)
committerKhalid Elmously <khalid.elmously@canonical.com>
Sat, 8 Aug 2020 05:53:12 +0000 (01:53 -0400)
commit6f21bb93d87093e4581536cc71448882696f613e
tree6075605b3bb0234d7bbcf202d243e5f7b984f090
parent5a89117c8aff1d4ae03d3b84c87c8dbe4105f490
hugetlbfs: take read_lock on i_mmap for PMD sharing

BugLink: https://bugs.launchpad.net/bugs/1882039
A customer with large SMP systems (up to 16 sockets) with application
that uses large amount of static hugepages (~500-1500GB) are
experiencing random multisecond delays.  These delays were caused by the
long time it took to scan the VMA interval tree with mmap_sem held.

The sharing of huge PMD does not require changes to the i_mmap at all.
Therefore, we can just take the read lock and let other threads
searching for the right VMA share it in parallel.  Once the right VMA is
found, either the PMD lock (2M huge page for x86-64) or the
mm->page_table_lock will be acquired to perform the actual PMD sharing.

Lock contention, if present, will happen in the spinlock.  That is much
better than contention in the rwsem where the time needed to scan the
the interval tree is indeterminate.

With this patch applied, the customer is seeing significant performance
improvement over the unpatched kernel.

Link: http://lkml.kernel.org/r/20191107211809.9539-1-longman@redhat.com
Signed-off-by: Waiman Long <longman@redhat.com>
Suggested-by: Mike Kravetz <mike.kravetz@oracle.com>
Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Matthew Wilcox <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
(cherry picked from commit 930668c34408ba983049322e04f13f03b6f1fafa)
Signed-off-by: Gavin Guo <gavin.guo@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Juerg Haefliger <juerg.haefliger@canonical.com>
Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
mm/hugetlb.c