git.proxmox.com Git - mirror_ubuntu-zesty-kernel.git/commit

KVM: PPC: Book3S HV: Invalidate TLB on radix guest vcpu movement

BugLink: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1675806
With radix, the guest can do TLB invalidations itself using the tlbie
(global) and tlbiel (local) TLB invalidation instructions.  Linux guests
use local TLB invalidations for translations that have only ever been
accessed on one vcpu.  However, that doesn't mean that the translations
have only been accessed on one physical cpu (pcpu) since vcpus can move
around from one pcpu to another.  Thus a tlbiel might leave behind stale
TLB entries on a pcpu where the vcpu previously ran, and if that task
then moves back to that previous pcpu, it could see those stale TLB
entries and thus access memory incorrectly.  The usual symptom of this
is random segfaults in userspace programs in the guest.

To cope with this, we detect when a vcpu is about to start executing on
a thread in a core that is a different core from the last time it
executed.  If that is the case, then we mark the core as needing a
TLB flush and then send an interrupt to any thread in the core that is
currently running a vcpu from the same guest.  This will get those vcpus
out of the guest, and the first one to re-enter the guest will do the
TLB flush.  The reason for interrupting the vcpus executing on the old
core is to cope with the following scenario:

CPU 0 CPU 1 CPU 4
(core 0) (core 0) (core 1)

VCPU 0 runs task X      VCPU 1 runs
core 0 TLB gets
entries from task X
VCPU 0 moves to CPU 4
VCPU 0 runs task X
Unmap pages of task X
tlbiel

(still VCPU 1) task X moves to VCPU 1
task X runs
task X sees stale TLB
entries

That is, as soon as the VCPU starts executing on the new core, it
could unmap and tlbiel some page table entries, and then the task
could migrate to one of the VCPUs running on the old core and
potentially see stale TLB entries.

Since the TLB is shared between all the threads in a core, we only
use the bit of kvm->arch.need_tlb_flush corresponding to the first
thread in the core.  To ensure that we don't have a window where we
can miss a flush, this moves the clearing of the bit from before the
actual flush to after it.  This way, two threads might both do the
flush, but we prevent the situation where one thread can enter the
guest before the flush is finished.

Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
(cherry picked from commit a29ebeaf5575d03eef178bb87c425a1e46cae1ca)
Signed-off-by: Breno Leitao <breno.leitao@gmail.com>
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>

author	Paul Mackerras <paulus@ozlabs.org>
	Tue, 28 Mar 2017 16:54:32 +0000 (13:54 -0300)
committer	Tim Gardner <tim.gardner@canonical.com>
	Tue, 28 Mar 2017 20:17:54 +0000 (14:17 -0600)
commit	0f76efd334974d5e9e339bd32d2c3b1b4c12b8b3
tree	e983e77104b7c8a33cc3a71745dc666a4c92a297	tree
parent	7d5bd15988e0ac8c725b10b2adda7e3c75372004	commit \| diff

arch/powerpc/include/asm/kvm_host.h		diff \| blob \| blame \| history
arch/powerpc/kvm/book3s_hv.c		diff \| blob \| blame \| history
arch/powerpc/kvm/book3s_hv_rm_mmu.c		diff \| blob \| blame \| history
arch/powerpc/kvm/book3s_hv_rmhandlers.S		diff \| blob \| blame \| history