git.proxmox.com Git - mirror_ubuntu-hirsute-kernel.git/log

KVM: x86 emulator: Use opcode::execute for Group 1, CMPS and SCAS

The following instructions are changed to use opcode::execute.

Group 1 (80-83)
ADD (00-05), OR (08-0D), ADC (10-15), SBB (18-1D), AND (20-25),
SUB (28-2D), XOR (30-35), CMP (38-3D)

CMPS (A6-A7), SCAS (AE-AF)

The last two do the same as CMP in the emulator, so em_cmp() is used.

Signed-off-by: Takuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp>
Signed-off-by: Avi Kivity <avi@redhat.com>

KVM: MMU: Optimize guest page table walk

This patch optimizes the guest page table walk by using get_user()
instead of copy_from_user().

With this patch applied, paging64_walk_addr_generic() has become
about 0.5us to 1.0us faster on my Phenom II machine with NPT on.

Signed-off-by: Takuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp>
Signed-off-by: Avi Kivity <avi@redhat.com>

KVM: SVM: Get rid of x86_intercept_map::valid

By reserving 0 as an invalid x86_intercept_stage, we no longer
need to store a valid flag in x86_intercept_map.

Signed-off-by: Avi Kivity <avi@redhat.com>

KVM: x86 emulator: Use opcode::execute for 0F 01 opcode

Signed-off-by: Avi Kivity <avi@redhat.com>

KVM: x86 emulator: Don't force #UD for 0F 01 /5

While it isn't defined, no need to force a #UD. If it becomes defined
in the future this can cause wierd problems for the guest.

Signed-off-by: Avi Kivity <avi@redhat.com>

KVM: x86 emulator: move 0F 01 sub-opcodes into their own functions

Signed-off-by: Avi Kivity <avi@redhat.com>

KVM: x86 emulator: fix const value warning on i386 in svm insn RAX check

arch/x86/kvm/emulate.c:2598: warning: integer constant is too large for 'long' type

Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Avi Kivity <avi@redhat.com>

KVM: x86 emulator: avoid calling wbinvd() macro

Commit 0b56652e33c72092956c651ab6ceb9f0ad081153 fails to build:

CC [M] arch/x86/kvm/emulate.o
arch/x86/kvm/emulate.c: In function 'x86_emulate_insn':
arch/x86/kvm/emulate.c:4095:25: error: macro "wbinvd" passed 1 arguments, but takes just 0
arch/x86/kvm/emulate.c:4095:3: warning: statement with no effect
make[2]: *** [arch/x86/kvm/emulate.o] Error 1
make[1]: *** [arch/x86/kvm] Error 2
make: *** [arch/x86] Error 2

Work around this for now.

Signed-off-by: Clemens Noss <cnoss@gmx.de>
Signed-off-by: Avi Kivity <avi@redhat.com>

KVM: ioapic: Fix an error field reference

Function ioapic_debug() in the ioapic_deliver() misnames
one filed by reference. This patch correct it.

Signed-off-by: Liu Yuan <tailai.ly@taobao.com>
Signed-off-by: Avi Kivity <avi@redhat.com>

KVM: MMU: Make cmpxchg_gpte aware of nesting too

This patch makes the cmpxchg_gpte() function aware of the
difference between l1-gfns and l2-gfns when nested
virtualization is in use. This fixes a potential
data-corruption problem in the l1-guest and makes the code
work correct (at least as correct as the hardware which is
emulated in this code) again.

Cc: stable@kernel.org
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Avi Kivity <avi@redhat.com>

KVM: x86 emulator: drop x86_emulate_ctxt::vcpu

No longer used.

Signed-off-by: Avi Kivity <avi@redhat.com>

KVM: Avoid using x86_emulate_ctxt.vcpu

We can use container_of() instead.

Signed-off-by: Avi Kivity <avi@redhat.com>

KVM: x86 emulator: add new ->wbinvd() callback

Instead of calling kvm_emulate_wbinvd() directly.

Signed-off-by: Avi Kivity <avi@redhat.com>

KVM: x86 emulator: add ->fix_hypercall() callback

Artificial, but needed to remove direct calls to KVM.

Signed-off-by: Avi Kivity <avi@redhat.com>

KVM: x86 emulator: add new ->halt() callback

Instead of reaching into vcpu internals.

Signed-off-by: Avi Kivity <avi@redhat.com>

KVM: x86 emulator: make emulate_invlpg() an emulator callback

Removing direct calls to KVM.

Signed-off-by: Avi Kivity <avi@redhat.com>

KVM: x86 emulator: emulate CLTS internally

Avoid using ctxt->vcpu; we can do everything with ->get_cr() and ->set_cr().

A side effect is that we no longer activate the fpu on emulated CLTS; but that
should be very rare.

Signed-off-by: Avi Kivity <avi@redhat.com>

KVM: x86 emulator: Replace calls to is_pae() and is_paging with ->get_cr()

Avoid use of ctxt->vcpu.

Signed-off-by: Avi Kivity <avi@redhat.com>

KVM: x86 emulator: drop use of is_long_mode()

Requires ctxt->vcpu, which is to be abolished. Replace with open calls
to get_msr().

Signed-off-by: Avi Kivity <avi@redhat.com>

KVM: x86 emulator: add and use new callbacks set_idt(), set_gdt()

Replacing direct calls to realmode_lgdt(), realmode_lidt().

Signed-off-by: Avi Kivity <avi@redhat.com>

KVM: x86 emulator: avoid using ctxt->vcpu in check_perm() callbacks

Unneeded for register access.

Signed-off-by: Avi Kivity <avi@redhat.com>

KVM: x86 emulator: drop vcpu argument from intercept callback

Making the emulator caller agnostic.

Signed-off-by: Avi Kivity <avi@redhat.com>

KVM: x86 emulator: drop vcpu argument from cr/dr/cpl/msr callbacks

Making the emulator caller agnostic.

Signed-off-by: Avi Kivity <avi@redhat.com>

KVM: x86 emulator: drop vcpu argument from segment/gdt/idt callbacks

Making the emulator caller agnostic.

[Takuya Yoshikawa: fix typo leading to LDT failures]

Signed-off-by: Takuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp>
Signed-off-by: Avi Kivity <avi@redhat.com>

KVM: x86 emulator: drop vcpu argument from pio callbacks

Making the emulator caller agnostic.

Signed-off-by: Avi Kivity <avi@redhat.com>

KVM: x86 emulator: drop vcpu argument from memory read/write callbacks

Making the emulator caller agnostic.

Signed-off-by: Avi Kivity <avi@redhat.com>

KVM: x86 emulator: whitespace cleanups

Clean up lines longer than 80 columns. No code changes.

Signed-off-by: Avi Kivity <avi@redhat.com>

KVM: emulator: Use linearize() when fetching instructions

Since segments need to be handled slightly differently when fetching
instructions, we add a __linearize helper that accepts a new 'fetch' boolean.

[avi: fix oops caused by wrong segmented_address initialization order]

Signed-off-by: Nelson Elhage <nelhage@ksplice.com>
Signed-off-by: Avi Kivity <avi@redhat.com>

KVM: X86: Update last_guest_tsc in vcpu_put

The last_guest_tsc is used in vcpu_load to adjust the
tsc_offset since tsc-scaling is merged. So the
last_guest_tsc needs to be updated in vcpu_put instead of
the the last_host_tsc. This is fixed with this patch.

Reported-by: Jan Kiszka <jan.kiszka@web.de>
Tested-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Avi Kivity <avi@redhat.com>

KVM: SVM: Fix nested sel_cr0 intercept path with decode-assists

This patch fixes a bug in the nested-svm path when
decode-assists is available on the machine. After a
selective-cr0 intercept is detected the rip is advanced
unconditionally. This causes the l1-guest to continue
running with an l2-rip.
This bug was with the sel_cr0 unit-test on decode-assists
capable hardware.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Avi Kivity <avi@redhat.com>

KVM: x86 emulator: Handle wraparound in (cs_base + offset) when fetching insns

Currently, setting a large (i.e. negative) base address for %cs does not work on
a 64-bit host. The "JOS" teaching operating system, used by MIT and other
universities, relies on such segments while bootstrapping its way to full
virtual memory management.

Signed-off-by: Nelson Elhage <nelhage@ksplice.com>
Signed-off-by: Avi Kivity <avi@redhat.com>

KVM: remove useless function declaration kvm_inject_pit_timer_irqs()

Just remove useless function define kvm_inject_pit_timer_irqs() from
file arch/x86/kvm/i8254.h

Signed-off-by:Duan Jiong<djduanjiong@gmail.com>

Signed-off-by: Avi Kivity <avi@redhat.com>

KVM: remove useless function declarations from file arch/x86/kvm/irq.h

Just remove useless function define kvm_pic_clear_isr_ack() and
pit_has_pending_timer()

Signed-off-by: Duan Jiong<djduanjiong@gmail.com>
Signed-off-by: Avi Kivity <avi@redhat.com>

KVM: Fix off by one in kvm_for_each_vcpu iteration

This patch avoids gcc issuing the following warning when KVM_MAX_VCPUS=1:
warning: array subscript is above array bounds

kvm_for_each_vcpu currently checks to see if the index for the vcpu is
valid /after/ loading it. We don't run into problems because the address
is still inside the enclosing struct kvm and we never deference or write
to it, so this isn't a security issue.

The warning occurs when KVM_MAX_VCPUS=1 because the increment portion of
the loop will *always* cause the loop to load an invalid location since
++idx will always be > 0.

This patch moves the load so that the check occurs before the load and
we don't run into the compiler warning.

Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Avi Kivity <avi@redhat.com>

KVM: fix push of wrong eip when doing softint

When doing a soft int, we need to bump eip before pushing it to
the stack. Otherwise we'll do the int a second time.

[apw@canonical.com: merged eip update as per Jan's recommendation.]
Signed-off-by: Serge E. Hallyn <serge.hallyn@ubuntu.com>
Signed-off-by: Andy Whitcroft <apw@canonical.com>
Signed-off-by: Avi Kivity <avi@redhat.com>

KVM: x86 emulator: Use em_push() instead of emulate_push()

em_push() is a simple wrapper of emulate_push(). So this patch replaces
emulate_push() with em_push() and removes the unnecessary former.

In addition, the unused ops arguments are removed from emulate_pusha()
and emulate_grp45().

Signed-off-by: Takuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp>
Signed-off-by: Avi Kivity <avi@redhat.com>

KVM: x86 emulator: Make emulate_push() store the value directly

PUSH emulation stores the value by calling writeback() after setting
the dst operand appropriately in emulate_push().

This writeback() using dst is not needed at all because we know the
target is the stack. So this patch makes emulate_push() call, newly
introduced, segmented_write() directly.

By this, many inlined writeback()'s are removed.

Signed-off-by: Takuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp>
Signed-off-by: Avi Kivity <avi@redhat.com>

KVM: x86 emulator: Disable writeback for CMP emulation

This stops "CMP r/m, reg" to write back the data into memory.
Pointed out by Avi.

The writeback suppression now covers CMP, CMPS, SCAS.

Signed-off-by: Takuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp>
Signed-off-by: Avi Kivity <avi@redhat.com>

KVM: VMX: Ensure that vmx_create_vcpu always returns proper error

In case certain allocations fail, vmx_create_vcpu may return 0 as error
instead of a negative value encoded via ERR_PTR. This causes a NULL
pointer dereferencing later on in kvm_vm_ioctl_vcpu_create.

Reported-by: Sasha Levin <levinsasha928@gmail.com>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>

KVM: emulator: do not needlesly sync registers from emulator ctxt to vcpu

Currently we sync registers back and forth before/after exiting
to userspace for IO, but during IO device model shouldn't need to
read/write the registers, so we can as well skip those sync points. The
only exaception is broken vmware backdor interface. The new code sync
registers content during IO only if registers are read from/written to
by userspace in the middle of the IO operation and this almost never
happens in practise.

Signed-off-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>

KVM: x86 emulator: implement segment permission checks

Signed-off-by: Avi Kivity <avi@redhat.com>

KVM: x86 emulator: move desc_limit_scaled()

For reuse later.

Signed-off-by: Avi Kivity <avi@redhat.com>

KVM: x86 emulator: move linearize() downwards

So it can call emulate_gp() without forward declarations.

Signed-off-by: Avi Kivity <avi@redhat.com>

KVM: x86 emulator: pass access size and read/write intent to linearize()

Needed for segment read/write checks.

Signed-off-by: Avi Kivity <avi@redhat.com>

KVM: x86 emulator: change address linearization to return an error code

Preparing to add segment checks.

Signed-off-by: Avi Kivity <avi@redhat.com>

KVM: x86 emulator: move invlpg emulation into a function

It's going to get more complicated soon.

Signed-off-by: Avi Kivity <avi@redhat.com>

KVM: x86 emulator: Add helpers for memory access using segmented addresses

Will help later adding proper segment checks.

Signed-off-by: Avi Kivity <avi@redhat.com>

KVM: SVM: Fix fault-rip on vmsave/vmload emulation

When the emulation of vmload or vmsave fails because the
guest passed an unsupported physical address it gets an #GP
with rip pointing to the instruction after vmsave/vmload.
This is a bug and fixed by this patch.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Avi Kivity <avi@redhat.com>

KVM: X86: Implement userspace interface to set virtual_tsc_khz

This patch implements two new vm-ioctls to get and set the
virtual_tsc_khz if the machine supports tsc-scaling. Setting
the tsc-frequency is only possible before userspace creates
any vcpu.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Avi Kivity <avi@redhat.com>

KVM: X86: Delegate tsc-offset calculation to architecture code

With TSC scaling in SVM the tsc-offset needs to be
calculated differently. This patch propagates this
calculation into the architecture specific modules so that
this complexity can be handled there.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Avi Kivity <avi@redhat.com>

KVM: X86: Implement call-back to propagate virtual_tsc_khz

This patch implements a call-back into the architecture code
to allow the propagation of changes to the virtual tsc_khz
of the vcpu.
On SVM it updates the tsc_ratio variable, on VMX it does
nothing.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Avi Kivity <avi@redhat.com>

KVM: X86: Make tsc_delta calculation a function of guest tsc

The calculation of the tsc_delta value to ensure a
forward-going tsc for the guest is a function of the
host-tsc. This works as long as the guests tsc_khz is equal
to the hosts tsc_khz. With tsc-scaling hardware support this
is not longer true and the tsc_delta needs to be calculated
using guest_tsc values.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Avi Kivity <avi@redhat.com>

KVM: X86: Let kvm-clock report the right tsc frequency

This patch changes the kvm_guest_time_update function to use
TSC frequency the guest actually has for updating its clock.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Avi Kivity <avi@redhat.com>

KVM: SVM: Implement infrastructure for TSC_RATE_MSR

This patch enhances the kvm_amd module with functions to
support the TSC_RATE_MSR which can be used to set a given
tsc frequency for the guest vcpu.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Avi Kivity <avi@redhat.com>

KVM: x86 emulator: Drop EFER.SVME requirement from VMMCALL

VMMCALL requires EFER.SVME to be enabled in the host, not in the guest, which
is what check_svme() checks.

Signed-off-by: Avi Kivity <avi@redhat.com>

KVM: x86 emulator: Re-add VendorSpecific tag to VMMCALL insn

VMMCALL needs the VendorSpecific tag so that #UD emulation
(called if a guest running on AMD was migrated to an Intel host)
is allowed to process the instruction.

Signed-off-by: Avi Kivity <avi@redhat.com>

KVM: PPC: Fix issue clearing exit timing counters

Following dump is observed on host when clearing the exit timing counters

[root@p1021mds kvm]# echo -n 'c' > vm1200_vcpu0_timing
INFO: task echo:1276 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
echo          D 0ff5bf94     0  1276   1190 0x00000000
Call Trace:
[c2157e40] [c0007908] __switch_to+0x9c/0xc4
[c2157e50] [c040293c] schedule+0x1b4/0x3bc
[c2157e90] [c04032dc] __mutex_lock_slowpath+0x74/0xc0
[c2157ec0] [c00369e4] kvmppc_init_timing_stats+0x20/0xb8
[c2157ed0] [c0036b00] kvmppc_exit_timing_write+0x84/0x98
[c2157ef0] [c00b9f90] vfs_write+0xc0/0x16c
[c2157f10] [c00ba284] sys_write+0x4c/0x90
[c2157f40] [c000e320] ret_from_syscall+0x0/0x3c

        The vcpu->mutex is used by kvm_ioctl_* (KVM_RUN etc) and same was
used when clearing the stats (in kvmppc_init_timing_stats()). What happens
is that when the guest is idle then it held the vcpu->mutx. While the
exiting timing process waits for guest to release the vcpu->mutex and
a hang state is reached.

        Now using seprate lock for exit timing stats.

Signed-off-by: Bharat Bhushan <Bharat.Bhushan@freescale.com>
Acked-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Avi Kivity <avi@redhat.com>

KVM: MMU: remove mmu_seq verification on pte update path

The mmu_seq verification can be removed since we get the pfn in the
protection of mmu_lock.

Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: Avi Kivity <avi@redhat.com>

KVM: x86 emulator: do not open code return values from the emulator

Signed-off-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>

KVM: Remove base_addresss in kvm_pit since it is unused

The patch below removes unsigned long base_addresss; in i8254.h
since it is unused.

Signed-off-by: Justin P. Mattock <justinmattock@gmail.com>
Signed-off-by: Avi Kivity <avi@redhat.com>

KVM: SVM: Remove nested sel_cr0_write handling code

This patch removes all the old code which handled the nested
selective cr0 write intercepts. This code was only in place
as a work-around until the instruction emulator is capable
of doing the same. This is the case with this patch-set and
so the code can be removed.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Avi Kivity <avi@redhat.com>

KVM: SVM: Add checks for IO instructions

This patch adds code to check for IOIO intercepts on
instructions decoded by the KVM instruction emulator.

[avi: fix build error due to missing #define D2bvIP]

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Avi Kivity <avi@redhat.com>

KVM: SVM: Add intercept checks for one-byte instructions

This patch add intercept checks for emulated one-byte
instructions to the KVM instruction emulation path.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Avi Kivity <avi@redhat.com>

KVM: SVM: Add intercept checks for remaining twobyte instructions

This patch adds intercepts checks for the remaining twobyte
instructions to the KVM instruction emulator.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Avi Kivity <avi@redhat.com>

KVM: SVM: Add intercept checks for remaining group7 instructions

This patch implements the emulator intercept checks for the
RDTSCP, MONITOR, and MWAIT instructions.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Avi Kivity <avi@redhat.com>

KVM: SVM: Add intercept checks for SVM instructions

This patch adds the necessary code changes in the
instruction emulator and the extensions to svm.c to
implement intercept checks for the svm instructions.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Avi Kivity <avi@redhat.com>

KVM: SVM: Add intercept checks for descriptor table accesses

This patch add intercept checks into the KVM instruction
emulator to check for the 8 instructions that access the
descriptor table addresses.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Avi Kivity <avi@redhat.com>

KVM: SVM: Add intercept check for accessing dr registers

This patch adds the intercept checks for instruction
accessing the debug registers.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Avi Kivity <avi@redhat.com>

KVM: SVM: Add intercept check for emulated cr accesses

This patch adds all necessary intercept checks for
instructions that access the crX registers.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Avi Kivity <avi@redhat.com>

KVM: x86: Add x86 callback for intercept check

This patch adds a callback into kvm_x86_ops so that svm and
vmx code can do intercept checks on emulated instructions.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Avi Kivity <avi@redhat.com>

KVM: x86 emulator: Add flag to check for protected mode instructions

This patch adds a flag for the opcoded to tag instruction
which are only recognized in protected mode. The necessary
check is added too.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Avi Kivity <avi@redhat.com>

KVM: x86 emulator: Add check_perm callback

This patch adds a check_perm callback for each opcode into
the instruction emulator. This will be used to do all
necessary permission checks on instructions before checking
whether they are intercepted or not.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Avi Kivity <avi@redhat.com>

KVM: x86 emulator: Don't write-back cpu-state on X86EMUL_INTERCEPTED

This patch prevents the changed CPU state to be written back
when the emulator detected that the instruction was
intercepted by the guest.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Avi Kivity <avi@redhat.com>

KVM: x86 emulator: add SVM intercepts

Add intercept codes for instructions defined by SVM as
interceptable.

Signed-off-by: Avi Kivity <avi@redhat.com>
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Avi Kivity <avi@redhat.com>

KVM: x86 emulator: add framework for instruction intercepts

When running in guest mode, certain instructions can be intercepted by
hardware.  This also holds for nested guests running on emulated
virtualization hardware, in particular instructions emulated by kvm
itself.

This patch adds a framework for intercepting instructions.  If an
instruction is marked for interception, and if we're running in guest
mode, a callback is called to check whether an intercept is needed or
not.  The callback is called at three points in time: immediately after
beginning execution, after checking privilge exceptions, and after
checking memory exception.  This suits the different interception points
defined for different instructions and for the various virtualization
instruction sets.

In addition, a new X86EMUL_INTERCEPT is defined, which any callback or
memory access may define, allowing the more complicated intercepts to be
implemented in existing callbacks.

Signed-off-by: Avi Kivity <avi@redhat.com>
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Avi Kivity <avi@redhat.com>

KVM: x86 emulator: implement movdqu instruction (f3 0f 6f, f3 0f 7f)

Signed-off-by: Avi Kivity <avi@redhat.com>

KVM: x86 emulator: SSE support

Add support for marking an instruction as SSE, switching registers used
to the SSE register file.

Signed-off-by: Avi Kivity <avi@redhat.com>

KVM: x86 emulator: Specialize decoding for insns with 66/f2/f3 prefixes

Most SIMD instructions use the 66/f2/f3 prefixes to distinguish between
different variants of the same instruction. Usually the encoding is quite
regular, but in some cases (including non-SIMD instructions) the prefixes
generate very different instructions. Examples include XCHG/PAUSE,
MOVQ/MOVDQA/MOVDQU, and MOVBE/CRC32.

Allow the emulator to handle these special cases by splitting such opcodes
into groups, with different decode flags and execution functions for different
prefixes.

Signed-off-by: Avi Kivity <avi@redhat.com>

KVM: x86 emulator: define callbacks for using the guest fpu within the emulator

Needed for emulating fpu instructions.

Signed-off-by: Avi Kivity <avi@redhat.com>

KVM: x86 emulator: do not munge rep prefix

Currently we store a rep prefix as 1 or 2 depending on whether it is a REPE or
REPNE. Since sse instructions depend on the prefix value, store it as the
original opcode to simplify things further on.

Signed-off-by: Avi Kivity <avi@redhat.com>

KVM: 16-byte mmio support

Since sse instructions can issue 16-byte mmios, we need to support them. We
can't increase the kvm_run mmio buffer size to 16 bytes without breaking
compatibility, so instead we break the large mmios into two smaller 8-byte
ones. Since the bus is 64-bit we aren't breaking any atomicity guarantees.

Signed-off-by: Avi Kivity <avi@redhat.com>

KVM: Split mmio completion into a function

Make room for sse mmio completions.

Signed-off-by: Avi Kivity <avi@redhat.com>

KVM: extend in-kernel mmio to handle >8 byte transactions

Needed for coalesced mmio using sse.

Signed-off-by: Avi Kivity <avi@redhat.com>

KVM: x86: better fix for race between nmi injection and enabling nmi window

Fix race between nmi injection and enabling nmi window in a simpler way.

Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>

Revert "KVM: Fix race between nmi injection and enabling nmi window"

This reverts commit f86368493ec038218e8663cc1b6e5393cd8e008a.

Simpler fix to follow.

Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>

KVM: expose async pf through our standard mechanism

As Avi recently mentioned, the new standard mechanism for exposing features
is KVM_GET_SUPPORTED_CPUID, not spamming CAPs. For some reason async pf
missed that.

So expose async_pf here.

Signed-off-by: Glauber Costa <glommer@redhat.com>
CC: Gleb Natapov <gleb@redhat.com>
CC: Avi Kivity <avi@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>

KVM: VMX: simplify NMI mask management

Use vmx_set_nmi_mask() instead of open-coding management of
the hardware bit and the software hint (nmi_known_unmasked).

There's a slight change of behaviour when running without
hardware virtual NMI support - we now clear the NMI mask if
NMI delivery faulted in that case as well. This improves
emulation accuracy.

Signed-off-by: Avi Kivity <avi@redhat.com>

KVM: SVM: Remove unused svm_features

We use boot_cpu_has now.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Avi Kivity <avi@redhat.com>

KVM: VMX: Use cached VM_EXIT_INTR_INFO in handle_exception

vmx_complete_atomic_exit() cached it for us, so we can use it here.

Signed-off-by: Avi Kivity <avi@redhat.com>

KVM: VMX: Don't VMREAD VM_EXIT_INTR_INFO unconditionally

Only read it if we're going to use it later.

Signed-off-by: Avi Kivity <avi@redhat.com>

KVM: VMX: Refactor vmx_complete_atomic_exit()

Move the exit reason checks to the front of the function, for early
exit in the common case.

Signed-off-by: Avi Kivity <avi@redhat.com>

KVM: VMX: Qualify check for host NMI

Check for the exit reason first; this allows us, later,
to avoid a VMREAD for VM_EXIT_INTR_INFO_FIELD.

Signed-off-by: Avi Kivity <avi@redhat.com>

KVM: VMX: Avoid vmx_recover_nmi_blocking() when unneeded

When we haven't injected an interrupt, we don't need to recover
the nmi blocking state (since the guest can't set it by itself).
This allows us to avoid a VMREAD later on.

Signed-off-by: Avi Kivity <avi@redhat.com>

KVM: VMX: Cache cpl

We may read the cpl quite often in the same vmexit (instruction privilege
check, memory access checks for instruction and operands), so we gain
a bit if we cache the value.

Signed-off-by: Avi Kivity <avi@redhat.com>

KVM: VMX: Optimize vmx_get_cpl()

In long mode, vm86 mode is disallowed, so we need not check for
it. Reading rflags.vm may require a VMREAD, so it is expensive.

Signed-off-by: Avi Kivity <avi@redhat.com>

KVM: VMX: Optimize vmx_get_rflags()

If called several times within the same exit, return cached results.

Signed-off-by: Avi Kivity <avi@redhat.com>

KVM: Use kvm_get_rflags() and kvm_set_rflags() instead of the raw versions

Some rflags bits are owned by the host, not guest, so we need to use
kvm_get_rflags() to strip those bits away or kvm_set_rflags() to add them
back.

Signed-off-by: Avi Kivity <avi@redhat.com>

KVM: cleanup memslot_id function

We can get memslot id from memslot->id directly

Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: Avi Kivity <avi@redhat.com>

Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (27 commits)
  slcan: fix ldisc->open retval
  net/usb: mark LG VL600 LTE modem ethernet interface as WWAN
  xfrm: Don't allow esn with disabled anti replay detection
  xfrm: Assign the inner mode output function to the dst entry
  net: dev_close() should check IFF_UP
  vlan: fix GVRP at dismantle time
  netfilter: revert a2361c8735e07322023aedc36e4938b35af31eb0
  netfilter: IPv6: fix DSCP mangle code
  netfilter: IPv6: initialize TOS field in REJECT target module
  IPVS: init and cleanup restructuring
  IPVS: Change of socket usage to enable name space exit.
  netfilter: ebtables: only call xt_compat_add_offset once per rule
  netfilter: fix ebtables compat support
  netfilter: ctnetlink: fix timestamp support for new conntracks
  pch_gbe: support ML7223 IOH
  PCH_GbE : Fixed the issue of checksum judgment
  PCH_GbE : Fixed the issue of collision detection
  NET: slip, fix ldisc->open retval
  be2net: Fixed bugs related to PVID.
  ehea: fix wrongly reported speed and port
  ...

slub: Revert "[PARISC] slub: fix panic with DISCONTIGMEM"

This reverts commit 4a5fa3590f09, which did not allow SLUB to be used
on architectures that use DISCONTIGMEM without compiling NUMA support
without CONFIG_BROKEN also set.

The slub panic that it was intended to prevent is addressed by
d9b41e0b54fd ("[PARISC] set memory ranges in N_NORMAL_MEMORY when
onlined") on parisc so there is no further slub issues with such a
configuration.

The reverts allows SLUB now to be used on such architectures since
there haven't been any reports of additional errors.

Cc: James Bottomley <James.Bottomley@suse.de>
Signed-off-by: David Rientjes <rientjes@google.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>