Jan Kiszka [Thu, 30 Aug 2012 18:30:00 +0000 (20:30 +0200)]
kvm: i386: Add classic PCI device assignment
This adds PCI device assignment for i386 targets using the classic KVM
interfaces. This version is 100% identical to what is being maintained
in qemu-kvm for several years and is supported by libvirt as well. It is
expected to remain relevant for another couple of years until kernels
without full-features and performance-wise equivalent VFIO support are
obsolete.
A refactoring to-do that should be done in-tree is to model MSI and
MSI-X support via the generic PCI layer, similar to what VFIO is already
doing for MSI-X. This should improve the correctness and clean up the
code from duplicate logic.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Acked-by: Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
Jan Kiszka [Mon, 27 Aug 2012 06:28:40 +0000 (08:28 +0200)]
kvm: i386: Add services required for PCI device assignment
These helpers abstract the interaction of upcoming pci-assign with the
KVM kernel services. Put them under i386 only as other archs will
implement device pass-through via VFIO and not this classic interface.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Acked-by: Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
Jan Kiszka [Mon, 27 Aug 2012 06:28:38 +0000 (08:28 +0200)]
kvm: Introduce kvm_irqchip_update_msi_route
This service allows to update an MSI route without releasing/reacquiring
the associated VIRQ. Will be used by PCI device assignment, later on
likely also by virtio/vhost and VFIO.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Acked-by: Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
The load/store slow path has been broken in e141ab52d:
- We need to move 4 registers for store functions and 3 registers for
load functions and not the reverse.
- According to the s390x calling convention the arguments of a function
should be zero extended. This means that the register shift should be
done with TCG_TYPE_I64 to ensure the higher word is correctly zero
extended when needed.
I am aware that CONFIG_TCG_PASS_AREG0 is being removed and thus that
this patch can be improved, but doing so means it can also be applied to
the 1.1 and 1.2 stable branches.
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net> Signed-off-by: Alexander Graf <agraf@suse.de>
Jan Kiszka [Mon, 20 Aug 2012 08:55:56 +0000 (10:55 +0200)]
kvm: Clean up irqfd API
No need to expose the fd-based interface, everyone will already be fine
with the more handy EventNotifier variant. Rename the latter to clarify
that we are still talking about irqfds here.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Acked-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
qemu: Use valgrind annotations to mark kvm guest memory as defined
valgrind with kvm produces a big amount of false positives regarding
"Conditional jump or move depends on uninitialised value(s)". This
happens because the guest memory is allocated with qemu_vmalloc which
boils down posix_memalign etc. This function is (correctly) considered
by valgrind as returning undefined memory.
Since valgrind is based on jitting code, it will not be able to see
changes made by the guest to guest memory if this is done by KVM_RUN,
thus keeping most of the guest memory undefined.
Now lots of places in qemu will then use guest memory to change behaviour.
To avoid the flood of these messages, lets declare the whole guest
memory as defined. This will reduce the noise and allows us to see real
problems.
In the future we might want to make this conditional, since there
is actually something that we can use those false positives for:
These messages will point to code that depends on guest memory, so
we can use these backtraces to actually make an audit that is focussed
only at those code places. For normal development we dont want to
see those messages, though.
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Avi Kivity <avi@redhat.com>
fcmp{s,d,q} instructions are supposed to ignore quiet NaN (contrary to
the fcmpe{s,d,q} instructions), but the current code is wrongly setting
the NV exception in that case. Moreover the current code is duplicated:
first the arguments are checked for NaN to generate an exception, and
later in case the comparison is unordered (which can only happens if one
of the argument is a NaN), the same check is done to generate an
exception.
Fix that by calling clear_float_exceptions() followed by
check_ieee_exceptions() as for the other floating point instructions.
Use the _compare_quiet functions for fcmp{s,d,q} and the _compare ones
for fcmpe{s,d,q}. Simplify the flag setting by not clearing a flag that
is set the line just below.
This fix allows the math glibc testsuite to pass.
Cc: Blue Swirl <blauwirbel@gmail.com> Signed-off-by: Aurelien Jarno <aurelien@aurel32.net> Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
Max Filippov [Thu, 6 Sep 2012 00:36:46 +0000 (04:36 +0400)]
target-xtensa: fix missing errno codes for mingw32
Put the following errno value mappings under #ifdef:
xtensa-semi.c: In function 'errno_h2g':
xtensa-semi.c:113: error: 'ENOTBLK' undeclared (first use in this function)
xtensa-semi.c:113: error: (Each undeclared identifier is reported only once
xtensa-semi.c:113: error: for each function it appears in.)
xtensa-semi.c:113: error: array index in initializer not of integer type
xtensa-semi.c:113: error: (near initialization for 'guest_errno')
xtensa-semi.c:124: error: 'ETXTBSY' undeclared (first use in this function)
xtensa-semi.c:124: error: array index in initializer not of integer type
xtensa-semi.c:124: error: (near initialization for 'guest_errno')
xtensa-semi.c:134: error: 'ELOOP' undeclared (first use in this function)
xtensa-semi.c:134: error: array index in initializer not of integer type
xtensa-semi.c:134: error: (near initialization for 'guest_errno')
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com> Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
This change updates the CPU reset sequence to use a common piece of code
that figures out CPU state flags, fixing the problem with MIPS_HFLAG_COP1X
not being set where applicable that causes floating-point MADD family
instructions (and other instructions from the MIPS IV FP subset) to trap.
As compute_hflags is now shared between op_helper.c and translate.c, the
function is now moved to a common header. There are no changes to this
function.
The problem was seen with the 24Kf MIPS32r2 processor in user emulation.
The new approach prevents system and user emulation from diverging -- all
the hflags state is initialized in one place now.
Signed-off-by: Maciej W. Rozycki <macro@codesourcery.com> Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Stefan Hajnoczi [Tue, 21 Aug 2012 20:52:08 +0000 (20:52 +0000)]
vhost: Pass device path to vhost_dev_init()
The path to /dev/vhost-net is currently hardcoded in vhost_dev_init().
This needs to be changed so that /dev/vhost-scsi can be used. Pass in
the device path instead of hardcoding it.
Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
This patch renames+moves the net_handle_fd_param() caller used to
obtain a file descriptor from either qemu_parse_fd() (the normal case)
or from monitor_get_fd() (migration case) into a generically prefixed
monitor_handle_fd_param() to be used by vhost-scsi code.
Also update net/[socket,tap].c consumers to use the new prefix.
Reported-by: Michael S. Tsirkin <mst@redhat.com> Cc: Michael S. Tsirkin <mst@redhat.com> Cc: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Anthony Liguori <aliguori@us.ibm.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Jason Baron [Tue, 4 Sep 2012 20:22:46 +0000 (16:22 -0400)]
pcie_aer: clear cmask for Advanced Error Interrupt Message Number
The Advanced Error Interrupt Message Number (bits 31:27 of the Root
Error Status Register) is updated when the number of msi messages assigned to a
device changes. Migration of windows 7 on q35 chipset failed because the check
in get_pci_config_device() fails due to cmask being set on these bits. Its valid
to update these bits and we must restore this state across migration.
Signed-off-by: Jason Baron <jbaron@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
The VMSTATE_PCIE_DEVICE() currently has a 'version_id' set to 2. However,
'version_id' in the above check is 1. And thus we fail to load the pcie device
field. Further the code returns to 'qemu_loadvm_state()' which produces the
error that I saw.
I'm proposing to fix this by simply dropping the 'version_id' field from
VMSTATE_PCIE_DEVICE(). VMSTATE_PCI_DEVICE() defines no such field and further
the vmstate_pcie_device that VMSTATE_PCI_DEVICE() refers to is already
versioned. Thus, any versioning issues could be detected at the vmsd level.
Taking a step back, I think that the 'field->version_id' should be compared
against a saved version number for the field not the 'version_id'. Futhermore,
once vmstate_load_state() is called recursively on another vmsd, the check of:
if (version_id > vmsd->version_id) {
return -EINVAL;
}
Will never fail since version_id is always equal to vmsd->version_id. So I'm
wondering why we aren't storing the vmsd version id of the source in the
migration stream?
This patch also renames the 'name' field of vmstate_pcie_device from:
PCIDevice -> PCIEDevice to differentiate it from vmstate_pci_device.
Signed-off-by: Jason Baron <jbaron@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
I've been using this to get correct indenting with vim
in qemu for a while, but it's a bit easier if we
put the settings in the central place.
Note that
1. you still need to enable 'exrc' and 'secure'
options in your vimrc for these settings to take effect.
2. you can create a .vimrc file if 'exrc' is on but there's
need to bypass this configuration.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Stefan Weil [Sat, 1 Sep 2012 07:30:39 +0000 (09:30 +0200)]
qapi: Fix potential NULL pointer segfault
Report from smatch:
qapi-visit.c:1640 visit_type_BlockdevAction(8) error:
we previously assumed 'obj' could be null (see line 1639)
qapi-visit.c:2432 visit_type_NetClientOptions(8) error:
we previously assumed 'obj' could be null (see line 2431)
Signed-off-by: Stefan Weil <sw@weilnetz.de> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
Amos Kong [Fri, 31 Aug 2012 02:56:26 +0000 (10:56 +0800)]
qapi: convert sendkey
Convert 'sendkey' to use QAPI.
QAPI passes key's index of mapping table to qmp_send_key(),
not keycode. So we use help functions to convert key/code to
index of key_defs, and 'index' will be converted to 'keycode'
inside qmp_send_key().
For qmp, QAPI would check invalid key and raise error.
For hmp, invalid key is checked in hmp_send_key().
'send-key' of QMP doesn't support key in hexadecimal format.
Signed-off-by: Amos Kong <akong@redhat.com> Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
Amos Kong [Fri, 31 Aug 2012 02:56:25 +0000 (10:56 +0800)]
monitor: move key_defs[] table and introduce two help functions
This patch added two help functions to convert key/code to index of
mapping table, those functions will return Q_KEY_CODE_MAX if the
code/key is invalid.
Patch also moved key_defs[] to input.c, and removed useless KeyDef struct.
Key's index in QKeyCode enmu is same as keycode's index in new key_defs[].
Monitor functions were changed to access key_defs[] directly.
key_defs[] is used in do_send_key(), so export key_defs[]. It will be
changed to static in next patch.
Signed-off-by: Amos Kong <akong@redhat.com> Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
Amos Kong [Fri, 31 Aug 2012 02:56:24 +0000 (10:56 +0800)]
qapi: add the QKeyCode enum
key_defs[] in monitor.c is a mapping table of keys and keycodes,
this patch added a QKeyCode enum. Key's index in the enmu is same
as keycode's index in key_defs[].
Signed-off-by: Amos Kong <akong@redhat.com> Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
Amos Kong [Fri, 31 Aug 2012 02:56:21 +0000 (10:56 +0800)]
monitor: rename keyname '<' to 'less'
There are many maps of keycode 0x56 in pc-bios/keymaps/*
pc-bios/keymaps/common:less 0x56
pc-bios/keymaps/common:greater 0x56 shift
pc-bios/keymaps/common:bar 0x56 altgr
pc-bios/keymaps/common:brokenbar 0x56 shift altgr
This patch just renamed '<' to 'less', QAPI might add new
variable by adding a prefix to keyname, '$PREFIX_<' is not
available, '$PREFIX_less' is ok.
For compatibility, convert user inputted '<' to 'less'.
Signed-off-by: Amos Kong <akong@redhat.com> Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
qxl: Add set_client_capabilities() interface to QXLInterface
This new interface lets spice server inform the guest whether
(a) a client is connected
(b) what capabilities the client has
There is a fixed number (464) of bits reserved for capabilities, and
when the capabilities bits change, the QXL_INTERRUPT_CLIENT interrupt
is generated.
Alon Levy [Wed, 22 Aug 2012 08:16:25 +0000 (11:16 +0300)]
qxl: add QXL_IO_MONITORS_CONFIG_ASYNC
Revision bumped to 4 for new IO support, enabled for spice-server >=
0.11.1. New io enabled if revision is 4. Revision can be set to 4.
[ kraxel: 3 continues to be the default revision. Once we have a new
stable spice-server release and the qemu patches to enable
the new bits merged we'll go flip the switch and make rev4
the default ]
This io calls the corresponding new spice api
spice_qxl_monitors_config_async to let spice-server read a new guest set
monitors config and notify the client.
On migration reissue spice_qxl_monitors_config_async.
RHBZ: 770842
Signed-off-by: Alon Levy <alevy@redhat.com> Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
fixup
Yonit Halperin [Tue, 21 Aug 2012 08:51:59 +0000 (11:51 +0300)]
spice: adding seamless-migration option to the command line
The seamless-migration flag is required in order to identify
whether libvirt supports the new QEVENT_SPICE_MIGRATE_COMPLETED or not
(by default the flag is off).
New libvirt versions that wait for QEVENT_SPICE_MIGRATE_COMPLETED should turn on this flag.
When this flag is off, spice fallbacks to its old migration method, which
can result in data loss.
Yonit Halperin [Tue, 21 Aug 2012 08:51:58 +0000 (11:51 +0300)]
spice: add 'migrated' flag to spice info
The flag is 'true' when spice migration has completed on the src side.
It is needed for a case where libvirt dies before migration completes
and it misses the event QEVENT_SPICE_MIGRATE_COMPLETED.
When libvirt is restored and queries the migration status, it also needs
to query spice and check if its migration has completed.
When migrating, libvirt queries the migration status, and upon migration
completions, it closes the migration src. On the other hand, when
migration is completed, spice transfers data from the src to destination
via the client. This data is required for keeping the spice session
after migration, without suffering from data loss and inconsistencies.
In order to allow this data transfer, we add QEVENT for signaling
libvirt that spice migration has completed, and libvirt needs to wait
for this event before quitting the src process.
Yonit Halperin [Tue, 21 Aug 2012 08:51:55 +0000 (11:51 +0300)]
spice: notify spice server on vm start/stop
Spice server needs to know about the vm state in order to prevent
attempts to write to devices when they are stopped, mainly during
the non-live stage of migration.
Instead, spice will take care of restoring this writes, on the migration
target side, after migration completes.
When parsing its command line parameters, spice aborts when it
finds unexpected values, except for the 'streaming-video' option.
This happens because the parsing of the parameters for this option
is done using the 'name2enum' helper, which does not error out
on unknown values. Using the 'parse_name' helper makes sure we
error out in this case. Looking at git history, the use of
'name2enum' instead of 'parse_name' seems to have been an oversight,
so let's change to that now.
The -net none is important otherwise it seems some events are generated
causing the things to work. When it doesn't work, the guest hangs when
measuring the CPU frequency, after the following line:
[ 0.000000] NR_IRQS:256
Pressing a key on the serial port unblocks it, hinting that the problem
is due to the recent elimination of the 1 second timeout in the main
loop.
The problem is that because init_timer_alarm sets the timer's pending
flag to true, the alarm timer is never armed until after the first time
through the main loop. Thus the bug started when QEMU started testing
the pending flag in qemu_mod_timer (commit 1828be3, more alarm timer
cleanup, 2010-03-10).
But actually, it isn't true at all that a timer is pending when the
alarm timer is created, and the real bug has been latent forever: the
fix is to remove the bogus setting of pending flag.
Reported-by: Aurelien Jarno <aurelien@aurel32.net> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Jan Kiszka <jan.kiszka@siemens.com> Tested-by: Aurelien Jarno <aurelien@aurel32.net> Tested-by: Michael Tokarev <mjt@tls.msk.ru> Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Anthony Liguori [Fri, 31 Aug 2012 15:04:54 +0000 (10:04 -0500)]
Merge remote-tracking branch 'kraxel/usb.61' into staging
* kraxel/usb.61:
uas: move transfer kickoff
ehci: Fix interrupt endpoints no longer working
ehci: handle TD deactivation of inflight packets
ehci: add ehci_cancel_queue()
ehci: simplify ehci_state_executing
ehci: Remove unnecessary ehci_flush_qh call
ehci: Schedule async-bh when IAAD bit gets set
ehci: Fix NULL ptr deref when unplugging an USB dev with an iso stream active
usb: unique packet ids
usb: Halt ep queue en cancel pending packets on a packet error
fix info qtree indention
Anthony Liguori [Fri, 31 Aug 2012 15:04:18 +0000 (10:04 -0500)]
Merge remote-tracking branch 'kwolf/for-anthony' into staging
* kwolf/for-anthony:
qemu-iotests: add backing file smaller than image test case
stream: complete early if end of backing file is reached
qed: refuse unaligned zero writes with a backing file
Gerd Hoffmann [Fri, 31 Aug 2012 12:34:19 +0000 (14:34 +0200)]
uas: move transfer kickoff
Kick next scsi transfer from request release callback instead of command
completion callback, otherwise we might get stuck in case scsi_req_unref()
doesn't release the request instantly due to someone else holding a
reference too.
Hans de Goede [Fri, 17 Aug 2012 09:39:17 +0000 (11:39 +0200)]
ehci: simplify ehci_state_executing
ehci_state_executing does not need to check for p->usb_status == USB_RET_ASYNC
or USB_RET_PROCERR, since ehci_execute_complete already does a similar check
and will trigger an assert if either value is encountered.
USB_RET_ASYNC should never be the packet status when execute_complete runs
for obvious reasons, and USB_RET_PROCERR is only used by ehci_state_execute /
ehci_execute not by ehci_state_executing / ehci_execute_complete.
Signed-off-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Hans de Goede [Thu, 30 Aug 2012 07:55:19 +0000 (09:55 +0200)]
ehci: Schedule async-bh when IAAD bit gets set
After the "ehci: Print a warning when a queue unexpectedly contains packets
on cancel" commit. Under certain reproducable conditions I was getting the
following message: "EHCI: Warning queue not empty on queue reset".
After aprox. 8 hours of debugging I've finally found the cause. The Linux EHCI
driver has an IAAD watchdog, to work around certain EHCI hardware sometimes
not acknowledging the doorbell at all. This watchdog has a timeout of 10 ms,
which is less then the time between 2 runs through the async schedule when
async_stepdown is at its highest value.
Thus the watchdog can trigger, after which Linux clears the IAAD bit and
re-uses the QH. IOW we were not properly detecting the unlink of the qh, due
to us missing (ignoring for more then 10 ms) the IAAD command, which triggered
the warning.
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Gerd Hoffmann [Thu, 23 Aug 2012 11:30:13 +0000 (13:30 +0200)]
usb: unique packet ids
This patch adds IDs to usb packets. Those IDs are (a) supposed to be
unique for the lifecycle of a packet (from packet setup until the packet
is either completed or canceled) and (b) stable across migration.
uhci, ohci, ehci and xhci use the guest physical address of the transfer
descriptor for this.
musb needs a different approach because there is no transfer descriptor.
But musb also doesn't support pipelining, so we have never more than one
packet per endpoint in flight. So we go create an ID based on endpoint
and device address.
Hans de Goede [Fri, 17 Aug 2012 13:24:49 +0000 (15:24 +0200)]
usb: Halt ep queue en cancel pending packets on a packet error
For controllers which queue up more then 1 packet at a time, we must halt the
ep queue, and inside the controller code cancel all pending packets on an
error.
There are multiple reasons for this:
1) Guests expect the controllers to halt ep queues on error, so that they
get the opportunity to cancel transfers which the scheduled after the failing
one, before processing continues
2) Not cancelling queued up packets after a failed transfer also messes up
the controller state machine, in the case of EHCI causing the following
assert to trigger: "assert(p->qtdaddr == q->qtdaddr)" at hcd-ehci.c:2075
3) For bulk endpoints with pipelining enabled (redirection to a real USB
device), we must cancel all the transfers after this a failed one so that:
a) If they've completed already, they are not processed further causing more
stalls to be reported, originating from the same failed transfer
b) If still in flight, they are cancelled before the guest does
a clear stall, otherwise the guest and device can loose sync!
Note this patch only touches the ehci and uhci controller changes, since AFAIK
no other controllers actually queue up multiple transfer. If I'm wrong on this
other controllers need to be updated too!
Also note that this patch was heavily tested with the ehci code, where I had
a reproducer for a device causing a transfer to fail. The uhci code is not
tested with actually failing transfers and could do with a thorough review!
Signed-off-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
The facility to use/unuse vectors dynamically is helpful
for virtio but little else: everyone just seems to use
vectors in their init function.
Avoid clearing msix vector use info on reset and load.
For virtio, clear it explicitly.
This should fix regressions reported with ivshmem - though
I didn't test this, I verified that virtio keeps
working like it did.
Tested-by: Cam Macdonell <cam@cs.ualberta.ca> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Support get/set of new PV EOI MSR, for migration.
Add an optional section for MSR value - send it
out in case MSR was changed from the default value (0).
Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Anthony Liguori [Wed, 29 Aug 2012 14:32:41 +0000 (09:32 -0500)]
target-i386: disable pv eoi to fix migration across QEMU versions
We have a problem with how we handle migration with KVM paravirt features.
We unconditionally enable paravirt features regardless of whether we know how
to migrate them.
We also don't tie paravirt features to specific machine types so an old QEMU on
a new kernel would expose features that never existed.
The 1.2 cycle is over and as things stand, migration is broken. Michael has
another series that adds support for migrating PV EOI and attempts to make it
work correctly for different machine types.
After speaking with Michael on IRC, we agreed to take this patch plus 1 & 4
from his series. This makes sure QEMU can migrate PV EOI if it's enabled, but
does not enable it by default.
This also means that we won't unconditionally enable new features for guests
future proofing us from this happening again in the future.
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>