Set the async_context_id field when queuing an async ioctl call
Signed-off-by: Andrew de Quincey <adq@lidskialf.net> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit 34cf0081294513bc734896c9051c20ca6c19c3db)
Signed-off-by: Loïc Minier <loic.minier@linaro.org> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit 2aa326be0d2039f51192707bdb2fc935d0e87c21)
Kevin Wolf [Tue, 17 Aug 2010 16:58:55 +0000 (18:58 +0200)]
qemu-img rebase: Open new backing file read-only
We never write to a backing file, so opening rw is useless. It just means that
you can't rebase on top of a file for which you don't have write permissions.
Kevin Wolf [Tue, 3 Aug 2010 14:57:02 +0000 (16:57 +0200)]
virtio-blk: Fix migration of queued requests
in_sg[].iovec and out_sg[].ioved are pointer to (source) host memory and
therefore invalid after migration. When loading the device state we must
create a new mapping on the destination host.
Andrea Arcangeli [Tue, 27 Jul 2010 19:04:36 +0000 (21:04 +0200)]
ide: Avoid canceling IDE DMA
The reason for not actually canceling the I/O is because with
virtualization and lots of VM running, a guest fs may mistake a
overload of the host, as an IDE timeout. So rather than canceling the
I/O, it's safer to wait I/O completion and simulate that the I/O has
completed just before the io cancellation was requested by the
guest. This way if ntfs or an app writes data without checking for
-EIO retval, and it thinks the write has succeeded, it's less likely
to run into troubles. Similar issues for reads.
Furthermore because the DMA operation is splitted into many synchronous
aio_read/write if there's more than one entry in the SG table, without this
patch the DMA would be cancelled in the middle, something we've no idea if it
happens on real hardware too or not. Overall this seems a great risk for zero
gain.
This approach is sure safer than previous code given we can't pretend all guest
fs code out there to check for errors and reply the DMA if it was completed
partially, given a timeout would never materialize on a real harddisk unless
there are defective blocks (and defective blocks are practically only an issue
for reads never for writes in any recent hardware as writing to blocks is the
way to fix them) or the harddisk breaks as a whole.
Signed-off-by: Izik Eidus <ieidus@redhat.com> Signed-off-by: Andrea Arcangeli <aarcange@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit 953844d102f5b682f0835f021f2ed2ad9fb7734c)
bdrv_eject() gets called when a device model opens or closes the tray.
If the block driver implements method bdrv_eject(), that method gets
called. Drivers host_cdrom implements it, and it opens and closes the
physical tray, and nothing else. When a device model opens, then
closes the tray, media changes only if the user actively changes the
physical media while the tray is open. This is matches how physical
hardware behaves.
If the block driver doesn't implement method bdrv_eject(), we do
something quite different: opening the tray severs the connection to
the image by calling bdrv_close(), and closing the tray does nothing.
When the device model opens, then closes the tray, media is gone,
unless the user actively inserts another one while the tray is open,
with a suitable change command in the monitor. This isn't how
physical hardware behaves. Rather inconvenient when programs
"helpfully" eject media to give you a chance to change it. The way
bdrv_eject() behaves here turns that chance into a must, which is not
what these programs or their users expect.
Change the default action not to call bdrv_close(). Instead, note the
tray status in new BlockDriverState member tray_open. Use it in
bdrv_is_inserted().
Arguably, the device models should keep track of tray status
themselves. But this is less invasive.
Signed-off-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit 4be9762adb0947a353e6efef2fed354f69218bfb)
Kevin Wolf [Wed, 28 Jul 2010 09:26:29 +0000 (11:26 +0200)]
block: Fix bdrv_has_zero_init
Assuming that any image on a block device is not properly zero-initialized is
actually wrong: Only raw images have this problem. Any other image format
shouldn't care about it, they initialize everything properly themselves.
linux-user: fix build on hosts not using guest base
Commit 68a1c816868b3e35a1da698af412b29e61b1948a broke qemu on hosts not
using guest base. It uses reserved_va unconditionally in mmap.c. To
avoid to many #ifdef #endif blocks, define RESERVED_VA as either
reserved_va or 0ul, and use it instead of reserved_va, similarly to what
has been done with guest_base/GUEST_BASE.
(cherry picked from commit 18e9ea8a3f36b0a3845e1ac6d8acd180063bed8f)
Blue Swirl [Sun, 25 Jul 2010 20:49:34 +0000 (20:49 +0000)]
Fix -snapshot deleting images on disk change
Block device change command did not copy BDRV_O_SNAPSHOT flag. Thus
the new image did not have this flag and the file got deleted during
opening.
Fix by copying BDRV_O_SNAPSHOT flag.
Signed-off-by: Blue Swirl <blauwirbel@gmail.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit 199630b62ec7cc5efd6f860ff545b449c7b5cdb8)
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Stefan Weil [Wed, 21 Jul 2010 19:51:51 +0000 (21:51 +0200)]
block: Use error codes from lower levels for error message
"No such file or directory" is a misleading error message
when a user tries to open a file with wrong permissions.
Cc: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Stefan Weil <weil@mail.berlios.de> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit c98ac35d87fbd41618c1f02c64bcd4019e42513e)
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Currently we set them to 512 bytes unless manually specified. Unforuntaly
some brain-dead partitioning tools create unaligned partitions if they
get low enough optiomal I/O size values, so don't report any at all
unless explicitly set.
Bruce Rogers [Wed, 21 Jul 2010 20:32:28 +0000 (14:32 -0600)]
move 'unsafe' to end of caching modes in help
Libvirt parses qemu help output to determine qemu features. In particular
it probes for the following: "cache=writethrough|writeback|none". The
addition of the unsafe cache mode was inserted within this string, as
opposed to being added to the end, which impacted libvirt's probe.
Unbreak libvirt by keeping the existing cache modes intact and add
unsafe to the end.
This problem only manifests itself if a caching mode is explicitly
specified in the libvirt xml, in which case older syntax for caching is
passed to qemu, which it no longer understands.
Signed-off-by: Bruce Rogers <brogers@novell.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit 6c6b6ba20a167a89f85606125ee1e10eafef5b33)
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Alex Williamson [Tue, 20 Jul 2010 17:14:22 +0000 (11:14 -0600)]
virtio-blk: Create exit function to unregister savevm
Otherwise we can't migrate after we've removed a virtio block device.
Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit 9d0d3138590c26cee1b1c440db6bcdd1986a5a20)
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
block migration: propagate return value when bdrv_write() returns < 0
Currently block_load() doesn't check return value of bdrv_write(), and
even the destination weren't prepared to execute block migration, it
proceeds and guest boots on the target. This patch fix this issue.
Signed-off-by: Yoshiaki Tamura <tamura.yoshiaki@lab.ntt.co.jp> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit b02bea3a85cc939f09aa674a3f1e4f36d418c007)
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
ide/atapi: add support for GET EVENT STATUS NOTIFICATION
The GET EVENT STATUS NOTIFICATION is a mandatory command according
to MMC-3, even if event status notification is not supported.
This patch adds support for this command. It returns NEA ("No Event
Available") with an empty "Supported Event Classes" to show that it
doesn't event support status notification. If asychronous operation is
requested, which requires NCQ support, it returns an error according
to the specifications.
This fixes HAL support on FreeBSD and derivatives, which fill up the
logs every second with:
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit 253cb7b9909806b83d73269afb9cf0ab3fa2ce2c)
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Implement a threaded VNC server using the producer-consumer model.
The main thread will push encoding jobs (a list a rectangles to update)
in a queue, and the VNC worker thread will consume that queue and send
framebuffer updates to the output buffer.
The threaded VNC server can be enabled with ./configure --enable-vnc-thread.
If you don't want it, just use ./configure --disable-vnc-thread and a syncrhonous
queue of job will be used (which as exactly the same behavior as the old queue).
If you disable the VNC thread, all thread related code will not be built and there will
be no overhead.
Signed-off-by: Corentin Chary <corentincj@iksaif.net> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
qemu-thread: add qemu_mutex/cond_destroy and qemu_mutex_exit
Add some missing functions in qemu-thread. Currently qemu-thread
is only used for io-thread but it will used by the vnc server soon
and we need those functions instead of calling pthread directly.
Signed-off-by: Corentin Chary <corentincj@iksaif.net> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Profiling with callgrind seems to show that a lot of time is spent
in the palette code (mostly due to memory allocation and qdict to int
conversion).
This patch adds a VncPalette implementation. The palette is stored
in a hash table, like qdict, but which does way less memory allocations,
and doesn't suffer from the QObject overhead.
Signed-off-by: Corentin Chary <corentincj@iksaif.net> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
vnc: tight: specific zlib level and filters for each compression level
Disable png filters for lower compression levels. This should lower
the CPU consumption and reduce encoding time.
This isn't in tight_conf because:
* tight_conf structure must not change, because it's shared with other
tight implementations (libvncserver, etc..).
* it'd exceed the 80 col limit.
* PNG_ macros are only defined if CONFIG_VNC_PNG is defined
Signed-off-by: Corentin Chary <corentincj@iksaif.net> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Introduce a new encoding: VNC_ENCODING_TIGHT_PNG [1] (-269) with a new
tight filter VNC_TIGHT_PNG (0x0A). When the client tells it supports the Tight PNG
encoding, the server will use tight, but will always send encoding pixels using
PNG instead of zlib. If the client also told it support JPEG, then the server can
send JPEG, because PNG will only be used in the cases zlib was used in normal tight.
This encoding was introduced to speed up HTML5 based VNC clients like noVNC [2], but
can also be used on devices like iPhone where PNG can be rendered in hardware.
vnc: tight: add JPEG and gradient subencoding with smooth image detection
Add gradient filter and JPEG compression with an heuristic to detect how
lossy the comppression will be. This code has been adapted from
libvncserver/tight.c.
JPEG support can be enabled/disabled at compile time with --enable-vnc-jpeg
and --disable-vnc-jpeg.
Signed-off-by: Corentin Chary <corentincj@iksaif.net> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Juan Quintela [Mon, 26 Jul 2010 19:38:43 +0000 (21:38 +0200)]
vmstate: add subsections code
This commit adds subsections for each device section.
Subsections is the way to handle information that don't need to be sent
to de destination of a migration because its values are not needed. It is
the way to handle optional information. Notice that only the source can
decide if the information is optional or not. The destination needs to
understand all subsections that it receives to have a sucessful load.
Signed-off-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
The conflicts are due to commit 4fc8d6711aff7a9c11e402c3d77b481609f9f486
that is a fix to the ide_drive_pre_save() function. It reverts both
(and both are reinstantiated later in the series)
Conflicts:
hw/ide/core.c
Signed-off-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Stefan Weil [Thu, 22 Jul 2010 20:15:24 +0000 (22:15 +0200)]
slirp: Remove declarations which are no longer needed
The previous patches replaced u_int8_t, u_int16_t, u_int32_t, u_int64_t
by standard int types from stdint.h,
so we can now remove their declarations which are no longer needed.
Signed-off-by: Stefan Weil <weil@mail.berlios.de> Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
mips: Correct MIPS interrupt glue logic for icount
When hw interrupt pending bits in CP0_Cause are set, the CPU should
see the hw interrupt line as active. The CPU may or may not take the
interrupt based on internal state (global irq mask etc) but the glue
logic shouldn't care.
This fixes MIPS external hw interrupts in combination with -icount.
Jan Kiszka [Tue, 13 Jul 2010 12:13:45 +0000 (14:13 +0200)]
scsi: Dequeue requests before invoking completion callback
The request completion callback of the LSI controller may start the next
request that can use the same tag as the completed one. As the latter is
still enqueued at that point, scsi_send_command will complain about the
tag reuse and cancel the completed request. That will cause a double
free later on when the completion path cleans up as well.
Fix this by dequeuing the request before invoking the callback.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
e1000: Fix wrong microwire EEPROM state initialization
This change fixes initialization of e1000's microwire EEPROM internal
state values so that qemu's e1000 emulation works on NetBSD,
which doesn't use Intel's em driver but has its own wm driver
for the Intel i8254x Gigabit Ethernet.
Previously set_eecd() function in e1000.c clears EEPROM internal state
values on SK rising edge during CS==L, but according to FM93C06 EEPROM
(which is MicroWire compatible) data sheet, EEPROM internal status
should be cleared on CS rise edge regardless of SK input:
"... a rising edge on this (CS) signal is required to reset the internal
state-machine to accept a new cycle .."
and nothing should be changed during CS (chip select) is inactive.
Intel's em driver seems to explicitly raise SK output after CS is negated
in em_standby_eeprom() so many other OSes that use Intel's driver
don't have this problem even on the previous e1000.c implementation,
but I can't find any articles that say the MICROWIRE or EEPROM spec
requires such sequence, and actually hardware works fine without it
(i.e. real i82540EM has been working on NetBSD).
This fix also changes initialization to clear each state value in
struct eecd_state individually rather than using memset() against
the whole structre. The old_eecd member stores the last SK and CS
signal levels and it should be preserved even after reset of internal
EEPROM state to detect next signal edges for proper EEPROM emulation.
Jan Kiszka [Fri, 25 Jun 2010 14:56:56 +0000 (16:56 +0200)]
Rework debug exception processing for gdb use
Guest debugging is currently broken under CONFIG_IOTHREAD. The reason is
inconsistent or even lacking signaling the debug events from the source
VCPU to the main loop and the gdbstub.
This patch addresses the issue by pushing this signaling into a
CPUDebugExcpHandler: cpu_debug_handler is registered as first handler,
thus will be executed last after potential breakpoint emulation
handlers. It sets informs the gdbstub about the debug event source,
requests a debug exit of the main loop and stops the current VCPU. This
mechanism works both for TCG and KVM, with and without IO-thread.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Acked-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Jan Kiszka [Fri, 25 Jun 2010 14:56:53 +0000 (16:56 +0200)]
Fix qemu_wait_io_event processing in io-thread mode
When checking for I/O events in the tcg CPU loop, make sure that we
call qemu_wait_io_event_common for all CPUs, not only the current one.
Otherwise pause_all_vcpus may lock up or run_on_cpu requests may starve.
Rename qemu_wait_io_event to qemu_tcg_wait_io_event at this chance and
purge its argument list as it has no use for it.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Acked-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Jan Kiszka [Fri, 25 Jun 2010 14:56:52 +0000 (16:56 +0200)]
Fix cpu_exit for tcp_cpu_exec
If a cpu_exit request is pending, ensure that we leave the CPU loop
quickly. For this purpose, keep the global exit_request pending until
we are about to leave tcg_cpu_exec. Also, immediately break out of the
SMP loop if the request is set, do not run till the end of the chain.
This preserves the VCPU scheduling order in SMP mode.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Acked-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Jan Kiszka [Fri, 25 Jun 2010 14:56:50 +0000 (16:56 +0200)]
Fix cpu_unlink_tb race
If a signal hit after the env->exit_request check but before cpu_exec
updated env->current_tb, cpu_unlink_tb called from the signal hander
will not unlink the current TB. This may leave us stuck in a guest loop
if no further unlink is invoked.
Fix this by reordering current_tb update and exit_request check,
additionally enforcing the correct order via a compiler barrier.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Acked-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Amit Shah [Wed, 23 Jun 2010 17:19:20 +0000 (22:49 +0530)]
virtio-serial: Fix compat property name
Starting with qemu -M pc-0.12 -device virtio-serial
results in
-device virtio-serial: Property 'virtio-serial-pci.max_nr_ports' not found
The property name 'max_ports' is incorrectly named 'max_nr_ports'. Fix
that.
Also fix the ppc440 machine type bamboo-0.12 which has this typo.
Reported-by: Daniel P. Berrange <berrange@redhat.com> Signed-off-by: Amit Shah <amit.shah@redhat.com> Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Bob Breuer [Tue, 13 Jul 2010 16:05:24 +0000 (11:05 -0500)]
Sparc32: reserve addresses for unimplemented devices on SS-20
Use empty_slot to reserve addresses for several unimplemented devices so they won't fault.
- BPP (parallel port), DBRI (audio), SX (pixel processor), and vsimms (framebuffer)
OBP for SS-20 either assumes these devices exist or probes without expecting faults.
Signed-off-by: Bob Breuer <breuerr@mc.net> Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
Anthony Liguori [Wed, 14 Jul 2010 15:58:00 +0000 (10:58 -0500)]
Make default invocation of block drivers safer (v3)
CVE-2008-2004 described a vulnerability in QEMU whereas a malicious user could
trick the block probing code into accessing arbitrary files in a guest. To
mitigate this, we added an explicit format parameter to -drive which disabling
block probing.
Fast forward to today, and the vast majority of users do not use this parameter.
libvirt does not use this by default nor does virt-manager.
Most users want block probing so we should try to make it safer.
This patch adds some logic to the raw device which attempts to detect a write
operation to the beginning of a raw device. If the first 4 bytes happen to
match an image file that has a backing file that we support, it scrubs the
signature to all zeros. If a user specifies an explicit format parameter, this
behavior is disabled.
I contend that while a legitimate guest could write such a signature to the
header, we would behave incorrectly anyway upon the next invocation of QEMU.
This simply changes the incorrect behavior to not involve a security
vulnerability.
I've tested this pretty extensively both in the positive and negative case. I'm
not 100% confident in the block layer's ability to deal with zero sized writes
particularly with respect to the aio functions so some additional eyes would be
appreciated.
Even in the case of a single sector write, we have to make sure to invoked the
completion from a bottom half so just removing the zero sized write is not an
option.
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Move the check from virtio_blk_init_pci(), where it protects only
virtio-blk-pci, to virtio_blk_init(). Without that, virtio-blk-s390
initializes without a drive. I figure that can lead to null pointer
dereferences.
Signed-off-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Stefan Weil [Fri, 9 Jul 2010 18:30:07 +0000 (20:30 +0200)]
qemu-img: Fix copy+paste bug in documentation
Replace rebase by resize in documentation of resize command.
Cc: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> Cc: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Stefan Weil <weil@mail.berlios.de> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Amit Shah [Thu, 1 Jul 2010 09:28:17 +0000 (14:58 +0530)]
virtio-serial: Assert for virtio queue ready before virtqueue operations
In addition to the previous fix for calling do_flush_queued_data() only
when the virtqueue is ready, ensure do_flush_queued_data() gets a vq
that's suitably initialised.
Signed-off-by: Amit Shah <amit.shah@redhat.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Sripathi Kodi [Wed, 30 Jun 2010 11:02:00 +0000 (16:32 +0530)]
virtio-9p: Avoid SEGV when log file couldn't be opened
While running in debug mode if 9P server is unable to open the log file
it results in a SEGV deep down in glibc:
Program received signal SIGSEGV, Segmentation fault.
0x008fca8c in fwrite () from /lib/libc.so.6
(gdb) bt
#0 0x008fca8c in fwrite () from /lib/libc.so.6
#1 0x081eb87e in pprint_pdu (pdu=0x89a52e1c)
at /data/sripathi/code/qemu/new/qemu-next-upstream/hw/virtio-9p-debug.c:380
#2 0x0806dad8 in submit_pdu (s=0x897dc008, pdu=0x89a52e1c)
at /data/sripathi/code/qemu/new/qemu-next-upstream/hw/virtio-9p.c:3092
#3 0x0806dc63 in handle_9p_output (vdev=0x897dc008, vq=0x86d8218)
at /data/sripathi/code/qemu/new/qemu-next-upstream/hw/virtio-9p.c:3122
#4 0x081ac728 in virtio_queue_notify (vdev=0x897dc008, n=0)
at /data/sripathi/code/qemu/new/qemu-next-upstream/hw/virtio.c:563
#5 0x08063876 in virtio_ioport_write (opaque=0x86d7b98, addr=16, val=0)
at /data/sripathi/code/qemu/new/qemu-next-upstream/hw/virtio-pci.c:222
#6 0x08063e26 in virtio_pci_config_writew (opaque=0x86d7b98, addr=16, val=0)
at /data/sripathi/code/qemu/new/qemu-next-upstream/hw/virtio-pci.c:357
#7 0x080c881a in ioport_write (index=1, address=49296, data=0) at ioport.c:80
#8 0x080c8d4c in cpu_outw (addr=49296, val=0) at ioport.c:204
#9 0x08073010 in kvm_handle_io (port=49296, data=0xab393000, direction=1, size=2, count=1)
at /data/sripathi/code/qemu/new/qemu-next-upstream/kvm-all.c:735
...
...
This is ugly and misleading. The following patch adds a BUG_ON to catch this
error. With this patch we get an abort message like the following, which makes
it easier to analyze: