]> git.proxmox.com Git - mirror_qemu.git/log
mirror_qemu.git
9 years agoapic_common: improve readability of apic_reset_common
Denis V. Lunev [Tue, 7 Apr 2015 13:53:52 +0000 (16:53 +0300)]
apic_common: improve readability of apic_reset_common

Replace call of cpu_is_bsp(s->cpu) which really returns
    !!(s->apicbase & MSR_IA32_APICBASE_BSP)
with directly collected value. Due to this the tracepoint
  trace_cpu_get_apic_base((uint64_t)s->apicbase);
will not be hit anymore in apic_reset_common.

Signed-off-by: Denis V. Lunev <den@openvz.org>
CC: Andreas Färber <afaerber@suse.de>
CC: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <1428414832-3104-1-git-send-email-den@openvz.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
9 years agokvm: Silence warning from valgrind
Thomas Huth [Mon, 27 Apr 2015 16:59:04 +0000 (18:59 +0200)]
kvm: Silence warning from valgrind

valgrind complains here about uninitialized bytes with the following message:

==17814== Syscall param ioctl(generic) points to uninitialised byte(s)
==17814==    at 0x466A780: ioctl (in /usr/lib64/power8/libc-2.17.so)
==17814==    by 0x100735B7: kvm_vm_ioctl (kvm-all.c:1920)
==17814==    by 0x10074583: kvm_set_ioeventfd_mmio (kvm-all.c:574)

Let's fix it by using a proper struct initializer in kvm_set_ioeventfd_mmio().

Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-Id: <1430153944-24368-1-git-send-email-thuth@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
9 years agoMerge remote-tracking branch 'remotes/cohuck/tags/s390x-20150430' into staging
Peter Maydell [Thu, 30 Apr 2015 13:15:56 +0000 (14:15 +0100)]
Merge remote-tracking branch 'remotes/cohuck/tags/s390x-20150430' into staging

First pile of s390x patches for 2.4, including:
- some cleanup patches
- sort most of the s390x devices into categories
- support for the new STSI post handler, used to insert vm name and
  friends
- support for the new MEM_OP ioctl (including access register mode)
  for accessing guest memory

# gpg: Signature made Thu Apr 30 12:56:58 2015 BST using RSA key ID C6F02FAF
# gpg: Good signature from "Cornelia Huck <huckc@linux.vnet.ibm.com>"
# gpg:                 aka "Cornelia Huck <cornelia.huck@de.ibm.com>"

* remotes/cohuck/tags/s390x-20150430:
  kvm: better advice for failed s390x startup
  s390x/kvm: Support access register mode for KVM_S390_MEM_OP ioctl
  s390x/mmu: Use ioctl for reading and writing from/to guest memory
  s390x/kvm: Put vm name, extended name and UUID into STSI322 SYSIB
  linux-headers: update
  s390x/mmu: Use access type definitions instead of magic values
  s390x/ipl: sort into categories
  sclp: sort into categories
  s390-virtio: sort into categories
  virtio-ccw: sort into categories

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
9 years agokvm: better advice for failed s390x startup
Cornelia Huck [Thu, 23 Apr 2015 15:03:46 +0000 (17:03 +0200)]
kvm: better advice for failed s390x startup

If KVM_CREATE failed on s390x, we print a hint to enable the switch_amode
kernel parameter. This only applies to old kernels, and only if the
error was -EINVAL. Moreover, with new kernels, the most likely reason
for -EINVAL is that pgstes were not enabled.

Let's update the error message to give a better hint on where things
may need fixing.

Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
9 years agos390x/kvm: Support access register mode for KVM_S390_MEM_OP ioctl
Alexander Yarygin [Thu, 5 Mar 2015 09:36:48 +0000 (12:36 +0300)]
s390x/kvm: Support access register mode for KVM_S390_MEM_OP ioctl

Access register mode is one of the modes that control dynamic address
translation. In this mode the address space is specified by values of
the access registers. The effective address-space-control element is
obtained from the result of the access register translation. See
the "Access-Register Introduction" section of the chapter 5 "Program
Execution" in "Principles of Operations" for more details.

When the CPU is in AR mode, the s390_cpu_virt_mem_rw() function must
know which access register number to use for address translation.
This patch does several things:
- add new parameter 'uint8_t ar' to that function
- decode ar number from intercepted instructions
- pass the ar number to s390_cpu_virt_mem_rw(), which in turn passes it
to the KVM_S390_MEM_OP ioctl.

Signed-off-by: Alexander Yarygin <yarygin@linux.vnet.ibm.com>
Reviewed-by: Thomas Huth <thuth@linux.vnet.ibm.com>
Reviewed-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
9 years agos390x/mmu: Use ioctl for reading and writing from/to guest memory
Thomas Huth [Fri, 6 Feb 2015 14:54:58 +0000 (15:54 +0100)]
s390x/mmu: Use ioctl for reading and writing from/to guest memory

Add code to make use of the new ioctl for reading from / writing to
virtual guest memory. By using the ioctl, the memory accesses are now
protected with the so-called ipte-lock in the kernel.

[CH: moved error message into kvm_s390_mem_op()]
Signed-off-by: Thomas Huth <thuth@linux.vnet.ibm.com>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
9 years agos390x/kvm: Put vm name, extended name and UUID into STSI322 SYSIB
Ekaterina Tumanova [Tue, 3 Mar 2015 17:35:27 +0000 (18:35 +0100)]
s390x/kvm: Put vm name, extended name and UUID into STSI322 SYSIB

KVM prefills the SYSIB, returned by STSI 3.2.2. This patch allows
userspace to intercept execution, and fill in the values, that are
known to qemu: machine name (8 chars), extended machine name (256
chars), extended machine name encoding (equals 2 for UTF-8) and UUID.

STSI322 qemu handler also finds a highest virtualization level in
level-3 virtualization stack that doesn't support Extended Names
(Ext Name delimiter) and propagates zero Ext Name to all levels below,
because this level is not capable of managing Extended Names of lower
levels.

Signed-off-by: Ekaterina Tumanova <tumanova@linux.vnet.ibm.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: Thomas Huth <thuth@linux.vnet.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
9 years agolinux-headers: update
Cornelia Huck [Tue, 31 Mar 2015 14:07:08 +0000 (16:07 +0200)]
linux-headers: update

This updates linux-headers against master 4.1-rc1 (commit
b787f68c36d49bb1d9236f403813641efa74a031).

Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
9 years agos390x/mmu: Use access type definitions instead of magic values
Thomas Huth [Thu, 19 Mar 2015 14:04:50 +0000 (15:04 +0100)]
s390x/mmu: Use access type definitions instead of magic values

Since there are now proper definitions for the MMU access type,
let's use them in the s390x MMU code, too, instead of the
hard-to-understand magic values.

Signed-off-by: Thomas Huth <thuth@linux.vnet.ibm.com>
Reviewed-by: Jens Freimann <jfrei@linux.vnet.ibm.com>
Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
9 years agos390x/ipl: sort into categories
Cornelia Huck [Fri, 20 Mar 2015 09:17:08 +0000 (10:17 +0100)]
s390x/ipl: sort into categories

The s390 ipl device has no real home (it's not really a storage device),
so let's sort it into the misc category.

Reviewed-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
9 years agosclp: sort into categories
Cornelia Huck [Tue, 17 Mar 2015 12:44:39 +0000 (13:44 +0100)]
sclp: sort into categories

Sort the sclp consoles into the input category, just as virtio-serial.
Various other sclp devices don't have an obvious category, sort them
into misc.

Reviewed-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
9 years agos390-virtio: sort into categories
Cornelia Huck [Tue, 17 Mar 2015 12:43:26 +0000 (13:43 +0100)]
s390-virtio: sort into categories

Sort the various s390-virtio devices into the same categories as their
virtio-pci counterparts.

Reviewed-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
9 years agovirtio-ccw: sort into categories
Cornelia Huck [Tue, 17 Mar 2015 12:42:03 +0000 (13:42 +0100)]
virtio-ccw: sort into categories

Sort the various virtio-ccw devices into the same categories as their
virtio-pci counterparts.

Reviewed-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
9 years agoMerge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging
Peter Maydell [Thu, 30 Apr 2015 11:04:11 +0000 (12:04 +0100)]
Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging

- miscellaneous cleanups for TCG (Emilio) and NBD (Bogdan)
- next part in the thread-safe address_space_* saga: atomic access
  to the bounce buffer and the map_clients list, from Fam
- optional support for linking with tcmalloc, also from Fam
- reapplying Peter Crosthwaite's "Respect as_translate_internal
  length clamp" after fixing the SPARC fallout.
- build system fix from Wei Liu
- small acpi-build and ioport cleanup by myself

# gpg: Signature made Wed Apr 29 09:34:00 2015 BST using RSA key ID 78C7AE83
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>"
# gpg:                 aka "Paolo Bonzini <pbonzini@redhat.com>"
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg:          It is not certain that the signature belongs to the owner.
# Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4  E2F7 7E15 100C CD36 69B1
#      Subkey fingerprint: F133 3857 4B66 2389 866C  7682 BFFB D25F 78C7 AE83

* remotes/bonzini/tags/for-upstream: (22 commits)
  nbd/trivial: fix type cast for ioctl
  translate-all: use bitmap helpers for PageDesc's bitmap
  target-i386: disable LINT0 after reset
  Makefile.target: prepend $libs_softmmu to $LIBS
  milkymist: do not modify libs-softmmu
  configure: Add support for tcmalloc
  exec: Respect as_translate_internal length clamp
  ioport: reserve the whole range of an I/O port in the AddressSpace
  ioport: loosen assertions on emulation of 16-bit ports
  ioport: remove wrong comment
  ide: there is only one data port
  gus: clean up MemoryRegionPortio
  sb16: remove useless mixer_write_indexw
  sun4m: fix slavio sysctrl and led register sizes
  acpi-build: remove dependency from ram_addr.h
  memory: add memory_region_ram_resize
  dma-helpers: Fix race condition of continue_after_map_failure and dma_aio_cancel
  exec: Notify cpu_register_map_client caller if the bounce buffer is available
  exec: Protect map_client_list with mutex
  linux-user, bsd-user: Remove two calls to cpu_exec_init_all
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
9 years agoMerge remote-tracking branch 'remotes/jnsnow/tags/ide-pull-request' into staging
Peter Maydell [Thu, 30 Apr 2015 09:10:31 +0000 (10:10 +0100)]
Merge remote-tracking branch 'remotes/jnsnow/tags/ide-pull-request' into staging

# gpg: Signature made Wed Apr 29 00:03:44 2015 BST using RSA key ID AAFC390E
# gpg: Good signature from "John Snow (John Huston) <jsnow@redhat.com>"
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg:          It is not certain that the signature belongs to the owner.
# Primary key fingerprint: FAEB 9711 A12C F475 812F  18F2 88A9 064D 1835 61EB
#      Subkey fingerprint: F9B7 ABDB BCAC DF95 BE76  CBD0 7DEF 8106 AAFC 390E

* remotes/jnsnow/tags/ide-pull-request:
  qtest: Add assertion that required environment variable is set
  qtest/ahci: add flush retry test
  libqos: add blkdebug_prepare_script
  libqtest: add qmp_async
  libqtest: add qmp_eventwait
  qtest/ahci: Allow override of default CLI options
  qtest/ahci: Add simple flush test
  qtest/ahci: test different disk sectors
  qtest/ahci: add qcow2 support to ahci-test
  fdc: remove sparc sun4m mutations

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
9 years agonbd/trivial: fix type cast for ioctl
Bogdan Purcareata [Fri, 3 Apr 2015 11:01:54 +0000 (11:01 +0000)]
nbd/trivial: fix type cast for ioctl

This fixes ioctl behavior on powerpc e6500 platforms with 64bit kernel and 32bit
userspace. The current type cast has no effect there and the value passed to the
kernel is still 0. Probably an issue related to the compiler, since I'm assuming
the same configuration works on a similar setup on x86.

Also ensure consistency with previous type cast in TRACE message.

Signed-off-by: Bogdan Purcareata <bogdan.purcareata@freescale.com>
Message-Id: <1428058914-32050-1-git-send-email-bogdan.purcareata@freescale.com>
Cc: qemu-stable@nongnu.org
[Fix parens as noticed by Michael. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
9 years agotranslate-all: use bitmap helpers for PageDesc's bitmap
Emilio G. Cota [Wed, 22 Apr 2015 21:50:52 +0000 (17:50 -0400)]
translate-all: use bitmap helpers for PageDesc's bitmap

Here we have an open-coded byte-based bitmap implementation.
Get rid of it since there's a ulong-based implementation to be
used by all code.

Signed-off-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
9 years agotarget-i386: disable LINT0 after reset
Nadav Amit [Sun, 12 Apr 2015 23:32:08 +0000 (02:32 +0300)]
target-i386: disable LINT0 after reset

Due to old Seabios bug, QEMU reenable LINT0 after reset. This bug is long gone
and therefore this hack is no longer needed.  Since it violates the
specifications, it is removed.

Signed-off-by: Nadav Amit <namit@cs.technion.ac.il>
Message-Id: <1428881529-29459-2-git-send-email-namit@cs.technion.ac.il>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
9 years agoMakefile.target: prepend $libs_softmmu to $LIBS
Wei Liu [Mon, 9 Mar 2015 14:54:33 +0000 (14:54 +0000)]
Makefile.target: prepend $libs_softmmu to $LIBS

I discovered a problem when trying to build QEMU statically with gcc.
libm is an element of LIBS while libpixman-1 is an element in
libs_softmmu. Libpixman references functions in libm, so the original
ordering makes linking fail.

This fix is to reorder $libs_softmmu and $LIBS to make -lm appear after
-lpixman-1. However I'm not quite sure if this is the right fix, hence
the RFC tag.

Normally QEMU is built with c++ compiler which happens to link in libm
(at least this is the case with g++), so building QEMU statically
normally just works and nobody notices this issue.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Message-Id: <1425912873-21215-1-git-send-email-wei.liu2@citrix.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
9 years agomilkymist: do not modify libs-softmmu
Paolo Bonzini [Tue, 10 Mar 2015 10:17:48 +0000 (11:17 +0100)]
milkymist: do not modify libs-softmmu

This is better and prepares for the next patch.  When we copy
libs_softmmu's value into LIBS with a := assignment, we cannot
anymore modify libs_softmmu in the Makefiles.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
9 years agoconfigure: Add support for tcmalloc
Fam Zheng [Thu, 26 Mar 2015 03:03:12 +0000 (11:03 +0800)]
configure: Add support for tcmalloc

This adds "--enable-tcmalloc" and "--disable-tcmalloc" to allow linking
to libtcmalloc from gperftools.

tcmalloc is a malloc implementation that works well with threads and is
fast, so it is good for performance.

It is disabled by default, because the MALLOC_PERTURB_ flag we use in
tests doesn't work with tcmalloc. However we can enable tcmalloc
specific heap checker and profilers later.

An IOPS gain can be observed with virtio-blk-dataplane, other parts of
QEMU will directly benefit from it as well:

==========================================================
                       glibc malloc
----------------------------------------------------------
rw         bs         iodepth    bw     iops       latency
read       4k         1          150    38511      24
----------------------------------------------------------

==========================================================
                         tcmalloc
----------------------------------------------------------
rw         bs         iodepth    bw     iops       latency
read       4k         1          156    39969      23
----------------------------------------------------------

Signed-off-by: Fam Zheng <famz@redhat.com>
Message-Id: <1427338992-27057-1-git-send-email-famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
9 years agoqtest: Add assertion that required environment variable is set
Ed Maste [Tue, 28 Apr 2015 19:27:51 +0000 (15:27 -0400)]
qtest: Add assertion that required environment variable is set

Signed-off-by: Ed Maste <emaste@freebsd.org>
Reviewed-by: John Snow <jsnow@redhat.com>
Message-id: 1427911244-22565-1-git-send-email-emaste@freebsd.org
Signed-off-by: John Snow <jsnow@redhat.com>
9 years agoqtest/ahci: add flush retry test
John Snow [Tue, 28 Apr 2015 19:27:51 +0000 (15:27 -0400)]
qtest/ahci: add flush retry test

Signed-off-by: John Snow <jsnow@redhat.com>
Message-id: 1426018503-821-7-git-send-email-jsnow@redhat.com

9 years agolibqos: add blkdebug_prepare_script
John Snow [Tue, 28 Apr 2015 19:27:51 +0000 (15:27 -0400)]
libqos: add blkdebug_prepare_script

Pull this helper out of ide-test and into libqos,
to be shared with ahci-test.

Signed-off-by: John Snow <jsnow@redhat.com>
Message-id: 1426018503-821-6-git-send-email-jsnow@redhat.com

9 years agolibqtest: add qmp_async
John Snow [Tue, 28 Apr 2015 19:27:51 +0000 (15:27 -0400)]
libqtest: add qmp_async

Add qmp_async, which lets us send QMP commands asynchronously.
This is useful when we want to send commands that will trigger
event responses, but we don't know in what order to expect them.

Sometimes the event responses may arrive even before the command
confirmation will show up, so it is convenient to leave the responses
in the stream.

Signed-off-by: John Snow <jsnow@redhat.com>
Message-id: 1426018503-821-5-git-send-email-jsnow@redhat.com

9 years agolibqtest: add qmp_eventwait
John Snow [Tue, 28 Apr 2015 19:27:51 +0000 (15:27 -0400)]
libqtest: add qmp_eventwait

Allow the user to poll until a desired interrupt occurs.

Signed-off-by: John Snow <jsnow@redhat.com>
Message-id: 1426018503-821-4-git-send-email-jsnow@redhat.com

9 years agoqtest/ahci: Allow override of default CLI options
John Snow [Tue, 28 Apr 2015 19:27:51 +0000 (15:27 -0400)]
qtest/ahci: Allow override of default CLI options

Signed-off-by: John Snow <jsnow@redhat.com>
Message-id: 1426018503-821-3-git-send-email-jsnow@redhat.com

9 years agoqtest/ahci: Add simple flush test
John Snow [Tue, 28 Apr 2015 19:27:51 +0000 (15:27 -0400)]
qtest/ahci: Add simple flush test

Signed-off-by: John Snow <jsnow@redhat.com>
Message-id: 1426018503-821-2-git-send-email-jsnow@redhat.com

9 years agoqtest/ahci: test different disk sectors
John Snow [Tue, 28 Apr 2015 19:27:51 +0000 (15:27 -0400)]
qtest/ahci: test different disk sectors

Test sector offset 0, 1, and the last sector(s)
in LBA28 and LBA48 modes.

Signed-off-by: John Snow <jsnow@redhat.com>
Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 1426274523-22661-3-git-send-email-jsnow@redhat.com

9 years agoqtest/ahci: add qcow2 support to ahci-test
John Snow [Tue, 28 Apr 2015 19:27:51 +0000 (15:27 -0400)]
qtest/ahci: add qcow2 support to ahci-test

This will enable the testing of high offsets without
wasting a lot of disk space, and does not impact the
previous tests.

mkimg and mkqcow2 are added to libqos for other tests.

Signed-off-by: John Snow <jsnow@redhat.com>
Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 1426274523-22661-2-git-send-email-jsnow@redhat.com

9 years agofdc: remove sparc sun4m mutations
Hervé Poussineau [Tue, 28 Apr 2015 19:27:51 +0000 (15:27 -0400)]
fdc: remove sparc sun4m mutations

They were introduced in 6f7e9aec5eb5bdfa57a9e458e391b785c283a007 and
82407d1a4035e5bfefb53ffdcb270872f813b34c and lots of bug fixes were done after that.

This fixes (at least) the detection of the floppy controller on Debian 4.0r9/SPARC,
and SS-5's OBP initialization routine still works.

Signed-off-by: Hervé Poussineau <hpoussin@reactos.org>
Tested-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Message-id: 1426351846-6497-1-git-send-email-hpoussin@reactos.org
Signed-off-by: John Snow <jsnow@redhat.com>
9 years agoMerge remote-tracking branch 'remotes/awilliam/tags/vfio-update-20150428.0' into...
Peter Maydell [Tue, 28 Apr 2015 17:58:15 +0000 (18:58 +0100)]
Merge remote-tracking branch 'remotes/awilliam/tags/vfio-update-20150428.0' into staging

VFIO updates
 - Correction to BAR overflow
 - Fix error sign
 - Reset workaround for AMD Bonaire & Hawaii GPUs

# gpg: Signature made Tue Apr 28 18:26:43 2015 BST using RSA key ID 3BB08B22
# gpg: Good signature from "Alex Williamson <alex.williamson@redhat.com>"
# gpg:                 aka "Alex Williamson <alex@shazbot.org>"
# gpg:                 aka "Alex Williamson <alwillia@redhat.com>"
# gpg:                 aka "Alex Williamson <alex.l.williamson@gmail.com>"

* remotes/awilliam/tags/vfio-update-20150428.0:
  vfio-pci: Reset workaround for AMD Bonaire and Hawaii GPUs
  vfio-pci: Fix error path sign
  vfio-pci: Further fix BAR size overflow

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
9 years agovfio-pci: Reset workaround for AMD Bonaire and Hawaii GPUs
Alex Williamson [Tue, 28 Apr 2015 17:14:02 +0000 (11:14 -0600)]
vfio-pci: Reset workaround for AMD Bonaire and Hawaii GPUs

Somehow these GPUs manage not to respond to a PCI bus reset, removing
our primary mechanism for resetting graphics cards.  The result is
that these devices typically work well for a single VM boot.  If the
VM is rebooted or restarted, the guest driver is not able to init the
card from the dirty state, resulting in a blue screen for Windows
guests.

The workaround is to use a device specific reset.  This is not 100%
reliable though since it depends on the incoming state of the device,
but it substantially improves the usability of these devices in a VM.

Credit to Alex Deucher <alexander.deucher@amd.com> for his guidance.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
9 years agovfio-pci: Fix error path sign
Alex Williamson [Tue, 28 Apr 2015 17:14:02 +0000 (11:14 -0600)]
vfio-pci: Fix error path sign

This is an impossible error path due to the fact that we're reading a
kernel provided, rather than user provided link, which will certainly
always fit in PATH_MAX.  Currently it returns a fixed 26 char path
plus %d group number, which typically maxes out at double digits.
However, the caller of the initfn certainly expects a less-than zero
return value on error, not just a non-zero value.  Therefore we
should correct the sign here.

Reported-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
9 years agovfio-pci: Further fix BAR size overflow
Alex Williamson [Tue, 28 Apr 2015 17:14:02 +0000 (11:14 -0600)]
vfio-pci: Further fix BAR size overflow

In an analysis by Laszlo, the resulting type of our calculation for
the end of the MSI-X table, and thus the start of memory after the
table, is uint32_t.  We're therefore not correctly preventing the
corner case overflow that we intended to fix here where a BAR >=4G
could place the MSI-X table to end exactly at the 4G boundary.  The
MSI-X table offset is defined by the hardware spec to 32bits, so we
simply use a cast rather than changing data structure types.  This
scenario is purely theoretically, typically the MSI-X table is located
at the front of the BAR.

Reported-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
9 years agoMerge remote-tracking branch 'remotes/kevin/tags/for-upstream' into staging
Peter Maydell [Tue, 28 Apr 2015 15:55:03 +0000 (16:55 +0100)]
Merge remote-tracking branch 'remotes/kevin/tags/for-upstream' into staging

Block patches

# gpg: Signature made Tue Apr 28 15:35:05 2015 BST using RSA key ID C88F2FD6
# gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>"

* remotes/kevin/tags/for-upstream: (76 commits)
  block: move I/O request processing to block/io.c
  block: extract bdrv_setup_io_funcs()
  block: add bdrv_set_dirty()/bdrv_reset_dirty() to block_int.h
  block: replace bdrv_states iteration with bdrv_next()
  vmdk: Widen before shifting 32 bit header field
  block/dmg: make it modular
  block/mirror: Always call block_job_sleep_ns()
  iotests: add incremental backup granularity tests
  iotests: add incremental backup failure recovery test
  iotests: add simple incremental backup case
  iotests: add QMP event waiting queue
  iotests: add invalid input incremental backup tests
  hbitmap: truncate tests
  block: Resize bitmaps on bdrv_truncate
  block: Ensure consistent bitmap function prototypes
  block: add BdrvDirtyBitmap documentation
  qmp: Add dirty bitmap status field in query-block
  qmp: add block-dirty-bitmap-clear
  qmp: Add support of "dirty-bitmap" sync mode for drive-backup
  block: Add bitmap successors
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
9 years agoblock: move I/O request processing to block/io.c
Stefan Hajnoczi [Tue, 28 Apr 2015 13:27:52 +0000 (14:27 +0100)]
block: move I/O request processing to block/io.c

The block.c file has grown to over 6000 lines.  It is time to split this
file so there are fewer conflicts and the code is easier to maintain.

Extract I/O request processing code:
 * Read
 * Write
 * Zero writes and making the image empty
 * Flush
 * Discard
 * ioctl
 * Tracked requests and queuing
 * Throttling and copy-on-read
 * Block status and allocated functions
 * Refreshing block limits
 * Reading/writing vmstate
 * qemu_blockalign() and friends

The patch simply moves code from block.c into block/io.c.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
9 years agoblock: extract bdrv_setup_io_funcs()
Stefan Hajnoczi [Tue, 28 Apr 2015 13:27:51 +0000 (14:27 +0100)]
block: extract bdrv_setup_io_funcs()

Move the code to install coroutine and aio emulation function pointers
in a BlockDriver to its own function.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
9 years agoblock: add bdrv_set_dirty()/bdrv_reset_dirty() to block_int.h
Stefan Hajnoczi [Tue, 28 Apr 2015 13:27:50 +0000 (14:27 +0100)]
block: add bdrv_set_dirty()/bdrv_reset_dirty() to block_int.h

The dirty bitmap functions are called from the block I/O processing
code.  Make them visible to block_int.h users so they can be used
outside block.c.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
9 years agoblock: replace bdrv_states iteration with bdrv_next()
Stefan Hajnoczi [Tue, 28 Apr 2015 13:27:49 +0000 (14:27 +0100)]
block: replace bdrv_states iteration with bdrv_next()

The bdrv_states list is a static variable in block.c.

bdrv_drain_all() and bdrv_flush_all() use this variable to iterate over
all drives.

The next patch will move bdrv_drain_all() and bdrv_flush_all() out of
block.c so it's necessary to switch to the public bdrv_next() interface.

Reviewed-by: Alberto Garcia <berto@igalia.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
9 years agovmdk: Widen before shifting 32 bit header field
Fam Zheng [Mon, 27 Apr 2015 14:23:01 +0000 (22:23 +0800)]
vmdk: Widen before shifting 32 bit header field

Coverity spotted this.

The field is 32 bits, but if it's possible to overflow in 32 bit
left shift.

Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
9 years agoblock/dmg: make it modular
Michael Tokarev [Mon, 27 Apr 2015 11:51:56 +0000 (14:51 +0300)]
block/dmg: make it modular

dmg can optionally utilize libbz2, make it modular

Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
9 years agoblock/mirror: Always call block_job_sleep_ns()
Max Reitz [Mon, 27 Apr 2015 11:07:31 +0000 (13:07 +0200)]
block/mirror: Always call block_job_sleep_ns()

The mirror block job is trying to take a clever shortcut if delay_ns is
0 and skips block_job_sleep_ns() in that case. But that function must be
called in every block job iteration, because otherwise it is for example
impossible to pause the job.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
9 years agoiotests: add incremental backup granularity tests
John Snow [Fri, 17 Apr 2015 23:50:09 +0000 (19:50 -0400)]
iotests: add incremental backup granularity tests

Test what happens if you fiddle with the granularity.

Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 1429314609-29776-22-git-send-email-jsnow@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
9 years agoiotests: add incremental backup failure recovery test
John Snow [Fri, 17 Apr 2015 23:50:08 +0000 (19:50 -0400)]
iotests: add incremental backup failure recovery test

Test the failure case for incremental backups.

Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 1429314609-29776-21-git-send-email-jsnow@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
9 years agoiotests: add simple incremental backup case
John Snow [Fri, 17 Apr 2015 23:50:07 +0000 (19:50 -0400)]
iotests: add simple incremental backup case

Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 1429314609-29776-20-git-send-email-jsnow@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
9 years agoiotests: add QMP event waiting queue
John Snow [Fri, 17 Apr 2015 23:50:06 +0000 (19:50 -0400)]
iotests: add QMP event waiting queue

A filter is added to allow callers to request very specific
events to be pulled from the event queue, while leaving undesired
events still in the stream.

This allows us to poll for completion data for multiple asynchronous
events in any arbitrary order.

A new timeout context is added to the qmp pull_event method's
wait parameter to allow tests to fail if they do not complete
within some expected period of time.

Also fixed is a bug in qmp.pull_event where we try to retrieve an event
from an empty list if we attempt to retrieve an event with wait=False
but no events have occurred.

Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 1429314609-29776-19-git-send-email-jsnow@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
9 years agoiotests: add invalid input incremental backup tests
John Snow [Fri, 17 Apr 2015 23:50:05 +0000 (19:50 -0400)]
iotests: add invalid input incremental backup tests

Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 1429314609-29776-18-git-send-email-jsnow@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
9 years agohbitmap: truncate tests
John Snow [Fri, 17 Apr 2015 23:50:04 +0000 (19:50 -0400)]
hbitmap: truncate tests

The general approach is to set bits close to the boundaries of
where we are truncating and ensure that everything appears to
have gone OK.

We test growing and shrinking by different amounts:
- Less than the granularity
- Less than the granularity, but across a boundary
- Less than sizeof(unsigned long)
- Less than sizeof(unsigned long), but across a ulong boundary
- More than sizeof(unsigned long)

Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 1429314609-29776-17-git-send-email-jsnow@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
9 years agoblock: Resize bitmaps on bdrv_truncate
John Snow [Fri, 17 Apr 2015 23:50:03 +0000 (19:50 -0400)]
block: Resize bitmaps on bdrv_truncate

Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 1429314609-29776-16-git-send-email-jsnow@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
9 years agoblock: Ensure consistent bitmap function prototypes
John Snow [Fri, 17 Apr 2015 23:50:02 +0000 (19:50 -0400)]
block: Ensure consistent bitmap function prototypes

We often don't need the BlockDriverState for functions
that operate on bitmaps. Remove it.

Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 1429314609-29776-15-git-send-email-jsnow@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
9 years agoblock: add BdrvDirtyBitmap documentation
John Snow [Fri, 17 Apr 2015 23:50:01 +0000 (19:50 -0400)]
block: add BdrvDirtyBitmap documentation

Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 1429314609-29776-14-git-send-email-jsnow@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
9 years agoqmp: Add dirty bitmap status field in query-block
John Snow [Fri, 17 Apr 2015 23:50:00 +0000 (19:50 -0400)]
qmp: Add dirty bitmap status field in query-block

Add the "frozen" status booleans, to inform clients
when a bitmap is occupied doing a task.

Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 1429314609-29776-13-git-send-email-jsnow@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
9 years agoqmp: add block-dirty-bitmap-clear
John Snow [Fri, 17 Apr 2015 23:49:59 +0000 (19:49 -0400)]
qmp: add block-dirty-bitmap-clear

Add bdrv_clear_dirty_bitmap and a matching QMP command,
qmp_block_dirty_bitmap_clear that enables a user to reset
the bitmap attached to a drive.

This allows us to reset a bitmap in the event of a full
drive backup.

Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 1429314609-29776-12-git-send-email-jsnow@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
9 years agoqmp: Add support of "dirty-bitmap" sync mode for drive-backup
John Snow [Fri, 17 Apr 2015 23:49:58 +0000 (19:49 -0400)]
qmp: Add support of "dirty-bitmap" sync mode for drive-backup

For "dirty-bitmap" sync mode, the block job will iterate through the
given dirty bitmap to decide if a sector needs backup (backup all the
dirty clusters and skip clean ones), just as allocation conditions of
"top" sync mode.

Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 1429314609-29776-11-git-send-email-jsnow@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
9 years agoblock: Add bitmap successors
John Snow [Fri, 17 Apr 2015 23:49:57 +0000 (19:49 -0400)]
block: Add bitmap successors

A bitmap successor is an anonymous BdrvDirtyBitmap that is intended to
be created just prior to a sensitive operation (e.g. Incremental Backup)
that can either succeed or fail, but during the course of which we still
want a bitmap tracking writes.

On creating a successor, we "freeze" the parent bitmap which prevents
its deletion, enabling, anonymization, or creating a bitmap with the
same name.

On success, the parent bitmap can "abdicate" responsibility to the
successor, which will inherit its name. The successor will have been
tracking writes during the course of the backup operation. The parent
will be safely deleted.

On failure, we can "reclaim" the successor from the parent, unifying
them such that the resulting bitmap describes all writes occurring since
the last successful backup, for instance. Reclamation will thaw the
parent, but not explicitly re-enable it.

BdrvDirtyBitmap operations that target a single bitmap are protected
by assertions that the bitmap is not frozen and/or disabled.

BdrvDirtyBitmap operations that target a group of bitmaps, such as
bdrv_{set,reset}_dirty will ignore frozen/disabled drives with a
conditional instead.

Internal functions that enable/disable dirty bitmaps have assertions
added to them to prevent modifying frozen bitmaps.

Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 1429314609-29776-10-git-send-email-jsnow@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
9 years agoblock: Add bitmap disabled status
John Snow [Fri, 17 Apr 2015 23:49:56 +0000 (19:49 -0400)]
block: Add bitmap disabled status

Add a status indicating the enabled/disabled state of the bitmap.
A bitmap is by default enabled, but you can lock the bitmap into
a read-only state by setting disabled = true.

A previous version of this patch added a QMP interface for changing
the state of the bitmap, but it has since been removed for now until
a use case emerges where this state must be revealed to the user.

The disabled state WILL be used internally for bitmap migration and
bitmap persistence.

Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 1429314609-29776-9-git-send-email-jsnow@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
9 years agohbitmap: add hbitmap_merge
John Snow [Fri, 17 Apr 2015 23:49:55 +0000 (19:49 -0400)]
hbitmap: add hbitmap_merge

We add a bitmap merge operation to assist in error cases
where we wish to combine two bitmaps together.

This is algorithmically O(bits) provided HBITMAP_LEVELS remains
constant. For a full bitmap on a 64bit machine:
sum(bits/64^k, k, 0, HBITMAP_LEVELS) ~= 1.01587 * bits

We may be able to improve running speed for particularly sparse
bitmaps by using iterators, but the running time for dense maps
will be worse.

We present the simpler solution first, and we can refine it later
if needed.

Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 1429314609-29776-8-git-send-email-jsnow@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
9 years agohbitmap: cache array lengths
John Snow [Fri, 17 Apr 2015 23:49:54 +0000 (19:49 -0400)]
hbitmap: cache array lengths

As a convenience: between incremental backups, bitmap migrations
and bitmap persistence we seem to need to recalculate these a lot.

Because the lengths are a little bit-twiddly, let's just solidly
cache them and be done with it.

Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 1429314609-29776-7-git-send-email-jsnow@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
9 years agoblock: Introduce bdrv_dirty_bitmap_granularity()
John Snow [Fri, 17 Apr 2015 23:49:53 +0000 (19:49 -0400)]
block: Introduce bdrv_dirty_bitmap_granularity()

This returns the granularity (in bytes) of dirty bitmap,
which matches the QMP interface and the existing query
interface.

Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 1429314609-29776-6-git-send-email-jsnow@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
9 years agoqmp: Add block-dirty-bitmap-add and block-dirty-bitmap-remove
John Snow [Fri, 17 Apr 2015 23:49:52 +0000 (19:49 -0400)]
qmp: Add block-dirty-bitmap-add and block-dirty-bitmap-remove

The new command pair is added to manage a user created dirty bitmap. The
dirty bitmap's name is mandatory and must be unique for the same device,
but different devices can have bitmaps with the same names.

The granularity is an optional field. If it is not specified, we will
choose a default granularity based on the cluster size if available,
clamped to between 4K and 64K to mirror how the 'mirror' code was
already choosing granularity. If we do not have cluster size info
available, we choose 64K. This code has been factored out into a helper
shared with block/mirror.

This patch also introduces the 'block_dirty_bitmap_lookup' helper,
which takes a device name and a dirty bitmap name and validates the
lookup, returning NULL and setting errp if there is a problem with
either field. This helper will be re-used in future patches in this
series.

The types added to block-core.json will be re-used in future patches
in this series, see:
'qapi: Add transaction support to block-dirty-bitmap-{add, enable, disable}'

Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 1429314609-29776-5-git-send-email-jsnow@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
9 years agoqmp: Ensure consistent granularity type
John Snow [Fri, 17 Apr 2015 23:49:51 +0000 (19:49 -0400)]
qmp: Ensure consistent granularity type

We treat this field with a variety of different types everywhere
in the code. Now it's just uint32_t.

Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 1429314609-29776-4-git-send-email-jsnow@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
9 years agoqapi: Add optional field "name" to block dirty bitmap
Fam Zheng [Fri, 17 Apr 2015 23:49:50 +0000 (19:49 -0400)]
qapi: Add optional field "name" to block dirty bitmap

This field will be set for user created dirty bitmap. Also pass in an
error pointer to bdrv_create_dirty_bitmap, so when a name is already
taken on this BDS, it can report an error message. This is not global
check, two BDSes can have dirty bitmap with a common name.

Implemented bdrv_find_dirty_bitmap to find a dirty bitmap by name, will
be used later when other QMP commands want to reference dirty bitmap by
name.

Add bdrv_dirty_bitmap_make_anon. This unsets the name of dirty bitmap.

Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 1429314609-29776-3-git-send-email-jsnow@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
9 years agodocs: incremental backup documentation
John Snow [Fri, 17 Apr 2015 23:49:49 +0000 (19:49 -0400)]
docs: incremental backup documentation

Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 1429314609-29776-2-git-send-email-jsnow@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
9 years agoblock/iscsi: use the allocationmap also if cache.direct=on
Peter Lieven [Thu, 16 Apr 2015 14:08:33 +0000 (16:08 +0200)]
block/iscsi: use the allocationmap also if cache.direct=on

the allocationmap has only a hint character. The driver always
double checks that blocks marked unallocated in the cache are
still unallocated before taking the fast path and return zeroes.
So using the allocationmap is migration safe and can
also be enabled with cache.direct=on.

Signed-off-by: Peter Lieven <pl@kamp.de>
Message-id: 1429193313-4263-10-git-send-email-pl@kamp.de
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
9 years agoblock/iscsi: bump year in copyright notice
Peter Lieven [Thu, 16 Apr 2015 14:08:32 +0000 (16:08 +0200)]
block/iscsi: bump year in copyright notice

Signed-off-by: Peter Lieven <pl@kamp.de>
Message-id: 1429193313-4263-9-git-send-email-pl@kamp.de
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
9 years agoblock/iscsi: handle SCSI_STATUS_TASK_SET_FULL
Peter Lieven [Thu, 16 Apr 2015 14:08:31 +0000 (16:08 +0200)]
block/iscsi: handle SCSI_STATUS_TASK_SET_FULL

a target may issue a SCSI_STATUS_TASK_SET_FULL status
if there is more than one "BUSY" command queued already.

Signed-off-by: Peter Lieven <pl@kamp.de>
Message-id: 1429193313-4263-8-git-send-email-pl@kamp.de
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
9 years agoblock/iscsi: increase retry count
Peter Lieven [Thu, 16 Apr 2015 14:08:30 +0000 (16:08 +0200)]
block/iscsi: increase retry count

The idea is that a command is retried in a BUSY condition
up a time of approx. 60 seconds before it is failed. This should
be far higher than any command timeout in the guest.

Signed-off-by: Peter Lieven <pl@kamp.de>
Message-id: 1429193313-4263-7-git-send-email-pl@kamp.de
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
9 years agoblock/iscsi: optimize WRITE10/16 if cache.writeback is not set
Peter Lieven [Thu, 16 Apr 2015 14:08:29 +0000 (16:08 +0200)]
block/iscsi: optimize WRITE10/16 if cache.writeback is not set

SCSI allowes to tell the target to not return from a write command
if the date is not written to the disk. Use this so called FUA
bit if it is supported to optimize WRITE commands if writeback is
not allowed.

In this case qemu always issues a WRITE followed by a FLUSH. This
is 2 round trip times. If we set the FUA bit we can ignore the
following FLUSH.

Signed-off-by: Peter Lieven <pl@kamp.de>
Message-id: 1429193313-4263-6-git-send-email-pl@kamp.de
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
9 years agoblock/iscsi: store DPOFUA bit from the modesense command
Peter Lieven [Thu, 16 Apr 2015 14:08:28 +0000 (16:08 +0200)]
block/iscsi: store DPOFUA bit from the modesense command

Signed-off-by: Peter Lieven <pl@kamp.de>
Message-id: 1429193313-4263-5-git-send-email-pl@kamp.de
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
9 years agoblock/iscsi: rename iscsi_write_protected and let it return void
Peter Lieven [Thu, 16 Apr 2015 14:08:27 +0000 (16:08 +0200)]
block/iscsi: rename iscsi_write_protected and let it return void

Signed-off-by: Peter Lieven <pl@kamp.de>
Message-id: 1429193313-4263-4-git-send-email-pl@kamp.de
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
9 years agoblock/iscsi: change all iscsilun properties from uint8_t to bool
Peter Lieven [Thu, 16 Apr 2015 14:08:26 +0000 (16:08 +0200)]
block/iscsi: change all iscsilun properties from uint8_t to bool

Signed-off-by: Peter Lieven <pl@kamp.de>
Message-id: 1429193313-4263-3-git-send-email-pl@kamp.de
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
9 years agoblock/iscsi: do not forget to logout from target
Peter Lieven [Thu, 16 Apr 2015 14:08:25 +0000 (16:08 +0200)]
block/iscsi: do not forget to logout from target

We actually were always impolitely dropping the connection and
not cleanly logging out.

CC: qemu-stable@nongnu.org
Signed-off-by: Peter Lieven <pl@kamp.de>
Message-id: 1429193313-4263-2-git-send-email-pl@kamp.de
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
9 years agoqmp: fill in the image field in BlockDeviceInfo
Alberto Garcia [Fri, 17 Apr 2015 11:52:43 +0000 (14:52 +0300)]
qmp: fill in the image field in BlockDeviceInfo

The image field in BlockDeviceInfo is supposed to contain an ImageInfo
object. However that is being filled in by bdrv_query_info(), not by
bdrv_block_device_info(), which is where BlockDeviceInfo is actually
created.

Anyone calling bdrv_block_device_info() directly will get a null image
field. As a consequence of this, the HMP command 'info block -n -v'
crashes QEMU.

This patch moves the code that fills in that field from
bdrv_query_info() to bdrv_block_device_info().

Signed-off-by: Alberto Garcia <berto@igalia.com>
Message-id: 1429271563-3765-1-git-send-email-berto@igalia.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
9 years agoRevert "hmp: fix crash in 'info block -n -v'"
Stefan Hajnoczi [Wed, 22 Apr 2015 10:15:10 +0000 (11:15 +0100)]
Revert "hmp: fix crash in 'info block -n -v'"

This reverts commit 638b8366200130cc7cf7a026630bc6bfb63b0c4c.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
9 years agoblock: add 'node-name' field to BLOCK_IMAGE_CORRUPTED
Alberto Garcia [Wed, 8 Apr 2015 09:29:20 +0000 (12:29 +0300)]
block: add 'node-name' field to BLOCK_IMAGE_CORRUPTED

Since this event can occur in nodes that cannot have a device name
associated, include also a field with the node name.

Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-id: 147cec5b3594f4bec0cb41c98afe5fcbfb67567c.1428485266.git.berto@igalia.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
9 years agoblock: use bdrv_get_device_or_node_name() in error messages
Alberto Garcia [Wed, 8 Apr 2015 09:29:19 +0000 (12:29 +0300)]
block: use bdrv_get_device_or_node_name() in error messages

There are several error messages that identify a BlockDriverState by
its device name. However those errors can be produced in nodes that
don't have a device name associated.

In those cases we should use bdrv_get_device_or_node_name() to fall
back to the node name and produce a more meaningful message. The
messages are also updated to use the more generic term 'node' instead
of 'device'.

Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 9823a1f0514fdb0692e92868661c38a9e00a12d6.1428485266.git.berto@igalia.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
9 years agoblock: add bdrv_get_device_or_node_name()
Alberto Garcia [Wed, 8 Apr 2015 09:29:18 +0000 (12:29 +0300)]
block: add bdrv_get_device_or_node_name()

This function gets the device name associated with a BlockDriverState,
or its node name if the device name is empty.

Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 4fa30aa8d61d9052ce266fd5429a59a14e941255.1428485266.git.berto@igalia.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
9 years agoblock: document block-stream in qmp-commands.hx
Stefan Hajnoczi [Wed, 15 Apr 2015 10:43:42 +0000 (11:43 +0100)]
block: document block-stream in qmp-commands.hx

The 'block-stream' QMP command is documented in block-core.json but not
qmp-commands.hx.  Add a summary of the command to qmp-commands.hx
(similar to the documentation for 'block-commit').

Reported-by: Kashyap Chamarthy <kchamart@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-id: 1429094622-26218-1-git-send-email-stefanha@redhat.com
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
9 years agom25p80: fix s->blk usage before assignment
Stefan Hajnoczi [Wed, 15 Apr 2015 09:43:44 +0000 (10:43 +0100)]
m25p80: fix s->blk usage before assignment

Delay the call to blk_blockalign() until s->blk has been assigned.

This never caused a crash because blk_blockalign(NULL, size) defaults to
4096 alignment but it's technically incorrect.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 1429091024-25098-1-git-send-email-stefanha@redhat.com
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
9 years agom25p80: add missing blk_attach_dev_nofail
Paolo Bonzini [Tue, 14 Apr 2015 15:29:47 +0000 (17:29 +0200)]
m25p80: add missing blk_attach_dev_nofail

Of the block devices that poked into -drive options via drive_get_next,
m25p80 was the only one who also did not attach itself to the BlockBackend.

Since sd does it, and all other devices go through a "drive" property,
with this change all block backends attached to the guest will have a
non-NULL result for blk_get_attached_dev().

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Message-id: 1429025387-11077-1-git-send-email-pbonzini@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
9 years agovirtio_blk: comment fix
Michael S. Tsirkin [Sun, 12 Apr 2015 15:55:17 +0000 (17:55 +0200)]
virtio_blk: comment fix

update virtio blk header from latest linux, include comment fixups.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Message-id: 1428854036-12806-1-git-send-email-mst@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
9 years agoblock: avoid unnecessary bottom halves
Paolo Bonzini [Sat, 28 Mar 2015 06:37:18 +0000 (07:37 +0100)]
block: avoid unnecessary bottom halves

bdrv_aio_* APIs can use coroutines to achieve asynchronicity.  However,
the coroutine may terminate without having yielded back to the caller
(for example because of something that invokes a nested event loop,
or because the coroutine is doing nothing at all).  In this case,
the bdrv_aio_* API must delay the completion to the next iteration
of the main loop, because bdrv_aio_* will never invoke the callback
before returning.

This can be done with a bottom half, and indeed bdrv_aio_* is always
using one for simplicity.  It is possible to gain some performance
(~3%) by avoiding this in the common case.  A new field in the
BlockAIOCBCoroutine struct is set to true until the first time the
corotine has yielded to its creator, and completion goes through a
new function bdrv_co_complete.  If the flag is false, bdrv_co_complete
invokes the callback immediately.  If it is true, the caller will
notice that the coroutine has completed and schedule the bottom
half itself.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 1427524638-28157-1-git-send-email-pbonzini@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
9 years agoblockjob: Update function name in comments
Fam Zheng [Fri, 3 Apr 2015 14:05:21 +0000 (22:05 +0800)]
blockjob: Update function name in comments

Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Message-id: 1428069921-2957-5-git-send-email-famz@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
9 years agoqemu-iotests: Test that "stop" doesn't drain block jobs
Fam Zheng [Fri, 3 Apr 2015 14:05:20 +0000 (22:05 +0800)]
qemu-iotests: Test that "stop" doesn't drain block jobs

Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Message-id: 1428069921-2957-4-git-send-email-famz@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
9 years agoblock: Pause block jobs in bdrv_drain_all
Fam Zheng [Fri, 3 Apr 2015 14:05:19 +0000 (22:05 +0800)]
block: Pause block jobs in bdrv_drain_all

This is necessary to suppress more IO requests from being generated from
block job coroutines.

Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Message-id: 1428069921-2957-3-git-send-email-famz@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
9 years agoblockjob: Allow nested pause
Fam Zheng [Fri, 3 Apr 2015 14:05:18 +0000 (22:05 +0800)]
blockjob: Allow nested pause

This patch changes block_job_pause to increase the pause counter and
block_job_resume to decrease it.

The counter will allow calling block_job_pause/block_job_resume
unconditionally on a job when we need to suspend the IO temporarily.

From now on, each block_job_resume must be paired with a block_job_pause
to keep the counter balanced.

The user pause from QMP or HMP will only trigger block_job_pause once
until it's resumed, this is achieved by adding a user_paused flag in
BlockJob.

One occurrence of block_job_resume in mirror_complete is replaced with
block_job_enter which does what is necessary.

In block_job_cancel, the cancel flag is good enough to instruct
coroutines to quit loop, so use block_job_enter to replace the unpaired
block_job_resume.

Upon block job IO error, user is notified about the entering to the
pause state, so this pause belongs to user pause, set the flag
accordingly and expect a matching QMP resume.

[Extended doc comments as suggested by Paolo Bonzini
<pbonzini@redhat.com>.
--Stefan]

Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Message-id: 1428069921-2957-2-git-send-email-famz@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
9 years agoMAINTAINERS: Add Fam Zheng as Null block driver maintainer
Fam Zheng [Wed, 1 Apr 2015 01:45:40 +0000 (09:45 +0800)]
MAINTAINERS: Add Fam Zheng as Null block driver maintainer

Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 1427852740-24315-4-git-send-email-famz@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
9 years agoblock/null: Support reopen
Fam Zheng [Wed, 1 Apr 2015 01:45:39 +0000 (09:45 +0800)]
block/null: Support reopen

Reopen is used in block-commit. With this always-succeed operation, it
is now possible to test committing to a null drive, by specifying
"null-aio://" or "null-co://" as the backing image when creating the
qcow2 image.

Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 1427852740-24315-3-git-send-email-famz@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
9 years agoblock/null: Latency simulation by adding new option "latency-ns"
Fam Zheng [Wed, 1 Apr 2015 01:45:38 +0000 (09:45 +0800)]
block/null: Latency simulation by adding new option "latency-ns"

Aio context switch should just work because the requests will be
drained, so the scheduled timer(s) on the old context will be freed.

Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 1427852740-24315-2-git-send-email-famz@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
9 years agoscripts: add 'qemu coroutine' command to qemu-gdb.py
Stefan Hajnoczi [Thu, 26 Mar 2015 22:42:34 +0000 (22:42 +0000)]
scripts: add 'qemu coroutine' command to qemu-gdb.py

The 'qemu coroutine <coroutine-address>' GDB command prints the
backtrace for a CoroutineUContext.  This is useful for peeking inside
yielded coroutines that are waiting for file descriptor events, timers,
etc.

For example:

  $ gdb tests/test-coroutine
  (gdb) b test_yield
  (gdb) r
  (gdb) b qemu_coroutine_enter
  (gdb) c
  (gdb) c
  Continuing.

  Breakpoint 2, qemu_coroutine_enter (co=0x555555c66520, opaque=0x0) at qemu-coroutine.c:103
  103 {
  (gdb) source scripts/qemu-gdb.py
  (gdb) qemu coroutine 0x555555c66520
  #0  0x000055555557a740 in qemu_coroutine_switch (from_=<optimized out>, to_=0x7ffff7f90a70, action=COROUTINE_YIELD) at coroutine-ucontext.c:177
  #1  0x0000555555566af9 in yield_5_times (opaque=0x7fffffffdbb7) at tests/test-coroutine.c:107
  #2  0x000055555557a7aa in coroutine_trampoline (i0=<optimized out>, i1=<optimized out>) at coroutine-ucontext.c:80
  #3  0x00007ffff08de000 in __start_context () at /lib64/libc.so.6

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 1427409754-8556-1-git-send-email-stefanha@redhat.com
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
9 years agothread-pool: clean up thread_pool_completion_bh()
Stefan Hajnoczi [Thu, 2 Apr 2015 16:39:22 +0000 (17:39 +0100)]
thread-pool: clean up thread_pool_completion_bh()

This patch simplifies thread_pool_completion_bh().

The function first checks elem->state:

  if (elem->state != THREAD_DONE) {
      continue;
  }

It then goes on to check elem->state == THREAD_DONE although we already
know this must be the case.

The QLIST_REMOVE() is duplicated down both branches of an if-else
statement so that can be lifted out as well.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 1427992762-10126-1-git-send-email-stefanha@redhat.com
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
9 years agovhdx: Fix zero-fill iov length
Kevin Wolf [Tue, 14 Apr 2015 14:36:16 +0000 (16:36 +0200)]
vhdx: Fix zero-fill iov length

Fix the length of the zero-fill for the back, which was accidentally
using the same value as for the front. This is caught by qemu-iotests
033.

For consistency, change the code for the front as well to use the length
stored in the iov (it is the same value, copied four lines above).

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Acked-by: Jeff Cody <jcody@redhat.com>
9 years agoblkdebug: Add bdrv_truncate()
Kevin Wolf [Tue, 14 Apr 2015 14:32:45 +0000 (16:32 +0200)]
blkdebug: Add bdrv_truncate()

This is, amongst others, required for qemu-iotests 033 to run as
intended on VHDX, which uses explicit bdrv_truncate() calls to bs->file
when allocating new blocks.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
9 years agoqemu-iotests: Some qemu-img convert tests
Kevin Wolf [Thu, 19 Mar 2015 12:33:33 +0000 (13:33 +0100)]
qemu-iotests: Some qemu-img convert tests

This adds a regression test for some problems that the qemu-img convert
rewrite just fixed.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
9 years agoqemu-img convert: Rewrite copying logic
Kevin Wolf [Thu, 19 Mar 2015 12:33:32 +0000 (13:33 +0100)]
qemu-img convert: Rewrite copying logic

The implementation of qemu-img convert is (a) messy, (b) buggy, and
(c) less efficient than possible. The changes required to beat some
sense into it are massive enough that incremental changes would only
make my and the reviewers' life harder. So throw it away and reimplement
it from scratch.

Let me give some examples what I mean by messy, buggy and inefficient:

(a) The copying logic of qemu-img convert has two separate branches for
    compressed and normal target images, which roughly do the same -
    except for a little code that handles actual differences between
    compressed and uncompressed images, and much more code that
    implements just a different set of optimisations and bugs. This is
    unnecessary code duplication, and makes the code for compressed
    output (unsurprisingly) suffer from bitrot.

    The code for uncompressed ouput is run twice to count the the total
    length for the progress bar. In the first run it just takes a
    shortcut and runs only half the loop, and when it's done, it toggles
    a boolean, jumps out of the loop with a backwards goto and starts
    over. Works, but pretty is something different.

(b) Converting while keeping a backing file (-B option) is broken in
    several ways. This includes not writing to the image file if the
    input has zero clusters or data filled with zeros (ignoring that the
    backing file will be visible instead).

    It also doesn't correctly limit every iteration of the copy loop to
    sectors of the same status so that too many sectors may be copied to
    in the target image. For -B this gives an unexpected result, for
    other images it just does more work than necessary.

    Conversion with a compressed target completely ignores any target
    backing file.

(c) qemu-img convert skips reading and writing an area if it knows from
    metadata that copying isn't needed (except for the bug mentioned
    above that ignores a status change in some cases). It does, however,
    read from the source even if it knows that it will read zeros, and
    then search for non-zero bytes in the read buffer, if it's possible
    that a write might be needed.

This reimplementation of the copying core reorganises the code to remove
the duplication and have a much more obvious code flow, by essentially
splitting the copy iteration loop into three parts:

1. Find the number of contiguous sectors of the same status at the
   current offset (This can also be called in a separate loop before the
   copying loop in order to determine the total sectors for the progress
   bar.)

2. Read sectors. If the status implies that there is no data there to
   read (zero or unallocated cluster), don't do anything.

3. Write sectors depending on the status. If it's data, write it. If
   we want the backing file to be visible (with -B), don't write it. If
   it's zeroed, skip it if you can, otherwise use bdrv_write_zeroes() to
   optimise the write at least where possible.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
9 years agoblock-backend: Expose bdrv_write_zeroes()
Kevin Wolf [Thu, 19 Mar 2015 12:33:31 +0000 (13:33 +0100)]
block-backend: Expose bdrv_write_zeroes()

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
9 years agoiothread: release iothread around aio_poll
Paolo Bonzini [Fri, 20 Feb 2015 16:26:52 +0000 (17:26 +0100)]
iothread: release iothread around aio_poll

This is the first step towards having fine-grained critical sections in
dataplane threads, which resolves lock ordering problems between
address_space_* functions (which need the BQL when doing MMIO, even
after we complete RCU-based dispatch) and the AioContext.

Because AioContext does not use contention callbacks anymore, the
unit test has to be changed.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 1424449612-18215-4-git-send-email-pbonzini@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
9 years agoAioContext: acquire/release AioContext during aio_poll
Paolo Bonzini [Fri, 20 Feb 2015 16:26:51 +0000 (17:26 +0100)]
AioContext: acquire/release AioContext during aio_poll

This is the first step in pushing down acquire/release, and will let
rfifolock drop the contention callback feature.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 1424449612-18215-3-git-send-email-pbonzini@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
9 years agoaio-posix: move pollfds to thread-local storage
Paolo Bonzini [Fri, 20 Feb 2015 16:26:50 +0000 (17:26 +0100)]
aio-posix: move pollfds to thread-local storage

By using thread-local storage, aio_poll can stop using global data during
g_poll_ns.  This will make it possible to drop callbacks from rfifolock.

[Moved npfd = 0 assignment to end of walking_handlers region as
suggested by Paolo.  This resolves the assert(npfd == 0) assertion
failure in pollfds_cleanup().
--Stefan]

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 1424449612-18215-2-git-send-email-pbonzini@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>