]> git.proxmox.com Git - mirror_qemu.git/log
mirror_qemu.git
5 years agoqdev: rename typedef qdev_resetfn() -> DeviceReset()
Philippe Mathieu-Daudé [Sun, 14 Jan 2018 02:04:10 +0000 (23:04 -0300)]
qdev: rename typedef qdev_resetfn() -> DeviceReset()

following the DeviceRealize and DeviceUnrealize typedefs,
this unify a bit the new QOM API.

Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20180114020412.26160-2-f4bug@amsat.org>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit b850f664a1dbbc1ea27bef12cd251ee5da0bfe05)
*prereq for 0c53057adb
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
5 years agopc-bios/s390-ccw: struct tpi_info must be declared as aligned(4)
Thomas Huth [Tue, 8 May 2018 10:17:52 +0000 (12:17 +0200)]
pc-bios/s390-ccw: struct tpi_info must be declared as aligned(4)

I've run into a compilation error today with the current version of GCC 8:

In file included from s390-ccw.h:49,
                 from main.c:12:
cio.h:128:1: error: alignment 1 of 'struct tpi_info' is less than 4 [-Werror=packed-not-aligned]
 } __attribute__ ((packed));
 ^
cc1: all warnings being treated as errors

Since the struct tpi_info contains an element ("struct subchannel_id schid")
which is marked as aligned(4), we've got to mark the struct tpi_info as
aligned(4), too.

CC: qemu-stable@nongnu.org
Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-Id: <1525774672-11913-1-git-send-email-thuth@redhat.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
(cherry picked from commit a6e4385dea94850d7b06b0542e7960c1063fdabd)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
5 years agos390x/css: disabled subchannels cannot be status pending
Cornelia Huck [Fri, 4 May 2018 12:53:16 +0000 (14:53 +0200)]
s390x/css: disabled subchannels cannot be status pending

The 3270 code will try to post an attention interrupt when the
3270 emulator (e.g. x3270) attaches. If the guest has not yet
enabled the subchannel for the 3270 device, we will present a spurious
cc 1 (status pending) when it uses msch on it later on, e.g. when
trying to enable the subchannel.

To fix this, just don't do anything in css_conditional_io_interrupt()
if the subchannel is not enabled. The 3270 code will work fine with
that, and the other user of this function (virtio-ccw) never
attempts to post an interrupt for a disabled device to begin with.

CC: qemu-stable@nongnu.org
Reported-by: Thomas Huth <thuth@redhat.com>
Tested-by: Thomas Huth <thuth@redhat.com>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Acked-by: Halil Pasic <pasic@linux.ibm.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
(cherry picked from commit 6e9c893ecd00afd5344c35d0d0ded50eaa0938f6)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
5 years agoraw: Check byte range uniformly
Fam Zheng [Fri, 1 Jun 2018 09:26:40 +0000 (17:26 +0800)]
raw: Check byte range uniformly

We don't verify the request range against s->size in the I/O callbacks
except for raw_co_pwritev. This is inconsistent (especially for
raw_co_pwrite_zeroes and raw_co_pdiscard), so fix them, in the meanwhile
make the helper reusable by the coming new callbacks.

Note that in most cases the block layer already verifies the request
byte range against our reported image length, before invoking the driver
callbacks.  The exception is during image creating, after
blk_set_allow_write_beyond_eof(blk, true) is called. But in that case,
the requests are not directly from the user or guest. So there is no
visible behavior change in adding the check code.

The int64_t -> uint64_t inconsistency, as shown by the type casting, is
pre-existing due to the interface.

Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
Message-id: 20180601092648.24614-3-famz@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 384455385248762e74a080978f18f0c8f74757fe)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
5 years agolm32: take BQL before writing IP/IM register
Michael Walle [Tue, 9 Jan 2018 17:01:13 +0000 (18:01 +0100)]
lm32: take BQL before writing IP/IM register

Writing to these registers may raise an interrupt request. Actually,
this prevents the milkymist board from starting.

Cc: qemu-stable@nongnu.org
Signed-off-by: Michael Walle <michael@walle.cc>
Tested-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
(cherry picked from commit 81e9cbd0ca1131012b058df6804b1f626a6b730c)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
5 years agoiotests: Add test for -U/force-share conflicts
Max Reitz [Wed, 2 May 2018 20:20:51 +0000 (22:20 +0200)]
iotests: Add test for -U/force-share conflicts

Signed-off-by: Max Reitz <mreitz@redhat.com>
Message-id: 20180502202051.15493-4-mreitz@redhat.com
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
(cherry picked from commit 4e7d73c5fbd97e55ffe5af02f24d1f7dbe3bbf20)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
5 years agoqemu-img: Use only string options in img_open_opts
Max Reitz [Wed, 2 May 2018 20:20:50 +0000 (22:20 +0200)]
qemu-img: Use only string options in img_open_opts

img_open_opts() takes a QemuOpts and converts them to a QDict, so all
values therein are strings.  Then it may try to call qdict_get_bool(),
however, which will fail with a segmentation fault every time:

$ ./qemu-img info -U --image-opts \
    driver=file,filename=/dev/null,force-share=off
[1]    27869 segmentation fault (core dumped)  ./qemu-img info -U
--image-opts driver=file,filename=/dev/null,force-share=off

Fix this by using qdict_get_str() and comparing the value as a string.
Also, when adding a force-share value to the QDict, add it as a string
so it fits the rest of the dict.

Cc: qemu-stable@nongnu.org
Signed-off-by: Max Reitz <mreitz@redhat.com>
Message-id: 20180502202051.15493-3-mreitz@redhat.com
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
(cherry picked from commit 4615f87832d2fcb7a544bedeece2741bf8c21f94)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
5 years agoqemu-io: Use purely string blockdev options
Max Reitz [Wed, 2 May 2018 20:20:49 +0000 (22:20 +0200)]
qemu-io: Use purely string blockdev options

Currently, qemu-io only uses string-valued blockdev options (as all are
converted directly from QemuOpts) -- with one exception: -U adds the
force-share option as a boolean.  This in itself is already a bit
questionable, but a real issue is that it also assumes the value already
existing in the options QDict would be a boolean, which is wrong.

That has the following effect:

$ ./qemu-io -r -U --image-opts \
    driver=file,filename=/dev/null,force-share=off
[1]    15200 segmentation fault (core dumped)  ./qemu-io -r -U
--image-opts driver=file,filename=/dev/null,force-share=off

Since @opts is converted from QemuOpts, the value must be a string, and
we have to compare it as such.  Consequently, it makes sense to also set
it as a string instead of a boolean.

Cc: qemu-stable@nongnu.org
Signed-off-by: Max Reitz <mreitz@redhat.com>
Message-id: 20180502202051.15493-2-mreitz@redhat.com
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
(cherry picked from commit 2a01c01f9ecb43af4c0a85fe6adc429ffc9c31b5)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
5 years agoiotests: Add test for rebasing with relative paths
Max Reitz [Wed, 9 May 2018 18:20:02 +0000 (20:20 +0200)]
iotests: Add test for rebasing with relative paths

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 20180509182002.8044-3-mreitz@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
(cherry picked from commit 28036a7f7044fddb79819e3c8fcb4ae5605c60e0)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
5 years agoqemu-img: Resolve relative backing paths in rebase
Max Reitz [Wed, 9 May 2018 18:20:01 +0000 (20:20 +0200)]
qemu-img: Resolve relative backing paths in rebase

Currently, rebase interprets a relative path for the new backing image
as follows:
(1) Open the new backing image with the given relative path (thus relative to
    qemu-img's working directory).
(2) Write it directly into the overlay's backing path field (thus
    relative to the overlay).

If the overlay is not in qemu-img's working directory, both will be
different interpretations, which may either lead to an error somewhere
(either rebase fails because it cannot open the new backing image, or
your overlay becomes unusable because its backing path does not point to
a file), or, even worse, it may result in your rebase being performed
for a different backing file than what your overlay will point to after
the rebase.

Fix this by interpreting the target backing path as relative to the
overlay, like qemu-img does everywhere else.

Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=1569835
Cc: qemu-stable@nongnu.org
Signed-off-by: Max Reitz <mreitz@redhat.com>
Message-id: 20180509182002.8044-2-mreitz@redhat.com
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
(cherry picked from commit d16699b64671466b42079c45b89127aeea1ca565)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
5 years agoconfigure: recognize more rpmbuild macros
Olaf Hering [Wed, 18 Apr 2018 07:50:44 +0000 (09:50 +0200)]
configure: recognize more rpmbuild macros

Extend the list of recognized, but ignored options from rpms %configure
macro. This fixes build on hosts running SUSE Linux.

Cc: qemu-stable@nongnu.org
Signed-off-by: Olaf Hering <olaf@aepfle.de>
Message-Id: <20180418075045.27393-1-olaf@aepfle.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 181ce1d05c6d4f1c80f0e7ebb41e489c2b541edf)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
5 years agoqxl: fix local renderer crash
Gerd Hoffmann [Fri, 27 Apr 2018 11:55:28 +0000 (13:55 +0200)]
qxl: fix local renderer crash

Make sure we only ask the spice local renderer for display updates in
case we have a valid primary surface.  Without that spice is confused
and throws errors in case a display update request (triggered by
screendump for example) happens in parallel to a mode switch and hits
the race window where the old primary surface is gone and the new isn't
establisted yet.

Cc: qemu-stable@nongnu.org
Fixes: https://bugzilla.redhat.com//show_bug.cgi?id=1567733
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-id: 20180427115528.345-1-kraxel@redhat.com
(cherry picked from commit 5bd5c27c7d284d01477c5cc022ce22438c46bf9f)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
5 years agospapr: don't advertise radix GTSE if max-compat-cpu < power9
Greg Kurz [Thu, 3 May 2018 21:16:59 +0000 (23:16 +0200)]
spapr: don't advertise radix GTSE if max-compat-cpu < power9

On a POWER9 host, if a guest runs in pre POWER9 compat mode, it necessarily
uses the hash MMU mode. In this case, we shouldn't advertise radix GTSE in
the ibm,arch-vec-5-platform-support DT property as the current code does.
The first reason is that it doesn't make sense, and the second one is that
causes the CAS-negotiated options subsection to be migrated. This breaks
backward migration to QEMU 2.7 and older versions on POWER8 hosts:

qemu-system-ppc64: error while loading state for instance 0x0 of device
 'spapr'
qemu-system-ppc64: load of migration failed: No such file or directory

This patch hence initialize CPUs a bit earlier so that we can check the
requested compat mode, and don't set OV5_MMU_RADIX_GTSE for power8 and
older.

Signed-off-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
(cherry picked from commit 0550b1206a91d66051a21441a02c4ff126b531fe)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
5 years agotarget/ppc: always set PPC_MEM_TLBIE in pre 2.8 migration hack
Greg Kurz [Thu, 3 May 2018 21:16:29 +0000 (23:16 +0200)]
target/ppc: always set PPC_MEM_TLBIE in pre 2.8 migration hack

The pseries-2.7 and older machine types require CPUPPCState::insns_flags
to be strictly equal between source and destination. This checking is
abusive and breaks migration of KVM guests when the host CPU models
are different, even if they are compatible enough to allow the guest
to run transparently. This buggy behaviour was fixed for pseries-2.8
and we added some hacks to allow backward migration of older machine
types. These hacks assume that the CPU belongs to the POWER8 family,
which was true for most KVM based setup we cared about at the time.
But now POWER9 systems are coming, and backward migration of pre 2.8
guests running in POWER8 architected mode from a POWER9 host to a
POWER8 host is broken:

qemu-system-ppc64: error while loading state for instance 0x0 of device
 'cpu'
qemu-system-ppc64: load of migration failed: Invalid argument

This happens because POWER9 doesn't set PPC_MEM_TLBIE in insns_flags,
while POWER8 does. Let's force PPC_MEM_TLBIE in the migration hack to
fix the issue. This is an acceptable hack because these old machine
types only support CPU models that do set PPC_MEM_TLBIE.

Signed-off-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
(cherry picked from commit bce009645b9f1d59195518e35747c8ea30f985f7)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
5 years agotarget/arm: Implement v8M VLLDM and VLSTM
Peter Maydell [Fri, 4 May 2018 17:05:51 +0000 (18:05 +0100)]
target/arm: Implement v8M VLLDM and VLSTM

For v8M the instructions VLLDM and VLSTM support lazy saving
and restoring of the secure floating-point registers. Even
if the floating point extension is not implemented, these
instructions must act as NOPs in Secure state, so they can
be used as part of the secure-to-nonsecure call sequence.

Fixes: https://bugs.launchpad.net/qemu/+bug/1768295
Cc: qemu-stable@nongnu.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20180503105730.5958-1-peter.maydell@linaro.org
(cherry picked from commit b1e5336a9899016c53d59eba53ebf6abcc21995c)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
5 years agotcg/arm: Fix memory barrier encoding
Henry Wertz [Tue, 17 Apr 2018 22:06:23 +0000 (12:06 -1000)]
tcg/arm: Fix memory barrier encoding

I found with qemu 2.11.x or newer that I would get an illegal instruction
error running some Intel binaries on my ARM chromebook.  On investigation,
I found it was quitting on memory barriers.

qemu instruction:
mb $0x31
was translating as:
0x604050cc:  5bf07ff5  blpl     #0x600250a8

After patch it gives:
0x604050cc:  f57ff05b  dmb      ish

In short, I found INSN_DMB_ISH (memory barrier for ARMv7) appeared to be
correct based on online docs, but due to some endian-related shenanigans it
had to be byte-swapped to suit qemu; it appears INSN_DMB_MCR (memory
barrier for ARMv6) also should be byte swapped  (and this patch does so).
I have not checked for correctness of aarch64's barrier instruction.

Cc: qemu-stable@nongnu.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Henry Wertz <hwertz10@gmail.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
(cherry picked from commit 3f814b803797c007abfe5c4041de754e01723031)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
5 years agos390-ccw: force diag 308 subcode to unsigned long
Cornelia Huck [Wed, 2 May 2018 12:52:21 +0000 (14:52 +0200)]
s390-ccw: force diag 308 subcode to unsigned long

We currently pass an integer as the subcode parameter. However,
the upper bits of the register containing the subcode need to
be 0, which is not guaranteed unless we explicitly specify the
subcode to be an unsigned long value.

Fixes: d046c51dad3 ("pc-bios/s390-ccw: Get device address via diag 308/6")
Cc: qemu-stable@nongnu.org
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Tested-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
(cherry picked from commit 63d8b5ace31c1e1f3996fe4cd551d6d377594d5a)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
5 years agos390: Do not pass inofficial IPL type to the guest
Viktor Mihajlovski [Thu, 5 Apr 2018 15:07:24 +0000 (17:07 +0200)]
s390: Do not pass inofficial IPL type to the guest

IPL over a virtio-scsi device requires special handling not
available in the real architecture. For this purpose the IPL
type 0xFF has been chosen as means of communication between
QEMU and the pc-bios. However, a guest OS could be confused
by seeing an unknown IPL type.

This change sets the IPL parameter type to 0x02 (CCW) to prevent
this. Pre-existing Linux has looked up the IPL parameters only in
the case of FCP IPL. This means that the behavior should stay
the same even if Linux checks for the IPL type unconditionally.

Signed-off-by: Viktor Mihajlovski <mihajlov@linux.vnet.ibm.com>
Message-Id: <1522940844-12336-4-git-send-email-mihajlov@linux.vnet.ibm.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
(cherry picked from commit e8c7ef288abb05b741a95418ee2de85c1071e0db)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
5 years agonbd/client: Fix error messages during NBD_INFO_BLOCK_SIZE
Eric Blake [Tue, 1 May 2018 15:46:53 +0000 (10:46 -0500)]
nbd/client: Fix error messages during NBD_INFO_BLOCK_SIZE

A missing space makes for poor error messages, and sizes can't
go negative.  Also, we missed diagnosing a server that sends
a maximum block size less than the minimum.

Fixes: 081dd1fe
CC: qemu-stable@nongnu.org
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20180501154654.943782-1-eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
(cherry picked from commit e475d108f1b3d3163f0affea67cdedbe5fc9752b)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
5 years agoccid: Fix dwProtocols advertisement of T=0
Jason Andryuk [Fri, 20 Apr 2018 18:32:19 +0000 (14:32 -0400)]
ccid: Fix dwProtocols advertisement of T=0

Commit d7d218ef02d87c637d20d64da8f575d434ff6f78 attempted to change
dwProtocols to only advertise support for T=0 and not T=1.  The change
was incorrect as it changed 0x00000003 to 0x00010000.

lsusb -v in a linux guest shows:
"dwProtocols         65536  (Invalid values detected)", though the
smart card could still be accessed.  Windows 7 does not detect inserted
smart cards and logs the the following Error in the Event Logs:

    Source: Smart Card Service
    Event ID: 610
    Smart Card Reader 'QEMU QEMU USB CCID 0' rejected IOCTL SET_PROTOCOL:
    Incorrect function. If this error persists, your smart card or reader
    may not be functioning correctly

    Command Header: 03 00 00 00

Setting to 0x00000001 fixes the Windows issue.

Signed-off-by: Jason Andryuk <jandryuk@gmail.com>
Message-id: 20180420183219.20722-1-jandryuk@gmail.com
Cc: qemu-stable@nongnu.org
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit 0ee86bb6c5beb6498488850104f7557c376d0bef)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
5 years agodevice_tree: Increase FDT_MAX_SIZE to 1 MiB
Geert Uytterhoeven [Thu, 26 Apr 2018 10:04:38 +0000 (11:04 +0100)]
device_tree: Increase FDT_MAX_SIZE to 1 MiB

It is not uncommon for a contemporary FDT to be larger than 64 KiB,
leading to failures loading the device tree from sysfs:

    qemu-system-aarch64: qemu_fdt_setprop: Couldn't set ...: FDT_ERR_NOSPACE

Hence increase the limit to 1 MiB, like on PPC.

For reference, the largest arm64 DTB created from the Linux sources is
ca. 75 KiB large (100 KiB when built with symbols/fixup support).

Cc: qemu-stable@nongnu.org
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Message-id: 1523541337-23919-1-git-send-email-geert+renesas@glider.be
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
(cherry picked from commit 14ec3cbd7c1e31dca4d23f028100c8f43e156573)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
5 years agohw/char/cmsdk-apb-uart.c: Correctly clear INTSTATUS bits on writes
Peter Maydell [Tue, 10 Apr 2018 13:42:03 +0000 (14:42 +0100)]
hw/char/cmsdk-apb-uart.c: Correctly clear INTSTATUS bits on writes

The CMSDK APB UART INTSTATUS register bits are all write-one-to-clear.
We were getting this correct for the TXO and RXO bits (which need
special casing because their state lives in the STATE register),
but had forgotten to handle the normal bits for RX and TX which
we do store in our s->intstatus field.

Perform the W1C operation on the bits in s->intstatus too.

Fixes: https://bugs.launchpad.net/qemu/+bug/1760262
Cc: qemu-stable@nongnu.org
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 20180410134203.17552-1-peter.maydell@linaro.org
(cherry picked from commit 6670b494fdb23f74ecd9be3d952c007f64e268f1)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
5 years agotcg: Introduce tcg_set_insn_start_param
Richard Henderson [Tue, 10 Apr 2018 12:02:26 +0000 (13:02 +0100)]
tcg: Introduce tcg_set_insn_start_param

The parameters for tcg_gen_insn_start are target_ulong, which may be split
into two TCGArg parameters for storage in the opcode on 32-bit hosts.

Fixes the ARM target and its direct use of tcg_set_insn_param, which would
set the wrong argument in the 64-on-32 case.

Cc: qemu-stable@nongnu.org
Reported-by: alarson@ddci.com
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20180410003558.2470-1-richard.henderson@linaro.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
(cherry picked from commit 9743cd5736263e90d312b2c33bd739ffe1eae70d)
 Conflicts:
target/arm/translate.h
tcg/tcg.h
* rework to avoid functional dependency on 15fa08f

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
5 years agohw/block/pflash_cfi: fix off-by-one error
Philippe Mathieu-Daudé [Wed, 4 Apr 2018 23:32:38 +0000 (20:32 -0300)]
hw/block/pflash_cfi: fix off-by-one error

ASAN reported:

    hw/block/pflash_cfi02.c:245:33: runtime error: index 82 out of bounds for type 'uint8_t [82]'

Since the 'cfi_len' member is not used, remove it to keep the code safer.

Cc: qemu-stable@nongnu.org
Reported-by: AddressSanitizer
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit 07c13a71721d9f8c690b66752964e254af247475)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
5 years agovfio-ccw: fix memory leaks in vfio_ccw_realize()
Greg Kurz [Sat, 7 Apr 2018 14:43:46 +0000 (16:43 +0200)]
vfio-ccw: fix memory leaks in vfio_ccw_realize()

If the subchannel is already attached or if vfio_get_device() fails, the
code jumps to the 'out_device_err' label and doesn't free the string it
has just allocated.

The code should be reworked so that vcdev->vdev.name only gets set when
the device has been attached, and freed when it is about to be detached.
This could be achieved  with the addition of a vfio_ccw_get_device()
function that would be the counterpart of vfio_put_device(). But this is
a more elaborate cleanup that should be done in a follow-up. For now,
let's just add calls to g_free() on the buggy error paths.

Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <152311222681.203086.8874800175539040298.stgit@bahia>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
(cherry picked from commit be4d026f645eb31078e08d431c93a898b895024e)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
5 years agocpus.c: ensure running CPU recalculates icount deadlines on timer expiry
Peter Maydell [Tue, 10 Apr 2018 12:02:25 +0000 (13:02 +0100)]
cpus.c: ensure running CPU recalculates icount deadlines on timer expiry

When we run in TCG icount mode, we calculate the number of instructions
to execute using tcg_get_icount_limit(), which ensures that we stop
execution at the next timer deadline. However there is a bug where
currently we do not recalculate that limit if the guest reprograms
a timer so that the next deadline moves closer, and so we will
continue execution until the original limit and fire the timer
later than we should.

Fix this bug in qemu_timer_notify_cb(): if we are currently running
a VCPU in icount mode, we simply need to kick it out of the main
loop and back to tcg_cpu_exec(), where it will recalculate the
icount limit. If we are not currently running a VCPU, then we
retain the existing logic for waking up a halted CPU.

Cc: qemu-stable@nongnu.org
Fixes: https://bugs.launchpad.net/qemu/+bug/1754038
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-id: 20180406123838.21249-1-peter.maydell@linaro.org
(cherry picked from commit c52e7132d7c885841500f5277f7305f62767fe1d)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
5 years agogluster: Fix blockdev-add with server.N.type=unix
Kevin Wolf [Tue, 3 Apr 2018 11:08:10 +0000 (13:08 +0200)]
gluster: Fix blockdev-add with server.N.type=unix

The legacy command line interface gets the socket path from an option
called 'socket'. QAPI in contract uses SocketAddress, where the
corresponding option is called 'path'.

Fix the gluster block driver to accept both 'socket' and 'path', with
'path' being the preferred syntax.

https://bugzilla.redhat.com/show_bug.cgi?id=1545155

Cc: qemu-stable@nongnu.org
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-id: 20180403110810.25624-1-kwolf@redhat.com
Signed-off-by: Jeff Cody <jcody@redhat.com>
(cherry picked from commit 9dae635afa98f83688806861cefe77ff1b4d76a8)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
5 years agoexec: fix memory leak in find_max_supported_pagesize()
Greg Kurz [Thu, 29 Mar 2018 09:09:46 +0000 (11:09 +0200)]
exec: fix memory leak in find_max_supported_pagesize()

The string returned by object_property_get_str() is dynamically allocated.

Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <152231458624.69730.1752893648612848392.stgit@bahia.lan>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
(cherry picked from commit 72a841d2a403b56ff894fa007b172dc9bcb3dae8)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
5 years agotarget/i386: Fix andn instruction
Alexandro Sanchez Bach [Thu, 5 Apr 2018 12:40:58 +0000 (14:40 +0200)]
target/i386: Fix andn instruction

In commit 7073fbada733c8d10992f00772c9b9299d740e9b, the `andn` instruction
was implemented via `tcg_gen_andc` but passes the operands in the wrong
order:
- X86 defines `andn dest,src1,src2` as: dest = ~src1 & src2
- TCG defines `andc dest,src1,src2` as: dest = src1 & ~src2

The following simple test shows the issue:

    #include <stdio.h>
    #include <stdint.h>

    int main(void) {
        uint32_t ret = 0;
        __asm (
            "mov $0xFF00, %%ecx\n"
            "mov $0x0F0F, %%eax\n"
            "andn %%ecx, %%eax, %%ecx\n"
            "mov %%ecx, %0\n"
          : "=r" (ret));
        printf("%08X\n", ret);
        return 0;
    }

This patch fixes the problem by simply swapping the order of the two last
arguments in `tcg_gen_andc_tl`.

Reported-by: Alexandro Sanchez Bach <alexandro@phi.nz>
Signed-off-by: Alexandro Sanchez Bach <alexandro@phi.nz>
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 5cd10051c2e02b7a86eae49919d6c65a87dbea46)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
5 years agotcg: Mark muluh_i64 and mulsh_i64 as 64-bit ops
Richard Henderson [Tue, 27 Mar 2018 03:37:24 +0000 (20:37 -0700)]
tcg: Mark muluh_i64 and mulsh_i64 as 64-bit ops

Failure to do so results in the tcg optimizer sign-extending
any constant fold from 32-bits.  This turns out to be visible
in the RISC-V testsuite using a host that emits these opcodes
(e.g. any non-x86_64).

Reported-by: Michael Clark <mjc@sifive.com>
Reviewed-by: Emilio G. Cota <cota@braap.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
(cherry picked from commit f2f1dde75160cac6ede330f3db50dc817d01a2d6)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
5 years agoiotests: Test preallocated truncate of 2G image
Max Reitz [Wed, 28 Feb 2018 13:13:15 +0000 (14:13 +0100)]
iotests: Test preallocated truncate of 2G image

Signed-off-by: Max Reitz <mreitz@redhat.com>
Message-id: 20180228131315.30194-3-mreitz@redhat.com
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
(cherry picked from commit 733d1dce0f3c8ab7b79a173f6482781d3718f844)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
5 years agoblock/file-posix: Fix fully preallocated truncate
Max Reitz [Wed, 28 Feb 2018 13:13:14 +0000 (14:13 +0100)]
block/file-posix: Fix fully preallocated truncate

Storing the lseek() result in an int results in it overflowing when the
file is at least 2 GB big.  Then, we have a 50 % chance of the result
being "negative" and thus thinking an error occurred when actually
everything went just fine.

So we should use the correct type for storing the result: off_t.

Reported-by: Daniel P. Berrange <berrange@redhat.com>
Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=1549231
Cc: qemu-stable@nongnu.org
Signed-off-by: Max Reitz <mreitz@redhat.com>
Message-id: 20180228131315.30194-2-mreitz@redhat.com
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
(cherry picked from commit 82b45e0a0b824787bd79ce3f6453eaa2afddd138)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
5 years agoqemu-pr-helper: Actually allow users to specify pidfile
Michal Privoznik [Sat, 24 Mar 2018 05:14:49 +0000 (06:14 +0100)]
qemu-pr-helper: Actually allow users to specify pidfile

Due to wrong specification of arguments to getopt_long() any
attempt to set pidfile resulted in:

1) the default to be leaked
2) the @pidfile variable to be set to NULL (because optarg is
NULL without this patch).

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Message-Id: <6f10cd53d361a395aa0e85a9311ec4e9a8fc11e5.1521868451.git.mprivozn@redhat.com>
Cc: qemu-stable@nongnu.org
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit f8e1a989644f22ba2f7afb0e13b6ce2309ea9503)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
5 years agovirtio_net: flush uncompleted TX on reset
Greg Kurz [Tue, 20 Mar 2018 10:44:56 +0000 (11:44 +0100)]
virtio_net: flush uncompleted TX on reset

If the backend could not transmit a packet right away for some reason,
the packet is queued for asynchronous sending. The corresponding vq
element is tracked in the async_tx.elem field of the VirtIONetQueue,
for later freeing when the transmission is complete.

If a reset happens before completion, virtio_net_tx_complete() will push
async_tx.elem back to the guest anyway, and we end up with the inuse flag
of the vq being equal to -1. The next call to virtqueue_pop() is then
likely to fail with "Virtqueue size exceeded".

This can be reproduced easily by starting a guest with an hubport backend
that is not connected to a functional network, eg,

 -device virtio-net-pci,netdev=hub0 -netdev hubport,id=hub0,hubid=0

and no other -netdev hubport,hubid=0 on the command line.

The appropriate fix is to ensure that such an asynchronous transmission
cannot survive a device reset. So for all queues, we first try to send
the packet again, and eventually we purge it if the backend still could
not deliver it.

CC: qemu-stable@nongnu.org
Reported-by: R. Nageswara Sastry <nasastry@in.ibm.com>
Buglink: https://github.com/open-power-host-os/qemu/issues/37
Signed-off-by: Greg Kurz <groug@kaod.org>
Tested-by: R. Nageswara Sastry <nasastry@in.ibm.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
(cherry picked from commit 94b52958b77a2a040564cf7ed716d3a9545d94e5)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
5 years agoarm/translate-a64: treat DISAS_UPDATE as variant of DISAS_EXIT
Victor Kamensky [Fri, 23 Mar 2018 18:26:45 +0000 (18:26 +0000)]
arm/translate-a64: treat DISAS_UPDATE as variant of DISAS_EXIT

In OE project 4.15 linux kernel boot hang was observed under
single cpu aarch64 qemu. Kernel code was in a loop waiting for
vtimer arrival, spinning in TC generated blocks, while interrupt
was pending unprocessed. This happened because when qemu tried to
handle vtimer interrupt target had interrupts disabled, as
result flag indicating TCG exit, cpu->icount_decr.u16.high,
was cleared but arm_cpu_exec_interrupt function did not call
arm_cpu_do_interrupt to process interrupt. Later when target
reenabled interrupts, it happened without exit into main loop, so
following code that waited for result of interrupt execution
run in infinite loop.

To solve the problem instructions that operate on CPU sys state
(i.e enable/disable interrupt), and marked as DISAS_UPDATE,
should be considered as DISAS_EXIT variant, and should be
forced to exit back to main loop so qemu will have a chance
processing pending CPU state updates, including pending
interrupts.

This change brings consistency with how DISAS_UPDATE is treated
in aarch32 case.

CC: Peter Maydell <peter.maydell@linaro.org>
CC: Alex Bennée <alex.bennee@linaro.org>
CC: qemu-stable@nongnu.org
Suggested-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Victor Kamensky <kamensky@cisco.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 1521526368-1996-1-git-send-email-kamensky@cisco.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
(cherry picked from commit a75a52d62418dafe462be4fe30485501d1010bb9)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
5 years agotests/multiboot: Add .gitignore
Kevin Wolf [Wed, 14 Mar 2018 12:34:40 +0000 (13:34 +0100)]
tests/multiboot: Add .gitignore

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Jack Schwartz <jack.schwartz@oracle.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
(cherry picked from commit e2679395d598bd40770c22a793c0152576ac211f)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
5 years agotests/multiboot: Add tests for the a.out kludge
Kevin Wolf [Wed, 14 Mar 2018 12:33:49 +0000 (13:33 +0100)]
tests/multiboot: Add tests for the a.out kludge

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Jack Schwartz <jack.schwartz@oracle.com>
(cherry picked from commit 1c8c426fb44bf5b3ffbcad1b00c7def4b89b03ec)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
5 years agotests/multiboot: Test exit code for every qemu run
Kevin Wolf [Wed, 14 Mar 2018 12:29:46 +0000 (13:29 +0100)]
tests/multiboot: Test exit code for every qemu run

Testing the exit code only once after a whole group of tests has
completed is not enough, it catches errors only in the very last qemu
invocation. We need to have the check after each qemu run.

The logging and diff with the reference output is still done once per
group to keep things more managable. This is not a problem because the
log file accumulates the output of all runs.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Jack Schwartz <jack.schwartz@oracle.com>
(cherry picked from commit 49713c413a65ab4b02124aabe83f8539cc6ece5e)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
5 years agomultiboot: Check validity of mh_header_addr
Kevin Wolf [Wed, 14 Mar 2018 16:57:45 +0000 (17:57 +0100)]
multiboot: Check validity of mh_header_addr

I couldn't find a case where this prevents something bad from happening
that isn't already caught by other checks, but let's err on the safe
side and check that mh_header_addr is as expected.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Jack Schwartz <jack.schwartz@oracle.com>
(cherry picked from commit dbf2dce7aabb7723542bd182175904846d70b0f9)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
5 years agomultiboot: Reject kernels exceeding the address space
Kevin Wolf [Wed, 14 Mar 2018 16:46:38 +0000 (17:46 +0100)]
multiboot: Reject kernels exceeding the address space

The code path where mh_load_end_addr is non-zero in the Multiboot
header checks that mh_load_end_addr >= mh_load_addr and so
mb_load_size is checked.  However, mb_load_size is not checked when
calculated from the file size, when mh_load_end_addr is 0.

If the kernel binary size is larger than can fit in the address space
after load_addr, we ended up with a kernel_size that is smaller than
load_size, which means that we read the file into a too small buffer.

Add a check to reject kernel files with such Multiboot headers.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Jack Schwartz <jack.schwartz@oracle.com>
(cherry picked from commit b17a9054a0652a1481be48a6729e972abf02412f)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
5 years agomultiboot: fprintf(stderr...) -> error_report()
Jack Schwartz [Thu, 21 Dec 2017 17:25:18 +0000 (09:25 -0800)]
multiboot: fprintf(stderr...) -> error_report()

Change all fprintf(stderr...) calls in hw/i386/multiboot.c to call
error_report() instead, including the mb_debug macro.  Remove the "\n"
from strings passed to all modified calls, since error_report() appends
one.

Signed-off-by: Jack Schwartz <jack.schwartz@oracle.com>
Reviewed-by: Daniel Kiper <daniel.kiper@oracle.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit 4b9006a41ea8818f2385ae5228e07f211bb4a33d)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
5 years agomultiboot: Use header names when displaying fields
Jack Schwartz [Thu, 21 Dec 2017 17:25:17 +0000 (09:25 -0800)]
multiboot: Use header names when displaying fields

Refer to field names when displaying fields in printf and debug statements.

Signed-off-by: Jack Schwartz <jack.schwartz@oracle.com>
Reviewed-by: Daniel Kiper <daniel.kiper@oracle.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit ce5eb6dc4dc5652f7e360a1db817f1d5dafab90f)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
5 years agomultiboot: Remove unused variables from multiboot.c
Jack Schwartz [Thu, 21 Dec 2017 17:25:16 +0000 (09:25 -0800)]
multiboot: Remove unused variables from multiboot.c

Remove unused variables: mh_mode_type, mh_width, mh_height, mh_depth

Signed-off-by: Jack Schwartz <jack.schwartz@oracle.com>
Reviewed-by: Daniel Kiper <daniel.kiper@oracle.com>
Reviewed-by: Prasad J Pandit <pjp@fedoraproject.org>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit 7a2e43cc96fd017883973caf9ee076ae23a3bebd)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
5 years agomultiboot: bss_end_addr can be zero
Jack Schwartz [Thu, 21 Dec 2017 17:25:15 +0000 (09:25 -0800)]
multiboot: bss_end_addr can be zero

The multiboot spec (https://www.gnu.org/software/grub/manual/multiboot/),
section 3.1.3, allows for bss_end_addr to be zero.

A zero bss_end_addr signifies there is no .bss section.

Suggested-by: Daniel Kiper <daniel.kiper@oracle.com>
Signed-off-by: Jack Schwartz <jack.schwartz@oracle.com>
Reviewed-by: Daniel Kiper <daniel.kiper@oracle.com>
Reviewed-by: Prasad J Pandit <pjp@fedoraproject.org>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit 2a8fcd119eb7c6bb3837fc3669eb1b2dfb31daf8)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
5 years agomigration/block: reset dirty bitmap before read in bulk phase
Peter Lieven [Thu, 8 Mar 2018 11:18:25 +0000 (12:18 +0100)]
migration/block: reset dirty bitmap before read in bulk phase

Reset the dirty bitmap before reading to make sure we don't miss
any new data.

Cc: qemu-stable@nongnu.org
Signed-off-by: Peter Lieven <pl@kamp.de>
Message-Id: <1520507908-16743-3-git-send-email-pl@kamp.de>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
(cherry picked from commit 86b124bc76bd7137d0fb20696c4e349571b8533d)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
5 years agomemory: fix flatview_access_valid RCU read lock/unlock imbalance
Paolo Bonzini [Wed, 7 Mar 2018 13:02:38 +0000 (14:02 +0100)]
memory: fix flatview_access_valid RCU read lock/unlock imbalance

Fixes: 11e732a5ed46903f997985bed4c3767ca28a7eb6
Reported-by: Cornelia Huck <cohuck@redhat.com>
Reported-by: luigi burdo <intermediadc@hotmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Tested-by: Cornelia Huck <cohuck@redhat.com>
Tested-by: Thomas Huth <thuth@redhat.com>
Message-id: 20180307130238.19358-1-pbonzini@redhat.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
(cherry picked from commit b39b61e410022f96ceb53d4381d25cba5126ac44)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
5 years agoaddress_space_rw: address_space_to_flatview needs RCU lock
Paolo Bonzini [Mon, 5 Mar 2018 08:29:04 +0000 (09:29 +0100)]
address_space_rw: address_space_to_flatview needs RCU lock

address_space_rw is calling address_space_to_flatview but it can
be called outside the RCU lock.  To fix it, transform flatview_rw
into address_space_rw, since flatview_rw is otherwise unused.

Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit db84fd973eba3f1e121416dcab73a4e8a60f2526)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
5 years agoaddress_space_map: address_space_to_flatview needs RCU lock
Paolo Bonzini [Sun, 4 Mar 2018 23:23:26 +0000 (00:23 +0100)]
address_space_map: address_space_to_flatview needs RCU lock

address_space_map is calling address_space_to_flatview but it can
be called outside the RCU lock.  The function itself is calling
rcu_read_lock/rcu_read_unlock, just in the wrong place, so the
fix is easy.

Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit ad0c60fa572d4050255b698ecdb67294dd4c0125)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
5 years agoaddress_space_access_valid: address_space_to_flatview needs RCU lock
Paolo Bonzini [Sun, 4 Mar 2018 23:23:26 +0000 (00:23 +0100)]
address_space_access_valid: address_space_to_flatview needs RCU lock

address_space_access_valid is calling address_space_to_flatview but it can
be called outside the RCU lock.  To fix it, push the rcu_read_lock/unlock
pair up from flatview_access_valid to address_space_access_valid.

Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 11e732a5ed46903f997985bed4c3767ca28a7eb6)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
5 years agoaddress_space_read: address_space_to_flatview needs RCU lock
Paolo Bonzini [Sun, 4 Mar 2018 23:19:49 +0000 (00:19 +0100)]
address_space_read: address_space_to_flatview needs RCU lock

address_space_read is calling address_space_to_flatview but it can
be called outside the RCU lock.  To fix it, push the rcu_read_lock/unlock
pair up from flatview_read_full to address_space_read's constant size
fast path and address_space_read_full.

Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit b2a44fcad74f1cc7a6786d38eba7db12ab2352ba)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
5 years agoaddress_space_write: address_space_to_flatview needs RCU lock
Paolo Bonzini [Mon, 5 Mar 2018 08:23:56 +0000 (09:23 +0100)]
address_space_write: address_space_to_flatview needs RCU lock

address_space_write is calling address_space_to_flatview but it can
be called outside the RCU lock.  To fix it, push the rcu_read_lock/unlock
pair up from flatview_write to address_space_write.

Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 4c6ebbb364aa6f42c5d8e83e932e967eb83f0e44)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
5 years agomemory: inline some performance-sensitive accessors
Paolo Bonzini [Sun, 4 Mar 2018 23:31:20 +0000 (00:31 +0100)]
memory: inline some performance-sensitive accessors

These accessors are called from inlined functions, and the call sequence
is much more expensive than just inlining the access.  Move the
struct declaration to memory-internal.h so that exec.c and memory.c
can both use an inline function.

Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 785a507ec78bbda1c346f3d3593e5a58b62e73ef)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
5 years agoopenpic_kvm: drop address_space_to_flatview call
Paolo Bonzini [Mon, 5 Mar 2018 08:18:26 +0000 (09:18 +0100)]
openpic_kvm: drop address_space_to_flatview call

The MemoryListener is registered on address_space_memory, there is
not much to assert.  This currently works because the callback
is invoked only once when the listener is registered, but section->fv
is the _new_ FlatView, not the old one on later calls and that
would break.

This confines address_space_to_flatview to exec.c and memory.c.

Acked-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 80d2b933f9fe3e53d4f76a45a1bc1a0175669468)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
5 years agosparc: fix leon3 casa instruction when MMU is disabled
KONRAD Frederic [Fri, 2 Mar 2018 09:59:25 +0000 (10:59 +0100)]
sparc: fix leon3 casa instruction when MMU is disabled

Since the commit af7a06bac7d3abb2da48ef3277d2a415772d2ae8:
`casa [..](10), .., ..` (and probably others alternate space instructions)
triggers a data access exception when the MMU is disabled.

When we enter get_asi(...) dc->mem_idx is set to MMU_PHYS_IDX when the MMU
is disabled. Just keep mem_idx unchanged in this case so we passthrough the
MMU when it is disabled.

Signed-off-by: KONRAD Frederic <frederic.konrad@adacore.com>
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
(cherry picked from commit 6e10f37c86068e35151f982c976a85f1bec07ef2)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
5 years agolinux-user: fix target_mprotect/target_munmap error return values
Max Filippov [Wed, 28 Feb 2018 22:16:05 +0000 (14:16 -0800)]
linux-user: fix target_mprotect/target_munmap error return values

target_mprotect/target_munmap return value goes through get_errno at the
call site, thus the functions must either set errno to host error code
and return -1 or return negative guest error code. Do the latter.

Cc: qemu-stable@nongnu.org
Cc: Riku Voipio <riku.voipio@iki.fi>
Cc: Laurent Vivier <laurent@vivier.eu>
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Message-Id: <20180228221609.11265-8-jcmvbkbc@gmail.com>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
(cherry picked from commit 78cf339039c325b336442f1d7f3ccc531b22c4a0)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
5 years agolinux-user: fix assertion in shmdt
Max Filippov [Wed, 28 Feb 2018 22:16:04 +0000 (14:16 -0800)]
linux-user: fix assertion in shmdt

shmdt fails to call mmap_lock/mmap_unlock around page_set_flags,
resulting in the following assertion:
  page_set_flags: Assertion `have_mmap_lock()' failed.

Wrap shmdt internals into mmap_lock/mmap_unlock.

Cc: qemu-stable@nongnu.org
Cc: Riku Voipio <riku.voipio@iki.fi>
Cc: Laurent Vivier <laurent@vivier.eu>
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Message-Id: <20180228221609.11265-7-jcmvbkbc@gmail.com>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
(cherry picked from commit 3c5f6a5f888729f9fbc64211298f7c3e2fb42b64)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
5 years agolinux-user: fix mmap/munmap/mprotect/mremap/shmat
Max Filippov [Wed, 7 Mar 2018 21:50:10 +0000 (13:50 -0800)]
linux-user: fix mmap/munmap/mprotect/mremap/shmat

In linux-user QEMU that runs for a target with TARGET_ABI_BITS bigger
than L1_MAP_ADDR_SPACE_BITS an assertion in page_set_flags fires when
mmap, munmap, mprotect, mremap or shmat is called for an address outside
the guest address space. mmap and mprotect should return ENOMEM in such
case.

Change definition of GUEST_ADDR_MAX to always be the last valid guest
address. Account for this change in open_self_maps.
Add macro guest_addr_valid that verifies if the guest address is valid.
Add function guest_range_valid that verifies if address range is within
guest address space and does not wrap around. Use that macro in
mmap/munmap/mprotect/mremap/shmat for error checking.

Cc: qemu-stable@nongnu.org
Cc: Riku Voipio <riku.voipio@iki.fi>
Cc: Laurent Vivier <laurent@vivier.eu>
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Message-Id: <20180307215010.30706-1-jcmvbkbc@gmail.com>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
(cherry picked from commit ebf9a3630c911d0cfc9c20f7cafe9ba4f88cf583)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
5 years agotarget/xtensa: dump correct physical registers
Max Filippov [Wed, 28 Feb 2018 19:48:04 +0000 (11:48 -0800)]
target/xtensa: dump correct physical registers

xtensa_cpu_dump_state outputs CPU physical registers as is, without
synchronization from current window. That may result in different values
printed for the current window and corresponding physical registers.
Synchronize physical registers from window before dumping.

Cc: qemu-stable@nongnu.org
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
(cherry picked from commit b55b1afda942306e4e40420aced1524bd83ba16d)
 Conflicts:
target/xtensa/translate.c
* drop context dependencies
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
5 years agoloader: don't perform overlapping address check for memory region ROM images
Mark Cave-Ayland [Fri, 23 Feb 2018 11:10:17 +0000 (11:10 +0000)]
loader: don't perform overlapping address check for memory region ROM images

All memory region ROM images have a base address of 0 which causes the overlapping
address check to fail if more than one memory region ROM image is present, or an
existing ROM image is loaded at address 0.

Make sure that we ignore the overlapping address check in
rom_check_and_register_reset() if this is a memory region ROM image. In particular
this fixes the "rom: requested regions overlap" error on startup when trying to
run qemu-system-sparc with a -kernel image since commit 7497638642: "tcx: switch to
load_image_mr() and remove prom_addr hack".

Suggested-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
(cherry picked from commit ca316c11526a1bc221fb542bdce6bac7238dde69)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
5 years agotpm: Set the flags of the CMD_INIT command to 0
Stefan Berger [Fri, 26 Jan 2018 18:49:24 +0000 (13:49 -0500)]
tpm: Set the flags of the CMD_INIT command to 0

The flags of the CMD_INIT control channel command were not
initialized properly. Fix this and set to 0.

Signed-off-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
(cherry picked from commit 302705876492a39f568035ce346e2c9176f5665e)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
5 years agorbd: Fix use after free in qemu_rbd_set_keypairs() error path
Kevin Wolf [Fri, 16 Feb 2018 18:14:55 +0000 (19:14 +0100)]
rbd: Fix use after free in qemu_rbd_set_keypairs() error path

If we want to include the invalid option name in the error message, we
can't free the string earlier than that.

Cc: qemu-stable@nongnu.org
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
(cherry picked from commit 71c87815f9e0386b6f3e22942adc956fd603c82f)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
5 years agospecs/qcow2: Fix documentation of the compressed cluster descriptor
Alberto Garcia [Wed, 21 Feb 2018 14:08:49 +0000 (16:08 +0200)]
specs/qcow2: Fix documentation of the compressed cluster descriptor

This patch fixes several mistakes in the documentation of the
compressed cluster descriptor:

1) the documentation claims that the cluster descriptor contains the
   number of sectors used to store the compressed data, but what it
   actually contains is the number of sectors *minus one* or, in other
   words, the number of additional sectors after the first one.

2) the width of the fields is incorrectly specified. The number of bits
   used by each field is

      x = 62 - (cluster_bits - 8)   for the offset field
      y = (cluster_bits - 8)        for the size field

   So the offset field's location is [0, x-1], not [0, x] as stated.

3) the size field does not contain the size of the compressed data,
   but rather the number of sectors where that data is stored. The
   compressed data starts at the exact point specified in the offset
   field and ends when there's enough data to produce a cluster of
   decompressed data. Both points can be in the middle of a sector,
   allowing several compressed clusters to be stored next to one
   another, sharing sectors if necessary.

Cc: qemu-stable@nongnu.org
Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit 156b46ded3853dfc6b34c5afae019ff61798491b)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
5 years agonbd: Honor server's advertised minimum block size
Eric Blake [Thu, 15 Feb 2018 03:29:05 +0000 (21:29 -0600)]
nbd: Honor server's advertised minimum block size

Commit 79ba8c98 (v2.7) changed the setting of request_alignment
to occur only during bdrv_refresh_limits(), rather than at at
bdrv_open() time; but at the time, NBD was unaffected, because
it still used sector-based callbacks, so the block layer
defaulted NBD to use 512 request_alignment.

Later, commit 70c4fb26 (also v2.7) changed NBD to use byte-based
callbacks, without setting request_alignment.  This resulted in
NBD using request_alignment of 1, which works great when the
server supports it (as is the case for qemu-nbd), but falls apart
miserably if the server requires alignment (but only if qemu
actually sends a sub-sector request; qemu-io can do it, but
most qemu operations still perform on sectors or larger).

Even later, the NBD protocol was updated to document that clients
should learn the server's minimum alignment during NBD_OPT_GO;
and recommended that clients should assume a minimum size of 512
unless the server understands NBD_OPT_GO and replied with a smaller
size.  Commit 081dd1fe (v2.10) attempted to do that, by assigning
request_alignment to whatever was learned from the server; but
it has two flaws: the assignment is done during bdrv_open() so
it gets unconditionally wiped out back to 1 during any later
bdrv_refresh_limits(); and the code is not using a default of 512
when the server did not report a minimum size.

Fix these issues by moving the assignment to request_alignment
to the right function, and by using a sane default when the
server does not advertise a minimum size.

CC: qemu-stable@nongnu.org
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20180215032905.27146-1-eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy<vsementsov@virtuozzo.com>
(cherry picked from commit fd8d372dd36e839568a718684914d9960d8b1ebd)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
5 years agospapr: make pseries-2.11 the default machine type
Greg Kurz [Wed, 20 Jun 2018 12:54:15 +0000 (14:54 +0200)]
spapr: make pseries-2.11 the default machine type

The spapr capability framework was introduced in QEMU 2.12. It allows
to have an explicit control on how host features are exposed to the
guest. This is especially needed to handle migration between hetero-
geneous hosts (eg, POWER8 to POWER9). It is also used to expose fixes/
workarounds against speculative execution vulnerabilities to guests.
The framework was hence backported to QEMU 2.11.1, especially these
commits:

0fac4aa93074 spapr: Add pseries-2.12 machine type
9070f408f491 spapr: Treat Hardware Transactional Memory (HTM) as an
 optional capability

0fac4aa93074 has the confusing effect of making pseries-2.12 the default
machine type for QEMU 2.11.1, instead of the expected pseries-2.11. This
patch changes the default machine back to pseries-2.11.

Unfortunately, 9070f408f491 enforces the HTM capability for pseries-2.11
to be enabled by default, ie, when not passing cap-htm on the command
line. This breaks several 'make check' testcases that run qemu-system-ppc64
with TCG.

The only sane way to fix this is to adapt the impacted testcases so that
they all pass cap-htm=off in this case. This patch does that as well.

Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
5 years agovirtio-balloon: unref the memory region before continuing
Tiwei Bie [Thu, 25 Jan 2018 07:12:43 +0000 (15:12 +0800)]
virtio-balloon: unref the memory region before continuing

Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
Cc: qemu-stable@nongnu.org
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit b86107ab43b804e899a226fe287e34ab8acef596)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
5 years agopci-bridge/i82801b11: clear bridge registers on platform reset
Laszlo Ersek [Wed, 7 Feb 2018 12:10:27 +0000 (13:10 +0100)]
pci-bridge/i82801b11: clear bridge registers on platform reset

The "i82801b11-bridge" device model is a descendant of "base-pci-bridge"
(TYPE_PCI_BRIDGE). However, unlike other similar devices, such as

- pci-bridge,
- pcie-pci-bridge,
- PCIE Root Port,
- xio3130 switch upstream and downstream ports,
- dec-21154-p2p-bridge,
- pbm-bridge,
- xilinx-pcie-root,

"i82801b11-bridge" does not clear the bridge specific registers at
platform reset.

This is a problem because devices on "i82801b11-bridge" continue to
respond to config space cycles after platform reset, when addressed with
the bus number that was previously programmed into the secondary bus
number register of "i82801b11-bridge". This error breaks OVMF's search for
extra (PXB) root buses, for example.

The device class reset method for "i82801b11-bridge" is currently NULL;
set it directly to pci_bridge_reset(), like the last three bridge models
in the above listing do.

Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Marcel Apfelbaum <marcel@redhat.com>
Cc: qemu-stable@nongnu.org
Ref: https://bugzilla.redhat.com/show_bug.cgi?id=1541839
Signed-off-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit ed247f40db84c8bd4bb7d10948702cd47cc4d5ae)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
5 years agoblock/ssh: fix possible segmentation fault when .desc is not null-terminated
Murilo Opsfelder Araujo [Fri, 5 Jan 2018 14:44:40 +0000 (12:44 -0200)]
block/ssh: fix possible segmentation fault when .desc is not null-terminated

This patch prevents a possible segmentation fault when .desc members are checked
against NULL.

The ssh_runtime_opts was added by commit
8a6a80896d6af03b8ee0c17cdf37219eca2588a7 ("block/ssh: Use QemuOpts for runtime
options").

This fix was inspired by
http://lists.nongnu.org/archive/html/qemu-devel/2018-01/msg00883.html.

Fixes: 8a6a80896d6af03b8ee0c17cdf37219eca2588a7 ("block/ssh: Use QemuOpts for runtime options")
Cc: Max Reitz <mreitz@redhat.com>
Cc: Eric Blake <eblake@redhat.com>
Signed-off-by: Murilo Opsfelder Araujo <muriloo@linux.vnet.ibm.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Signed-off-by: Jeff Cody <jcody@redhat.com>
(cherry picked from commit fbd5c4c0db47e578e3fdd88a0ebc4314a1ed3d42)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
5 years agospapr: register dummy ICPs later
Greg Kurz [Tue, 27 Feb 2018 15:22:58 +0000 (16:22 +0100)]
spapr: register dummy ICPs later

Some older machine types create more ICPs than needed. We hence
need to register up to xics_max_server_number() dummy ICPs to
accomodate the migration of these machine types.

Recent VSMT rework changed xics_max_server_number() to return

    DIV_ROUND_UP(max_cpus * spapr->vsmt, smp_threads)

instead of

    DIV_ROUND_UP(max_cpus * kvmppc_smt_threads(), smp_threads);

The change is okay but it requires spapr->vsmt to be set, which
isn't the case with the current code. This causes the formula to
return zero and we don't create dummy ICPs. This breaks migration
of older guests as reported here:

    https://bugzilla.redhat.com/show_bug.cgi?id=1549087

The dummy ICP workaround doesn't really have a dependency on XICS
itself. But it does depend on proper VCPU id numbering and it must
be applied before creating vCPUs (ie, creating real ICPs). So this
patch moves the workaround to spapr_init_cpus(), which already
assumes VSMT to be set.

Fixes: 72194664c8a1 ("spapr: use spapr->vsmt to compute VCPU ids")
Reported-by: Lukas Doktor <ldoktor@redhat.com>
Signed-off-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
(cherry picked from commit 72fdd4de8e5fdc1a6078e000835fb54592a3fe97)
Signed-off-by: Greg Kurz <groug@kaod.org>
5 years agospapr: fix missing CPU core nodes in DT when running with TCG
Greg Kurz [Fri, 16 Feb 2018 18:58:06 +0000 (19:58 +0100)]
spapr: fix missing CPU core nodes in DT when running with TCG

Commit 5d0fb1508e2d "spapr: consolidate the VCPU id numbering logic
in a single place" introduced a helper to detect thread0 of a virtual
core based on its VCPU id. This is used to create CPU core nodes in
the DT, but it is broken in TCG.

$ qemu-system-ppc64 -nographic -accel tcg -machine dumpdtb=dtb.bin \
                    -smp cores=16,maxcpus=16,threads=1
$ dtc -f -O dts dtb.bin | grep POWER8
                PowerPC,POWER8@0 {
                PowerPC,POWER8@8 {

instead of the expected 16 cores that we get with KVM:

$ dtc -f -O dts dtb.bin | grep POWER8
                PowerPC,POWER8@0 {
                PowerPC,POWER8@8 {
                PowerPC,POWER8@10 {
                PowerPC,POWER8@18 {
                PowerPC,POWER8@20 {
                PowerPC,POWER8@28 {
                PowerPC,POWER8@30 {
                PowerPC,POWER8@38 {
                PowerPC,POWER8@40 {
                PowerPC,POWER8@48 {
                PowerPC,POWER8@50 {
                PowerPC,POWER8@58 {
                PowerPC,POWER8@60 {
                PowerPC,POWER8@68 {
                PowerPC,POWER8@70 {
                PowerPC,POWER8@78 {

This happens because spapr_get_vcpu_id() maps VCPU ids to
cs->cpu_index in TCG mode. This confuses the code in
spapr_is_thread0_in_vcore(), since it assumes thread0 VCPU
ids to have a spapr->vsmt spacing.

    spapr_get_vcpu_id(cpu) % spapr->vsmt == 0

Actually, there's no real reason to expose cs->cpu_index instead
of the VCPU id, since we also generate it with TCG. Also we already
set it explicitly in spapr_set_vcpu_id(), so there's no real reason
either to call kvm_arch_vcpu_id() with KVM.

This patch unifies spapr_get_vcpu_id() to always return the computed
VCPU id both in TCG and KVM. This is one step forward towards KVM<->TCG
migration.

Fixes: 5d0fb1508e2d
Reported-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
(cherry picked from commit b1a568c1c2192f090536b8ac76d135ce1f46a0ee)
Signed-off-by: Greg Kurz <groug@kaod.org>
5 years agospapr: consolidate the VCPU id numbering logic in a single place
Greg Kurz [Wed, 14 Feb 2018 19:40:53 +0000 (20:40 +0100)]
spapr: consolidate the VCPU id numbering logic in a single place

Several places in the code need to calculate a VCPU id:

    (cpu_index / smp_threads) * spapr->vsmt + cpu_index % smp_threads
    (core_id / smp_threads) * spapr->vsmt (1 user)
    index * spapr->vsmt (2 users)

or guess that the VCPU id of a given VCPU is the first thread of a virtual
core:

    index % spapr->vsmt != 0

Even if the numbering logic isn't that complex, it is rather fragile to
have these assumptions open-coded in several places. FWIW this was
proved with recent issues related to VSMT.

This patch moves the VCPU id formula to a single function to be called
everywhere the code needs to compute one. It also adds an helper to
guess if a VCPU is the first thread of a VCORE.

Signed-off-by: Greg Kurz <groug@kaod.org>
[dwg: Rename spapr_is_vcore() to spapr_is_thread0_in_vcore() for clarity]
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
(cherry picked from commit 5d0fb1508e2d279da74ef4a103e8def9b52c6304)
Signed-off-by: Greg Kurz <groug@kaod.org>
5 years agospapr: rename spapr_vcpu_id() to spapr_get_vcpu_id()
Greg Kurz [Wed, 14 Feb 2018 19:40:44 +0000 (20:40 +0100)]
spapr: rename spapr_vcpu_id() to spapr_get_vcpu_id()

The spapr_vcpu_id() function is an accessor actually. Let's rename it
for symmetry with the recently added spapr_set_vcpu_id() helper.

The motivation behind this is that a later patch will consolidate
the VCPU id formula in a function and spapr_vcpu_id looks like an
appropriate name.

Signed-off-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
(cherry picked from commit 14bb4486c819ea797a151b3e0fe53d6f5c7b3fc5)
Signed-off-by: Greg Kurz <groug@kaod.org>
5 years agotarget/ppc: Clarify compat mode max_threads value
David Gibson [Mon, 15 Jan 2018 05:40:23 +0000 (16:40 +1100)]
target/ppc: Clarify compat mode max_threads value

We recently had some discussions that were sidetracked for a while, because
nearly everyone misapprehended the purpose of the 'max_threads' field in
the compatiblity modes table.  It's all about guest expectations, not host
expectations or support (that's handled elsewhere).

In an attempt to avoid a repeat of that confusion, rename the field to
'max_vthreads' and add an explanatory comment.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Jose Ricardo Ziviani <joserz@linux.vnet.ibm.com>
(cherry picked from commit abbc124753896f72e3715813ea20dd1924202ff0)
Signed-off-by: Greg Kurz <groug@kaod.org>
5 years agospapr: move VCPU calculation to core machine code
Greg Kurz [Wed, 14 Feb 2018 19:40:35 +0000 (20:40 +0100)]
spapr: move VCPU calculation to core machine code

The VCPU ids are currently computed and assigned to each individual
CPU threads in spapr_cpu_core_realize(). But the numbering logic
of VCPU ids is actually a machine-level concept, and many places
in hw/ppc/spapr.c also have to compute VCPU ids out of CPU indexes.

The current formula used in spapr_cpu_core_realize() is:

    vcpu_id = (cc->core_id * spapr->vsmt / smp_threads) + i

where:

    cc->core_id is a multiple of smp_threads
    cpu_index = cc->core_id + i
    0 <= i < smp_threads

So we have:

    cpu_index % smp_threads == i
    cc->core_id / smp_threads == cpu_index / smp_threads

hence:

    vcpu_id =
        (cpu_index / smp_threads) * spapr->vsmt + cpu_index % smp_threads;

This formula was used before VSMT at the time VCPU ids where computed
at the target emulation level. It has the advantage of being useable
to derive a VPCU id out of a CPU index only. It is fitted for all the
places where the machine code has to compute a VCPU id.

This patch introduces an accessor to set the VCPU id in a PowerPCCPU object
using the above formula. It is a first step to consolidate all the VCPU id
logic in a single place.

Signed-off-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
(cherry picked from commit 648edb64751ea0e550f36302fa66f9f11e480824)
Signed-off-by: Greg Kurz <groug@kaod.org>
5 years agospapr: use spapr->vsmt to compute VCPU ids
Greg Kurz [Wed, 14 Feb 2018 19:40:26 +0000 (20:40 +0100)]
spapr: use spapr->vsmt to compute VCPU ids

Since the introduction of VSMT in 2.11, the spacing of VCPU ids
between cores is controllable through a machine property instead
of being only dictated by the SMT mode of the host:

    cpu->vcpu_id = (cc->core_id * spapr->vsmt / smp_threads) + i

Until recently, the machine code would try to change the SMT mode
of the host to be equal to VSMT or exit. This allowed the rest of
the code to assume that kvmppc_smt_threads() == spapr->vsmt is
always true.

Recent commit "8904e5a75005 spapr: Adjust default VSMT value for
better migration compatibility" relaxed the rule. If the VSMT
mode cannot be set in KVM for some reasons, but the requested
CPU topology is compatible with the current SMT mode, then we
let the guest run with  kvmppc_smt_threads() != spapr->vsmt.

This breaks quite a few places in the code, in particular when
calculating DRC indexes.

This is what happens on a POWER host with subcores-per-core=2 (ie,
supports up to SMT4) when passing the following topology:

    -smp threads=4,maxcpus=16 \
    -device host-spapr-cpu-core,core-id=4,id=core1 \
    -device host-spapr-cpu-core,core-id=8,id=core2

qemu-system-ppc64: warning: Failed to set KVM's VSMT mode to 8 (errno -22)

This is expected since KVM is limited to SMT4, but the guest is started
anyway because this topology can run on SMT4 even with a VSMT8 spacing.

But when we look at the DT, things get nastier:

cpus {
        ...
        ibm,drc-indexes = <0x4 0x10000000 0x10000004 0x10000008 0x1000000c>;

This means that we have the following association:

 CPU core device |     DRC    | VCPU id
-----------------+------------+---------
   boot core     | 0x10000000 | 0
   core1         | 0x10000004 | 4
   core2         | 0x10000008 | 8
   core3         | 0x1000000c | 12

But since the spacing of VCPU ids is 8, the DRC for core1 points to a
VCPU that doesn't exist, the DRC for core2 points to the first VCPU of
core1 and and so on...

        ...

        PowerPC,POWER8@0 {
                ...
                ibm,my-drc-index = <0x10000000>;
                ...
        };

        PowerPC,POWER8@8 {
                ...
                ibm,my-drc-index = <0x10000008>;
                ...
        };

        PowerPC,POWER8@10 {
                ...

No ibm,my-drc-index property for this core since 0x10000010 doesn't
exist in ibm,drc-indexes above.

                ...
        };
};

...

interrupt-controller {
        ...
        ibm,interrupt-server-ranges = <0x0 0x10>;

With a spacing of 8, the highest VCPU id for the given topology should be:
        16 * 8 / 4 = 32 and not 16

        ...
        linux,phandle = <0x7e7323b8>;
        interrupt-controller;
};

And CPU hot-plug/unplug is broken:

(qemu) device_del core1
pseries-hotplug-cpu: Cannot find CPU (drc index 10000004) to remove

(qemu) device_del core2
cpu 4 (hwid 8) Ready to die...
cpu 5 (hwid 9) Ready to die...
cpu 6 (hwid 10) Ready to die...
cpu 7 (hwid 11) Ready to die...

These are the VCPU ids of core1 actually

(qemu) device_add host-spapr-cpu-core,core-id=12,id=core3
(qemu) device_del core3
pseries-hotplug-cpu: Cannot find CPU (drc index 1000000c) to remove

This patches all the code in hw/ppc/spapr.c to assume the VSMT
spacing when manipulating VCPU ids.

Fixes: 8904e5a75005
Signed-off-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
(cherry picked from commit 72194664c8a16b67865eb95054f984dd169cfa86)

Signed-off-by: Greg Kurz <groug@kaod.org>
Signed-off-by: Greg Kurz <groug@kaod.org>
5 years agospapr: set vsmt to MAX(8, smp_threads)
Laurent Vivier [Fri, 9 Feb 2018 08:18:58 +0000 (09:18 +0100)]
spapr: set vsmt to MAX(8, smp_threads)

We ignore silently the value of smp_threads when we set
the default VSMT value, and if smp_threads is greater than VSMT
kernel is going into trouble later.

Fixes: 8904e5a750
("spapr: Adjust default VSMT value for better migration compatibility")

Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
(cherry picked from commit 4ad64cbd0c3f9df15be5f7d1c920285551e802ca)
Signed-off-by: Greg Kurz <groug@kaod.org>
5 years agospapr: Adjust default VSMT value for better migration compatibility
David Gibson [Mon, 15 Jan 2018 06:51:33 +0000 (17:51 +1100)]
spapr: Adjust default VSMT value for better migration compatibility

fa98fbfc "PC: KVM: Support machine option to set VSMT mode" introduced the
"vsmt" parameter for the pseries machine type, which controls the spacing
of the vcpu ids of thread 0 for each virtual core.  This was done to bring
some consistency and stability to how that was done, while still allowing
backwards compatibility for migration and otherwise.

The default value we used for vsmt was set to the max of the host's
advertised default number of threads and the number of vthreads per vcore
in the guest.  This was done to continue running without extra parameters
on older KVM versions which don't allow the VSMT value to be changed.

Unfortunately, even that smaller than before leakage of host configuration
into guest visible configuration still breaks things.  Specifically a guest
with 4 (or less) vthread/vcore will get a different vsmt value when
running on a POWER8 (vsmt==8) and POWER9 (vsmt==4) host.  That means the
vcpu ids don't line up so you can't migrate between them, though you should
be able to.

Long term we really want to make vsmt == smp_threads for sufficiently
new machine types.  However, that means that qemu will then require a
sufficiently recent KVM (one which supports changing VSMT) - that's still
not widely enough deployed to be really comfortable to do.

In the meantime we need some default that will work as often as
possible.  This patch changes that default to 8 in all circumstances.
This does change guest visible behaviour (including for existing
machine versions) for many cases - just not the most common/important
case.

Following is case by case justification for why this is still the least
worst option.  Note that any of the old behaviours can still be duplicated
after this patch, it's just that it requires manual intervention by
setting the vsmt property on the command line.

KVM HV on POWER8 host:
   This is the overwhelmingly common case in production setups, and is
   unchanged by design.  POWER8 hosts will advertise a default VSMT mode
   of 8, and > 8 vthreads/vcore isn't permitted

KVM HV on POWER7 host:
   Will break, but POWER7s allowing KVM were never released to the public.

KVM HV on POWER9 host:
   Not yet released to the public, breaking this now will reduce other
   breakage later.

KVM HV on PowerPC 970:
   Will theoretically break it, but it was barely supported to begin with
   and already required various user visible hacks to work.  Also so old
   that I just don't care.

TCG:
   This is the nastiest one; it means migration of TCG guests (without
   manual vsmt setting) will break.  Since TCG is rarely used in production
   I think this is worth it for the other benefits.  It does also remove
   one more barrier to TCG<->KVM migration which could be interesting for
   debugging applications.

KVM PR:
   As with TCG, this will break migration of existing configurations,
   without adding extra manual vsmt options.  As with TCG, it is rare in
   production so I think the benefits outweigh breakages.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: Jose Ricardo Ziviani <joserz@linux.vnet.ibm.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
(cherry picked from commit 8904e5a75005fe579c28806003892d8ae4a27dfa)
Signed-off-by: Greg Kurz <groug@kaod.org>
5 years agospapr: Allow some cases where we can't set VSMT mode in the kernel
David Gibson [Tue, 16 Jan 2018 04:37:37 +0000 (15:37 +1100)]
spapr: Allow some cases where we can't set VSMT mode in the kernel

At present if we require a vsmt mode that's not equal to the kernel's
default, and the kernel doesn't let us change it (e.g. because it's an old
kernel without support) then we always fail.

But in fact we can cope with the kernel having a different vsmt as long as
  a) it's >= the actual number of vthreads/vcore (so that guest threads
     that are supposed to be on the same core act like it)
  b) it's a submultiple of the requested vsmt mode (so that guest threads
     spaced by the vsmt value will act like they're on different cores)

Allowing this case gives us a bit more freedom to adjust the vsmt behaviour
without breaking existing cases.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
Tested-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Greg Kurz <groug@kaod.org>
(cherry picked from commit 1f20f2e0ee61d91abff4e86ed1cda1b5244647d3)
Signed-off-by: Greg Kurz <groug@kaod.org>
5 years agosdl: workaround bug in sdl 2.0.8 headers
Gerd Hoffmann [Wed, 7 Mar 2018 15:42:57 +0000 (16:42 +0100)]
sdl: workaround bug in sdl 2.0.8 headers

https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=892087

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Message-id: 20180307154258.9313-1-kraxel@redhat.com
(cherry picked from commit 2ca5c43091324a68772dc348cdf157c63888c168)
Signed-off-by: Greg Kurz <groug@kaod.org>
5 years agomemfd: fix configure test
Paolo Bonzini [Tue, 28 Nov 2017 10:51:27 +0000 (11:51 +0100)]
memfd: fix configure test

Recent glibc added memfd_create in sys/mman.h.  This conflicts with
the definition in util/memfd.c:

    /builddir/build/BUILD/qemu-2.11.0-rc1/util/memfd.c:40:12: error: static declaration of memfd_create follows non-static declaration

Fix the configure test, and remove the sys/memfd.h inclusion since the
file actually does not exist---it is a typo in the memfd_create(2) man
page.

Cc: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 75e5b70e6b5dcc4f2219992d7cffa462aa406af0)
Signed-off-by: Greg Kurz <groug@kaod.org>
6 years agoUpdate version for 2.11.1 release v2.11.1
Michael Roth [Wed, 14 Feb 2018 20:41:05 +0000 (14:41 -0600)]
Update version for 2.11.1 release

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
6 years agospapr: add missing break in h_get_cpu_characteristics()
Greg Kurz [Thu, 1 Feb 2018 19:47:41 +0000 (20:47 +0100)]
spapr: add missing break in h_get_cpu_characteristics()

Detected by Coverity (CID 1385702). This fixes the recently added hypercall
to let guests properly apply Spectre and Meltdown workarounds.

Fixes: c59704b25473 "target/ppc/spapr: Add H-Call H_GET_CPU_CHARACTERISTICS"
Reported-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
(cherry picked from commit fa86f59234919b479b7e8da6b0dc2dad894a5eac)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
6 years agovga: check the validation of memory addr when draw text
linzhecheng [Thu, 11 Jan 2018 13:27:24 +0000 (21:27 +0800)]
vga: check the validation of memory addr when draw text

Start a vm with qemu-kvm -enable-kvm -vnc :66 -smp 1 -m 1024 -hda
redhat_5.11.qcow2  -device pcnet -vga cirrus,
then use VNC client to connect to VM, and excute the code below in guest
OS will lead to qemu crash:

int main()
 {
    iopl(3);
    srand(time(NULL));
    int a,b;
    while(1){
a = rand()%0x100;
b = 0x3c0 + (rand()%0x20);
        outb(a,b);
    }
    return 0;
}

The above code is writing the registers of VGA randomly.
We can write VGA CRT controller registers index 0x0C or 0x0D
(which is the start address register) to modify the
the display memory address of the upper left pixel
or character of the screen. The address may be out of the
range of vga ram. So we should check the validation of memory address
when reading or writing it to avoid segfault.

Signed-off-by: linzhecheng <linzhecheng@huawei.com>
Message-id: 20180111132724.13744-1-linzhecheng@huawei.com
Fixes: CVE-2018-5683
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit 191f59dc17396bb5a8da50f8c59b6e0a430711a4)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
6 years agoinput: fix memory leak
linzhecheng [Mon, 25 Dec 2017 02:37:30 +0000 (10:37 +0800)]
input: fix memory leak

If kbd_queue is not empty and queue_count >= queue_limit,
we should free evt.

Change-Id: Ieeacf90d5e7e370a40452ec79031912d8b864d83
Signed-off-by: linzhecheng <linzhecheng@huawei.com>
Message-id: 20171225023730.5512-1-linzhecheng@huawei.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit fca4774a96843ba9d32a5d5d1c3826e1478facae)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
6 years agoui: correctly advance output buffer when writing SASL data
Daniel P. Berrangé [Thu, 1 Feb 2018 15:58:41 +0000 (15:58 +0000)]
ui: correctly advance output buffer when writing SASL data

In this previous commit:

  commit 8f61f1c5a6bc06438a1172efa80bc7606594fa07
  Author: Daniel P. Berrange <berrange@redhat.com>
  Date:   Mon Dec 18 19:12:20 2017 +0000

    ui: track how much decoded data we consumed when doing SASL encoding

I attempted to fix a flaw with tracking how much data had actually been
processed when encoding with SASL. With that flaw, the VNC server could
mistakenly discard queued data that had not been sent.

The fix was not quite right though, because it merely decremented the
vs->output.offset value. This is effectively discarding data from the
end of the pending output buffer. We actually need to discard data from
the start of the pending output buffer. We also want to free memory that
is no longer required. The correct way to handle this is to use the
buffer_advance() helper method instead of directly manipulating the
offset value.

Reported-by: Laszlo Ersek <lersek@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Message-id: 20180201155841.27509-1-berrange@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit 627ebec208a8809818589e17f4fce55a59420ad2)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
6 years agoui: avoid sign extension using client width/height
Daniel P. Berrange [Thu, 18 Jan 2018 15:52:54 +0000 (15:52 +0000)]
ui: avoid sign extension using client width/height

Pixman returns a signed int for the image width/height, but the VNC
protocol only permits a unsigned int16. Effective framebuffer size
is determined by the guest, limited by the video RAM size, so the
dimensions are unlikely to exceed the range of an unsigned int16,
but this is not currently validated.

With the current use of 'int' for client width/height, the calculation
of offsets in vnc_update_throttle_offset() suffers from integer size
promotion and sign extension, causing coverity warnings

*** CID 1385147:  Integer handling issues  (SIGN_EXTENSION)
/ui/vnc.c: 979 in vnc_update_throttle_offset()
973      * than that the client would already suffering awful audio
974      * glitches, so dropping samples is no worse really).
975      */
976     static void vnc_update_throttle_offset(VncState *vs)
977     {
978         size_t offset =
>>>     CID 1385147:  Integer handling issues  (SIGN_EXTENSION)
>>>     Suspicious implicit sign extension:
    "vs->client_pf.bytes_per_pixel" with type "unsigned char" (8 bits,
    unsigned) is promoted in "vs->client_width * vs->client_height *
    vs->client_pf.bytes_per_pixel" to type "int" (32 bits, signed), then
    sign-extended to type "unsigned long" (64 bits, unsigned).  If
    "vs->client_width * vs->client_height * vs->client_pf.bytes_per_pixel"
    is greater than 0x7FFFFFFF, the upper bits of the result will all be 1.
979             vs->client_width * vs->client_height * vs->client_pf.bytes_per_pixel;

Change client_width / client_height to be a size_t to avoid sign
extension and integer promotion. Then validate that dimensions are in
range wrt the RFB protocol u16 limits.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-id: 20180118155254.17053-1-berrange@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit 4c956bd81e2e16afd19d38d1fdeba6d9faa8a1ae)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
6 years agoui: mix misleading comments & return types of VNC I/O helper methods
Daniel P. Berrange [Mon, 18 Dec 2017 19:12:28 +0000 (19:12 +0000)]
ui: mix misleading comments & return types of VNC I/O helper methods

While the QIOChannel APIs for reading/writing data return ssize_t, with negative
value indicating an error, the VNC code passes this return value through the
vnc_client_io_error() method. This detects the error condition, disconnects the
client and returns 0 to indicate error. Thus all the VNC helper methods should
return size_t (unsigned), and misleading comments which refer to the possibility
of negative return values need fixing.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Reviewed-by: Darren Kenny <darren.kenny@oracle.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-id: 20171218191228.31018-14-berrange@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit 30b80fd5269257f55203b7072c505b4ebaab5115)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
6 years agoui: add trace events related to VNC client throttling
Daniel P. Berrange [Mon, 18 Dec 2017 19:12:27 +0000 (19:12 +0000)]
ui: add trace events related to VNC client throttling

The VNC client throttling is quite subtle so will benefit from having trace
points available for live debugging.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Reviewed-by: Darren Kenny <darren.kenny@oracle.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-id: 20171218191228.31018-13-berrange@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit 6aa22a29187e1908f5db738d27c64a9efc8d0bfa)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
6 years agoui: place a hard cap on VNC server output buffer size
Daniel P. Berrange [Mon, 18 Dec 2017 19:12:26 +0000 (19:12 +0000)]
ui: place a hard cap on VNC server output buffer size

The previous patches fix problems with throttling of forced framebuffer updates
and audio data capture that would cause the QEMU output buffer size to grow
without bound. Those fixes are graceful in that once the client catches up with
reading data from the server, everything continues operating normally.

There is some data which the server sends to the client that is impractical to
throttle. Specifically there are various pseudo framebuffer update encodings to
inform the client of things like desktop resizes, pointer changes, audio
playback start/stop, LED state and so on. These generally only involve sending
a very small amount of data to the client, but a malicious guest might be able
to do things that trigger these changes at a very high rate. Throttling them is
not practical as missed or delayed events would cause broken behaviour for the
client.

This patch thus takes a more forceful approach of setting an absolute upper
bound on the amount of data we permit to be present in the output buffer at
any time. The previous patch set a threshold for throttling the output buffer
by allowing an amount of data equivalent to one complete framebuffer update and
one seconds worth of audio data. On top of this it allowed for one further
forced framebuffer update to be queued.

To be conservative, we thus take that throttling threshold and multiply it by
5 to form an absolute upper bound. If this bound is hit during vnc_write() we
forceably disconnect the client, refusing to queue further data. This limit is
high enough that it should never be hit unless a malicious client is trying to
exploit the sever, or the network is completely saturated preventing any sending
of data on the socket.

This completes the fix for CVE-2017-15124 started in the previous patches.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Reviewed-by: Darren Kenny <darren.kenny@oracle.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-id: 20171218191228.31018-12-berrange@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit f887cf165db20f405cb8805c716bd363aaadf815)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
6 years agoui: fix VNC client throttling when forced update is requested
Daniel P. Berrange [Mon, 18 Dec 2017 19:12:25 +0000 (19:12 +0000)]
ui: fix VNC client throttling when forced update is requested

The VNC server must throttle data sent to the client to prevent the 'output'
buffer size growing without bound, if the client stops reading data off the
socket (either maliciously or due to stalled/slow network connection).

The current throttling is very crude because it simply checks whether the
output buffer offset is zero. This check is disabled if the client has requested
a forced update, because we want to send these as soon as possible.

As a result, the VNC client can cause QEMU to allocate arbitrary amounts of RAM.
They can first start something in the guest that triggers lots of framebuffer
updates eg play a youtube video. Then repeatedly send full framebuffer update
requests, but never read data back from the server. This can easily make QEMU's
VNC server send buffer consume 100MB of RAM per second, until the OOM killer
starts reaping processes (hopefully the rogue QEMU process, but it might pick
others...).

To address this we make the throttling more intelligent, so we can throttle
full updates. When we get a forced update request, we keep track of exactly how
much data we put on the output buffer. We will not process a subsequent forced
update request until this data has been fully sent on the wire. We always allow
one forced update request to be in flight, regardless of what data is queued
for incremental updates or audio data. The slight complication is that we do
not initially know how much data an update will send, as this is done in the
background by the VNC job thread. So we must track the fact that the job thread
has an update pending, and not process any further updates until this job is
has been completed & put data on the output buffer.

This unbounded memory growth affects all VNC server configurations supported by
QEMU, with no workaround possible. The mitigating factor is that it can only be
triggered by a client that has authenticated with the VNC server, and who is
able to trigger a large quantity of framebuffer updates or audio samples from
the guest OS. Mostly they'll just succeed in getting the OOM killer to kill
their own QEMU process, but its possible other processes can get taken out as
collateral damage.

This is a more general variant of the similar unbounded memory usage flaw in
the websockets server, that was previously assigned CVE-2017-15268, and fixed
in 2.11 by:

  commit a7b20a8efa28e5f22c26c06cd06c2f12bc863493
  Author: Daniel P. Berrange <berrange@redhat.com>
  Date:   Mon Oct 9 14:43:42 2017 +0100

    io: monitor encoutput buffer size from websocket GSource

This new general memory usage flaw has been assigned CVE-2017-15124, and is
partially fixed by this patch.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Reviewed-by: Darren Kenny <darren.kenny@oracle.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-id: 20171218191228.31018-11-berrange@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit ada8d2e4369ea49677d8672ac81bce73eefd5b54)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
6 years agoui: fix VNC client throttling when audio capture is active
Daniel P. Berrange [Mon, 18 Dec 2017 19:12:24 +0000 (19:12 +0000)]
ui: fix VNC client throttling when audio capture is active

The VNC server must throttle data sent to the client to prevent the 'output'
buffer size growing without bound, if the client stops reading data off the
socket (either maliciously or due to stalled/slow network connection).

The current throttling is very crude because it simply checks whether the
output buffer offset is zero. This check must be disabled if audio capture is
enabled, because when streaming audio the output buffer offset will rarely be
zero due to queued audio data, and so this would starve framebuffer updates.

As a result, the VNC client can cause QEMU to allocate arbitrary amounts of RAM.
They can first start something in the guest that triggers lots of framebuffer
updates eg play a youtube video. Then enable audio capture, and simply never
read data back from the server. This can easily make QEMU's VNC server send
buffer consume 100MB of RAM per second, until the OOM killer starts reaping
processes (hopefully the rogue QEMU process, but it might pick others...).

To address this we make the throttling more intelligent, so we can throttle
when audio capture is active too. To determine how to throttle incremental
updates or audio data, we calculate a size threshold. Normally the threshold is
the approximate number of bytes associated with a single complete framebuffer
update. ie width * height * bytes per pixel. We'll send incremental updates
until we hit this threshold, at which point we'll stop sending updates until
data has been written to the wire, causing the output buffer offset to fall
back below the threshold.

If audio capture is enabled, we increase the size of the threshold to also
allow for upto 1 seconds worth of audio data samples. ie nchannels * bytes
per sample * frequency. This allows the output buffer to have a mixture of
incremental framebuffer updates and audio data queued, but once the threshold
is exceeded, audio data will be dropped and incremental updates will be
throttled.

This unbounded memory growth affects all VNC server configurations supported by
QEMU, with no workaround possible. The mitigating factor is that it can only be
triggered by a client that has authenticated with the VNC server, and who is
able to trigger a large quantity of framebuffer updates or audio samples from
the guest OS. Mostly they'll just succeed in getting the OOM killer to kill
their own QEMU process, but its possible other processes can get taken out as
collateral damage.

This is a more general variant of the similar unbounded memory usage flaw in
the websockets server, that was previously assigned CVE-2017-15268, and fixed
in 2.11 by:

  commit a7b20a8efa28e5f22c26c06cd06c2f12bc863493
  Author: Daniel P. Berrange <berrange@redhat.com>
  Date:   Mon Oct 9 14:43:42 2017 +0100

    io: monitor encoutput buffer size from websocket GSource

This new general memory usage flaw has been assigned CVE-2017-15124, and is
partially fixed by this patch.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Reviewed-by: Darren Kenny <darren.kenny@oracle.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-id: 20171218191228.31018-10-berrange@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit e2b72cb6e0443d90d7ab037858cb6834b6cca852)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
6 years agoui: refactor code for determining if an update should be sent to the client
Daniel P. Berrange [Mon, 18 Dec 2017 19:12:23 +0000 (19:12 +0000)]
ui: refactor code for determining if an update should be sent to the client

The logic for determining if it is possible to send an update to the client
will become more complicated shortly, so pull it out into a separate method
for easier extension later.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Reviewed-by: Darren Kenny <darren.kenny@oracle.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-id: 20171218191228.31018-9-berrange@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit 0bad834228b9ee63e4239108d02dcb94568254d0)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
6 years agoui: correctly reset framebuffer update state after processing dirty regions
Daniel P. Berrange [Mon, 18 Dec 2017 19:12:22 +0000 (19:12 +0000)]
ui: correctly reset framebuffer update state after processing dirty regions

According to the RFB protocol, a client sends one or more framebuffer update
requests to the server. The server can reply with a single framebuffer update
response, that covers all previously received requests. Once the client has
read this update from the server, it may send further framebuffer update
requests to monitor future changes. The client is free to delay sending the
framebuffer update request if it needs to throttle the amount of data it is
reading from the server.

The QEMU VNC server, however, has never correctly handled the framebuffer
update requests. Once QEMU has received an update request, it will continue to
send client updates forever, even if the client hasn't asked for further
updates. This prevents the client from throttling back data it gets from the
server. This change fixes the flawed logic such that after a set of updates are
sent out, QEMU waits for a further update request before sending more data.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Reviewed-by: Darren Kenny <darren.kenny@oracle.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-id: 20171218191228.31018-8-berrange@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit 728a7ac95484a7ba5e624ccbac4c1326571576b0)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
6 years agoui: introduce enum to track VNC client framebuffer update request state
Daniel P. Berrange [Mon, 18 Dec 2017 19:12:21 +0000 (19:12 +0000)]
ui: introduce enum to track VNC client framebuffer update request state

Currently the VNC servers tracks whether a client has requested an incremental
or forced update with two boolean flags. There are only really 3 distinct
states to track, so create an enum to more accurately reflect permitted states.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Reviewed-by: Darren Kenny <darren.kenny@oracle.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-id: 20171218191228.31018-7-berrange@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit fef1bbadfb2c3027208eb3d14b43e1bdb51166ca)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
6 years agoui: track how much decoded data we consumed when doing SASL encoding
Daniel P. Berrange [Mon, 18 Dec 2017 19:12:20 +0000 (19:12 +0000)]
ui: track how much decoded data we consumed when doing SASL encoding

When we encode data for writing with SASL, we encode the entire pending output
buffer. The subsequent write, however, may not be able to send the full encoded
data in one go though, particularly with a slow network. So we delay setting the
output buffer offset back to zero until all the SASL encoded data is sent.

Between encoding the data and completing sending of the SASL encoded data,
however, more data might have been placed on the pending output buffer. So it
is not valid to set offset back to zero. Instead we must keep track of how much
data we consumed during encoding and subtract only that amount.

With the current bug we would be throwing away some pending data without having
sent it at all. By sheer luck this did not previously cause any serious problem
because appending data to the send buffer is always an atomic action, so we
only ever throw away complete RFB protocol messages. In the case of frame buffer
updates we'd catch up fairly quickly, so no obvious problem was visible.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Reviewed-by: Darren Kenny <darren.kenny@oracle.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-id: 20171218191228.31018-6-berrange@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit 8f61f1c5a6bc06438a1172efa80bc7606594fa07)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
6 years agoui: avoid pointless VNC updates if framebuffer isn't dirty
Daniel P. Berrange [Mon, 18 Dec 2017 19:12:19 +0000 (19:12 +0000)]
ui: avoid pointless VNC updates if framebuffer isn't dirty

The vnc_update_client() method checks the 'has_dirty' flag to see if there are
dirty regions that are pending to send to the client. Regardless of this flag,
if a forced update is requested, updates must be sent. For unknown reasons
though, the code also tries to sent updates if audio capture is enabled. This
makes no sense as audio capture state does not impact framebuffer contents, so
this check is removed.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Reviewed-by: Darren Kenny <darren.kenny@oracle.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-id: 20171218191228.31018-5-berrange@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit 3541b08475d51bddf8aded36576a0ff5a547a978)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
6 years agoui: remove redundant indentation in vnc_client_update
Daniel P. Berrange [Mon, 18 Dec 2017 19:12:18 +0000 (19:12 +0000)]
ui: remove redundant indentation in vnc_client_update

Now that previous dead / unreachable code has been removed, we can simplify
the indentation in the vnc_client_update method.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Reviewed-by: Darren Kenny <darren.kenny@oracle.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-id: 20171218191228.31018-4-berrange@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit b939eb89b6f320544a9328fa908d881d0024c1ee)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
6 years agoui: remove unreachable code in vnc_update_client
Daniel P. Berrange [Mon, 18 Dec 2017 19:12:17 +0000 (19:12 +0000)]
ui: remove unreachable code in vnc_update_client

A previous commit:

  commit 5a8be0f73d6f60ff08746377eb09ca459f39deab
  Author: Gerd Hoffmann <kraxel@redhat.com>
  Date:   Wed Jul 13 12:21:20 2016 +0200

    vnc: make sure we finish disconnect

Added a check for vs->disconnecting at the very start of the
vnc_update_client method. This means that the very next "if"
statement check for !vs->disconnecting always evaluates true,
and is thus redundant. This in turn means the vs->disconnecting
check at the very end of the method never evaluates true, and
is thus unreachable code.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Reviewed-by: Darren Kenny <darren.kenny@oracle.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-id: 20171218191228.31018-3-berrange@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit c53df961617736f94731d94b62c2954c261d2bae)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
6 years agoui: remove 'sync' parameter from vnc_update_client
Daniel P. Berrange [Mon, 18 Dec 2017 19:12:16 +0000 (19:12 +0000)]
ui: remove 'sync' parameter from vnc_update_client

There is only one caller of vnc_update_client and that always passes false
for the 'sync' parameter.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Reviewed-by: Darren Kenny <darren.kenny@oracle.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-id: 20171218191228.31018-2-berrange@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit 6af998db05aec9af95a06f84ad94f1b96785e667)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
6 years agomigration: incoming postcopy advise sanity checks
Greg Kurz [Tue, 6 Feb 2018 11:23:30 +0000 (12:23 +0100)]
migration: incoming postcopy advise sanity checks

If postcopy-ram was set on the source but not on the destination,
migration doesn't occur, the destination prints an error and boots
the guest:

qemu-system-ppc64: Expected vmdescription section, but got 0

We end up with two running instances.

This behaviour was introduced in 2.11 by commit 58110f0acb1a "migration:
split common postcopy out of ram postcopy" to prepare ground for the
upcoming dirty bitmap postcopy support. It adds a new case where the
source may send an empty postcopy advise because dirty bitmap doesn't
need to check page sizes like RAM postcopy does.

If the source has enabled postcopy-ram, then it sends an advise with
the page size values. If the destination hasn't enabled postcopy-ram,
then loadvm_postcopy_handle_advise() leaves the page size values on
the stream and returns. This confuses qemu_loadvm_state() later on
and causes the destination to start execution.

As discussed several times, postcopy-ram should be enabled both sides
to be functional. This patch changes the destination to perform some
extra checks on the advise length to ensure this is the case. Otherwise
an error is returned and migration is aborted.

Reported-by: Balamuruhan S <bala24@linux.vnet.ibm.com>
Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Daniel Henrique Barboza <danielhb@linux.vnet.ibm.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-Id: <151791621042.19120.3103118434734245776.stgit@bahia>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
(cherry picked from commit 875fcd013ab68c64802998b22f54f0184479d21b)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
6 years agotarget/sh4: add missing tcg_temp_free() in _decode_opc()
Philippe Mathieu-Daudé [Tue, 5 Dec 2017 17:00:13 +0000 (14:00 -0300)]
target/sh4: add missing tcg_temp_free() in _decode_opc()

missed in c55497ecb8c and 852d481faf7.

Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20171205170013.22337-3-f4bug@amsat.org>
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
(cherry picked from commit e691e0ed135ac989221683ca9560c34d357edc57)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>