git.proxmox.com Git - mirror

Update version for 3.0.1 release

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>

vhost-user: Fix userfaultfd leak

'fd' received from the vhost side is never freed.
Also, everything (including 'postcopy_listen' state) should be
cleaned up on vhost cleanup.

Fixes: 46343570c06e ("vhost+postcopy: Wire up POSTCOPY_END notify")
Fixes: f82c11165ffa ("vhost+postcopy: Register shared ufd with postcopy")
Cc: qemu-stable@nongnu.org
Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Message-Id: <20181008160536.6332-3-i.maximets@samsung.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
(cherry picked from commit c4f753859ae6da1aeb93cad19c586fea1925e269)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>

vhost-user: Don't ask for reply on postcopy mem table set

According to documentation, NEED_REPLY_MASK should not be set
for VHOST_USER_SET_MEM_TABLE request in postcopy mode.
This restriction was mistakenly applied to 'reply_supported'
variable, which is local and used only for non-postcopy case.

CC: Dr. David Alan Gilbert <dgilbert@redhat.com>
Fixes: 9bb38019942c ("vhost+postcopy: Send address back to qemu")
Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Message-Id: <20181002140947.4107-1-i.maximets@samsung.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
(cherry picked from commit 5ce43896e1679e706db1562d0e2d86ad428ed1e6)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>

ppc/pnv: check size before data buffer access

While performing PowerNV memory r/w operations, the access length
'sz' could exceed the data[4] buffer size. Add check to avoid OOB
access.

Reported-by: Moguofang <moguofang@huawei.com>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
(cherry picked from commit d07945e78eb6b593cd17a4640c1fc9eb35e3245d)
*CVE-2018-18954
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>

slirp: check data length while emulating ident function

While emulating identification protocol, tcp_emu() does not check
available space in the 'sc_rcv->sb_data' buffer. It could lead to
heap buffer overflow issue. Add check to avoid it.

Reported-by: Kira <864786842@qq.com>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
(cherry picked from commit a7104eda7dab99d0cdbd3595c211864cba415905)
*CVE-2019-6778
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>

lsi53c895a: check message length value is valid

While writing a message in 'lsi_do_msgin', message length value
in 'msg_len' could be invalid due to an invalid migration stream.
Add an assertion to avoid an out of bounds access, and reject
the incoming migration data if it contains an invalid message
length.

Discovered by Deja vu Security. Reported by Oracle.

Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Message-Id: <20181026194314.18663-1-ppandit@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit e58ccf039650065a9442de43c9816f81e88f27f6)
*CVE-2018-18849
*avoid context dep. on c921370b22c
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>

seccomp: set the seccomp filter to all threads

When using "-seccomp on", the seccomp policy is only applied to the
main thread, the vcpu worker thread and other worker threads created
after seccomp policy is applied; the seccomp policy is not applied to
e.g. the RCU thread because it is created before the seccomp policy is
applied and SECCOMP_FILTER_FLAG_TSYNC isn't used.

This can be verified with
for task in /proc/`pidof qemu`/task/*; do cat $task/status | grep Secc ; done
Seccomp: 2
Seccomp: 0
Seccomp: 0
Seccomp: 2
Seccomp: 2
Seccomp: 2

Starting with libseccomp 2.2.0 and kernel >= 3.17, we can use
seccomp_attr_set(ctx, > SCMP_FLTATR_CTL_TSYNC, 1) to update the policy
on all threads.

libseccomp requirement was bumped to 2.2.0 in previous patch.
libseccomp should fail to set the filter if it can't honour
SCMP_FLTATR_CTL_TSYNC (untested), and thus -sandbox will now fail on
kernel < 3.17.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Acked-by: Eduardo Otubo <otubo@redhat.com>
(cherry picked from commit 70dfabeaa79ba4d7a3b699abe1a047c8012db114)
*CVE-2018-15746
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>

configure: require libseccomp 2.2.0

The following patch is going to require TSYNC, which is only available
since libseccomp 2.2.0.

libseccomp 2.2.0 was released February 12, 2015.

According to repology, libseccomp version in different distros:

  RHEL-7: 2.3.1
  Debian (Stretch): 2.3.1
  OpenSUSE Leap 15: 2.3.2
  Ubuntu (Xenial):  2.3.1

This will drop support for -sandbox on:

  Debian (Jessie): 2.1.1 (but 2.2.3 in backports)

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Acked-by: Eduardo Otubo <otubo@redhat.com>
(cherry picked from commit d0699bd37c48067cffbd80383172efc29da6d2f9)
*CVE-2018-15746
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>

seccomp: prefer SCMP_ACT_KILL_PROCESS if available

The upcoming libseccomp release should have SCMP_ACT_KILL_PROCESS
action (https://github.com/seccomp/libseccomp/issues/96).

SCMP_ACT_KILL_PROCESS is preferable to immediately terminate the
offending process, rather than having the SIGSYS handler running.

Use SECCOMP_GET_ACTION_AVAIL to check availability of kernel support,
as libseccomp will fallback on SCMP_ACT_KILL otherwise, and we still
prefer SCMP_ACT_TRAP.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Acked-by: Eduardo Otubo <otubo@redhat.com>
(cherry picked from commit bda08a5764d470f101fa38635d30b41179a313e1)
*CVE-2018-15746
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>

seccomp: use SIGSYS signal instead of killing the thread

The seccomp action SCMP_ACT_KILL results in immediate termination of
the thread that made the bad system call. However, qemu being
multi-threaded, it keeps running. There is no easy way for parent
process / management layer (libvirt) to know about that situation.

Instead, the default SIGSYS handler when invoked with SCMP_ACT_TRAP
will terminate the program and core dump.

This may not be the most secure solution, but probably better than
just killing the offending thread. SCMP_ACT_KILL_PROCESS has been
added in Linux 4.14 to improve the situation, which I propose to use
by default if available in the next patch.

Related to:
https://bugzilla.redhat.com/show_bug.cgi?id=1594456

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Acked-by: Eduardo Otubo <otubo@redhat.com>
(cherry picked from commit 6f2231e9b0931e1998d9ed0c509adf7aedc02db2)
*CVE-2018-15746
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>

usb-mtp: use O_NOFOLLOW and O_CLOEXEC.

Open files and directories with O_NOFOLLOW to avoid symlinks attacks.
While being at it also add O_CLOEXEC.

usb-mtp only handles regular files and directories and ignores
everything else, so users should not see a difference.

Because qemu ignores symlinks, carrying out a successful symlink attack
requires swapping an existing file or directory below rootdir for a
symlink and winning the race against the inotify notification to qemu.

Fixes: CVE-2018-16872
Cc: Prasad J Pandit <ppandit@redhat.com>
Cc: Bandan Das <bsd@redhat.com>
Reported-by: Michael Hanselmann <public@hansmi.ch>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Michael Hanselmann <public@hansmi.ch>
Message-id: 20181213122511.13853-1-kraxel@redhat.com
(cherry picked from commit bab9df35ce73d1c8e19a37e2737717ea1c984dc1)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>

usb-mtp: outlaw slashes in filenames

Slash is unix directory separator, so they are not allowed in filenames.
Note this also stops the classic escape via "../".

Fixes: CVE-2018-16867
Reported-by: Michael Hanselmann <public@hansmi.ch>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-id: 20181203101045.27976-3-kraxel@redhat.com
(cherry picked from commit c52d46e041b42bb1ee6f692e00a0abe37a9659f6)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>

9p: fix QEMU crash when renaming files

When using the 9P2000.u version of the protocol, the following shell
command line in the guest can cause QEMU to crash:

    while true; do rm -rf aa; mkdir -p a/b & touch a/b/c & mv a aa; done

With 9P2000.u, file renaming is handled by the WSTAT command. The
v9fs_wstat() function calls v9fs_complete_rename(), which calls
v9fs_fix_path() for every fid whose path is affected by the change.
The involved calls to v9fs_path_copy() may race with any other access
to the fid path performed by some worker thread, causing a crash like
shown below:

Thread 12 "qemu-system-x86" received signal SIGSEGV, Segmentation fault.
0x0000555555a25da2 in local_open_nofollow (fs_ctx=0x555557d958b8, path=0x0,
flags=65536, mode=0) at hw/9pfs/9p-local.c:59
59          while (*path && fd != -1) {
(gdb) bt
#0  0x0000555555a25da2 in local_open_nofollow (fs_ctx=0x555557d958b8,
path=0x0, flags=65536, mode=0) at hw/9pfs/9p-local.c:59
#1  0x0000555555a25e0c in local_opendir_nofollow (fs_ctx=0x555557d958b8,
path=0x0) at hw/9pfs/9p-local.c:92
#2  0x0000555555a261b8 in local_lstat (fs_ctx=0x555557d958b8,
fs_path=0x555556b56858, stbuf=0x7fff84830ef0) at hw/9pfs/9p-local.c:185
#3  0x0000555555a2b367 in v9fs_co_lstat (pdu=0x555557d97498,
path=0x555556b56858, stbuf=0x7fff84830ef0) at hw/9pfs/cofile.c:53
#4  0x0000555555a1e9e2 in v9fs_stat (opaque=0x555557d97498)
at hw/9pfs/9p.c:1083
#5  0x0000555555e060a2 in coroutine_trampoline (i0=-669165424, i1=32767)
at util/coroutine-ucontext.c:116
#6  0x00007fffef4f5600 in __start_context () at /lib64/libc.so.6
#7  0x0000000000000000 in  ()
(gdb)

The fix is to take the path write lock when calling v9fs_complete_rename(),
like in v9fs_rename().

Impact:  DoS triggered by unprivileged guest users.

Fixes: CVE-2018-19489
Cc: P J P <ppandit@redhat.com>
Reported-by: zhibin hu <noirfate@gmail.com>
Reviewed-by: Prasad J Pandit <pjp@fedoraproject.org>
Signed-off-by: Greg Kurz <groug@kaod.org>
(cherry picked from commit 1d20398694a3b67a388d955b7a945ba4aa90a8a8)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>

nvme: fix out-of-bounds access to the CMB

Because the CMB BAR has a min_access_size of 2, if you read the last
byte it will try to memcpy *2* bytes from n->cmbuf, causing an off-by-one
error.  This is CVE-2018-16847.

Another way to fix this might be to register the CMB as a RAM memory
region, which would also be more efficient.  However, that might be a
change for big-endian machines; I didn't think this through and I don't
know how real hardware works.  Add a basic testcase for the CMB in case
somebody does this change later on.

Cc: Keith Busch <keith.busch@intel.com>
Cc: qemu-block@nongnu.org
Reported-by: Li Qiang <liq3ea@gmail.com>
Reviewed-by: Li Qiang <liq3ea@gmail.com>
Tested-by: Li Qiang <liq3ea@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit 87ad860c622cc8f8916b5232bd8728c08f938fce)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>

qga: update docs with systemd suspend support info

Commit 067927d62e ("qga: systemd hibernate/suspend/hybrid-sleep
support") failed to update qapi-schema.json after adding systemd
hibernate/suspend/hybrid-sleep capabilities to guest-suspend-* QGA
commands.

Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
(cherry picked from commit bb6c8d407e49d7b805ac52fe46abf4d8d5262046)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>

bitmap: Update count after a merge

We need an accurate count of the number of bits set in a bitmap
after a merge. In particular, since the merge operation short-circuits
a merge from an empty source, if you have bitmaps A, B, and C where
B started empty, then merge C into B, and B into A, an inaccurate
count meant that A did not get the contents of C.

In the worst case, we may falsely regard the bitmap as empty when
it has had new writes merged into it.

Fixes: be58721db
CC: qemu-stable@nongnu.org
Signed-off-by: Eric Blake <eblake@redhat.com>
Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-id: 20181002233314.30159-1-jsnow@redhat.com
Signed-off-by: John Snow <jsnow@redhat.com>
(cherry picked from commit d1dde7149e376d72b422a529ec4bf3ed47f3ba30)
*drop functional dep. on fa000f2f
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>

tpm_tis: fix loop that cancels any seizure by a lower locality

In tpm_tis_mmio_write() if the requesting locality is seizing
access, any seizure by a lower locality is cancelled. However the
loop doing the seizure had an off-by-one error and the locality
immediately preceding the requesting locality was not being cleared.
This is fixed by adjusting the test in the for loop to check the
localities up to the requesting locality.

Signed-off-by: Liam Merwick <Liam.Merwick@oracle.com>
Reviewed-by: Stefan Berger <stefanb@linux.ibm.com>
Signed-off-by: Stefan Berger <stefanb@linux.ibm.com>
(cherry picked from commit 37b55d67c0f001b20b7831db3f9f24f1d453e1de)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>

slirp: check sscanf result when emulating ident

When emulating ident in tcp_emu, if the strchr checks passed but the
sscanf check failed, two uninitialized variables would be copied and
sent in the reply, so move this code inside the if(sscanf()) clause.

Signed-off-by: William Bowling <will@wbowling.info>
Cc: qemu-stable@nongnu.org
Cc: secalert@redhat.com
Message-Id: <1551476756-25749-1-git-send-email-will@wbowling.info>
Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
(cherry picked from commit d3222975c7d6cda9e25809dea05241188457b113)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>

hw/rdma: another clang compilation fix

Configuring QEMU with:
   configure --target-list="x86_64-softmmu" --cc=clang --enable-pvrdma
Results in:
   qemu/hw/rdma/rdma_rm_defs.h:108:3: error: redefinition of typedef 'RdmaDeviceResources' is a C11 feature [-Werror,-Wtypedef-redefinition]
   } RdmaDeviceResources;
     ^
   qemu/hw/rdma/rdma_backend_defs.h:24:36: note: previous definition is here
   typedef struct RdmaDeviceResources RdmaDeviceResources;

Fix by removing one of the 'typedef' definitions.

Signed-off-by: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>
Message-Id: <20190214154053.15050-1-marcel.apfelbaum@gmail.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Acked-by: Kamal Heib <kamalheib1@gmail.com>
Signed-off-by: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>
(cherry picked from commit 59f911938fbaa6a5eff1146c8a4d74e1c55ecc2b)
*drop context dep. on c2dd117b385
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>

acpi: Make TPM 2.0 with TIS available as MSFT0101

This is a backport of rev 24cf5413aa0 to 3.0.x and 3.1.x.

This patch makes the a TPM 2.0 with TIS interface available under the
HID 'MSF0101'. This is supported by Linux and also Windows now
recognizes the TPM 2.0 with TIS interface. Leave the TPM 1.2 as before.

Signed-off-by: Stefan Berger <stefanb@linux.ibm.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>

block: Fix invalidate_cache error path for parent activation

bdrv_co_invalidate_cache() clears the BDRV_O_INACTIVE flag before
actually activating a node so that the correct permissions etc. are
taken. In case of errors, the flag must be restored so that the next
call to bdrv_co_invalidate_cache() retries activation.

Restoring the flag was missing in the error path for a failed
parent->role->activate() call. The consequence is that this attempt to
activate all images correctly fails because we still set errp, however
on the next attempt BDRV_O_INACTIVE is already clear, so we return
success without actually retrying the failed action.

An example where this is observable in practice is migration to a QEMU
instance that has a raw format block node attached to a guest device
with share-rw=off (the default) while another process holds
BLK_PERM_WRITE for the same image. In this case, all activation steps
before parent->role->activate() succeed because raw can tolerate other
writers to the image. Only the parent callback (in particular
blk_root_activate()) tries to implement the share-rw=on property and
requests exclusive write permissions. This fails when the migration
completes and correctly displays an error. However, a manual 'cont' will
incorrectly resume the VM without calling blk_root_activate() again.

This case is described in more detail in the following bug report:
https://bugzilla.redhat.com/show_bug.cgi?id=1531888

Fix this by correctly restoring the BDRV_O_INACTIVE flag in the error
path.

Cc: qemu-stable@nongnu.org
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Tested-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 78fc3b3a26c145eebcdee992988644974b243a74)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>

tpm: Make sure the locality received from backend is valid

Make sure that the locality passed from the backend to
tpm_tis_request_completed() is valid.

Signed-off-by: Stefan Berger <stefanb@linux.ibm.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
(cherry picked from commit a639f96111eadb3b8e3021fd3f27e2948ad1c640)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>

tpm: Make sure new locality passed to tpm_tis_prep_abort() is valid

Make sure that the new locality passed to tpm_tis_prep_abort()
is valid.

Add a comment to aborting_locty that it may be any locality, including
TPM_TIS_NO_LOCALITY.

Signed-off-by: Stefan Berger <stefanb@linux.ibm.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
(cherry picked from commit e92b63ea610201bd743343fc6b11e6c39c8d3515)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>

tpm: use loop iterator to set sts data field

When TIS request is done, set 'sts' data field across all localities.

Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Reviewed-by: Stefan Berger <stefanb@linux.ibm.com>
Signed-off-by: Stefan Berger <stefanb@linux.ibm.com>
(cherry picked from commit 6a50bb98f24929c9fc69e9197eb21c142e061fbd)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>

tpm: Zero-init structure to avoid uninitialized variables in valgrind log

Zero-init the ptm_loc structure so that we don't have fields that
are not initialised.

Signed-off-by: Stefan Berger <stefanb@linux.ibm.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
(cherry picked from commit eff1fe9fd0cebe2293eea9597616f792b6b5ad18)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>

exec.c: Don't reallocate IOMMUNotifiers that are in use

The tcg_register_iommu_notifier() code has a GArray of
TCGIOMMUNotifier structs which it has registered by passing
memory_region_register_iommu_notifier() a pointer to the embedded
IOMMUNotifier field. Unfortunately, if we need to enlarge the
array via g_array_set_size() this can cause a realloc(), which
invalidates the pointer that memory_region_register_iommu_notifier()
put into the MemoryRegion's iommu_notify list. This can result
in segfaults.

Switch the GArray to holding pointers to the TCGIOMMUNotifier
structs, so that we can individually allocate and free them.

Cc: qemu-stable@nongnu.org
Fixes: 1f871c5e6b0f30644a60a ("exec.c: Handle IOMMUs in address_space_translate_for_iotlb()")
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20190128174241.5860-1-peter.maydell@linaro.org
(cherry picked from commit 5601be3b01d73e21c09331599e2ce62df016ff94)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>

s390x: Return specification exception for unimplemented diag 308 subcodes

The architecture specifies specification exceptions for all
unavailable subcodes.

The presence of subcodes is indicated by checking some query subcode.
For example 6 will indicate that 3-6 are available. So future systems
might call new subcodes to check for new features. This should not
trigger a hw error, instead we return the architectured specification
exception.

Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
Cc: qemu-stable@nongnu.org
Message-Id: <20190111113657.66195-3-frankja@linux.ibm.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
(cherry picked from commit 37dbd1f4d4805edcd18d94eb202bb3461b3cd52d)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>

linux-user: make pwrite64/pread64(fd, NULL, 0, offset) return 0

Linux returns success if pwrite64() or pread64() are called with a
zero length NULL buffer, but QEMU was returning -TARGET_EFAULT.

This is the same bug that we fixed in commit 58cfa6c2e6eb51b23cc9
for the write syscall, and long before that in 38d840e6790c29f59
for the read syscall.

Fixes: https://bugs.launchpad.net/qemu/+bug/1810433
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20190108184900.9654-1-peter.maydell@linaro.org>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
(cherry picked from commit 2bd3f8998e1e7dcd9afc29fab252fb9936f9e956)
*drop functional dep. on 2852aafd9d
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>

linux-user: write(fd, NULL, 0) parity with linux's treatment of same

Bring linux-user write(2) handling into line with linux for the case
of a 0-byte write with a NULL buffer. Based on a patch originally
written by Zhuowei Zhang.

Addresses https://bugs.launchpad.net/qemu/+bug/1716292.

>From Zhuowei Zhang's patch (https://lists.gnu.org/archive/html/qemu-devel/2017-09/msg08073.html):

    Linux returns success for the special case of calling write with a
    zero-length NULL buffer: compiling and running

    int main() {
       ssize_t ret = write(STDOUT_FILENO, NULL, 0);
       fprintf(stderr, "write returned %ld\n", ret);
       return 0;
    }

    gives "write returned 0" when run directly, but "write returned
    -1" in QEMU.

    This commit checks for this situation and returns success if
    found.

Subsequent discussion raised the following questions (and my answers):

- Q. Should TARGET_NR_read pass through to safe_read in this
      situation too?
   A. I'm wary of changing unrelated code to the specific problem I'm
      addressing. TARGET_NR_read is already consistent with Linux for
      this case.

- Q. Do pread64/pwrite64 need to be changed similarly?
   A. Experiment suggests not: both linux and linux-user yield -1 for
      NULL 0-length reads/writes.

Signed-off-by: Tony Garnock-Jones <tonygarnockjones@gmail.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Message-Id: <20180908182205.GB409@mornington.dcs.gla.ac.uk>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
(cherry picked from commit 58cfa6c2e6eb51b23cc98f81d16136b3ca929b31)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>

hw/s390x: Fix bad mask in time2tod()

Since "s390x/tcg: avoid overflows in time2tod/tod2time", the
time2tod() function tries to deal with the 9 uppermost bits in the
time value, but uses the wrong mask for this: 0xff80000000000000 should
be used instead of 0xff10000000000000 here.

Fixes: 14055ce53c2d901d826ffad7fb7d6bb8ab46bdfd
Cc: qemu-stable@nongnu.org
Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-Id: <1544792887-14575-1-git-send-email-thuth@redhat.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
[CH: tweaked commit message]
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
(cherry picked from commit aba7a5a2de3dba5917024df25441f715b9249e31)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>

pc:piix4: Update smbus I/O space after a migration

Otherwise it won't be set up correctly and won't work after
miigration.

Signed-off-by: Corey Minyard <cminyard@mvista.com>
Cc: Igor Mammedov <imammedo@redhat.com>
Cc: qemu-stable@nongnu.org
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit 2b4e573c7c7b9a698ba6931ba456bbd8d3d8c84c)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>

pcie: set link state inactive/active after hot unplug/plug

When VM boots from the latest version of linux kernel, after
hot-unpluging virtio-blk disks which are hotplugged into
pcie-root-port, the VM's dmesg log shows:

[  151.046242] pciehp 0000:00:05.0:pcie004: pending interrupts 0x0001 from Slot Status
[  151.046365] pciehp 0000:00:05.0:pcie004: Slot(0-3): Attention button pressed
[  151.046369] pciehp 0000:00:05.0:pcie004: Slot(0-3): Powering off due to button press
[  151.046420] pciehp 0000:00:05.0:pcie004: pending interrupts 0x0010 from Slot Status
[  151.046425] pciehp 0000:00:05.0:pcie004: pciehp_green_led_blink: SLOTCTRL a8 write cmd 200
[  151.046464] pciehp 0000:00:05.0:pcie004: pending interrupts 0x0010 from Slot Status
[  151.046468] pciehp 0000:00:05.0:pcie004: pciehp_set_attention_status: SLOTCTRL a8 write cmd c0
[  156.163421] pciehp 0000:00:05.0:pcie004: pciehp_get_power_status: SLOTCTRL a8 value read 2f1
[  156.163427] pciehp 0000:00:05.0:pcie004: pciehp_unconfigure_device: domain:bus:dev = 0000:06:00
[  156.198736] pciehp 0000:00:05.0:pcie004: pending interrupts 0x0010 from Slot Status
[  156.198772] pciehp 0000:00:05.0:pcie004: pciehp_power_off_slot: SLOTCTRL a8 write cmd 400
[  157.224124] pciehp 0000:00:05.0:pcie004: pending interrupts 0x0018 from Slot Status
[  157.224194] pciehp 0000:00:05.0:pcie004: pciehp_green_led_off: SLOTCTRL a8 write cmd 300
[  157.224220] pciehp 0000:00:05.0:pcie004: pciehp_check_link_active: lnk_status = 2011
[  157.224223] pciehp 0000:00:05.0:pcie004: Slot(0-3): Link Up
[  157.224233] pciehp 0000:00:05.0:pcie004: pciehp_get_power_status: SLOTCTRL a8 value read 7f1
[  157.224281] pciehp 0000:00:05.0:pcie004: pending interrupts 0x0010 from Slot Status
[  157.224285] pciehp 0000:00:05.0:pcie004: pciehp_power_on_slot: SLOTCTRL a8 write cmd 0
[  157.224300] pciehp 0000:00:05.0:pcie004: __pciehp_link_set: lnk_ctrl = 0
[  157.224336] pciehp 0000:00:05.0:pcie004: pending interrupts 0x0010 from Slot Status
[  157.224339] pciehp 0000:00:05.0:pcie004: pciehp_green_led_blink: SLOTCTRL a8 write cmd 200
[  159.739294] pci 0000:06:00.0 id reading try 50 times with interval 20 ms to get ffffffff
[  159.739315] pciehp 0000:00:05.0:pcie004: pciehp_check_link_status: lnk_status = 2011
[  159.739318] pciehp 0000:00:05.0:pcie004: Failed to check link status
[  159.739371] pciehp 0000:00:05.0:pcie004: pending interrupts 0x0010 from Slot Status
[  159.739394] pciehp 0000:00:05.0:pcie004: pciehp_power_off_slot: SLOTCTRL a8 write cmd 400
[  160.771426] pciehp 0000:00:05.0:pcie004: pending interrupts 0x0010 from Slot Status
[  160.771452] pciehp 0000:00:05.0:pcie004: pciehp_green_led_off: SLOTCTRL a8 write cmd 300
[  160.771495] pciehp 0000:00:05.0:pcie004: pending interrupts 0x0010 from Slot Status
[  160.771499] pciehp 0000:00:05.0:pcie004: pciehp_set_attention_status: SLOTCTRL a8 write cmd 40
[  160.771535] pciehp 0000:00:05.0:pcie004: pending interrupts 0x0010 from Slot Status
[  160.771539] pciehp 0000:00:05.0:pcie004: pciehp_green_led_off: SLOTCTRL a8 write cmd 300

After analyzing the log information, it seems that qemu doesn't
change the Link Status from active to inactive after hot-unplug.
This results in the abnormal log after the linux kernel commit
d331710ea78fea merged.

Furthermore, If I hotplug the same virtio-blk disk after hot-unplug,
the virtio-blk would turn on and then back off.

So this patch set the Link Status inactive after hot-unplug and
active after hot-plug.

Signed-off-by: Zheng Xiang <zhengxiang9@huawei.com>
Signed-off-by: Zheng Xiang <xiang.zheng@linaro.org>
Cc: Wang Haibin <wanghaibin.wang@huawei.com>
Cc: qemu-stable@nongnu.org
Reviewed-by: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit 2f2b18f60bf17453b4c01197a9316615a3c1f1de)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>

Changes requirement for "vsubsbs" instruction

Changes requirement for "vsubsbs" instruction, which has been supported
since ISA 2.03. (Please see section 5.9.1.2 of ISA 2.03)

Reported-by: Paul A. Clarke <pc@us.ibm.com>
Signed-off-by: Paul A. Clarke <pc@us.ibm.com>
Signed-off-by: Leonardo Bras <leonardo@linux.vnet.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
(cherry picked from commit fcfbc18d00b10335310c9665edd6e04f2d152be8)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>

iotests: make 235 work on s390 (and others)

"-machine pc" will not work all architectures. Lets fall back to the
default machine by not specifying it.

In addition we also need to specify -no-shutdown on s390 as qemu will
exit otherwise.

Cc: qemu-stable@nongnu.org
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit 2c26e648e4350079b0c86a6627b2d3566c3709c0)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>

iotests: simple mirror test with kvm on 1G image

This test is broken without previous commit fixing dead-lock in mirror.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
Acked-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit db5e8210adbafe9c6383d8364345a8c545d38e62)
*drop context deps from test groups not in 3.0
*modify expected QMP output to match encoding for 3.0
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>

mirror: fix dead-lock

Let start from the beginning:

Commit b9e413dd375 (in 2.9)
"block: explicitly acquire aiocontext in aio callbacks that need it"
added pairs of aio_context_acquire/release to mirror_write_complete and
mirror_read_complete, when they were aio callbacks for blk_aio_* calls.

Then, commit 2e1990b26e5 (in 3.0) "block/mirror: Convert to coroutines"
dropped these blk_aio_* calls, than mirror_write_complete and
mirror_read_complete are not callbacks more, and don't need additional
aiocontext acquiring. Furthermore, mirror_read_complete calls
blk_co_pwritev inside these pair of aio_context_acquire/release, which
leads to the following dead-lock with mirror:

(gdb) info thr
   Id   Target Id         Frame
   3    Thread (LWP 145412) "qemu-system-x86" syscall ()
   2    Thread (LWP 145416) "qemu-system-x86" __lll_lock_wait ()
* 1    Thread (LWP 145411) "qemu-system-x86" __lll_lock_wait ()

(gdb) bt
#0  __lll_lock_wait ()
#1  _L_lock_812 ()
#2  __GI___pthread_mutex_lock
#3  qemu_mutex_lock_impl (mutex=0x561032dce420 <qemu_global_mutex>,
     file=0x5610327d8654 "util/main-loop.c", line=236) at
     util/qemu-thread-posix.c:66
#4  qemu_mutex_lock_iothread_impl
#5  os_host_main_loop_wait (timeout=480116000) at util/main-loop.c:236
#6  main_loop_wait (nonblocking=0) at util/main-loop.c:497
#7  main_loop () at vl.c:1892
#8  main

Printing contents of qemu_global_mutex, I see that "__owner = 145416",
so, thr1 is main loop, and now it wants BQL, which is owned by thr2.

(gdb) thr 2
(gdb) bt
#0  __lll_lock_wait ()
#1  _L_lock_870 ()
#2  __GI___pthread_mutex_lock
#3  qemu_mutex_lock_impl (mutex=0x561034d25dc0, ...
#4  aio_context_acquire (ctx=0x561034d25d60)
#5  dma_blk_cb
#6  dma_blk_io
#7  dma_blk_read
#8  ide_dma_cb
#9  bmdma_cmd_writeb
#10 bmdma_write
#11 memory_region_write_accessor
#12 access_with_adjusted_size
#15 flatview_write
#16 address_space_write
#17 address_space_rw
#18 kvm_handle_io
#19 kvm_cpu_exec
#20 qemu_kvm_cpu_thread_fn
#21 qemu_thread_start
#22 start_thread
#23 clone ()

Printing mutex in fr 2, I see "__owner = 145411", so thr2 wants aio
context mutex, which is owned by thr1. Classic dead-lock.

Then, let's check that aio context is hold by mirror coroutine: just
print coroutine stack of first tracked request in mirror job target:

(gdb) [...]
(gdb) qemu coroutine 0x561035dd0860
#0  qemu_coroutine_switch
#1  qemu_coroutine_yield
#2  qemu_co_mutex_lock_slowpath
#3  qemu_co_mutex_lock
#4  qcow2_co_pwritev
#5  bdrv_driver_pwritev
#6  bdrv_aligned_pwritev
#7  bdrv_co_pwritev
#8  blk_co_pwritev
#9  mirror_read_complete () at block/mirror.c:232
#10 mirror_co_read () at block/mirror.c:370
#11 coroutine_trampoline
#12 __start_context

Yes it is mirror_read_complete calling blk_co_pwritev after acquiring
aio context.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit d12ade5732c4d2d293735a39b4bd943da8d64db6)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>

nbd/client: Send NBD_CMD_DISC if open fails after connect

If nbd_client_init() fails after we are already connected,
then the server will spam logs with:

Disconnect client, due to: Unexpected end-of-file before all bytes were read

unless we gracefully disconnect before closing the connection.

Ways to trigger this:

$ opts=driver=nbd,export=foo,server.type=inet,server.host=localhost,server.port=10809
$ qemu-img map --output=json --image-opts $opts,read-only=off
$ qemu-img map --output=json --image-opts $opts,x-dirty-bitmap=nosuch:

Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20181130023232.3079982-4-eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
(cherry picked from commit c688e6ca7b41a105241054853d250df64addbf8f)
*drop functional dep. on 6c2e581d4d7
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>

nbd/client: Make x-dirty-bitmap more reliable

The implementation of x-dirty-bitmap in qemu 3.0 (commit 216ee365)
silently falls back to treating the server as not supporting
NBD_CMD_BLOCK_STATUS if a requested meta_context name was not
negotiated, which in turn means treating the _entire_ image as
data. Since our hack relied on using 'qemu-img map' to view
which portions of the image were dirty by seeing what the
redirected bdrv_block_status() treats as holes, this means
that our fallback treats the entire image as clean. Better
would have been to treat the entire image as dirty, or to fail
to connect because the user's request for a specific context
could not be honored. This patch goes with the latter.

Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20181130023232.3079982-3-eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
(cherry picked from commit 47829c40794160debdb33b4a042d182e776876d4)
*avoid context dep. on 6c2e581d
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>

nbd/server: Advertise all contexts in response to bare LIST

The NBD spec, and even our code comment, says that if the client
asks for NBD_OPT_LIST_META_CONTEXT with 0 queries, then we should
reply with (a possibly-compressed representation of) ALL contexts
that we are willing to let them try. But commit 3d068aff forgot
to advertise qemu:dirty-bitmap:FOO.

Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20181130023232.3079982-2-eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
(cherry picked from commit e31d802479df9daff1994a7ed1e36bbc5bb19d03)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>

i2c: Add a length check to the SMBus write handling

Avoid an overflow.

Signed-off-by: Corey Minyard <cminyard@mvista.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Cc: QEMU Stable <qemu-stable@nongnu.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
(cherry picked from commit 629457a13080052c575779e1fd9f5eb5ee6b8ad9)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>

i2c: Move typedef of bitbang_i2c_interface to i2c.h

Clang 3.4 considers duplicate typedef in ppc4xx_i2c.h and
bitbang_i2c.h an error even if they are identical. Move it to a common
place to allow building with this clang version.

Reported-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
(cherry picked from commit 2b4c1125ac3db2734222ff43c25388a16aca4a99)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>

vfio-helpers: Fix qemu_vfio_open_pci() crash

qemu_vfio_open_common() initializes s->lock only after passing s to
qemu_vfio_dma_map() via qemu_vfio_init_ramblock().
qemu_vfio_dma_map() tries to lock the uninitialized lock and crashes.

Fix by initializing s->lock first.

RHBZ: https://bugzilla.redhat.com/show_bug.cgi?id=1645840
Fixes: 418026ca43bc2626db092d7558258f9594366f28
Cc: qemu-stable@nongnu.org
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 20181127084143.1113-1-armbru@redhat.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
(cherry picked from commit 549b50a31d28f2687a47e827a1e17300784a2c44)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>

fmops: fix off-by-one in AR_TABLE and DR_TABLE array size

Cc: P J P <ppandit@redhat.com>
Reported-by: Wangjunqing <wangjunqing@huawei.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-id: 20181030082340.17170-1-kraxel@redhat.com
Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit 57ac4a7a28fef81b80b547c64d26681edc4a2cda)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>

qemu-img: Fix leak

create_opts was leaked here. This is not too bad since the process is
about to exit anyway, but relying on that does not make the code nicer
to read.

Fixes: d402b6a21a825a5c07aac9251990860723d49f5d
Reported-by: Kevin Wolf <kwolf@redhat.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit 3ecd5a4f19fd9a497490a91aaa96e76a5edadd2c)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>

qemu-img: Fix typo

Fixes: d402b6a21a825a5c07aac9251990860723d49f5d
Reported-by: Kevin Wolf <kwolf@redhat.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit f0998879e049dad19beed881a1c56643ce536384)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>

fdc: fix segfault in fdctrl_stop_transfer() when DMA is disabled

Commit c8a35f1cf0f "fdc: use IsaDma interface instead of global DMA_*
functions" accidentally introduced a segfault in fdctrl_stop_transfer() for
non-DMA transfers.

If fdctrl->dma_chann has not been configured then the fdctrl->dma interface
reference isn't initialised during isabus_fdc_realize(). Unfortunately
fdctrl_stop_transfer() unconditionally references the DMA interface when
finishing the transfer causing a NULL pointer dereference.

Fix the issue by adding a check in fdctrl_stop_transfer() so that the DMA
interface reference and release method is only invoked if fdctrl->dma_chann
has been set.

(This issue was discovered by Martin testing a recent change in the NetBSD
installer under qemu-system-sparc)

Cc: qemu-stable@nongnu.org
Reported-by: Martin Husemann <martin@duskware.de>
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Hervé Poussineau <hpoussin@reactos.org>
Reviewed-by: John Snow <jsnow@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit 441f6692ecc14859b77af2ac6d8f55e6f1354d3b)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>

net: drop too large packet early

We try to detect and drop too large packet (>INT_MAX) in 1592a9947036
("net: ignore packet size greater than INT_MAX") during packet
delivering. Unfortunately, this is not sufficient as we may hit
another integer overflow when trying to queue such large packet in
qemu_net_queue_append_iov():

- size of the allocation may overflow on 32bit
- packet->size is integer which may overflow even on 64bit

Fixing this by moving the check to qemu_sendv_packet_async() which is
the entrance of all networking codes and reduce the limit to
NET_BUFSIZE to be more conservative. This works since:

- For the callers that call qemu_sendv_packet_async() directly, they
  only care about if zero is returned to determine whether to prevent
  the source from producing more packets. A callback will be triggered
  if peer can accept more then source could be enabled. This is
  usually used by high speed networking implementation like virtio-net
  or netmap.
- For the callers that call qemu_sendv_packet() that calls
  qemu_sendv_packet_async() indirectly, they often ignore the return
  value. In this case qemu will just the drop packets if peer can't
  receive.

Qemu will copy the packet if it was queued. So it was safe for both
kinds of the callers to assume the packet was sent.

Since we move the check from qemu_deliver_packet_iov() to
qemu_sendv_packet_async(), it would be safer to make
qemu_deliver_packet_iov() static to prevent any external user in the
future.

This is a revised patch of CVE-2018-17963.

Cc: qemu-stable@nongnu.org
Cc: Li Qiang <liq3ea@163.com>
Fixes: 1592a9947036 ("net: ignore packet size greater than INT_MAX")
Reported-by: Li Qiang <liq3ea@gmail.com>
Reviewed-by: Li Qiang <liq3ea@gmail.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-id: 20181204035347.6148-2-jasowang@redhat.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
(cherry picked from commit 25c01bd19d0e4b66f357618aeefda1ef7a41e21a)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>

make-release: add skiboot .version file

This is needed to build skiboot from tarball-distributed sources
since the git data the make_release.sh script relies on to generate
it is not available.

Cc: qemu-stable@nongnu.org
Reported-by: Michael Tokarev <mjt@tls.msk.ru>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-id: 20181109161352.29873-1-mdroth@linux.vnet.ibm.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
(cherry picked from commit 3fccd3f26ef3c0c76a06c138b17af6d55a5d9904)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>

vhost-scsi: prevent using uninitialized vqs

There are 3 virtqueues (ctrl, event and cmd) for virtio scsi device,
but seabios will only set the physical address for the 3rd one (cmd).
Then in vhost_virtqueue_start(), virtio_queue_get_desc_addr()
will be 0 for ctrl and event vq.

In this case, ctrl and event vq are not initialized.
vhost_verify_ring_mappings may use uninitialized vhost_virtqueue
such that vhost_verify_ring_part_mapping returns ENOMEM.

When encountered this problem, we got the following logs:

qemu-system-x86_64: Unable to map available ring for ring 0
qemu-system-x86_64: Verify ring failure on region 0

Signed-off-by: Forrest Liu <forrestl@synology.com>
Signed-off-by: yuchenlin <yuchenlin@synology.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit e6cc11d64fc998c11a4dfcde8fda3fc33a74d844)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>

migration: Stop postcopy fault thread before notifying

POSTCOPY_NOTIFY_INBOUND_END handlers will remove userfault fds
from the postcopy_remote_fds array which could be still in
use by the fault thread. Let's stop the thread before
notification to avoid possible accessing wrong memory.

Fixes: 46343570c06e ("vhost+postcopy: Wire up POSTCOPY_END notify")
Cc: qemu-stable@nongnu.org
Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Message-Id: <20181008160536.6332-2-i.maximets@samsung.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
(cherry picked from commit 55d0fe8254984321a126efd8db358f754737aa63)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>

nbd: fix NBD_FLAG_SEND_CACHE value

Commit bc37b06a5 added NBD_CMD_CACHE support, but used the wrong value
for NBD_FLAG_SEND_CACHE flag for negotiation. That commit picked bit 8,
which had already been assigned by the NBD specification to mean
NBD_FLAG_CAN_MULTI_CONN, and which was already implemented in the
Linux kernel as a part of stable userspace-kernel API since 4.10:

"bit 8, NBD_FLAG_CAN_MULTI_CONN: Indicates that the server operates
entirely without cache, or that the cache it uses is shared among all
connections to the given device. In particular, if this flag is
present, then the effects of NBD_CMD_FLUSH and NBD_CMD_FLAG_FUA
MUST be visible across all connections when the server sends its reply
to that command to the client. In the absense of this flag, clients
SHOULD NOT multiplex their commands over more than one connection to
the export.
...
bit 10, NBD_FLAG_SEND_CACHE: documents that the server understands
NBD_CMD_CACHE; however, note that server implementations exist
which support the command without advertising this bit, and
conversely that this bit does not guarantee that the command will
succeed or have an impact."

Consequences:
- a client trying to use NBD_CMD_CACHE per the NBD spec will not
see the feature as available from a qemu 3.0 server (not fatal,
clients already have to be prepared for caching to not exist)
- a client accidentally coded to the qemu 3.0 bit value instead
of following the spec may interpret NBD_CMD_CACHE as being available
when it is not (probably not fatal, the spec says the server should
gracefully fail unknown commands, and that clients of NBD_CMD_CACHE
should be prepared for failure even when the feature is advertised);
such clients are unlikely (perhaps only in unreleased Virtuozzo code),
and will disappear over time
- a client prepared to use multiple connections based on
NBD_FLAG_CAN_MULTI_CONN may cause data corruption when it assumes
that caching is consistent when in reality qemu 3.0 did not have
a consistent cache. Partially mitigated by using read-only
connections (where nothing needs to be flushed, so caching is
indeed consistent) or when using qemu-nbd with the default -e 1
(at most one client at a time); visible only when using -e 2 or
more for a writable export.

Thus the commit fixes negotiation flag in QEMU according to the
specification.

Signed-off-by: Denis V. Lunev <den@openvz.org>
CC: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
CC: Valery Vdovin <valery.vdovin@acronis.com>
CC: Eric Blake <eblake@redhat.com>
CC: Paolo Bonzini <pbonzini@redhat.com>
CC: qemu-stable@nongnu.org
Message-Id: <20181004100313.4253-1-den@openvz.org>
Reviewed-by: Eric Blake <eblake@redhat.com>
[eblake: enhance commit message, add defines for unimplemented flags]
Signed-off-by: Eric Blake <eblake@redhat.com>
(cherry picked from commit df91328adab8490367776d2b21b35d790a606120)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>

intel_iommu: better handling of dmar state switch

QEMU is not handling the global DMAR switch well, especially when from
"on" to "off".

Let's first take the example of system reset.

Assuming that a guest has IOMMU enabled.  When it reboots, we will drop
all the existing DMAR mappings to handle the system reset, however we'll
still keep the existing memory layouts which has the IOMMU memory region
enabled.  So after the reboot and before the kernel reloads again, there
will be no mapping at all for the host device.  That's problematic since
any software (for example, SeaBIOS) that runs earlier than the kernel
after the reboot will assume the IOMMU is disabled, so any DMA from the
software will fail.

For example, a guest that boots on an assigned NVMe device might fail to
find the boot device after a system reboot/reset and we'll be able to
observe SeaBIOS errors if we capture the debugging log:

  WARNING - Timeout at nvme_wait:144!

Meanwhile, we should see DMAR errors on the host of that NVMe device.
It's the DMA fault that caused a NVMe driver timeout.

The correct fix should be that we do proper switching of device DMA
address spaces when system resets, which will setup correct memory
regions and notify the backend of the devices.  This might not affect
much on non-assigned devices since QEMU VT-d emulation will assume a
default passthrough mapping if DMAR is not enabled in the GCMD
register (please refer to vtd_iommu_translate).  However that's required
for an assigned devices, since that'll rebuild the correct GPA to HPA
mapping that is needed for any DMA operation during guest bootstrap.

Besides the system reset, we have some other places that might change
the global DMAR status and we'd better do the same thing there.  For
example, when we change the state of GCMD register, or the DMAR root
pointer.  Do the same refresh for all these places.  For these two
places we'll also need to explicitly invalidate the context entry cache
and iotlb cache.

Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1625173
CC: QEMU Stable <qemu-stable@nongnu.org>
Reported-by: Cong Li <coli@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
--
v2:
- do the same for GCMD write, or root pointer update [Alex]
- test is carried out by me this time, by observing the
  vtd_switch_address_space tracepoint after system reboot
v3:
- rewrite commit message as suggested by Alex
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit 2cc9ddccebcaa48b3debfc279a83761fcbb7616c)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>

intel_iommu: introduce vtd_reset_caches()

Provide the function and use it in vtd_init(). Used to reset both
context entry cache and iotlb cache for the whole IOMMU unit.

Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit 06aba4ca52fd2c8718b8ba486f22f0aa7c99ed55)
*functional dep. for 2cc9ddcceb
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>

nbd/server: fix NBD_CMD_CACHE

We should not go to structured-read branch on CACHE command, fix that.

Bug introduced in bc37b06a5cde24 "nbd/server: introduce NBD_CMD_CACHE"
with the whole feature and affects 3.0.0 release.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
CC: qemu-stable@nongnu.org
Message-Id: <20181003144738.70670-1-vsementsov@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
[eblake: commit message typo fix]
Signed-off-by: Eric Blake <eblake@redhat.com>
(cherry picked from commit 2f454defc23e1be78f2a96bad2877ce7829f61b4)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>

target/arm: Correct condition for v8M callee stack push

In v7m_exception_taken() we were incorrectly using a
"LR bit EXCRET.ES is 1" check when it should be 0
(compare the pseudocode ExceptionTaken() function).
This meant we didn't stack the callee-saved registers
when tailchaining from a NonSecure to a Secure exception.

Cc: qemu-stable@nongnu.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20181002145940.30931-1-peter.maydell@linaro.org
(cherry picked from commit 7b73a1ca05b33d42278ce29cea4652e22d408165)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>

block-backend: Set werror/rerror defaults in blk_new()

Currently, the default values for werror and rerror have to be set
explicitly with blk_set_on_error() by the callers of blk_new(). The only
caller actually doing this is blockdev_init(), which is called for
BlockBackends created using -drive.

In particular, anonymous BlockBackends created with
-device ...,drive=<node-name> didn't get the correct default set and
instead defaulted to the integer value 0 (= BLOCKDEV_ON_ERROR_REPORT).
This is the intended default for rerror anyway, but the default for
werror should be BLOCKDEV_ON_ERROR_ENOSPC.

Set the defaults in blk_new() instead so that they apply no matter what
way the BlockBackend was created.

Cc: qemu-stable@nongnu.org
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
(cherry picked from commit cb53460b708db3617ab73248374d071d5552c263)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>

virtio: do not take address of packed members

The address of a packed member is not packed, which may cause accesses
to unaligned pointers. Avoid this by reading the packed value before
passing it to another function.

Cc: Jason Wang <jasowang@redhat.com>
Cc: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit d41ca5afe3bc513ecf10b3ba5aa59523e3cd54aa)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>

virt: Suppress external aborts on virt-2.10 and earlier

In commit c79c0a314c43b78 we enabled emulation of external aborts
when the guest attempts to access a physical address with no
mapped device. In commit 4672cbd7bed88dc6 we suppress this for
most legacy boards to prevent breakage of previously working
guests, but we didn't suppress it in the 'virt' board, with
the rationale "we know that guests won't try to prod devices
that we don't describe in the device tree or ACPI tables". This
is mostly true, but we've had a report of a Linux guest image
that this did break. The problem seems to be that the guest
is (incorrectly) configured with a DEBUG_UART_PHYS value that
tells it there is a uart at 0x10009000 (which is true for
vexpress but not for virt), so in early bootup the kernel
probes this bogus address.

This is a misconfigured guest, so we don't need to worry
about it too much, but we can arrange that guests that ran
on QEMU v2.10 (before c79c0a314c43b78) will still run on
the "virt-2.10" board model, by suppressing external aborts
only for that version and earlier. This seems a reasonable
compromise: "virt-2.10" is supposed to behave the same way
that "virt" did in the 2.10 release, and making it do that
provides a usable workaround for guests with bugs like this.

Cc: qemu-stable@nongnu.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 20180925144127.31965-1-peter.maydell@linaro.org
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
(cherry picked from commit 846690dee8ca6a4143d20b39e894fd1f24627561)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>

net: ignore packet size greater than INT_MAX

There should not be a reason for passing a packet size greater than
INT_MAX. It's usually a hint of bug somewhere, so ignore packet size
greater than INT_MAX in qemu_deliver_packet_iov()

CC: qemu-stable@nongnu.org
Reported-by: Daniel Shapira <daniel@twistlock.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
(cherry picked from commit 1592a9947036d60dde5404204a5d45975133caf5)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>

pcnet: fix possible buffer overflow

In pcnet_receive(), we try to assign size_ to size which converts from
size_t to integer. This will cause troubles when size_ is greater
INT_MAX, this will lead a negative value in size and it can then pass
the check of size < MIN_BUF_SIZE which may lead out of bound access
for both buf and buf1.

Fixing by converting the type of size to size_t.

CC: qemu-stable@nongnu.org
Reported-by: Daniel Shapira <daniel@twistlock.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
(cherry picked from commit b1d80d12c5f7ff081bb80ab4f4241d4248691192)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>

rtl8139: fix possible out of bound access

In rtl8139_do_receive(), we try to assign size_ to size which converts
from size_t to integer. This will cause troubles when size_ is greater
INT_MAX, this will lead a negative value in size and it can then pass
the check of size < MIN_BUF_SIZE which may lead out of bound access of
for both buf and buf1.

Fixing by converting the type of size to size_t.

CC: qemu-stable@nongnu.org
Reported-by: Daniel Shapira <daniel@twistlock.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
(cherry picked from commit 1a326646fef38782e5542280040ec3ea23e4a730)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>

ne2000: fix possible out of bound access in ne2000_receive

In ne2000_receive(), we try to assign size_ to size which converts
from size_t to integer. This will cause troubles when size_ is greater
INT_MAX, this will lead a negative value in size and it can then pass
the check of size < MIN_BUF_SIZE which may lead out of bound access of
for both buf and buf1.

Fixing by converting the type of size to size_t.

CC: qemu-stable@nongnu.org
Reported-by: Daniel Shapira <daniel@twistlock.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
(cherry picked from commit fdc89e90fac40c5ca2686733df17b6423fb8d8fb)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>

target/arm: Fix cpu_get_tb_cpu_state() for non-SVE CPUs

Not only are the sve-related tb_flags fields unused when SVE is
disabled, but not all of the cpu registers are initialized properly
for computing same. This can corrupt other fields by ORing in -1,
which might result in QEMU crashing.

This bug was not present in 3.0, but this patch is cc'd to
stable because adf92eab90e3f5f34c285 where the bug was
introduced was marked for stable.

Fixes: adf92eab90e3f5f34c285
Cc: qemu-stable@nongnu.org (3.0.1)
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
(cherry picked from commit e79b445d896deb61909be52b61b87c98a9ed96f7)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>

block/rbd: add deprecation documentation for filename keyvalue pairs

Signed-off-by: Jeff Cody <jcody@redhat.com>
Message-id: 647f5b5ab7efd8bf567a504c832b1d2d6f719b23.1536704901.git.jcody@redhat.com
Signed-off-by: Jeff Cody <jcody@redhat.com>
(cherry picked from commit 3bebd37e04f972775b1ece1bdda95451bc9fb14c)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>

block/rbd: add iotest for rbd legacy keyvalue filename parsing

This is a small test that will check for the ability to parse
both legacy and modern options for rbd.

The way the test is set up is for failure to occur, but without
having to wait to timeout on a non-existent rbd server. The error
messages in the success path show that the arguments were parsed.

The failure behavior prior to the patch series that has this test, is
qemu-img complaining about mandatory options (e.g. 'pool') not being
provided.

Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Jeff Cody <jcody@redhat.com>
Message-id: f830580e339b974a83ed4870d11adcdc17f49a47.1536704901.git.jcody@redhat.com
Signed-off-by: Jeff Cody <jcody@redhat.com>
(cherry picked from commit 66e6a735e97450ac50fcaf40f78600c688534cae)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>

block/rbd: Attempt to parse legacy filenames

When we converted rbd to get rid of the older key/value-centric
encoding format, we broke compatibility with image files with backing
file strings encoded in the old format.

This leaves a bit of an ugly conundrum, and a hacky solution.

If the initial attempt to parse the "proper" options fails, it assumes
that we may have an older key/value encoded filename. Fall back to
attempting to parse the filename, and extract the required options from
it. If that fails, pass along the original error message.

We do not support mixed modern usage alongside legacy keyvalue pair
usage.

A deprecation warning has been added, although care should be taken
when actually deprecating since the impact is not limited to
commandline or qapi usage, but also opening existing images.

Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Jeff Cody <jcody@redhat.com>
Message-id: 15b332e5432ad069441f7275a46080f465d789a0.1536704901.git.jcody@redhat.com
Signed-off-by: Jeff Cody <jcody@redhat.com>
(cherry picked from commit 084d1d13bdb753d558b991996e7686c077bd6d80)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>

block/rbd: pull out qemu_rbd_convert_options

Code movement to pull the conversion from Qdict to BlockdevOptionsRbd
into a helper function.

Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Signed-off-by: Jeff Cody <jcody@redhat.com>
Message-id: 5b49a980f2cde6610ab1df41bb0277d00b5db893.1536704901.git.jcody@redhat.com
Signed-off-by: Jeff Cody <jcody@redhat.com>
(cherry picked from commit f24b03b56cdb28d753b4ff9ae210d555f14cb0d8)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>

clean up callback when del virtqueue

Before, we did not clear callback like handle_output when delete
the virtqueue which may result be segmentfault.
The scene is as follows:
1. Start a vm with multiqueue vhost-net,
2. then we write VIRTIO_PCI_GUEST_FEATURES in PCI configuration to
triger multiqueue disable in this vm which will delete the virtqueue.
In this step, the tx_bh is deleted but the callback virtio_net_handle_tx_bh
still exist.
3. Finally, we write VIRTIO_PCI_QUEUE_NOTIFY in PCI configuration to
notify the deleted virtqueue. In this way, virtio_net_handle_tx_bh
will be called and qemu will be crashed.

Although the way described above is uncommon, we had better reinforce it.

CC: qemu-stable@nongnu.org
Signed-off-by: liujunjie <liujunjie23@huawei.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
(cherry picked from commit 7da2d99fb9fbf30104125c061caaff330e362d74)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>

nbd/server: fix bitmap export

bitmap_to_extents function is broken: it switches dirty variable after
every iteration, however it can process only part of dirty (or zero)
area during one iteration in case when this area is too large for one
extent.

Fortunately, the bug doesn't produce wrong extent flags: it just inserts
a zero-length extent between sequential extents representing large dirty
(or zero) area. However, zero-length extents are forbidden by the NBD
protocol. So, a careful client should consider such a reply as a server
fault, while a less-careful will likely ignore zero-length extents.

The bug can only be triggered by a client that requests block status
for nearly 4G at once (a request of 4G and larger is impossible per
the protocol, and requests smaller than 4G less the bitmap granularity
cause the loop to quit iterating rather than revisit the tail of the
large area); it also cannot trigger if the client used the
NBD_CMD_FLAG_REQ_ONE flag. Since qemu 3.0 as client (using the
x-dirty-bitmap extension) always passes the flag, it is immune; and
we are not aware of other open-source clients that know how to request
qemu:dirty-bitmap:FOO contexts. Clients that want to avoid the bug
could cap block status requests to a smaller length, such as 2G or 3G.

Fix this by more careful handling of dirty variable.

Bug was introduced in 3d068aff16
"nbd/server: implement dirty bitmap export", with the whole function.
and is present in v3.0.0 release.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20180914165116.23182-1-vsementsov@virtuozzo.com>
CC: qemu-stable@nongnu.org
Reviewed-by: Eric Blake <eblake@redhat.com>
[eblake: improved commit message]
Signed-off-by: Eric Blake <eblake@redhat.com>
(cherry picked from commit 6545916d528de7a6b784f4d10e7b236b30bfaced)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>

target/xtensa: fix s32c1i TCGMemOp flags

s32c1i must load and store value with target endianness, not host.
This results in an infinite loop in atomic cmpxchg sequences when target
endianness doesn't match host endianness.

Fixes: 9fb40342d4b3 ("target/xtensa: support MTTCG")
Cc: qemu-stable@nongnu.org
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
(cherry picked from commit 7a54cfbcee8dd7aa87ce655a321b622107556326)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>

job: Fix nested aio_poll() hanging in job_txn_apply

All callers have acquired ctx already. Doing that again results in
aio_poll() hang. This fixes the problem that a BDRV_POLL_WHILE() in the
callback cannot make progress because ctx is recursively locked, for
example, when drive-backup finishes.

There are two callers of job_finalize():

    fam@lemon:~/work/qemu [master]$ git grep -w -A1 '^\s*job_finalize'
    blockdev.c:    job_finalize(&job->job, errp);
    blockdev.c-    aio_context_release(aio_context);
    --
    job-qmp.c:    job_finalize(job, errp);
    job-qmp.c-    aio_context_release(aio_context);
    --
    tests/test-blockjob.c:    job_finalize(&job->job, &error_abort);
    tests/test-blockjob.c-    assert(job->job.status == JOB_STATUS_CONCLUDED);

Ignoring the test, it's easy to see both callers to job_finalize (and
job_do_finalize) have acquired the context.

Cc: qemu-stable@nongnu.org
Reported-by: Gu Nini <ngu@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit 49880165a44f26dc84651858750facdee31f2513)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>

block: Fix use after free error in bdrv_open_inherit()

When a block device is opened with BDRV_O_SNAPSHOT and the
bdrv_append_temp_snapshot() call fails then the error code path tries
to unref the already destroyed 'options' QDict.

This can be reproduced easily by setting TMPDIR to a location where
the QEMU process can't write:

$ TMPDIR=/nonexistent $QEMU -drive driver=null-co,snapshot=on

Signed-off-by: Alberto Garcia <berto@igalia.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit 8961be33e8ca7e809c603223803ea66ef7ea5be7)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>

tests: update acpi expected files

Fixes: dbb6da8ba7e ("pc: acpi: revert back to 1 SRAT entry for hotpluggable area")
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit d2a1b1d602986a5f02658f6d4fc9ed422f8ddebf)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>

pc: acpi: revert back to 1 SRAT entry for hotpluggable area

Commit
10efd7e108 "pc: acpi: fix memory hotplug regression by reducing stub SRAT entry size"
attemped to fix hotplug regression introduced by
848a1cc1e "hw/acpi-build: build SRAT memory affinity structures for DIMM devices"

fixed issue for Windows/3.0+ linux kernels, however it regressed 2.6 based
kernels (RHEL6) to the point where guest might crash at boot.
Reason is that 2.6 kernel discards SRAT table due too small last entry
which down the road leads to crashes. Hack I've tried in 10efd7e108 is also
not ACPI spec compliant according to which whole possible RAM should be
described in SRAT. Revert 10efd7e108 to fix regression for 2.6 based kernels.

With 10efd7e108 reverted, I've also tried splitting SRAT table statically
in different ways %/node and %/slot but Windows still fails to online
2nd pc-dimm hot-plugged into node 0 (as described in 10efd7e108) and
sometimes even coldplugged pc-dimms where affected with static SRAT
partitioning.
The only known so far way where Windows stays happy is when we have 1
SRAT entry in the last node covering all hotplug area.

Revert 848a1cc1e until we come up with a way to avoid regression
on Windows with hotplug area split in several entries.
Tested this with 2.6/3.0 based kernels (RHEL6/7) and WS20[08/12/12R2/16]).

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Acked-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit dbb6da8ba7e02105bdbb33b527e088249c9843c8)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>

vhost: fix invalid downcast

virtio_queue_get_desc_addr returns 64-bit hwaddr while int is usually 32-bit.
If returned hwaddr is not equal to 0 but least-significant 32 bits are
equal to 0 then this code will not actually stop running queue.

Signed-off-by: Yury Kotov <yury-kotov@yandex-team.ru>
Acked-by: Jia He <hejianet@gmail.com>
Cc: qemu-stable@nongnu.org
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit fa4ae4be15fb08b37bec35139688ef563311d0b9)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>

slirp: Add sanity check for str option length

When user provides a long domainname or hostname that doesn't fit in the
DHCP packet, we mustn't overflow the response packet buffer. Instead,
report errors, following the g_warning() in the slirp->vdnssearch
branch.

Also check the strlen against 256 when initializing slirp, which limit
is also from the protocol where one byte represents the string length.
This gives an early error before the warning which is harder to notice
or diagnose.

Reported-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Fam Zheng <famz@redhat.com>
Tested-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
(cherry picked from commit 6e157a0339793bb081705f52318fc77afd10addf)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>

kvm: add call to qemu_add_opts() for -overcommit option

qemu command fails to process -overcommit option. Add the missing
call to qemu_add_opts() in vl.c.

Signed-off-by: Prasad Singamsetty <prasad.singamsetty@oracle.com>
Message-Id: <20180815175704.105902-1-prasad.singamsetty@oracle.com>
Reviewed-by: Mark Kanda <mark.kanda@oracle.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 1fdd4748711a62d863744f42b958472509a6f202)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>

target/xtensa: fix FPU2000 bugs

- FPU2000 defines rfr and wfr opcodes, not rfr.s and wfr.s;
- movcond.s uses incorrect operand in tcg_gen_movcond: in case the
condition is not satisfied it must not change its argument 0.

Fixes: c04e1692e3aa ("target/xtensa: extract FPU2000 opcode
translators")
Cc: qemu-stable@nongnu.org
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
(cherry picked from commit e8e05fd472cbe77650353eaa50d5a9703a91c1db)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>

virtio: update MemoryRegionCaches when guest negotiates features

Because the cache is sized to include the rings and the event indices,
negotiating the VIRTIO_RING_F_EVENT_IDX feature will result in the size
of the cache changing. And because MemoryRegionCache accesses are
range-checked, if we skip this we end up with an assertion failure.
This happens with OpenBSD 6.3.

Reported-by: Fam Zheng <famz@redhat.com>
Fixes: 97cd965c070152bc626c7507df9fb356bbe1cd81
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Tested-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit db812c4073c77c8a64db8d6663b3416a587c7b4a)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>

block: iotest to catch abort on forced blockjob cancel

Signed-off-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Message-id: df317f617fbe5affcf699cb8560e7b0c2e028a64.1534868459.git.jcody@redhat.com
Signed-off-by: Jeff Cody <jcody@redhat.com>
(cherry picked from commit 26bf474ba92c76e61bea51726e22da6dfd185296)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>

block: for jobs, do not clear user_paused until after the resume

The function job_cancel_async() will always cause an assert for blockjob
user resume. We set job->user_paused to false, and then call
job->driver->user_resume(). In the case of blockjobs, this is the
block_job_user_resume() function.

In that function, we assert that job.user_paused is set to true.
Unfortunately, right before calling this function, it has explicitly
been set to false.

The fix is pretty simple: set job->user_paused to false only after the
job user_resume() function has been called.

Reviewed-by: John Snow <jsnow@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Jeff Cody <jcody@redhat.com>
Message-id: bb183b77d8f2dd6bd67b8da559a90ac1e74b2052.1534868459.git.jcody@redhat.com
Signed-off-by: Jeff Cody <jcody@redhat.com>
(cherry picked from commit e321c0597c7590499bacab239d7f86e257f96bcd)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>

i386: Disable TOPOEXT by default on "-cpu host"

Enabling TOPOEXT is always allowed, but it can't be enabled
blindly by "-cpu host" because it may make guests crash if the
rest of the cache topology information isn't provided or isn't
consistent.

This addresses the bug reported at:
https://bugzilla.redhat.com/show_bug.cgi?id=1613277

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <20180809221852.15285-1-ehabkost@redhat.com>
Tested-by: Richard W.M. Jones <rjones@redhat.com>
Reviewed-by: Babu Moger <babu.moger@amd.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
(cherry picked from commit 7210a02c58572b2686a3a8d610c6628f87864aed)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>

vnc: fix memleak of the "vnc-worker-output" name

Fixes repeated memory leaks of 18 bytes when using VNC:

    Direct leak of 831024 byte(s) in 46168 object(s) allocated from:
        ...
        #4 0x7f6d2f919bdd in g_strdup_vprintf glib/gstrfuncs.c:514
        #5 0x56085cdcf660 in buffer_init util/buffer.c:59
        #6 0x56085ca6a7ec in vnc_async_encoding_start ui/vnc-jobs.c:177
        #7 0x56085ca6b815 in vnc_worker_thread_loop ui/vnc-jobs.c:240

Fixes: 543b95801f98 ("vnc: attach names to buffers")
Cc: Gerd Hoffmann <kraxel@redhat.com>
CC: qemu-stable@nongnu.org
Signed-off-by: Peter Wu <peter@lekensteyn.nl>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-id: 20180807221830.3844-1-peter@lekensteyn.nl
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit 0ae0b069aa8f23a138cc6d2d83edaa5c22f948a5)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>

monitor: fix oob command leak

Spotted by ASAN, during make check...

Direct leak of 40 byte(s) in 1 object(s) allocated from:
    #0 0x7f8e27262c48 in malloc (/lib64/libasan.so.5+0xeec48)
    #1 0x7f8e26a5f3c5 in g_malloc (/lib64/libglib-2.0.so.0+0x523c5)
    #2 0x555ab67078a8 in qstring_from_str /home/elmarco/src/qq/qobject/qstring.c:67
    #3 0x555ab67071e4 in qstring_new /home/elmarco/src/qq/qobject/qstring.c:24
    #4 0x555ab6713fbf in qstring_from_escaped_str /home/elmarco/src/qq/qobject/json-parser.c:144
    #5 0x555ab671738c in parse_literal /home/elmarco/src/qq/qobject/json-parser.c:506
    #6 0x555ab67179c3 in parse_value /home/elmarco/src/qq/qobject/json-parser.c:569
    #7 0x555ab6715123 in parse_pair /home/elmarco/src/qq/qobject/json-parser.c:306
    #8 0x555ab6715483 in parse_object /home/elmarco/src/qq/qobject/json-parser.c:357
    #9 0x555ab671798b in parse_value /home/elmarco/src/qq/qobject/json-parser.c:561
    #10 0x555ab6717a6b in json_parser_parse_err /home/elmarco/src/qq/qobject/json-parser.c:592
    #11 0x555ab4fd4dcf in handle_qmp_command /home/elmarco/src/qq/monitor.c:4257
    #12 0x555ab6712c4d in json_message_process_token /home/elmarco/src/qq/qobject/json-streamer.c:105
    #13 0x555ab67e01e2 in json_lexer_feed_char /home/elmarco/src/qq/qobject/json-lexer.c:323
    #14 0x555ab67e0af6 in json_lexer_feed /home/elmarco/src/qq/qobject/json-lexer.c:373
    #15 0x555ab6713010 in json_message_parser_feed /home/elmarco/src/qq/qobject/json-streamer.c:124
    #16 0x555ab4fd58ec in monitor_qmp_read /home/elmarco/src/qq/monitor.c:4337
    #17 0x555ab6559df2 in qemu_chr_be_write_impl /home/elmarco/src/qq/chardev/char.c:175
    #18 0x555ab6559e95 in qemu_chr_be_write /home/elmarco/src/qq/chardev/char.c:187
    #19 0x555ab6560127 in fd_chr_read /home/elmarco/src/qq/chardev/char-fd.c:66
    #20 0x555ab65d9c73 in qio_channel_fd_source_dispatch /home/elmarco/src/qq/io/channel-watch.c:84
    #21 0x7f8e26a598ac in g_main_context_dispatch (/lib64/libglib-2.0.so.0+0x4c8ac)

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20180809114417.28718-4-marcandre.lureau@redhat.com>
[Screwed up in commit b27314567d4]
Cc: qemu-stable@nongnu.org
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
(cherry picked from commit cb9ec42f33c07cd07d2e2971422bf7636c761202)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>

aio: Do aio_notify_accept only during blocking aio_poll

An aio_notify() pairs with an aio_notify_accept(). The former should
happen in the main thread or a vCPU thread, and the latter should be
done in the IOThread.

There is one rare case that the main thread or vCPU thread may "steal"
the aio_notify() event just raised by itself, in bdrv_set_aio_context()
[1]. The sequence is like this:

    main thread                     IO Thread
    ===============================================================
    bdrv_drained_begin()
      aio_disable_external(ctx)
                                    aio_poll(ctx, true)
                                      ctx->notify_me += 2
    ...
    bdrv_drained_end()
      ...
        aio_notify()
    ...
    bdrv_set_aio_context()
      aio_poll(ctx, false)
[1]     aio_notify_accept(ctx)
                                      ppoll() /* Hang! */

[1] is problematic. It will clear the ctx->notifier event so that
the blocked ppoll() will not return.

(For the curious, this bug was noticed when booting a number of VMs
simultaneously in RHV.  One or two of the VMs will hit this race
condition, making the VIRTIO device unresponsive to I/O commands. When
it hangs, Seabios is busy waiting for a read request to complete (read
MBR), right after initializing the virtio-blk-pci device, using 100%
guest CPU. See also https://bugzilla.redhat.com/show_bug.cgi?id=1562750
for the original bug analysis.)

aio_notify() only injects an event when ctx->notify_me is set,
correspondingly aio_notify_accept() is only useful when ctx->notify_me
_was_ set. Move the call to it into the "blocking" branch. This will
effectively skip [1] and fix the hang.

Furthermore, blocking aio_poll is only allowed on home thread
(in_aio_context_home_thread), because otherwise two blocking
aio_poll()'s can steal each other's ctx->notifier event and cause
hanging just like described above.

Cc: qemu-stable@nongnu.org
Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
Message-Id: <20180809132259.18402-3-famz@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
(cherry picked from commit b37548fcd1b8ac2e88e185a395bef851f3fc4e65)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>

aio-posix: Don't count ctx->notifier as progress when polling

The same logic exists in fd polling. This change is especially important
to avoid busy loop once we limit aio_notify_accept() to blocking
aio_poll().

Cc: qemu-stable@nongnu.org
Signed-off-by: Fam Zheng <famz@redhat.com>
Message-Id: <20180809132259.18402-2-famz@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
(cherry picked from commit 70232b5253a3c4e03ed1ac47ef9246a8ac66c6fa)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>

nvme: Fix nvme_init error handling

It is wrong to leave this field as 1, as nvme_close() called in the
error handling code in nvme_file_open() will use it and try to free
s->queues again.

Another problem is the cleaning ups are duplicated between the fail*
labels of nvme_init() and nvme_file_open(), which calls nvme_close().

A third problem is nvme_close() misses g_free() and
event_notifier_cleanup().

Fix all of them.

Cc: qemu-stable@nongnu.org
Signed-off-by: Fam Zheng <famz@redhat.com>
Message-Id: <20180712025420.4932-1-famz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
(cherry picked from commit 9582f357bb6f6573c9a452743d8f3ab41ba2e3fa)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>

qemu-img: fix regression copying secrets during convert

When the convert command is creating an output file that needs
secrets, we need to ensure those secrets are passed to both the
blk_new_open and bdrv_create API calls.

This is done by qemu-img extracting all opts matching the name
suffix "key-secret". Unfortunately the code doing this was run after the
call to bdrv_create(), which meant the QemuOpts it was extracting
secrets from was now empty.

Previously this worked by luks as a bug meant the "key-secret"
parameters were not purged from the QemuOpts. This bug was fixed in

  commit b76b4f604521e59f857d6177bc55f6f2e41fd392
  Author: Kevin Wolf <kwolf@redhat.com>
  Date:   Thu Jan 11 16:18:08 2018 +0100

    qcow2: Use visitor for options in qcow2_create()

Exposing the latent bug in qemu-img. This fix simply moves the copying
of secrets to before the bdrv_create() call.

Cc: qemu-stable@nongnu.org
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit 8d65a3ccfd5db7f0436e095cd952f5d0c3a873ba)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>

mirror: Fail gracefully for source == target

blockdev-mirror with the same node for source and target segfaults
today: A node is in its own backing chain, so mirror_start_job() decides
that this is an active commit. When adding the intermediate nodes with
block_job_add_bdrv(), it starts the iteration through the subchain with
the backing file of source, though, so it never reaches target and
instead runs into NULL at the base.

While we could fix that by starting with source itself, there is no
point in allowing mirroring a node into itself and I wouldn't be
surprised if this caused more problems later.

So just check for this scenario and error out.

Cc: qemu-stable@nongnu.org
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
(cherry picked from commit 86fae10c64d642256cf019e6829929fa0d259c7a)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>

block/qapi: Fix memory leak in qmp_query_blockstats()

For BlockBackends that are skipped in query-blockstats, we would leak
info since commit 567dcb31. Allocate info only later to avoid the memory
leak.

Fixes: CID 1394727
Cc: qemu-stable@nongnu.org
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
(cherry picked from commit f62492bb8d1ea7f7e156ffbdf411de46107072c5)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>

target/arm: Use FZ not FZ16 for SVE FCVT single-half and double-half

We were using the wrong flush-to-zero bit for the non-half input.

Fixes: 46d33d1e3c9
Cc: qemu-stable@nongnu.org (3.0.1)
Reported-by: Laurent Desnogues <laurent.desnogues@gmail.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Laurent Desnogues <laurent.desnogues@gmail.com>
Tested-by: Laurent Desnogues <laurent.desnogues@gmail.com>
Message-id: 20180810193129.1556-5-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
(cherry picked from commit e4ab5124a5c2e2291006b24bdc21c3dd8d087ff4)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>

target/arm: Use fp_status_fp16 for do_fmpa_zpzzz_h

This makes float16_muladd correctly use FZ16 not FZ.

Fixes: 6ceabaad110
Cc: qemu-stable@nongnu.org (3.0.1)
Reported-by: Laurent Desnogues <laurent.desnogues@gmail.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Laurent Desnogues <laurent.desnogues@gmail.com>
Tested-by: Laurent Desnogues <laurent.desnogues@gmail.com>
Message-id: 20180810193129.1556-4-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
(cherry picked from commit 52a339b11d1719a6589de40606859939875fda9a)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>

target/arm: Ignore float_flag_input_denormal from fp_status_f16

When FZ is set, input_denormal exceptions are recognized, but this does
not happen with FZ16. The softfloat code has no way to distinguish
these bits and will raise such exceptions into fp_status_f16.flags,
so ignore them when computing the accumulated flags.

Cc: qemu-stable@nongnu.org (3.0.1)
Reported-by: Laurent Desnogues <laurent.desnogues@gmail.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Laurent Desnogues <laurent.desnogues@gmail.com>
Tested-by: Laurent Desnogues <laurent.desnogues@gmail.com>
Message-id: 20180810193129.1556-3-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
(cherry picked from commit 19062c169e5bcdda3d60df9161228e107bf0f96e)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>

target/arm: Adjust FPCR_MASK for FZ16

When support for FZ16 was added, we failed to include the bit
within FPCR_MASK, which means that it could never be set.
Continue to zero FZ16 when ARMv8.2-FP16 is not enabled.

Fixes: d81ce0ef2c4
Cc: qemu-stable@nongnu.org (3.0.1)
Reported-by: Laurent Desnogues <laurent.desnogues@gmail.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Laurent Desnogues <laurent.desnogues@gmail.com>
Tested-by: Laurent Desnogues <laurent.desnogues@gmail.com>
Message-id: 20180810193129.1556-2-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
(cherry picked from commit 0b62159be33d45d00dfa34a317c6d3da30ffb480)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>

spapr_cpu_core: vmstate_[un]register per-CPU data from (un)realizefn

VMStateDescription vmstate_spapr_cpu_state was added by commit
b94020268e0b6 (spapr_cpu_core: migrate per-CPU data) to migrate per-CPU
data with the required vmstate registration and unregistration calls.
However the unregistration is being done only from vcpu creation error path
and not from CPU delete path.

This causes migration to fail with the following error if migration is
attempted after a CPU unplug like this:
Unknown savevm section or instance 'spapr_cpu' 16
Additionally this leaves the source VM unresponsive after migration failure.

Fix this by ensuring the vmstate_unregister happens during CPU removal.
Fixing this becomes easier when vmstate (un)registration calls are moved to
vcpu (un)realize functions which is what this patch does.

Fixes: https://bugs.launchpad.net/qemu/+bug/1785972
Reported-by: Satheesh Rajendran <sathnaga@linux.vnet.ibm.com>
Signed-off-by: Bharata B Rao <bharata@linux.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
(cherry picked from commit cc71c7760e263f808c4240a725425671eeeb7e4d)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>

target/arm: Add sve-max-vq cpu property to -cpu max

This allows the default (and maximum) vector length to be set
from the command-line. Which is extraordinarily helpful in
debugging problems depending on vector length without having to
bake knowledge of PR_SET_SVE_VL into every guest binary.

Cc: qemu-stable@nongnu.org (3.0.1)
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Tested-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
(cherry picked from commit adf92eab90e3f5f34c285da6d14d48952b7a8e72)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>

target/arm: Dump SVE state if enabled

Also fold the FPCR/FPSR state onto the same line as PSTATE,
and mention but do not dump disabled FPU state.

Cc: qemu-stable@nongnu.org (3.0.1)
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Tested-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
(cherry picked from commit 2bf5f3f91bb4e3faa2a19aec042138a938afbf6a)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>

target/arm: Reformat integer register dump

With PC, there are 33 registers. Three per line lines up nicely
without overflowing 80 columns.

Cc: qemu-stable@nongnu.org (3.0.1)
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
(cherry picked from commit 3cb506a399854c481c2fd2efabecda0654700c47)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>

target/arm: Fix offset scaling for LD_zprr and ST_zprr

The scaling should be solely on the memory operation size; the number
of registers being loaded does not come in to the initial computation.

Cc: qemu-stable@nongnu.org (3.0.1)
Reported-by: Laurent Desnogues <laurent.desnogues@gmail.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Tested-by: Laurent Desnogues <laurent.desnogues@gmail.com>
Reviewed-by: Laurent Desnogues <laurent.desnogues@gmail.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
(cherry picked from commit 50ef1cbf31caad21019ae6fa8036ed6f29244ba5)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>

target/arm: Fix offset for LD1R instructions

The immediate should be scaled by the size of the memory reference,
not the size of the elements into which it is loaded.

Cc: qemu-stable@nongnu.org (3.0.1)
Reported-by: Laurent Desnogues <laurent.desnogues@gmail.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Tested-by: Laurent Desnogues <laurent.desnogues@gmail.com>
Reviewed-by: Laurent Desnogues <laurent.desnogues@gmail.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
(cherry picked from commit d0e372b0298f897993f831dbff7ad4f1c70f138e)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>