]> git.proxmox.com Git - mirror_qemu.git/log
mirror_qemu.git
4 years agoaspeed: add a RAM memory region container
Cédric Le Goater [Mon, 1 Jul 2019 16:26:17 +0000 (17:26 +0100)]
aspeed: add a RAM memory region container

The RAM memory region is defined after the SoC is realized when the
SDMC controller has checked that the defined RAM size for the machine
is correct. This is problematic for controller models requiring a link
on the RAM region, for DMA support in the SMC controller for instance.

Introduce a container memory region for the RAM that we can link into
the controllers early, before the SoC is realized. It will be
populated with the RAM region after the checks have be done.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Joel Stanley <joel@jms.id.au>
Message-id: 20190618165311.27066-14-clg@kaod.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
4 years agoaspeed: remove the "ram" link
Cédric Le Goater [Mon, 1 Jul 2019 16:26:17 +0000 (17:26 +0100)]
aspeed: remove the "ram" link

It has never been used as far as I can tell from the git history.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Joel Stanley <joel@jms.id.au>
Message-id: 20190618165311.27066-13-clg@kaod.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
4 years agoaspeed/timer: Ensure positive muldiv delta
Christian Svensson [Mon, 1 Jul 2019 16:26:17 +0000 (17:26 +0100)]
aspeed/timer: Ensure positive muldiv delta

If the host decrements the counter register that results in a negative
delta. This is then passed to muldiv64 which only handles unsigned
numbers resulting in bogus results.

This fix ensures the delta being operated on is positive.

Test case: kexec a kernel using aspeed_timer and it will freeze on the
second bootup when the kernel initializes the timer. With this patch
that no longer happens and the timer appears to run OK.

Signed-off-by: Christian Svensson <bluecmd@google.com>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Joel Stanley <joel@jms.id.au>
Reviewed-by: Andrew Jeffery <andrew@aj.id.au>
Message-id: 20190618165311.27066-12-clg@kaod.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
4 years agoaspeed/timer: Fix match calculations
Andrew Jeffery [Mon, 1 Jul 2019 16:26:17 +0000 (17:26 +0100)]
aspeed/timer: Fix match calculations

If the match value exceeds reload then we don't want to include it in
calculations for the next event.

Signed-off-by: Andrew Jeffery <andrew@aj.id.au>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-id: 20190618165311.27066-10-clg@kaod.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
4 years agoaspeed/timer: Status register contains reload for stopped timer
Andrew Jeffery [Mon, 1 Jul 2019 16:26:16 +0000 (17:26 +0100)]
aspeed/timer: Status register contains reload for stopped timer

From the datasheet:

  This register stores the current status of counter #N. When timer
  enable bit TMC30[N * b] is disabled, the reload register will be
  loaded into this counter. When timer bit TMC30[N * b] is set, the
  counter will start to decrement. CPU can update this register value
  when enable bit is set.

Signed-off-by: Andrew Jeffery <andrew@aj.id.au>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Joel Stanley <joel@jms.id.au>
Message-id: 20190618165311.27066-9-clg@kaod.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
4 years agoaspeed/timer: Fix behaviour running Linux
Joel Stanley [Mon, 1 Jul 2019 16:26:16 +0000 (17:26 +0100)]
aspeed/timer: Fix behaviour running Linux

The Linux kernel driver was updated in commit 4451d3f59f2a
("clocksource/drivers/fttmr010: Fix set_next_event handler) to fix an
issue observed on hardware:

 > RELOAD register is loaded into COUNT register when the aspeed timer
 > is enabled, which means the next event may be delayed because timer
 > interrupt won't be generated until <0xFFFFFFFF - current_count +
 > cycles>.

When running under Qemu, the system appeared "laggy". The guest is now
scheduling timer events too regularly, starving the host of CPU time.

This patch modifies the timer model to attempt to schedule the timer
expiry as the guest requests, but if we have missed the deadline we
re interrupt and try again, which allows the guest to catch up.

Provides expected behaviour with old and new guest code.

Fixes: c04bd47db6b9 ("hw/timer: Add ASPEED timer device model")
Signed-off-by: Joel Stanley <joel@jms.id.au>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-id: 20190618165311.27066-8-clg@kaod.org
[clg: - merged a fix from Andrew Jeffery <andrew@aj.id.au>
        "Fire interrupt on failure to meet deadline"
        https://lists.ozlabs.org/pipermail/openbmc/2019-January/014641.html
      - adapted commit log
      - checkpatch fixes ]
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
4 years agoaspeed: add support for multiple NICs
Cédric Le Goater [Mon, 1 Jul 2019 16:26:16 +0000 (17:26 +0100)]
aspeed: add support for multiple NICs

The Aspeed SoCs have two MACs. Extend the Aspeed model to support a
second NIC.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Joel Stanley <joel@jms.id.au>
Message-id: 20190618165311.27066-7-clg@kaod.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
4 years agoaspeed: introduce a configurable number of CPU per machine
Cédric Le Goater [Mon, 1 Jul 2019 16:26:16 +0000 (17:26 +0100)]
aspeed: introduce a configurable number of CPU per machine

The current models of the Aspeed SoCs only have one CPU but future
ones will support SMP. Introduce a new num_cpus field at the SoC class
level to define the number of available CPUs per SoC and also
introduce a 'num-cpus' property to activate the CPUs configured for
the machine.

The max_cpus limit of the machine should depend on the SoC definition
but, unfortunately, these values are not available when the machine
class is initialized. This is the reason why we add a check on
num_cpus in the AspeedSoC realize handler.

SMP support will be activated when models for such SoCs are implemented.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Joel Stanley <joel@jms.id.au>
Message-id: 20190618165311.27066-6-clg@kaod.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
4 years agohw/arm/aspeed: Add RTC to SoC
Joel Stanley [Mon, 1 Jul 2019 16:26:16 +0000 (17:26 +0100)]
hw/arm/aspeed: Add RTC to SoC

All systems have an RTC.

The IRQ is hooked up but the model does not use it at this stage. There
is no guest code that uses it, so this limitation is acceptable.

Signed-off-by: Joel Stanley <joel@jms.id.au>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 20190618165311.27066-5-clg@kaod.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
4 years agohw: timer: Add ASPEED RTC device
Joel Stanley [Mon, 1 Jul 2019 16:26:15 +0000 (17:26 +0100)]
hw: timer: Add ASPEED RTC device

The RTC is modeled to provide time and date functionality. It is
initialised at zero to match the hardware.

There is no modelling of the alarm functionality, which includes the IRQ
line. As there is no guest code to exercise this function that is
acceptable for now.

Signed-off-by: Joel Stanley <joel@jms.id.au>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 20190618165311.27066-4-clg@kaod.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
4 years agoaspeed: add a per SoC mapping for the memory space
Cédric Le Goater [Mon, 1 Jul 2019 16:26:15 +0000 (17:26 +0100)]
aspeed: add a per SoC mapping for the memory space

This will simplify the definition of new SoCs, like the AST2600 which
should use a slightly different address space and have a different set
of controllers.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Joel Stanley <joel@jms.id.au>
Message-id: 20190618165311.27066-3-clg@kaod.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
4 years agoaspeed: add a per SoC mapping for the interrupt space
Cédric Le Goater [Mon, 1 Jul 2019 16:26:15 +0000 (17:26 +0100)]
aspeed: add a per SoC mapping for the interrupt space

This will simplify the definition of new SoCs, like the AST2600 which
should use a different CPU and a different IRQ number layout.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Joel Stanley <joel@jms.id.au>
Message-id: 20190618165311.27066-2-clg@kaod.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
4 years agoi.mx7d: pci: Update PCI IRQ mapping to match HW
Andrey Smirnov [Mon, 1 Jul 2019 16:26:15 +0000 (17:26 +0100)]
i.mx7d: pci: Update PCI IRQ mapping to match HW

Datasheet for i.MX7 is incorrect and i.MX7's PCI IRQ mapping matches
that of i.MX6:

    * INTD/MSI    122
    * INTC        123
    * INTB        124
    * INTA        125

Fix all of the relevant code to reflect that fact. Needed by latest
Linux kernels.

(Reference: Linux kernel commit 538d6e9d597584e80 from an
NXP employee confirming that the datasheet is incorrect and
with a report of a test against hardware.)

Signed-off-by: Andrey Smirnov <andrew.smirnov@gmail.com>
Cc: Peter Maydell <peter.maydell@linaro.org>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: qemu-devel@nongnu.org
Cc: qemu-arm@nongnu.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
[PMM: added ref to kernel commit confirming the datasheet error]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
4 years agopci: designware: Update MSI mapping when MSI address changes
Andrey Smirnov [Mon, 1 Jul 2019 16:26:15 +0000 (17:26 +0100)]
pci: designware: Update MSI mapping when MSI address changes

MSI mapping needs to be update when MSI address changes, so add the
code to do so.

Signed-off-by: Andrey Smirnov <andrew.smirnov@gmail.com>
Cc: Peter Maydell <peter.maydell@linaro.org>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: qemu-devel@nongnu.org
Cc: qemu-arm@nongnu.org
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
4 years agopci: designware: Update MSI mapping unconditionally
Andrey Smirnov [Mon, 1 Jul 2019 16:26:15 +0000 (17:26 +0100)]
pci: designware: Update MSI mapping unconditionally

Expression to calculate update_msi_mapping in code handling writes to
DESIGNWARE_PCIE_MSI_INTR0_ENABLE is missing an ! operator and should
be:

    !!root->msi.intr[0].enable ^ !!val;

so that MSI mapping is updated when enabled transitions from either
"none" -> "any" or "any" -> "none". Since that register shouldn't be
written to very often, change the code to update MSI mapping
unconditionally instead of trying to fix the update_msi_mapping logic.

Signed-off-by: Andrey Smirnov <andrew.smirnov@gmail.com>
Cc: Peter Maydell <peter.maydell@linaro.org>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: qemu-devel@nongnu.org
Cc: qemu-arm@nongnu.org
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
4 years agoi.mx7d: Add no-op/unimplemented PCIE PHY IP block
Andrey Smirnov [Mon, 1 Jul 2019 16:26:14 +0000 (17:26 +0100)]
i.mx7d: Add no-op/unimplemented PCIE PHY IP block

Add no-op/unimplemented PCIE PHY IP block. Needed by new kernels to
use PCIE.

Signed-off-by: Andrey Smirnov <andrew.smirnov@gmail.com>
Cc: Peter Maydell <peter.maydell@linaro.org>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: qemu-devel@nongnu.org
Cc: qemu-arm@nongnu.org
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
4 years agoi.mx7d: Add no-op/unimplemented APBH DMA module
Andrey Smirnov [Mon, 1 Jul 2019 16:26:14 +0000 (17:26 +0100)]
i.mx7d: Add no-op/unimplemented APBH DMA module

Instantiate no-op APBH DMA module. Needed to boot latest Linux kernel.

Signed-off-by: Andrey Smirnov <andrew.smirnov@gmail.com>
Cc: Peter Maydell <peter.maydell@linaro.org>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: qemu-devel@nongnu.org
Cc: qemu-arm@nongnu.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
4 years agohw/arm/virt: Add support for Cortex-A7
Jan Kiszka [Mon, 1 Jul 2019 16:26:14 +0000 (17:26 +0100)]
hw/arm/virt: Add support for Cortex-A7

Allow cortex-a7 to be used with the virt board; it supports
the v7VE features and there is no reason to deny this type.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-id: fc5404f7-4d1d-c28f-6e48-d8799c82acc0@web.de
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
4 years agohw/arm/msf2-som: Exit when the cpu is not the expected one
Philippe Mathieu-Daudé [Mon, 1 Jul 2019 16:26:14 +0000 (17:26 +0100)]
hw/arm/msf2-som: Exit when the cpu is not the expected one

This machine correctly defines its default_cpu_type to cortex-m3
and report an error if the user requested another cpu_type,
however it does not exit, and this can confuse users trying
to use another core:

  $ qemu-system-arm -M emcraft-sf2 -cpu cortex-m4 -kernel test-m4.elf
  qemu-system-arm: This board can only be used with CPU cortex-m3-arm-cpu
  [output related to M3 core ...]

The CPU is indeed a M3 core:

  (qemu) info qom-tree
  /machine (emcraft-sf2-machine)
    /unattached (container)
      /device[0] (msf2-soc)
        /armv7m (armv7m)
          /cpu (cortex-m3-arm-cpu)

Add the missing exit() call to return to the shell.

Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Message-id: 20190617160136.29930-1-philmd@redhat.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
4 years agohw/arm/boot: fix direct kernel boot with initrd
Andrew Jones [Mon, 1 Jul 2019 16:26:14 +0000 (17:26 +0100)]
hw/arm/boot: fix direct kernel boot with initrd

Fix the condition used to check whether the initrd fits
into RAM; in some cases if an initrd was also passed on
the command line we would get an error stating that it
was too big to fit into RAM after the kernel. Despite the
error the loader continued anyway, though, so also add an
exit(1) when the initrd is actually too big.

Fixes: 852dc64d665f ("hw/arm/boot: Diagnose layouts that put initrd or
DTB off the end of RAM")
Signed-off-by: Andrew Jones <drjones@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-id: 20190618125844.4863-1-drjones@redhat.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
4 years agoMerge remote-tracking branch 'remotes/vivier2/tags/linux-user-for-4.1-pull-request...
Peter Maydell [Mon, 1 Jul 2019 14:55:40 +0000 (15:55 +0100)]
Merge remote-tracking branch 'remotes/vivier2/tags/linux-user-for-4.1-pull-request' into staging

Update ppc64 feature and default CPU
next setsockops() options
Improve "-L" option
Another fix for 5.2-rc1 headers

# gpg: Signature made Wed 26 Jun 2019 13:11:04 BST
# gpg:                using RSA key CD2F75DDC8E3A4DC2E4F5173F30C38BD3F2FBE3C
# gpg:                issuer "laurent@vivier.eu"
# gpg: Good signature from "Laurent Vivier <lvivier@redhat.com>" [full]
# gpg:                 aka "Laurent Vivier <laurent@vivier.eu>" [full]
# gpg:                 aka "Laurent Vivier (Red Hat) <lvivier@redhat.com>" [full]
# Primary key fingerprint: CD2F 75DD C8E3 A4DC 2E4F  5173 F30C 38BD 3F2F BE3C

* remotes/vivier2/tags/linux-user-for-4.1-pull-request:
  linux-user: set default PPC64 CPU
  linux-user: update PPC64 HWCAP2 feature list
  linux-user: Add support for setsockopt() options IPV6_<ADD|DROP>_MEMBERSHIP
  linux-user: Add support for setsockopt() option SOL_ALG
  linux-user: emulate msgsnd(), msgrcv() and semtimedop()
  util/path: Do not cache all filenames at startup

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
4 years agoMerge remote-tracking branch 'remotes/amarkovic/tags/mips-queue-jun-26-2019' into...
Peter Maydell [Mon, 1 Jul 2019 13:39:45 +0000 (14:39 +0100)]
Merge remote-tracking branch 'remotes/amarkovic/tags/mips-queue-jun-26-2019' into staging

MIPS queue for June 2016th, 2019

# gpg: Signature made Wed 26 Jun 2019 12:38:58 BST
# gpg:                using RSA key D4972A8967F75A65
# gpg: Good signature from "Aleksandar Markovic <amarkovic@wavecomp.com>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg:          There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 8526 FBF1 5DA3 811F 4A01  DD75 D497 2A89 67F7 5A65

* remotes/amarkovic/tags/mips-queue-jun-26-2019:
  target/mips: Fix big endian host behavior for interleave MSA instructions
  tests/tcg: target/mips: Fix some test cases for pack MSA instructions
  tests/tcg: target/mips: Add support for MSA MIPS32R6 testings
  tests/tcg: target/mips: Add support for MSA big-endian target testings
  tests/tcg: target/mips: Amend tests for MSA int multiply instructions
  tests/tcg: target/mips: Amend tests for MSA int dot product instructions
  tests/tcg: target/mips: Add tests for MSA move instructions
  tests/tcg: target/mips: Add tests for MSA bit move instructions
  dma/rc4030: Minor code style cleanup
  dma/rc4030: Fix off-by-one error in specified memory region size
  hw/mips/gt64xxx_pci: Align the pci0-mem size
  hw/mips/gt64xxx_pci: Convert debug printf()s to trace events
  hw/mips/gt64xxx_pci: Use qemu_log_mask() instead of debug printf()
  hw/mips/gt64xxx_pci: Fix 'spaces' coding style issues
  hw/mips/gt64xxx_pci: Fix 'braces' coding style issues
  hw/mips/gt64xxx_pci: Fix 'tabs' coding style issues
  hw/mips/gt64xxx_pci: Fix multiline comment syntax

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
4 years agoMerge remote-tracking branch 'remotes/bkoppelmann2/tags/pull-tricore-20190625' into...
Peter Maydell [Mon, 1 Jul 2019 12:47:20 +0000 (13:47 +0100)]
Merge remote-tracking branch 'remotes/bkoppelmann2/tags/pull-tricore-20190625' into staging

* Add FTOIZ/UTOF/QSEED insns
* Fix sync of hflags and swapped args of RRPW_INSERT

# gpg: Signature made Tue 25 Jun 2019 14:05:03 BST
# gpg:                using RSA key 6E636A7E83F2DD0CFA6E6E370AD2C6396B69CA14
# gpg:                issuer "kbastian@mail.uni-paderborn.de"
# gpg: Good signature from "Bastian Koppelmann <kbastian@mail.uni-paderborn.de>" [full]
# Primary key fingerprint: 6E63 6A7E 83F2 DD0C FA6E  6E37 0AD2 C639 6B69 CA14

* remotes/bkoppelmann2/tags/pull-tricore-20190625:
  tricore: add QSEED instruction
  tricore: sync ctx.hflags with tb->flags
  tricore: fix RRPW_INSERT instruction
  tricore: add UTOF instruction
  tricore: add FTOIZ instruction

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
4 years agoMerge remote-tracking branch 'remotes/aperard/tags/pull-xen-20190624' into staging
Peter Maydell [Mon, 1 Jul 2019 12:03:51 +0000 (13:03 +0100)]
Merge remote-tracking branch 'remotes/aperard/tags/pull-xen-20190624' into staging

Xen queue

* Fix build
* xen-block: support feature-large-sector-size
* xen-block: Support IOThread polling for PV shared rings
* Avoid usage of a VLA
* Cleanup Xen headers usage

# gpg: Signature made Mon 24 Jun 2019 16:30:32 BST
# gpg:                using RSA key F80C006308E22CFD8A92E7980CF5572FD7FB55AF
# gpg:                issuer "anthony.perard@citrix.com"
# gpg: Good signature from "Anthony PERARD <anthony.perard@gmail.com>" [marginal]
# gpg:                 aka "Anthony PERARD <anthony.perard@citrix.com>" [marginal]
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg:          It is not certain that the signature belongs to the owner.
# Primary key fingerprint: 5379 2F71 024C 600F 778A  7161 D8D5 7199 DF83 42C8
#      Subkey fingerprint: F80C 0063 08E2 2CFD 8A92  E798 0CF5 572F D7FB 55AF

* remotes/aperard/tags/pull-xen-20190624:
  xen: Import other xen/io/*.h
  Revert xen/io/ring.h of "Clean up a few header guard symbols"
  xen: Drop includes of xen/hvm/params.h
  xen: Avoid VLA
  xen-bus / xen-block: add support for event channel polling
  xen-bus: allow AioContext to be specified for each event channel
  xen-bus: use a separate fd for each event channel
  xen-block: support feature-large-sector-size

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
4 years agoMerge remote-tracking branch 'remotes/maxreitz/tags/pull-block-2019-06-24' into staging
Peter Maydell [Mon, 1 Jul 2019 10:28:28 +0000 (11:28 +0100)]
Merge remote-tracking branch 'remotes/maxreitz/tags/pull-block-2019-06-24' into staging

Block patches:
- The SSH block driver now uses libssh instead of libssh2
- The VMDK block driver gets read-only support for the seSparse
  subformat
- Various fixes

# gpg: Signature made Mon 24 Jun 2019 15:42:56 BST
# gpg:                using RSA key 91BEB60A30DB3E8857D11829F407DB0061D5CF40
# gpg:                issuer "mreitz@redhat.com"
# gpg: Good signature from "Max Reitz <mreitz@redhat.com>" [full]
# Primary key fingerprint: 91BE B60A 30DB 3E88 57D1  1829 F407 DB00 61D5 CF40

* remotes/maxreitz/tags/pull-block-2019-06-24:
  iotests: Fix 205 for concurrent runs
  ssh: switch from libssh2 to libssh
  vmdk: Add read-only support for seSparse snapshots
  vmdk: Reduce the max bound for L1 table size
  vmdk: Fix comment regarding max l1_size coverage
  iotest 134: test cluster-misaligned encrypted write
  blockdev: enable non-root nodes for transaction drive-backup source
  nvme: do not advertise support for unsupported arbitration mechanism

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
4 years agotarget/mips: Fix big endian host behavior for interleave MSA instructions
Aleksandar Markovic [Wed, 26 Jun 2019 10:07:09 +0000 (12:07 +0200)]
target/mips: Fix big endian host behavior for interleave MSA instructions

Fix big endian host behavior for interleave MSA instructions. Previous
fix used TARGET_WORDS_BIGENDIAN instead of HOST_WORDS_BIGENDIAN, which
was a mistake.

Signed-off-by: Aleksandar Markovic <amarkovic@wavecomp.com>
Reviewed-by: Aleksandar Rikalo <arikalo@wavecomp.com>
Message-Id: <1561543629-20327-9-git-send-email-aleksandar.markovic@rt-rk.com>

4 years agotests/tcg: target/mips: Fix some test cases for pack MSA instructions
Aleksandar Markovic [Wed, 26 Jun 2019 10:07:08 +0000 (12:07 +0200)]
tests/tcg: target/mips: Fix some test cases for pack MSA instructions

Fix certian test cases for MSA pack instructions.

Signed-off-by: Aleksandar Markovic <amarkovic@wavecomp.com>
Reviewed-by: Aleksandar Rikalo <arikalo@wavecomp.com>
Message-Id: <1561543629-20327-8-git-send-email-aleksandar.markovic@rt-rk.com>

4 years agotests/tcg: target/mips: Add support for MSA MIPS32R6 testings
Aleksandar Markovic [Wed, 26 Jun 2019 10:07:07 +0000 (12:07 +0200)]
tests/tcg: target/mips: Add support for MSA MIPS32R6 testings

Add files for MSA MIPS32R6 target testings (copiling and running).

Signed-off-by: Aleksandar Markovic <amarkovic@wavecomp.com>
Reviewed-by: Aleksandar Rikalo <arikalo@wavecomp.com>
Message-Id: <1561543629-20327-7-git-send-email-aleksandar.markovic@rt-rk.com>

4 years agotests/tcg: target/mips: Add support for MSA big-endian target testings
Aleksandar Markovic [Wed, 26 Jun 2019 10:07:06 +0000 (12:07 +0200)]
tests/tcg: target/mips: Add support for MSA big-endian target testings

Add files for MSA big-endian target testings (copiling and running).

Little-endian files are renamed and ammended too.

Signed-off-by: Aleksandar Markovic <amarkovic@wavecomp.com>
Reviewed-by: Aleksandar Rikalo <arikalo@wavecomp.com>
Message-Id: <1561543629-20327-6-git-send-email-aleksandar.markovic@rt-rk.com>

4 years agotests/tcg: target/mips: Amend tests for MSA int multiply instructions
Aleksandar Markovic [Wed, 26 Jun 2019 10:07:05 +0000 (12:07 +0200)]
tests/tcg: target/mips: Amend tests for MSA int multiply instructions

Amend tests for MSA int multiply instructions.

Signed-off-by: Aleksandar Markovic <amarkovic@wavecomp.com>
Reviewed-by: Aleksandar Rikalo <arikalo@wavecomp.com>
Message-Id: <1561543629-20327-5-git-send-email-aleksandar.markovic@rt-rk.com>

4 years agotests/tcg: target/mips: Amend tests for MSA int dot product instructions
Aleksandar Markovic [Wed, 26 Jun 2019 10:07:04 +0000 (12:07 +0200)]
tests/tcg: target/mips: Amend tests for MSA int dot product instructions

Add tests for instructions whose result depends on the value in destination
register (prior to instruction execution).

Signed-off-by: Aleksandar Markovic <amarkovic@wavecomp.com>
Reviewed-by: Aleksandar Rikalo <arikalo@wavecomp.com>
Message-Id: <1561543629-20327-4-git-send-email-aleksandar.markovic@rt-rk.com>

4 years agotests/tcg: target/mips: Add tests for MSA move instructions
Aleksandar Markovic [Wed, 26 Jun 2019 10:07:03 +0000 (12:07 +0200)]
tests/tcg: target/mips: Add tests for MSA move instructions

Add tests for MSA move instructions.

Signed-off-by: Aleksandar Markovic <amarkovic@wavecomp.com>
Reviewed-by: Aleksandar Rikalo <arikalo@wavecomp.com>
Message-Id: <1561543629-20327-3-git-send-email-aleksandar.markovic@rt-rk.com>

4 years agotests/tcg: target/mips: Add tests for MSA bit move instructions
Aleksandar Markovic [Wed, 26 Jun 2019 10:07:02 +0000 (12:07 +0200)]
tests/tcg: target/mips: Add tests for MSA bit move instructions

Add tests for MSA bit move instructions.

Signed-off-by: Aleksandar Markovic <amarkovic@wavecomp.com>
Reviewed-by: Aleksandar Rikalo <arikalo@wavecomp.com>
Message-Id: <1561543629-20327-2-git-send-email-aleksandar.markovic@rt-rk.com>

4 years agodma/rc4030: Minor code style cleanup
Aleksandar Markovic [Tue, 25 Jun 2019 14:27:18 +0000 (16:27 +0200)]
dma/rc4030: Minor code style cleanup

Fix some simple checkpatch.pl warnings in rc4030.c.

Signed-off-by: Aleksandar Markovic <amarkovic@wavecomp.com>
Reviewed-by: Aleksandar Rikalo <arikalo@wavecomp.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <1561472838-32272-3-git-send-email-aleksandar.markovic@rt-rk.com>

4 years agodma/rc4030: Fix off-by-one error in specified memory region size
Aleksandar Markovic [Tue, 25 Jun 2019 14:27:17 +0000 (16:27 +0200)]
dma/rc4030: Fix off-by-one error in specified memory region size

The size is one byte less than it should be:

address-space: rc4030-dma
  0000000000000000-00000000fffffffe (prio 0, i/o): rc4030.dma

rc4030 is used in MIPS Jazz board context.

Signed-off-by: Aleksandar Markovic <amarkovic@wavecomp.com>
Reviewed-by: Aleksandar Rikalo <arikalo@wavecomp.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <1561472838-32272-2-git-send-email-aleksandar.markovic@rt-rk.com>

4 years agohw/mips/gt64xxx_pci: Align the pci0-mem size
Philippe Mathieu-Daudé [Mon, 24 Jun 2019 22:28:41 +0000 (00:28 +0200)]
hw/mips/gt64xxx_pci: Align the pci0-mem size

One byte is missing, use an aligned size.

    (qemu) info mtree
    memory-region: pci0-mem
      0000000000000000-00000000fffffffe (prio 0, i/o): pci0-mem
                                      ^

Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Aleksandar Markovic <amarkovic@wavecomp.com>
Message-Id: <20190624222844.26584-8-f4bug@amsat.org>

4 years agohw/mips/gt64xxx_pci: Convert debug printf()s to trace events
Philippe Mathieu-Daudé [Mon, 24 Jun 2019 22:28:40 +0000 (00:28 +0200)]
hw/mips/gt64xxx_pci: Convert debug printf()s to trace events

Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Aleksandar Markovic <amarkovic@wavecomp.com>
Message-Id: <20190624222844.26584-7-f4bug@amsat.org>

4 years agohw/mips/gt64xxx_pci: Use qemu_log_mask() instead of debug printf()
Philippe Mathieu-Daudé [Mon, 24 Jun 2019 22:28:39 +0000 (00:28 +0200)]
hw/mips/gt64xxx_pci: Use qemu_log_mask() instead of debug printf()

Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Aleksandar Markovic <amarkovic@wavecomp.com>
Message-Id: <20190624222844.26584-6-f4bug@amsat.org>

4 years agohw/mips/gt64xxx_pci: Fix 'spaces' coding style issues
Philippe Mathieu-Daudé [Mon, 24 Jun 2019 22:28:38 +0000 (00:28 +0200)]
hw/mips/gt64xxx_pci: Fix 'spaces' coding style issues

Since we'll move this code around, fix its style first:

  ERROR: space prohibited between function name and open parenthesis
  ERROR: line over 90 characters

Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Aleksandar Markovic <amarkovic@wavecomp.com>
Message-Id: <20190624222844.26584-5-f4bug@amsat.org>

4 years agohw/mips/gt64xxx_pci: Fix 'braces' coding style issues
Philippe Mathieu-Daudé [Mon, 24 Jun 2019 22:28:37 +0000 (00:28 +0200)]
hw/mips/gt64xxx_pci: Fix 'braces' coding style issues

Since we'll move this code around, fix its style first:

  ERROR: braces {} are necessary for all arms of this statement

Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Aleksandar Markovic <amarkovic@wavecomp.com>
Message-Id: <20190624222844.26584-4-f4bug@amsat.org>

4 years agohw/mips/gt64xxx_pci: Fix 'tabs' coding style issues
Philippe Mathieu-Daudé [Mon, 24 Jun 2019 22:28:36 +0000 (00:28 +0200)]
hw/mips/gt64xxx_pci: Fix 'tabs' coding style issues

Since we'll move this code around, fix its style first:

  ERROR: code indent should never use tabs

Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Aleksandar Markovic <amarkovic@wavecomp.com>
Message-Id: <20190624222844.26584-3-f4bug@amsat.org>

4 years agohw/mips/gt64xxx_pci: Fix multiline comment syntax
Philippe Mathieu-Daudé [Mon, 24 Jun 2019 22:28:35 +0000 (00:28 +0200)]
hw/mips/gt64xxx_pci: Fix multiline comment syntax

Since commit 8c06fbdf36b checkpatch.pl enforce a new multiline
comment syntax. Since we'll move this code around, fix its style
first.

Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Aleksandar Markovic <amarkovic@wavecomp.com>
Message-Id: <20190624222844.26584-2-f4bug@amsat.org>

4 years agotricore: add QSEED instruction
Andreas Konopik [Mon, 24 Jun 2019 07:03:39 +0000 (09:03 +0200)]
tricore: add QSEED instruction

Signed-off-by: Andreas Konopik <andreas.konopik@efs-auto.de>
Signed-off-by: David Brenken <david.brenken@efs-auto.de>
Signed-off-by: Georg Hofstetter <georg.hofstetter@efs-auto.de>
Signed-off-by: Robert Rasche <robert.rasche@efs-auto.de>
Signed-off-by: Lars Biermanski <lars.biermanski@efs-auto.de>
Reviewed-by: Bastian Koppelmann <kbastian@mail.uni-paderborn.de>
Message-Id: <20190624070339.4408-6-david.brenken@efs-auto.org>
Signed-off-by: Bastian Koppelmann <kbastian@mail.uni-paderborn.de>
[BK: Added fp_status arg to float32_is_signaling_nan()]

4 years agotricore: sync ctx.hflags with tb->flags
Georg Hofstetter [Mon, 24 Jun 2019 07:03:38 +0000 (09:03 +0200)]
tricore: sync ctx.hflags with tb->flags

Signed-off-by: Andreas Konopik <andreas.konopik@efs-auto.de>
Signed-off-by: David Brenken <david.brenken@efs-auto.de>
Signed-off-by: Georg Hofstetter <georg.hofstetter@efs-auto.de>
Signed-off-by: Robert Rasche <robert.rasche@efs-auto.de>
Signed-off-by: Lars Biermanski <lars.biermanski@efs-auto.de>
Reviewed-by: Bastian Koppelmann <kbastian@mail.uni-paderborn.de>
Message-Id: <20190624070339.4408-5-david.brenken@efs-auto.org>
Signed-off-by: Bastian Koppelmann <kbastian@mail.uni-paderborn.de>
4 years agotricore: fix RRPW_INSERT instruction
David Brenken [Mon, 24 Jun 2019 07:03:37 +0000 (09:03 +0200)]
tricore: fix RRPW_INSERT instruction

Signed-off-by: Andreas Konopik <andreas.konopik@efs-auto.de>
Signed-off-by: David Brenken <david.brenken@efs-auto.de>
Signed-off-by: Georg Hofstetter <georg.hofstetter@efs-auto.de>
Signed-off-by: Robert Rasche <robert.rasche@efs-auto.de>
Signed-off-by: Lars Biermanski <lars.biermanski@efs-auto.de>
Reviewed-by: Bastian Koppelmann <kbastian@mail.uni-paderborn.de>
Message-Id: <20190624070339.4408-4-david.brenken@efs-auto.org>
Signed-off-by: Bastian Koppelmann <kbastian@mail.uni-paderborn.de>
4 years agotricore: add UTOF instruction
David Brenken [Mon, 24 Jun 2019 07:03:36 +0000 (09:03 +0200)]
tricore: add UTOF instruction

Signed-off-by: Andreas Konopik <andreas.konopik@efs-auto.de>
Signed-off-by: David Brenken <david.brenken@efs-auto.de>
Signed-off-by: Georg Hofstetter <georg.hofstetter@efs-auto.de>
Signed-off-by: Robert Rasche <robert.rasche@efs-auto.de>
Signed-off-by: Lars Biermanski <lars.biermanski@efs-auto.de>
Reviewed-by: Bastian Koppelmann <kbastian@mail.uni-paderborn.de>
Message-Id: <20190624070339.4408-3-david.brenken@efs-auto.org>
Signed-off-by: Bastian Koppelmann <kbastian@mail.uni-paderborn.de>
4 years agotricore: add FTOIZ instruction
David Brenken [Mon, 24 Jun 2019 07:03:35 +0000 (09:03 +0200)]
tricore: add FTOIZ instruction

Signed-off-by: Andreas Konopik <andreas.konopik@efs-auto.de>
Signed-off-by: David Brenken <david.brenken@efs-auto.de>
Signed-off-by: Georg Hofstetter <georg.hofstetter@efs-auto.de>
Signed-off-by: Robert Rasche <robert.rasche@efs-auto.de>
Signed-off-by: Lars Biermanski <lars.biermanski@efs-auto.de>
Reviewed-by: Bastian Koppelmann <kbastian@mail.uni-paderborn.de>
Message-Id: <20190624070339.4408-2-david.brenken@efs-auto.org>
Signed-off-by: Bastian Koppelmann <kbastian@mail.uni-paderborn.de>
4 years agolinux-user: set default PPC64 CPU
Laurent Vivier [Sun, 9 Jun 2019 14:35:21 +0000 (16:35 +0200)]
linux-user: set default PPC64 CPU

The default CPU for pseries has been set to POWER9 by default.
We can use the same default for linux-user

Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Message-Id: <20190609143521.19374-2-laurent@vivier.eu>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
4 years agolinux-user: update PPC64 HWCAP2 feature list
Laurent Vivier [Sun, 9 Jun 2019 14:35:20 +0000 (16:35 +0200)]
linux-user: update PPC64 HWCAP2 feature list

QEMU_PPC_FEATURE2_VEC_CRYPTO enables the use
of VSX instructions in libcrypto that are accelerated
by the TCG vector instructions now.

QEMU_PPC_FEATURE2_DARN allows to use the new builtin
qemu_guest_getrandom() function.

Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Message-Id: <20190609143521.19374-1-laurent@vivier.eu>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
4 years agolinux-user: Add support for setsockopt() options IPV6_<ADD|DROP>_MEMBERSHIP
Neng Chen [Wed, 19 Jun 2019 14:17:10 +0000 (16:17 +0200)]
linux-user: Add support for setsockopt() options IPV6_<ADD|DROP>_MEMBERSHIP

Add support for the option IPV6_<ADD|DROP>_MEMBERSHIP of the syscall
setsockopt(). This option controls membership in multicast groups.
Argument is a pointer to a struct ipv6_mreq.

The glibc <netinet/in.h> header defines the ipv6_mreq structure,
which includes the following members:

  struct in6_addr  ipv6mr_multiaddr;
  unsigned int     ipv6mr_interface;

Whereas the kernel in its <linux/in6.h> header defines following
members of the same structure:

  struct in6_addr  ipv6mr_multiaddr;
  int              ipv6mr_ifindex;

POSIX defines ipv6mr_interface [1].

__UAPI_DEF_IVP6_MREQ appears in kernel headers with v3.12:

  cfd280c91253 net: sync some IP headers with glibc

Without __UAPI_DEF_IVP6_MREQ, kernel defines ipv6mr_ifindex, and
this is explained in cfd280c91253:

  "If you include the kernel headers first you get those,
  and if you include the glibc headers first you get those,
  and the following patch arranges a coordination and
  synchronization between the two."

So before 3.12, a program can't include both <netinet/in.h> and
<linux/in6.h>.

In linux-user/syscall.c, we only include <netinet/in.h> (glibc) and
not <linux/in6.h> (kernel headers), so ipv6mr_interface is the one
to use.

[1] http://pubs.opengroup.org/onlinepubs/009695399/basedefs/netinet/in.h.html

Signed-off-by: Neng Chen <nchen@wavecomp.com>
Signed-off-by: Aleksandar Markovic <amarkovic@wavecomp.com>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Message-Id: <1560953834-29584-2-git-send-email-aleksandar.markovic@rt-rk.com>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
4 years agolinux-user: Add support for setsockopt() option SOL_ALG
Yunqiang Su [Wed, 19 Jun 2019 14:17:11 +0000 (16:17 +0200)]
linux-user: Add support for setsockopt() option SOL_ALG

Add support for options SOL_ALG of the syscall setsockopt(). This
option is used in relation to Linux kernel Crypto API, and allows
a user to set additional information for the cipher operation via
syscall setsockopt(). The field "optname" must be one of the
following:

  - ALG_SET_KEY – seting the key
  - ALG_SET_AEAD_AUTHSIZE – set the authentication tag size

SOL_ALG is relatively newer setsockopt() option. Therefore, the
code that handles SOL_ALG is enclosed in "ifdef" so that the build
does not fail for older kernels that do not contain support for
SOL_ALG. "ifdef" also contains check if ALG_SET_KEY and
ALG_SET_AEAD_AUTHSIZE are defined.

Signed-off-by: Yunqiang Su <ysu@wavecomp.com>
Signed-off-by: Aleksandar Markovic <amarkovic@wavecomp.com>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Message-Id: <1560953834-29584-3-git-send-email-aleksandar.markovic@rt-rk.com>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
4 years agolinux-user: emulate msgsnd(), msgrcv() and semtimedop()
Laurent Vivier [Wed, 29 May 2019 08:48:04 +0000 (10:48 +0200)]
linux-user: emulate msgsnd(), msgrcv() and semtimedop()

When we have updated kernel headers to 5.2-rc1 we have introduced
new syscall numbers that can be not supported by older kernels
and fail with ENOSYS while the guest emulation succeeded before
because the syscalls were emulated with ipc().

This patch fixes the problem by using ipc() if the new syscall
returns ENOSYS.

Fixes: 86e636951ddc ("linux-user: fix __NR_semtimedop undeclared error")
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Message-Id: <20190529084804.25950-1-laurent@vivier.eu>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
4 years agoutil/path: Do not cache all filenames at startup
Richard Henderson [Sun, 19 May 2019 20:19:41 +0000 (13:19 -0700)]
util/path: Do not cache all filenames at startup

If one uses -L $PATH to point to a full chroot, the startup time
is significant.  In addition, the existing probing algorithm fails
to handle symlink loops.

Instead, probe individual paths on demand.  Cache both positive
and negative results within $PATH, so that any one filename is
probed only once.

Use glib filename functions for clarity.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Tested-by: Laurent Vivier <laurent@vivier.eu>
Message-Id: <20190519201953.20161-2-richard.henderson@linaro.org>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
4 years agoiotests: Fix 205 for concurrent runs
Max Reitz [Tue, 18 Jun 2019 21:02:38 +0000 (23:02 +0200)]
iotests: Fix 205 for concurrent runs

Tests should place their files into the test directory.  This includes
Unix sockets.  205 currently fails to do so, which prevents it from
being run concurrently.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Message-id: 20190618210238.9524-1-mreitz@redhat.com
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
4 years agossh: switch from libssh2 to libssh
Pino Toscano [Thu, 20 Jun 2019 20:08:40 +0000 (22:08 +0200)]
ssh: switch from libssh2 to libssh

Rewrite the implementation of the ssh block driver to use libssh instead
of libssh2.  The libssh library has various advantages over libssh2:
- easier API for authentication (for example for using ssh-agent)
- easier API for known_hosts handling
- supports newer types of keys in known_hosts

Use APIs/features available in libssh 0.8 conditionally, to support
older versions (which are not recommended though).

Adjust the iotest 207 according to the different error message, and to
find the default key type for localhost (to properly compare the
fingerprint with).
Contributed-by: Max Reitz <mreitz@redhat.com>
Adjust the various Docker/Travis scripts to use libssh when available
instead of libssh2. The mingw/mxe testing is dropped for now, as there
are no packages for it.

Signed-off-by: Pino Toscano <ptoscano@redhat.com>
Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Acked-by: Alex Bennée <alex.bennee@linaro.org>
Message-id: 20190620200840.17655-1-ptoscano@redhat.com
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-id: 5873173.t2JhDm7DL7@lindworm.usersys.redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
4 years agovmdk: Add read-only support for seSparse snapshots
Sam Eiderman [Thu, 20 Jun 2019 09:10:57 +0000 (12:10 +0300)]
vmdk: Add read-only support for seSparse snapshots

Until ESXi 6.5 VMware used the vmfsSparse format for snapshots (VMDK3 in
QEMU).

This format was lacking in the following:

    * Grain directory (L1) and grain table (L2) entries were 32-bit,
      allowing access to only 2TB (slightly less) of data.
    * The grain size (default) was 512 bytes - leading to data
      fragmentation and many grain tables.
    * For space reclamation purposes, it was necessary to find all the
      grains which are not pointed to by any grain table - so a reverse
      mapping of "offset of grain in vmdk" to "grain table" must be
      constructed - which takes large amounts of CPU/RAM.

The format specification can be found in VMware's documentation:
https://www.vmware.com/support/developer/vddk/vmdk_50_technote.pdf

In ESXi 6.5, to support snapshot files larger than 2TB, a new format was
introduced: SESparse (Space Efficient).

This format fixes the above issues:

    * All entries are now 64-bit.
    * The grain size (default) is 4KB.
    * Grain directory and grain tables are now located at the beginning
      of the file.
      + seSparse format reserves space for all grain tables.
      + Grain tables can be addressed using an index.
      + Grains are located in the end of the file and can also be
        addressed with an index.
      - seSparse vmdks of large disks (64TB) have huge preallocated
        headers - mainly due to L2 tables, even for empty snapshots.
    * The header contains a reverse mapping ("backmap") of "offset of
      grain in vmdk" to "grain table" and a bitmap ("free bitmap") which
      specifies for each grain - whether it is allocated or not.
      Using these data structures we can implement space reclamation
      efficiently.
    * Due to the fact that the header now maintains two mappings:
        * The regular one (grain directory & grain tables)
        * A reverse one (backmap and free bitmap)
      These data structures can lose consistency upon crash and result
      in a corrupted VMDK.
      Therefore, a journal is also added to the VMDK and is replayed
      when the VMware reopens the file after a crash.

Since ESXi 6.7 - SESparse is the only snapshot format available.

Unfortunately, VMware does not provide documentation regarding the new
seSparse format.

This commit is based on black-box research of the seSparse format.
Various in-guest block operations and their effect on the snapshot file
were tested.

The only VMware provided source of information (regarding the underlying
implementation) was a log file on the ESXi:

    /var/log/hostd.log

Whenever an seSparse snapshot is created - the log is being populated
with seSparse records.

Relevant log records are of the form:

[...] Const Header:
[...]  constMagic     = 0xcafebabe
[...]  version        = 2.1
[...]  capacity       = 204800
[...]  grainSize      = 8
[...]  grainTableSize = 64
[...]  flags          = 0
[...] Extents:
[...]  Header         : <1 : 1>
[...]  JournalHdr     : <2 : 2>
[...]  Journal        : <2048 : 2048>
[...]  GrainDirectory : <4096 : 2048>
[...]  GrainTables    : <6144 : 2048>
[...]  FreeBitmap     : <8192 : 2048>
[...]  BackMap        : <10240 : 2048>
[...]  Grain          : <12288 : 204800>
[...] Volatile Header:
[...] volatileMagic     = 0xcafecafe
[...] FreeGTNumber      = 0
[...] nextTxnSeqNumber  = 0
[...] replayJournal     = 0

The sizes that are seen in the log file are in sectors.
Extents are of the following format: <offset : size>

This commit is a strict implementation which enforces:
    * magics
    * version number 2.1
    * grain size of 8 sectors  (4KB)
    * grain table size of 64 sectors
    * zero flags
    * extent locations

Additionally, this commit proivdes only a subset of the functionality
offered by seSparse's format:
    * Read-only
    * No journal replay
    * No space reclamation
    * No unmap support

Hence, journal header, journal, free bitmap and backmap extents are
unused, only the "classic" (L1 -> L2 -> data) grain access is
implemented.

However there are several differences in the grain access itself.
Grain directory (L1):
    * Grain directory entries are indexes (not offsets) to grain
      tables.
    * Valid grain directory entries have their highest nibble set to
      0x1.
    * Since grain tables are always located in the beginning of the
      file - the index can fit into 32 bits - so we can use its low
      part if it's valid.
Grain table (L2):
    * Grain table entries are indexes (not offsets) to grains.
    * If the highest nibble of the entry is:
        0x0:
            The grain in not allocated.
            The rest of the bytes are 0.
        0x1:
            The grain is unmapped - guest sees a zero grain.
            The rest of the bits point to the previously mapped grain,
            see 0x3 case.
        0x2:
            The grain is zero.
        0x3:
            The grain is allocated - to get the index calculate:
            ((entry & 0x0fff000000000000) >> 48) |
            ((entry & 0x0000ffffffffffff) << 12)
    * The difference between 0x1 and 0x2 is that 0x1 is an unallocated
      grain which results from the guest using sg_unmap to unmap the
      grain - but the grain itself still exists in the grain extent - a
      space reclamation procedure should delete it.
      Unmapping a zero grain has no effect (0x2 will not change to 0x1)
      but unmapping an unallocated grain will (0x0 to 0x1) - naturally.

In order to implement seSparse some fields had to be changed to support
both 32-bit and 64-bit entry sizes.

Reviewed-by: Karl Heubaum <karl.heubaum@oracle.com>
Reviewed-by: Eyal Moscovici <eyal.moscovici@oracle.com>
Reviewed-by: Arbel Moshe <arbel.moshe@oracle.com>
Signed-off-by: Sam Eiderman <shmuel.eiderman@oracle.com>
Message-id: 20190620091057.47441-4-shmuel.eiderman@oracle.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
4 years agovmdk: Reduce the max bound for L1 table size
Sam Eiderman [Thu, 20 Jun 2019 09:10:56 +0000 (12:10 +0300)]
vmdk: Reduce the max bound for L1 table size

512M of L1 entries is a very loose bound, only 32M are required to store
the maximal supported VMDK file size of 2TB.

Fixed qemu-iotest 59# - now failure occures before on impossible L1
table size.

Reviewed-by: Karl Heubaum <karl.heubaum@oracle.com>
Reviewed-by: Eyal Moscovici <eyal.moscovici@oracle.com>
Reviewed-by: Liran Alon <liran.alon@oracle.com>
Reviewed-by: Arbel Moshe <arbel.moshe@oracle.com>
Signed-off-by: Sam Eiderman <shmuel.eiderman@oracle.com>
Message-id: 20190620091057.47441-3-shmuel.eiderman@oracle.com
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
4 years agovmdk: Fix comment regarding max l1_size coverage
Sam Eiderman [Thu, 20 Jun 2019 09:10:55 +0000 (12:10 +0300)]
vmdk: Fix comment regarding max l1_size coverage

Commit b0651b8c246d ("vmdk: Move l1_size check into vmdk_add_extent")
extended the l1_size check from VMDK4 to VMDK3 but did not update the
default coverage in the moved comment.

The previous vmdk4 calculation:

    (512 * 1024 * 1024) * 512(l2 entries) * 65536(grain) = 16PB

The added vmdk3 calculation:

    (512 * 1024 * 1024) * 4096(l2 entries) * 512(grain) = 1PB

Adding the calculation of vmdk3 to the comment.

In any case, VMware does not offer virtual disks more than 2TB for
vmdk4/vmdk3 or 64TB for the new undocumented seSparse format which is
not implemented yet in qemu.

Reviewed-by: Karl Heubaum <karl.heubaum@oracle.com>
Reviewed-by: Eyal Moscovici <eyal.moscovici@oracle.com>
Reviewed-by: Liran Alon <liran.alon@oracle.com>
Reviewed-by: Arbel Moshe <arbel.moshe@oracle.com>
Signed-off-by: Sam Eiderman <shmuel.eiderman@oracle.com>
Message-id: 20190620091057.47441-2-shmuel.eiderman@oracle.com
Reviewed-by: yuchenlin <yuchenlin@synology.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
4 years agoiotest 134: test cluster-misaligned encrypted write
Anton Nefedov [Thu, 16 May 2019 14:30:28 +0000 (17:30 +0300)]
iotest 134: test cluster-misaligned encrypted write

COW (even empty/zero) areas require encryption too

Signed-off-by: Anton Nefedov <anton.nefedov@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Message-id: 20190516143028.81155-1-anton.nefedov@virtuozzo.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
4 years agoblockdev: enable non-root nodes for transaction drive-backup source
Vladimir Sementsov-Ogievskiy [Tue, 18 Jun 2019 14:08:04 +0000 (17:08 +0300)]
blockdev: enable non-root nodes for transaction drive-backup source

We forget to enable it for transaction .prepare, while it is already
enabled in do_drive_backup since commit a2d665c1bc362
    "blockdev: loosen restrictions on drive-backup source node"

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-id: 20190618140804.59214-1-vsementsov@virtuozzo.com
Reviewed-by: John Snow <jsnow@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
4 years agonvme: do not advertise support for unsupported arbitration mechanism
Klaus Birkelund Jensen [Thu, 6 Jun 2019 09:25:30 +0000 (11:25 +0200)]
nvme: do not advertise support for unsupported arbitration mechanism

The device mistakenly reports that the Weighted Round Robin with Urgent
Priority Class arbitration mechanism is supported.

It is not.

Signed-off-by: Klaus Birkelund Jensen <klaus.jensen@cnexlabs.com>
Message-id: 20190606092530.14206-1-klaus@birkelund.eu
Acked-by: Maxim Levitsky <mlevitsk@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
4 years agoxen: Import other xen/io/*.h
Anthony PERARD [Fri, 21 Jun 2019 10:54:41 +0000 (11:54 +0100)]
xen: Import other xen/io/*.h

A Xen public header have been imported into QEMU (by
f65eadb639 "xen: import ring.h from xen"), but there are other header
that depends on ring.h which come from the system when building QEMU.

This patch resolves the issue of having headers from the system
importing a different copie of ring.h.

This patch is prompt by the build issue described in the previous
patch: 'Revert xen/io/ring.h of "Clean up a few header guard symbols"'

ring.h and the new imported headers are moved to
"include/hw/xen/interface" as those describe interfaces with a guest.

The imported headers are cleaned up a bit while importing them: some
part of the file that QEMU doesn't use are removed (description
of how to make hypercall in grant_table.h have been removed).

Other cleanup:
- xen-mapcache.c and xen-legacy-backend.c don't need grant_table.h.
- xenfb.c doesn't need event_channel.h.

Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Paul Durrant <paul.durrant@citrix.com>
Message-Id: <20190621105441.3025-3-anthony.perard@citrix.com>

4 years agoRevert xen/io/ring.h of "Clean up a few header guard symbols"
Anthony PERARD [Fri, 21 Jun 2019 10:54:40 +0000 (11:54 +0100)]
Revert xen/io/ring.h of "Clean up a few header guard symbols"

This reverts changes to include/hw/xen/io/ring.h from commit
37677d7db39a3c250ad661d00fb7c3b59d047b1f.

Following 37677d7db3 "Clean up a few header guard symbols", QEMU start
to fail to build:

In file included from ~/xen/tools/../tools/include/xen/io/blkif.h:31:0,
                 from ~/xen/tools/qemu-xen-dir/hw/block/xen_blkif.h:5,
                 from ~/xen/tools/qemu-xen-dir/hw/block/xen-block.c:22:
~/xen/tools/../tools/include/xen/io/ring.h:68:0: error: "__CONST_RING_SIZE" redefined [-Werror]
 #define __CONST_RING_SIZE(_s, _sz) \

In file included from ~/xen/tools/qemu-xen-dir/hw/block/xen_blkif.h:4:0,
                 from ~/xen/tools/qemu-xen-dir/hw/block/xen-block.c:22:
~/xen/tools/qemu-xen-dir/include/hw/xen/io/ring.h:66:0: note: this is the location of the previous definition
 #define __CONST_RING_SIZE(_s, _sz) \

The issue is that some public xen headers have been imported (by
f65eadb639 "xen: import ring.h from xen") but not all. With the change
in the guards symbole, the ring.h header start to be imported twice.

Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Paul Durrant <paul.durrant@citrix.com>
Message-Id: <20190621105441.3025-2-anthony.perard@citrix.com>

4 years agoxen: Drop includes of xen/hvm/params.h
Anthony PERARD [Tue, 18 Jun 2019 11:23:40 +0000 (12:23 +0100)]
xen: Drop includes of xen/hvm/params.h

xen-mapcache.c doesn't needs params.h.

xen-hvm.c uses defines available in params.h but so is xen_common.h
which is included before. HVM_PARAM_* flags are only needed to make
xc_hvm_param_{get,set} calls so including only xenctrl.h, which is
where the definition the function is, should be enough.
(xenctrl.h does include params.h)

Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
Reviewed-by: Paul Durrant <paul.durrant@citrix.com>
Message-Id: <20190618112341.513-4-anthony.perard@citrix.com>

4 years agoxen: Avoid VLA
Anthony PERARD [Tue, 18 Jun 2019 11:23:41 +0000 (12:23 +0100)]
xen: Avoid VLA

Avoid using a variable length array.

We allocate the `dirty_bitmap' buffer only once when we start tracking
for dirty bits.

Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
Reviewed-by: Paul Durrant <paul.durrant@citrix.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20190618112341.513-5-anthony.perard@citrix.com>

4 years agoxen-bus / xen-block: add support for event channel polling
Paul Durrant [Mon, 8 Apr 2019 15:16:17 +0000 (16:16 +0100)]
xen-bus / xen-block: add support for event channel polling

This patch introduces a poll callback for event channel fd-s and uses
this to invoke the channel callback function.

To properly support polling, it is necessary for the event channel callback
function to return a boolean saying whether it has done any useful work or
not. Thus xen_block_dataplane_event() is modified to directly invoke
xen_block_handle_requests() and the latter only returns true if it actually
processes any requests. This also means that the call to qemu_bh_schedule()
is moved into xen_block_complete_aio(), which is more intuitive since the
only reason for doing a deferred poll of the shared ring should be because
there were previously insufficient resources to fully complete a previous
poll.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Reviewed-by: Anthony PERARD <anthony.perard@citrix.com>
Message-Id: <20190408151617.13025-4-paul.durrant@citrix.com>
Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
4 years agoxen-bus: allow AioContext to be specified for each event channel
Paul Durrant [Mon, 8 Apr 2019 15:16:16 +0000 (16:16 +0100)]
xen-bus: allow AioContext to be specified for each event channel

This patch adds an AioContext parameter to xen_device_bind_event_channel()
and then uses aio_set_fd_handler() to set the callback rather than
qemu_set_fd_handler().

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Reviewed-by: Anthony PERARD <anthony.perard@citrix.com>
Message-Id: <20190408151617.13025-3-paul.durrant@citrix.com>
[Call aio_set_fd_handler() with is_external=true]
Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
4 years agoxen-bus: use a separate fd for each event channel
Paul Durrant [Mon, 8 Apr 2019 15:16:15 +0000 (16:16 +0100)]
xen-bus: use a separate fd for each event channel

To better support use of IOThread-s it will be necessary to be able to set
the AioContext for each XenEventChannel and hence it is necessary to open a
separate handle to libxenevtchan for each channel.

This patch stops using NotifierList for event channel callbacks, replacing
that construct by a list of complete XenEventChannel structures. Each of
these now has a xenevtchn_handle pointer in place of the single pointer
previously held in the XenDevice structure. The individual handles are
opened/closed in xen_device_bind/unbind_event_channel(), replacing the
single open/close in xen_device_realize/unrealize().

NOTE: This patch does not add an AioContext parameter to
      xen_device_bind_event_channel(). That will be done in a subsequent
      patch.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Reviewed-by: Anthony PERARD <anthony.perard@citrix.com>
Message-Id: <20190408151617.13025-2-paul.durrant@citrix.com>
Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
4 years agoxen-block: support feature-large-sector-size
Paul Durrant [Tue, 9 Apr 2019 16:40:38 +0000 (17:40 +0100)]
xen-block: support feature-large-sector-size

A recent Xen commit [1] clarified the semantics of sector based quantities
used in the blkif protocol such that it is now safe to create a xen-block
device with a logical_block_size != 512, as long as the device only
connects to a frontend advertizing 'feature-large-block-size'.

This patch modifies xen-block accordingly. It also uses a stack variable
for the BlockBackend in xen_block_realize() to avoid repeated dereferencing
of the BlockConf pointer, and changes the parameters of
xen_block_dataplane_create() so that the BlockBackend pointer and sector
size are passed expicitly rather than implicitly via the BlockConf.

These modifications have been tested against a recent Windows PV XENVBD
driver [2] using a xen-disk device with a 4kB logical block size.

[1] http://xenbits.xen.org/gitweb/?p=xen.git;a=commit;h=67e1c050e36b2c9900cca83618e56189effbad98
[2] https://winpvdrvbuild.xenproject.org:8080/job/XENVBD-master/126

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Reviewed-by: Anthony PERARD <anthony.perard@citrix.com>
Message-Id: <20190409164038.25484-1-paul.durrant@citrix.com>
[Edited error message]
Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
4 years agoMerge remote-tracking branch 'remotes/amarkovic/tags/mips-queue-jun-21-2019' into...
Peter Maydell [Fri, 21 Jun 2019 14:40:50 +0000 (15:40 +0100)]
Merge remote-tracking branch 'remotes/amarkovic/tags/mips-queue-jun-21-2019' into staging

MIPS queue for June 21st, 2019

# gpg: Signature made Fri 21 Jun 2019 10:46:57 BST
# gpg:                using RSA key D4972A8967F75A65
# gpg: Good signature from "Aleksandar Markovic <amarkovic@wavecomp.com>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg:          There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 8526 FBF1 5DA3 811F 4A01  DD75 D497 2A89 67F7 5A65

* remotes/amarkovic/tags/mips-queue-jun-21-2019:
  target/mips: Fix emulation of ILVR.<B|H|W> on big endian host
  target/mips: Fix emulation of ILVL.<B|H|W> on big endian host
  target/mips: Fix emulation of ILVOD.<B|H|W> on big endian host
  target/mips: Fix emulation of ILVEV.<B|H|W> on big endian host
  tests/tcg: target/mips: Amend tests for MSA pack instructions
  tests/tcg: target/mips: Include isa/ase and group name in test output
  target/mips: Fix if-else-switch-case arms checkpatch errors in translate.c
  target/mips: Fix some space checkpatch errors in translate.c
  MAINTAINERS: Consolidate MIPS disassembler-related items
  MAINTAINERS: Update file items for MIPS Malta board

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
4 years agoMerge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging
Peter Maydell [Fri, 21 Jun 2019 12:32:10 +0000 (13:32 +0100)]
Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging

* Nuke hw_compat_4_0_1 and pc_compat_4_0_1 (Greg)
* Static analysis fixes (Igor, Lidong)
* X86 Hyper-V CPUID improvements (Vitaly)
* X86 nested virt migration (Liran)
* New MSR-based features (Xiaoyao)

# gpg: Signature made Fri 21 Jun 2019 12:25:42 BST
# gpg:                using RSA key BFFBD25F78C7AE83
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>" [full]
# gpg:                 aka "Paolo Bonzini <pbonzini@redhat.com>" [full]
# Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4  E2F7 7E15 100C CD36 69B1
#      Subkey fingerprint: F133 3857 4B66 2389 866C  7682 BFFB D25F 78C7 AE83

* remotes/bonzini/tags/for-upstream: (25 commits)
  hw: Nuke hw_compat_4_0_1 and pc_compat_4_0_1
  util/main-loop: Fix incorrect assertion
  sd: Fix out-of-bounds assertions
  target/i386: kvm: Add nested migration blocker only when kernel lacks required capabilities
  target/i386: kvm: Add support for KVM_CAP_EXCEPTION_PAYLOAD
  target/i386: kvm: Add support for save and restore nested state
  vmstate: Add support for kernel integer types
  linux-headers: sync with latest KVM headers from Linux 5.2
  target/i386: kvm: Block migration for vCPUs exposed with nested virtualization
  target/i386: kvm: Re-inject #DB to guest with updated DR6
  target/i386: kvm: Use symbolic constant for #DB/#BP exception constants
  KVM: Introduce kvm_arch_destroy_vcpu()
  target/i386: kvm: Delete VMX migration blocker on vCPU init failure
  target/i386: define a new MSR based feature word - FEAT_CORE_CAPABILITY
  i386/kvm: add support for Direct Mode for Hyper-V synthetic timers
  i386/kvm: hv-evmcs requires hv-vapic
  i386/kvm: hv-tlbflush/ipi require hv-vpindex
  i386/kvm: hv-stimer requires hv-time and hv-synic
  i386/kvm: implement 'hv-passthrough' mode
  i386/kvm: document existing Hyper-V enlightenments
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
4 years agohw: Nuke hw_compat_4_0_1 and pc_compat_4_0_1
Greg Kurz [Fri, 14 Jun 2019 13:09:02 +0000 (15:09 +0200)]
hw: Nuke hw_compat_4_0_1 and pc_compat_4_0_1

Commit c87759ce876a fixed a regression affecting pc-q35 machines by
introducing a new pc-q35-4.0.1 machine version to be used instead
of pc-q35-4.0. The only purpose was to revert the default behaviour
of not using split irqchip, but the change also introduced the usual
hw_compat and pc_compat bits, and wired them for pc-q35 only.

This raises questions when it comes to add new compat properties for
4.0* machine versions of any architecture. Where to add them ? In
4.0, 4.0.1 or both ? Error prone. Another possibility would be to teach
all other architectures about 4.0.1. This solution isn't satisfying,
especially since this is a pc-q35 specific issue.

It turns out that the split irqchip default is handled in the machine
option function and doesn't involve compat lists at all.

Drop all the 4.0.1 compat lists and use the 4.0 ones instead in the 4.0.1
machine option function.

Move the compat props that were added to the 4.0.1 since c87759ce876a to
4.0.

Even if only hw_compat_4_0_1 had an impact on other architectures,
drop pc_compat_4_0_1 as well for consistency.

Fixes: c87759ce876a "q35: Revert to kernel irqchip"
Suggested-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-Id: <156051774276.244890.8660277280145466396.stgit@bahia.lan>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
4 years agoutil/main-loop: Fix incorrect assertion
Lidong Chen [Wed, 19 Jun 2019 19:14:47 +0000 (15:14 -0400)]
util/main-loop: Fix incorrect assertion

The check for poll_fds in g_assert() was incorrect. The correct assertion
should check "n_poll_fds + w->num <= ARRAY_SIZE(poll_fds)" because the
subsequent for-loop is doing access to poll_fds[n_poll_fds + i] where i
is in [0, w->num).  This could happen with a very high number of file
descriptors and/or wait objects.

Signed-off-by: Lidong Chen <lidong.chen@oracle.com>
Suggested-by: Peter Maydell <peter.maydell@linaro.org>
Suggested-by: Liam Merwick <liam.merwick@oracle.com>
Reviewed-by: Liran Alon <liran.alon@oracle.com>
Reviewed-by: Darren Kenny <darren.kenny@oracle.com>
Reviewed-by: Li Qiang <liq3ea@gmail.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <ded30967982811617ce7f0222d11228130c198b7.1560806687.git.lidong.chen@oracle.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
4 years agosd: Fix out-of-bounds assertions
Lidong Chen [Wed, 19 Jun 2019 19:14:46 +0000 (15:14 -0400)]
sd: Fix out-of-bounds assertions

Due to an off-by-one error, the assert statements allow an
out-of-bound array access.  This doesn't happen in practice,
but the static analyzer notices.

Signed-off-by: Lidong Chen <lidong.chen@oracle.com>
Reviewed-by: Liam Merwick <liam.merwick@oracle.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Li Qiang <liq3ea@gmail.com>
Reviewed-by: Darren Kenny <darren.kenny@oracle.com>
Message-Id: <6b19cb7359a10a6bedc3ea0fce22fed3ef93c102.1560806687.git.lidong.chen@oracle.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
4 years agotarget/i386: kvm: Add nested migration blocker only when kernel lacks required capabi...
Liran Alon [Wed, 19 Jun 2019 16:21:40 +0000 (19:21 +0300)]
target/i386: kvm: Add nested migration blocker only when kernel lacks required capabilities

Previous commits have added support for migration of nested virtualization
workloads. This was done by utilising two new KVM capabilities:
KVM_CAP_NESTED_STATE and KVM_CAP_EXCEPTION_PAYLOAD. Both which are
required in order to correctly migrate such workloads.

Therefore, change code to add a migration blocker for vCPUs exposed with
Intel VMX or AMD SVM in case one of these kernel capabilities is
missing.

Signed-off-by: Liran Alon <liran.alon@oracle.com>
Reviewed-by: Maran Wilson <maran.wilson@oracle.com>
Message-Id: <20190619162140.133674-11-liran.alon@oracle.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
4 years agotarget/i386: kvm: Add support for KVM_CAP_EXCEPTION_PAYLOAD
Liran Alon [Wed, 19 Jun 2019 16:21:39 +0000 (19:21 +0300)]
target/i386: kvm: Add support for KVM_CAP_EXCEPTION_PAYLOAD

Kernel commit c4f55198c7c2 ("kvm: x86: Introduce KVM_CAP_EXCEPTION_PAYLOAD")
introduced a new KVM capability which allows userspace to correctly
distinguish between pending and injected exceptions.

This distinguish is important in case of nested virtualization scenarios
because a L2 pending exception can still be intercepted by the L1 hypervisor
while a L2 injected exception cannot.

Furthermore, when an exception is attempted to be injected by QEMU,
QEMU should specify the exception payload (CR2 in case of #PF or
DR6 in case of #DB) instead of having the payload already delivered in
the respective vCPU register. Because in case exception is injected to
L2 guest and is intercepted by L1 hypervisor, then payload needs to be
reported to L1 intercept (VMExit handler) while still preserving
respective vCPU register unchanged.

This commit adds support for QEMU to properly utilise this new KVM
capability (KVM_CAP_EXCEPTION_PAYLOAD).

Reviewed-by: Nikita Leshenko <nikita.leshchenko@oracle.com>
Signed-off-by: Liran Alon <liran.alon@oracle.com>
Message-Id: <20190619162140.133674-10-liran.alon@oracle.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
4 years agotarget/i386: kvm: Add support for save and restore nested state
Liran Alon [Wed, 19 Jun 2019 16:21:38 +0000 (19:21 +0300)]
target/i386: kvm: Add support for save and restore nested state

Kernel commit 8fcc4b5923af ("kvm: nVMX: Introduce KVM_CAP_NESTED_STATE")
introduced new IOCTLs to extract and restore vCPU state related to
Intel VMX & AMD SVM.

Utilize these IOCTLs to add support for migration of VMs which are
running nested hypervisors.

Reviewed-by: Nikita Leshenko <nikita.leshchenko@oracle.com>
Reviewed-by: Maran Wilson <maran.wilson@oracle.com>
Tested-by: Maran Wilson <maran.wilson@oracle.com>
Signed-off-by: Liran Alon <liran.alon@oracle.com>
Message-Id: <20190619162140.133674-9-liran.alon@oracle.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
4 years agovmstate: Add support for kernel integer types
Liran Alon [Wed, 19 Jun 2019 16:21:37 +0000 (19:21 +0300)]
vmstate: Add support for kernel integer types

Reviewed-by: Nikita Leshenko <nikita.leshchenko@oracle.com>
Reviewed-by: Maran Wilson <maran.wilson@oracle.com>
Signed-off-by: Liran Alon <liran.alon@oracle.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-Id: <20190619162140.133674-8-liran.alon@oracle.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
4 years agolinux-headers: sync with latest KVM headers from Linux 5.2
Liran Alon [Wed, 19 Jun 2019 16:21:36 +0000 (19:21 +0300)]
linux-headers: sync with latest KVM headers from Linux 5.2

Improve the KVM_{GET,SET}_NESTED_STATE structs by detailing the format
of VMX nested state data in a struct.

In order to avoid changing the ioctl values of
KVM_{GET,SET}_NESTED_STATE, there is a need to preserve
sizeof(struct kvm_nested_state). This is done by defining the data
struct as "data.vmx[0]". It was the most elegant way I found to
preserve struct size while still keeping struct readable and easy to
maintain. It does have a misfortunate side-effect that now it has to be
accessed as "data.vmx[0]" rather than just "data.vmx".

Because we are already modifying these structs, I also modified the
following:
* Define the "format" field values as macros.
* Rename vmcs_pa to vmcs12_pa for better readability.

Signed-off-by: Liran Alon <liran.alon@oracle.com>
Reviewed-by: Maran Wilson <maran.wilson@oracle.com>
Message-Id: <20190619162140.133674-7-liran.alon@oracle.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
4 years agotarget/i386: kvm: Block migration for vCPUs exposed with nested virtualization
Liran Alon [Wed, 19 Jun 2019 16:21:35 +0000 (19:21 +0300)]
target/i386: kvm: Block migration for vCPUs exposed with nested virtualization

Commit d98f26073beb ("target/i386: kvm: add VMX migration blocker")
added a migration blocker for vCPU exposed with Intel VMX.
However, migration should also be blocked for vCPU exposed with
AMD SVM.

Both cases should be blocked because QEMU should extract additional
vCPU state from KVM that should be migrated as part of vCPU VMState.
E.g. Whether vCPU is running in guest-mode or host-mode.

Fixes: d98f26073beb ("target/i386: kvm: add VMX migration blocker")
Reviewed-by: Maran Wilson <maran.wilson@oracle.com>
Signed-off-by: Liran Alon <liran.alon@oracle.com>
Message-Id: <20190619162140.133674-6-liran.alon@oracle.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
4 years agotarget/mips: Fix emulation of ILVR.<B|H|W> on big endian host
Aleksandar Markovic [Thu, 20 Jun 2019 13:45:49 +0000 (15:45 +0200)]
target/mips: Fix emulation of ILVR.<B|H|W> on big endian host

Fix emulation of ILVR.<B|H|W> on big endian host by applying
mapping of data element indexes from one endian to another.

Signed-off-by: Aleksandar Markovic <amarkovic@wavecomp.com>
Reviewed-by: Aleksandar Rikalo <arikalo@wavecomp.com>
Message-Id: <1561038349-17105-5-git-send-email-aleksandar.markovic@rt-rk.com>

4 years agotarget/mips: Fix emulation of ILVL.<B|H|W> on big endian host
Aleksandar Markovic [Thu, 20 Jun 2019 13:45:48 +0000 (15:45 +0200)]
target/mips: Fix emulation of ILVL.<B|H|W> on big endian host

Fix emulation of ILVL.<B|H|W> on big endian host by applying
mapping of data element indexes from one endian to another.

Signed-off-by: Aleksandar Markovic <amarkovic@wavecomp.com>
Reviewed-by: Aleksandar Rikalo <arikalo@wavecomp.com>
Message-Id: <1561038349-17105-4-git-send-email-aleksandar.markovic@rt-rk.com>

4 years agotarget/mips: Fix emulation of ILVOD.<B|H|W> on big endian host
Aleksandar Markovic [Thu, 20 Jun 2019 13:45:47 +0000 (15:45 +0200)]
target/mips: Fix emulation of ILVOD.<B|H|W> on big endian host

Fix emulation of ILVOD.<B|H|W> on big endian host by applying
mapping of data element indexes from one endian to another.

Signed-off-by: Aleksandar Markovic <amarkovic@wavecomp.com>
Reviewed-by: Aleksandar Rikalo <arikalo@wavecomp.com>
Message-Id: <1561038349-17105-3-git-send-email-aleksandar.markovic@rt-rk.com>

4 years agotarget/mips: Fix emulation of ILVEV.<B|H|W> on big endian host
Aleksandar Markovic [Thu, 20 Jun 2019 13:45:46 +0000 (15:45 +0200)]
target/mips: Fix emulation of ILVEV.<B|H|W> on big endian host

Fix emulation of ILVEV.<B|H|W> on big endian host by applying
mapping of data element indexes from one endian to another.

Signed-off-by: Aleksandar Markovic <amarkovic@wavecomp.com>
Reviewed-by: Aleksandar Rikalo <arikalo@wavecomp.com>
Message-Id: <1561038349-17105-2-git-send-email-aleksandar.markovic@rt-rk.com>

4 years agotests/tcg: target/mips: Amend tests for MSA pack instructions
Aleksandar Markovic [Thu, 20 Jun 2019 11:49:18 +0000 (13:49 +0200)]
tests/tcg: target/mips: Amend tests for MSA pack instructions

Add tests for cases when destination register is the same as one
of source registers.

Signed-off-by: Aleksandar Markovic <amarkovic@wavecomp.com>
Reviewed-by: Aleksandar Rikalo <arikalo@wavecomp.com>
Message-Id: <1561031359-6727-3-git-send-email-aleksandar.markovic@rt-rk.com>

4 years agotests/tcg: target/mips: Include isa/ase and group name in test output
Aleksandar Markovic [Thu, 20 Jun 2019 11:49:17 +0000 (13:49 +0200)]
tests/tcg: target/mips: Include isa/ase and group name in test output

For better appearance and usefullnes, include ISA/ASE name and
instruction group name in the output of tests. For example, all
this data will be displayed for FMAX_A.W test:

| MSA       | Float Max Min       | FMAX_A.W    |
| PASS:  80 | FAIL:   0 | elapsed time: 0.16 ms |

(the data will be displayed in one row; they are presented here in two
rows not to exceed the width of the commit message)

Signed-off-by: Aleksandar Markovic <amarkovic@wavecomp.com>
Reviewed-by: Aleksandar Rikalo <arikalo@wavecomp.com>
Message-Id: <1561031359-6727-2-git-send-email-aleksandar.markovic@rt-rk.com>

4 years agotarget/mips: Fix if-else-switch-case arms checkpatch errors in translate.c
Aleksandar Markovic [Thu, 20 Jun 2019 13:33:15 +0000 (15:33 +0200)]
target/mips: Fix if-else-switch-case arms checkpatch errors in translate.c

Remove if-else-switch-case-arms-related checkpatch errors.

Signed-off-by: Aleksandar Markovic <amarkovic@wavecomp.com>
Reviewed-by: Aleksandar Rikalo <arikalo@wavecomp.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <1561037595-14413-5-git-send-email-aleksandar.markovic@rt-rk.com>

4 years agotarget/mips: Fix some space checkpatch errors in translate.c
Aleksandar Markovic [Thu, 20 Jun 2019 13:33:14 +0000 (15:33 +0200)]
target/mips: Fix some space checkpatch errors in translate.c

Remove some space-related checkpatch warning.

Signed-off-by: Aleksandar Markovic <amarkovic@wavecomp.com>
Reviewed-by: Aleksandar Rikalo <arikalo@wavecomp.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <1561037595-14413-4-git-send-email-aleksandar.markovic@rt-rk.com>

4 years agoMAINTAINERS: Consolidate MIPS disassembler-related items
Aleksandar Markovic [Thu, 20 Jun 2019 13:33:13 +0000 (15:33 +0200)]
MAINTAINERS: Consolidate MIPS disassembler-related items

Eliminate duplicate MIPS disassembler-related items in the
MAINTAINERS file, and use wildcards to shorten the list of
involved files.

Signed-off-by: Aleksandar Markovic <amarkovic@wavecomp.com>
Reviewed-by: Aleksandar Rikalo <arikalo@wavecomp.com>
Message-Id: <1561037595-14413-3-git-send-email-aleksandar.markovic@rt-rk.com>

4 years agoMAINTAINERS: Update file items for MIPS Malta board
Aleksandar Markovic [Thu, 20 Jun 2019 13:33:12 +0000 (15:33 +0200)]
MAINTAINERS: Update file items for MIPS Malta board

hw/mips/gt64xxx_pci.c is used for Malta only, so it is logical to
place this file in Malta board section of the MAINTAINERS file.

Signed-off-by: Aleksandar Markovic <amarkovic@wavecomp.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Aleksandar Rikalo <arikalo@wavecomp.com>
Message-Id: <1561037595-14413-2-git-send-email-aleksandar.markovic@rt-rk.com>

4 years agotarget/i386: kvm: Re-inject #DB to guest with updated DR6
Liran Alon [Wed, 19 Jun 2019 16:21:34 +0000 (19:21 +0300)]
target/i386: kvm: Re-inject #DB to guest with updated DR6

If userspace (QEMU) debug guest, when #DB is raised in guest and
intercepted by KVM, KVM forwards information on #DB to userspace
instead of injecting #DB to guest.
While doing so, KVM don't update vCPU DR6 but instead report the #DB DR6
value to userspace for further handling.
See KVM's handle_exception() DB_VECTOR handler.

QEMU handler for this case is kvm_handle_debug(). This handler basically
checks if #DB is related to one of user set hardware breakpoints and if
not, it re-inject #DB into guest.
The re-injection is done by setting env->exception_injected to #DB which
will later be passed as events.exception.nr to KVM_SET_VCPU_EVENTS ioctl
by kvm_put_vcpu_events().

However, in case userspace re-injects #DB, KVM expects userspace to set
vCPU DR6 as reported to userspace when #DB was intercepted! Otherwise,
KVM_REQ_EVENT handler will inject #DB with wrong DR6 to guest.

Fix this issue by updating vCPU DR6 appropriately when re-inject #DB to
guest.

Reviewed-by: Nikita Leshenko <nikita.leshchenko@oracle.com>
Reviewed-by: Krish Sadhukhan <krish.sadhukhan@oracle.com>
Signed-off-by: Liran Alon <liran.alon@oracle.com>
Message-Id: <20190619162140.133674-5-liran.alon@oracle.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
4 years agotarget/i386: kvm: Use symbolic constant for #DB/#BP exception constants
Liran Alon [Wed, 19 Jun 2019 16:21:33 +0000 (19:21 +0300)]
target/i386: kvm: Use symbolic constant for #DB/#BP exception constants

Reviewed-by: Nikita Leshenko <nikita.leshchenko@oracle.com>
Reviewed-by: Krish Sadhukhan <krish.sadhukhan@oracle.com>
Signed-off-by: Liran Alon <liran.alon@oracle.com>
Message-Id: <20190619162140.133674-4-liran.alon@oracle.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
4 years agoKVM: Introduce kvm_arch_destroy_vcpu()
Liran Alon [Wed, 19 Jun 2019 16:21:32 +0000 (19:21 +0300)]
KVM: Introduce kvm_arch_destroy_vcpu()

Simiar to how kvm_init_vcpu() calls kvm_arch_init_vcpu() to perform
arch-dependent initialisation, introduce kvm_arch_destroy_vcpu()
to be called from kvm_destroy_vcpu() to perform arch-dependent
destruction.

This was added because some architectures (Such as i386)
currently do not free memory that it have allocated in
kvm_arch_init_vcpu().

Suggested-by: Maran Wilson <maran.wilson@oracle.com>
Reviewed-by: Maran Wilson <maran.wilson@oracle.com>
Signed-off-by: Liran Alon <liran.alon@oracle.com>
Message-Id: <20190619162140.133674-3-liran.alon@oracle.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
4 years agotarget/i386: kvm: Delete VMX migration blocker on vCPU init failure
Liran Alon [Wed, 19 Jun 2019 16:21:31 +0000 (19:21 +0300)]
target/i386: kvm: Delete VMX migration blocker on vCPU init failure

Commit d98f26073beb ("target/i386: kvm: add VMX migration blocker")
added migration blocker for vCPU exposed with Intel VMX because QEMU
doesn't yet contain code to support migration of nested virtualization
workloads.

However, that commit missed adding deletion of the migration blocker in
case init of vCPU failed. Similar to invtsc_mig_blocker. This commit fix
that issue.

Fixes: d98f26073beb ("target/i386: kvm: add VMX migration blocker")
Signed-off-by: Liran Alon <liran.alon@oracle.com>
Reviewed-by: Maran Wilson <maran.wilson@oracle.com>
Message-Id: <20190619162140.133674-2-liran.alon@oracle.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
4 years agotarget/i386: define a new MSR based feature word - FEAT_CORE_CAPABILITY
Xiaoyao Li [Mon, 17 Jun 2019 15:36:54 +0000 (23:36 +0800)]
target/i386: define a new MSR based feature word - FEAT_CORE_CAPABILITY

MSR IA32_CORE_CAPABILITY is a feature-enumerating MSR, which only
enumerates the feature split lock detection (via bit 5) by now.

The existence of MSR IA32_CORE_CAPABILITY is enumerated by CPUID.7_0:EDX[30].

The latest kernel patches about them can be found here:
https://lkml.org/lkml/2019/4/24/1909

Signed-off-by: Xiaoyao Li <xiaoyao.li@linux.intel.com>
Message-Id: <20190617153654.916-1-xiaoyao.li@linux.intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
4 years agoi386/kvm: add support for Direct Mode for Hyper-V synthetic timers
Vitaly Kuznetsov [Fri, 17 May 2019 14:19:24 +0000 (16:19 +0200)]
i386/kvm: add support for Direct Mode for Hyper-V synthetic timers

Hyper-V on KVM can only use Synthetic timers with Direct Mode (opting for
an interrupt instead of VMBus message). This new capability is only
announced in KVM_GET_SUPPORTED_HV_CPUID.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Message-Id: <20190517141924.19024-10-vkuznets@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
4 years agoi386/kvm: hv-evmcs requires hv-vapic
Vitaly Kuznetsov [Fri, 17 May 2019 14:19:23 +0000 (16:19 +0200)]
i386/kvm: hv-evmcs requires hv-vapic

Enlightened VMCS is enabled by writing to a field in VP assist page and
these require virtual APIC.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Message-Id: <20190517141924.19024-9-vkuznets@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
4 years agoi386/kvm: hv-tlbflush/ipi require hv-vpindex
Vitaly Kuznetsov [Fri, 17 May 2019 14:19:22 +0000 (16:19 +0200)]
i386/kvm: hv-tlbflush/ipi require hv-vpindex

The corresponding hypercalls require using VP indexes.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Message-Id: <20190517141924.19024-8-vkuznets@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
4 years agoi386/kvm: hv-stimer requires hv-time and hv-synic
Vitaly Kuznetsov [Fri, 17 May 2019 14:19:21 +0000 (16:19 +0200)]
i386/kvm: hv-stimer requires hv-time and hv-synic

Synthetic timers operate in hv-time time and Windows won't use these
without SynIC.

Add .dependencies field to kvm_hyperv_properties[] and a generic mechanism
to check dependencies between features.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Message-Id: <20190517141924.19024-7-vkuznets@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
4 years agoi386/kvm: implement 'hv-passthrough' mode
Vitaly Kuznetsov [Fri, 17 May 2019 14:19:20 +0000 (16:19 +0200)]
i386/kvm: implement 'hv-passthrough' mode

In many case we just want to give Windows guests all currently supported
Hyper-V enlightenments and that's where this new mode may come handy. We
pass through what was returned by KVM_GET_SUPPORTED_HV_CPUID.

hv_cpuid_check_and_set() is modified to also set cpu->hyperv_* flags as
we may want to check them later (and we actually do for hv_runtime,
hv_synic,...).

'hv-passthrough' is a development only feature, a migration blocker is
added to prevent issues while migrating between hosts with different
feature sets.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Message-Id: <20190517141924.19024-6-vkuznets@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>