git.proxmox.com Git - mirror

aspeed/soc: Add ASPEED Video stub

Signed-off-by: Joel Stanley <joel@jms.id.au>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-id: 20190925143248.10000-24-clg@kaod.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

aspeed: add support for the Aspeed MII controller of the AST2600

The AST2600 SoC has an extra controller to set the PHY registers.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Joel Stanley <joel@jms.id.au>
Message-id: 20190925143248.10000-23-clg@kaod.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

aspeed: Parameterise number of MACs

To support the ast2600's four MACs allow SoCs to specify the number
they have, and create that many.

Signed-off-by: Joel Stanley <joel@jms.id.au>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-id: 20190925143248.10000-22-clg@kaod.org
[clg: - included a check on sc->macs_num when realizing the macs
- included interrupt definitions for the AST2600 ]
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

m25p80: Add support for w25q512jv

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Joel Stanley <joel@jms.id.au>
Message-id: 20190925143248.10000-20-clg@kaod.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

aspeed/soc: Add AST2600 support

Initial definitions for a simple machine using an AST2600 SoC (Cortex
CPU).

The Cortex CPU and its interrupt controller are too complex to handle
in the common Aspeed SoC framework. We introduce a new Aspeed SoC
class with instance_init and realize handlers to handle the differences
with the AST2400 and the AST2500 SoCs. This will add extra work to
keep in sync both models with future extensions but it makes the code
clearer.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Joel Stanley <joel@jms.id.au>
Message-id: 20190925143248.10000-19-clg@kaod.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

aspeed: Introduce an object class per SoC

It prepares ground for the AST2600.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Joel Stanley <joel@jms.id.au>
Message-id: 20190925143248.10000-18-clg@kaod.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

aspeed/i2c: Add AST2600 support

The I2C controller of the AST2400 and AST2500 SoCs have one IRQ shared
by all I2C busses. The AST2600 SoC I2C controller has one IRQ per bus
and 16 busses.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Joel Stanley <joel@jms.id.au>
Message-id: 20190925143248.10000-17-clg@kaod.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

aspeed/i2c: Introduce an object class per SoC

It prepares ground for register differences between SoCs.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Joel Stanley <joel@jms.id.au>
Message-id: 20190925143248.10000-16-clg@kaod.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

hw/gpio: Add in AST2600 specific implementation

The AST2600 has the same sets of 3.6v gpios as the AST2400 plus an
addtional two sets of 1.8V gpios.

Signed-off-by: Rashmica Gupta <rashmica.g@gmail.com>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Acked-by: Joel Stanley <joel@jms.id.au>
Message-id: 20190925143248.10000-15-clg@kaod.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

aspeed/smc: Add AST2600 support

The AST2600 SoC SMC controller is a SPI only controller now and has a
few extensions which we will need to take into account when SW
requires it. This is enough to support u-boot and Linux.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Acked-by: Joel Stanley <joel@jms.id.au>
Message-id: 20190925143248.10000-14-clg@kaod.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

aspeed/smc: Introduce segment operations

AST2600 will use a different encoding for the addresses defined in the
Segment Register.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Acked-by: Joel Stanley <joel@jms.id.au>
Message-id: 20190925143248.10000-13-clg@kaod.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

hw: wdt_aspeed: Add AST2600 support

The AST2600 has four watchdogs, and they each have a 0x40 of registers.

When running as part of an ast2600 system we must check a different
offset for the system reset control register in the SCU.

Signed-off-by: Joel Stanley <joel@jms.id.au>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-id: 20190925143248.10000-12-clg@kaod.org
[clg: - reworked model integration into new object class ]
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

watchdog/aspeed: Introduce an object class per SoC

It cleanups the current models for the Aspeed AST2400 and AST2500 SoCs
and prepares ground for future SoCs.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Joel Stanley <joel@jms.id.au>
Message-id: 20190925143248.10000-11-clg@kaod.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

aspeed/sdmc: Add AST2600 support

The AST2600 SDMC controller is slightly different from its predecessor
(DRAM training). Max memory is now 2G on the AST2600.

Signed-off-by: Joel Stanley <joel@jms.id.au>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-id: 20190925143248.10000-10-clg@kaod.org
[clg: - improved commit log
- reworked model integration into new object class ]
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

aspeed/sdmc: Introduce an object class per SoC

Use class handlers and class constants to differentiate the
characteristics of the memory controller and remove the 'silicon_rev'
property.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Joel Stanley <joel@jms.id.au>
Message-id: 20190925143248.10000-9-clg@kaod.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

aspeed/timer: Add support for IRQ status register on the AST2600

The AST2600 timer replaces control register 2 with a interrupt status
register. It is set by hardware when an IRQ occurs and cleared by
software.

Modify the vmstate version to take into account the new fields.

Based on previous work from Joel Stanley.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Joel Stanley <joel@jms.id.au>
Message-id: 20190925143248.10000-8-clg@kaod.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

aspeed/timer: Add AST2600 support

The AST2600 timer has a third control register that is used to
implement a set-to-clear feature for the main control register.

On the AST2600, it is not configurable via 0x38 (control register 3)
as it is on the AST2500.

Based on previous work from Joel Stanley.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Joel Stanley <joel@jms.id.au>
Message-id: 20190925143248.10000-7-clg@kaod.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

aspeed/timer: Add support for control register 3

The AST2500 timer has a third control register that is used to
implement a set-to-clear feature for the main control register.

This models the behaviour expected by the AST2500 while maintaining
the same behaviour for the AST2400.

The vmstate version is not increased yet because the structure is
modified again in the following patches.

Based on previous work from Joel Stanley.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Joel Stanley <joel@jms.id.au>
Message-id: 20190925143248.10000-6-clg@kaod.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

aspeed/timer: Introduce an object class per SoC

The most important changes will be on the register range 0x34 - 0x3C
memops. Introduce class read/write operations to handle the
differences between SoCs.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Joel Stanley <joel@jms.id.au>
Message-id: 20190925143248.10000-5-clg@kaod.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

hw: aspeed_scu: Add AST2600 support

The SCU controller on the AST2600 SoC has extra registers. Increase
the number of regs of the model and introduce a new field in the class
to customize the MemoryRegion operations depending on the SoC model.

Signed-off-by: Joel Stanley <joel@jms.id.au>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-id: 20190925143248.10000-4-clg@kaod.org
[clg: - improved commit log
      - changed vmstate version
      - reworked model integration into new object class
      - included AST2600_HPLL_PARAM value ]
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

hw/sd/aspeed_sdhci: New device

The Aspeed SOCs have two SD/MMC controllers. Add a device that
encapsulates both of these controllers and models the Aspeed-specific
registers and behavior.

Tested by reading from mmcblk0 in Linux:
qemu-system-arm -machine romulus-bmc -nographic \
-drive file=flash-romulus,format=raw,if=mtd \
-device sd-card,drive=sd0 -drive file=_tmp/kernel,format=raw,if=sd,id=sd0

Signed-off-by: Eddie James <eajames@linux.ibm.com>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Joel Stanley <joel@jms.id.au>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-id: 20190925143248.10000-3-clg@kaod.org
[clg: - changed the controller MMIO window size to 0x1000
- moved the MMIO mapping of the SDHCI slots at the SoC level
- merged code to add SD drives on the SD buses at the machine level ]
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

aspeed/wdt: Check correct register for clock source

When WDT_RESTART is written, the data is not the contents
of the WDT_CTRL register. Hence ensure we are looking at
WDT_CTRL to check if bit WDT_CTRL_1MHZ_CLK is set or not.

Signed-off-by: Amithash Prasad <amithash@fb.com>
Reviewed-by: Joel Stanley <joel@jms.id.au>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-id: 20190925143248.10000-2-clg@kaod.org
[clg: improved Suject prefix ]
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Joel Stanley <joel@jms.id.au>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

target/arm/arm-semi: Implement SH_EXT_STDOUT_STDERR extension

SH_EXT_STDOUT_STDERR is a v2.0 semihosting extension: the guest
can open ":tt" with a file mode requesting append access in
order to open stderr, in addition to the existing "open for
read for stdin or write for stdout". Implement this and
report it via the :semihosting-features data.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-id: 20190916141544.17540-16-peter.maydell@linaro.org

target/arm/arm-semi: Implement SH_EXT_EXIT_EXTENDED extension

SH_EXT_EXIT_EXTENDED is a v2.0 semihosting extension: it
indicates that the implementation supports the SYS_EXIT_EXTENDED
function. This function allows both A64 and A32/T32 guests to
exit with a specified exit status, unlike the older SYS_EXIT
function which only allowed this for A64 guests. Implement
this extension.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-id: 20190916141544.17540-15-peter.maydell@linaro.org

target/arm/arm-semi: Implement support for semihosting feature detection

Version 2.0 of the semihosting specification added support for
allowing a guest to detect whether the implementation supported
particular features. This works by the guest opening a magic
file ":semihosting-features", which contains a fixed set of
data with some magic numbers followed by a sequence of bytes
with feature flags. The file is expected to behave sensibly
for the various semihosting calls which operate on files
(SYS_FLEN, SYS_SEEK, etc).

Implement this as another kind of guest FD using our function
table dispatch mechanism. Initially we report no extended
features, so we have just one feature flag byte which is zero.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-id: 20190916141544.17540-14-peter.maydell@linaro.org

target/arm/arm-semi: Factor out implementation of SYS_FLEN

Factor out the implementation of SYS_FLEN via the new
function tables.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-id: 20190916141544.17540-13-peter.maydell@linaro.org

target/arm/arm-semi: Factor out implementation of SYS_SEEK

Factor out the implementation of SYS_SEEK via the new function
tables.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-id: 20190916141544.17540-12-peter.maydell@linaro.org

target/arm/arm-semi: Factor out implementation of SYS_ISTTY

Factor out the implementation of SYS_ISTTY via the new function
tables.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-id: 20190916141544.17540-11-peter.maydell@linaro.org

target/arm/arm-semi: Factor out implementation of SYS_READ

Factor out the implementation of SYS_READ via the
new function tables.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-id: 20190916141544.17540-10-peter.maydell@linaro.org

target/arm/arm-semi: Factor out implementation of SYS_WRITE

Factor out the implementation of SYS_WRITE via the
new function tables.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20190916141544.17540-9-peter.maydell@linaro.org

target/arm/arm-semi: Factor out implementation of SYS_CLOSE

Currently for the semihosting calls which take a file descriptor
(SYS_CLOSE, SYS_WRITE, SYS_READ, SYS_ISTTY, SYS_SEEK, SYS_FLEN)
we have effectively two implementations, one for real host files
and one for when we indirect via the gdbstub. We want to add a
third one to deal with the magic :semihosting-features file.

Instead of having a three-way if statement in each of these
cases, factor out the implementation of the calls to separate
functions which we dispatch to via function pointers selected
via the GuestFDType for the guest fd.

In this commit, we set up the framework for the dispatch,
and convert the SYS_CLOSE call to use it.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20190916141544.17540-8-peter.maydell@linaro.org

target/arm/arm-semi: Use set_swi_errno() in gdbstub callback functions

When we are routing semihosting operations through the gdbstub, the
work of sorting out the return value and setting errno if necessary
is done by callback functions which are invoked by the gdbstub code.
Clean up some ifdeffery in those functions by having them call
set_swi_errno() to set the semihosting errno.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20190916141544.17540-7-peter.maydell@linaro.org

target/arm/arm-semi: Restrict use of TaskState*

The semihosting code needs accuss to the linux-user only
TaskState pointer so it can set the semihosting errno per-thread
for linux-user mode. At the moment we do this by having some
ifdefs so that we define a 'ts' local in do_arm_semihosting()
which is either a real TaskState * or just a CPUARMState *,
depending on which mode we're compiling for.

This is awkward if we want to refactor do_arm_semihosting()
into other functions which might need to be passed the TaskState.
Restrict usage of the TaskState local by:
* making set_swi_errno() always take the CPUARMState pointer
   and (for the linux-user version) get TaskState from that
* creating a new get_swi_errno() which reads the errno
* having the two semihosting calls which need the TaskState
   for other purposes (SYS_GET_CMDLINE and SYS_HEAPINFO)
   define a variable with scope restricted to just that code

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20190916141544.17540-6-peter.maydell@linaro.org

target/arm/arm-semi: Make semihosting code hand out its own file descriptors

Currently the Arm semihosting code returns the guest file descriptors
(handles) which are simply the fd values from the host OS or the
remote gdbstub. Part of the semihosting 2.0 specification requires
that we implement special handling of opening a ":semihosting-features"
filename. Guest fds which result from opening the special file
won't correspond to host fds, so to ensure that we don't end up
with duplicate fds we need to have QEMU code control the allocation
of the fd values we give the guest.

Add in an abstraction layer which lets us allocate new guest FD
values, and translate from a guest FD value back to the host one.
This also fixes an odd hole where a semihosting guest could
use the semihosting API to read, write or close file descriptors
that it had never allocated but which were being used by QEMU itself.
(This isn't a security hole, because enabling semihosting permits
the guest to do arbitrary file access to the whole host filesystem,
and so should only be done if the guest is completely trusted.)

Currently the only kind of guest fd is one which maps to a
host fd, but in a following commit we will add one which maps
to the :semihosting-features magic data.

If the guest is migrated with an open semihosting file descriptor
then subsequent attempts to use the fd will all fail; this is
not a change from the previous situation (where the host fd
being used on the source end would not be re-opened on the
destination end).

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20190916141544.17540-5-peter.maydell@linaro.org

target/arm/arm-semi: Correct comment about gdb syscall races

In arm_gdb_syscall() we have a comment suggesting a race
because the syscall completion callback might not happen
before the gdb_do_syscallv() call returns. The comment is
correct that the callback may not happen but incorrect about
the effects. Correct it and note the important caveat that
callers must never do any work of any kind after return from
arm_gdb_syscall() that depends on its return value.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20190916141544.17540-4-peter.maydell@linaro.org

target/arm/arm-semi: Always set some kind of errno for failed calls

If we fail a semihosting call we should always set the
semihosting errno to something; we were failing to do
this for some of the "check inputs for sanity" cases.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20190916141544.17540-3-peter.maydell@linaro.org

target/arm/arm-semi: Capture errno in softmmu version of set_swi_errno()

The set_swi_errno() function is called to capture the errno
from a host system call, so that we can return -1 from the
semihosting function and later allow the guest to get a more
specific error code with the SYS_ERRNO function. It comes in
two versions, one for user-only and one for softmmu. We forgot
to capture the errno in the softmmu version; fix the error.

(Semihosting calls directed to gdb are unaffected because
they go through a different code path that captures the
error return from the gdbstub call in arm_semi_cb() or
arm_semi_flen_cb().)

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20190916141544.17540-2-peter.maydell@linaro.org

hw/net/lan9118.c: Switch to transaction-based ptimer API

Switch the cmsdk-apb-watchdog code away from bottom-half based
ptimers to the new transaction-based ptimer API. This just requires
adding begin/commit calls around the various places that modify the
ptimer state, and using the new ptimer_init() function to create the
timer.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20191008171740.9679-22-peter.maydell@linaro.org

hw/watchdog/cmsdk-apb-watchdog.c: Switch to transaction-based ptimer API

Switch the cmsdk-apb-watchdog code away from bottom-half based
ptimers to the new transaction-based ptimer API. This just requires
adding begin/commit calls around the various places that modify the
ptimer state, and using the new ptimer_init() function to create the
timer.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20191008171740.9679-21-peter.maydell@linaro.org

hw/timer/mss-timerc: Switch to transaction-based ptimer API

Switch the mss-timer code away from bottom-half based ptimers to
the new transaction-based ptimer API. This just requires adding
begin/commit calls around the various places that modify the ptimer
state, and using the new ptimer_init() function to create the timer.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20191008171740.9679-20-peter.maydell@linaro.org

hw/timer/imx_gpt.c: Switch to transaction-based ptimer API

Switch the imx_epit.c code away from bottom-half based ptimers to
the new transaction-based ptimer API. This just requires adding
begin/commit calls around the various places that modify the ptimer
state, and using the new ptimer_init() function to create the timer.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20191008171740.9679-19-peter.maydell@linaro.org

hw/timer/imx_epit.c: Switch to transaction-based ptimer API

Switch the imx_epit.c code away from bottom-half based ptimers to
the new transaction-based ptimer API. This just requires adding
begin/commit calls around the various places that modify the ptimer
state, and using the new ptimer_init() function to create the timer.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20191008171740.9679-18-peter.maydell@linaro.org

hw/timer/exynos4210_rtc.c: Switch main ptimer to transaction-based API

Switch the exynos41210_rtc main ptimer over to the transaction-based
API, completing the transition for this device.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20191008171740.9679-17-peter.maydell@linaro.org

hw/timer/exynos4210_rtc.c: Switch 1Hz ptimer to transaction-based API

Switch the exynos41210_rtc 1Hz ptimer over to the transaction-based
API. (We will switch the other ptimer used by this device in a
separate commit.)

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20191008171740.9679-16-peter.maydell@linaro.org

hw/timer/exynos4210_pwm.c: Switch to transaction-based ptimer API

Switch the exynos4210_pwm code away from bottom-half based ptimers to
the new transaction-based ptimer API. This just requires adding
begin/commit calls around the various places that modify the ptimer
state, and using the new ptimer_init() function to create the timer.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20191008171740.9679-15-peter.maydell@linaro.org

hw/timer/exynos4210_mct.c: Switch ltick to transaction-based ptimer API

Switch the ltick ptimer over to the ptimer transaction API.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20191008171740.9679-14-peter.maydell@linaro.org

hw/timer/exynos4210_mct.c: Switch LFRC to transaction-based ptimer API

Switch the exynos MCT LFRC timers over to the ptimer transaction API.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20191008171740.9679-13-peter.maydell@linaro.org

hw/timer/exynos4210_mct.c: Switch GFRC to transaction-based ptimer API

We want to switch the exynos MCT code away from bottom-half based ptimers to
the new transaction-based ptimer API. The MCT is complicated
and uses multiple different ptimers, so it's clearer to switch
it a piece at a time. Here we change over only the GFRC.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20191008171740.9679-12-peter.maydell@linaro.org

hw/timer/digic-timer.c: Switch to transaction-based ptimer API

Switch the digic-timer.c code away from bottom-half based ptimers to
the new transaction-based ptimer API. This just requires adding
begin/commit calls around the various places that modify the ptimer
state, and using the new ptimer_init() function to create the timer.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20191008171740.9679-11-peter.maydell@linaro.org

hw/timer/cmsdk-apb-timer.c: Switch to transaction-based ptimer API

Switch the cmsdk-apb-timer code away from bottom-half based ptimers
to the new transaction-based ptimer API. This just requires adding
begin/commit calls around the various places that modify the ptimer
state, and using the new ptimer_init() function to create the timer.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20191008171740.9679-10-peter.maydell@linaro.org

hw/timer/cmsdk-apb-dualtimer.c: Switch to transaction-based ptimer API

Switch the cmsdk-apb-dualtimer code away from bottom-half based
ptimers to the new transaction-based ptimer API. This just requires
adding begin/commit calls around the various places that modify the
ptimer state, and using the new ptimer_init() function to create the
timer.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20191008171740.9679-9-peter.maydell@linaro.org

hw/timer/arm_mptimer.c: Switch to transaction-based ptimer API

Switch the arm_mptimer.c code away from bottom-half based ptimers to
the new transaction-based ptimer API. This just requires adding
begin/commit calls around the various places that modify the ptimer
state, and using the new ptimer_init() function to create the timer.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20191008171740.9679-8-peter.maydell@linaro.org

hw/timer/allwinner-a10-pit.c: Switch to transaction-based ptimer API

Switch the allwinner-a10-pit code away from bottom-half based ptimers to
the new transaction-based ptimer API. This just requires adding
begin/commit calls around the various places that modify the ptimer
state, and using the new ptimer_init() function to create the timer.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20191008171740.9679-7-peter.maydell@linaro.org

hw/arm/musicpal.c: Switch to transaction-based ptimer API

Switch the musicpal code away from bottom-half based ptimers to
the new transaction-based ptimer API. This just requires adding
begin/commit calls around the various places that modify the ptimer
state, and using the new ptimer_init() function to create the timer.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20191008171740.9679-6-peter.maydell@linaro.org

hw/timer/arm_timer.c: Switch to transaction-based ptimer API

Switch the arm_timer.c code away from bottom-half based ptimers
to the new transaction-based ptimer API. This just requires
adding begin/commit calls around the various arms of
arm_timer_write() that modify the ptimer state, and using the
new ptimer_init() function to create the timer.

Fixes: https://bugs.launchpad.net/qemu/+bug/1777777
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20191008171740.9679-5-peter.maydell@linaro.org

tests/ptimer-test: Switch to transaction-based ptimer API

Convert the ptimer test cases to the transaction-based ptimer API,
by changing to ptimer_init(), dropping the now-unused QEMUBH
variables, and surrounding each set of changes to the ptimer
state in ptimer_transaction_begin/commit calls.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20191008171740.9679-4-peter.maydell@linaro.org

ptimer: Provide new transaction-based API

Provide the new transaction-based API. If a ptimer is created
using ptimer_init() rather than ptimer_init_with_bh(), then
instead of providing a QEMUBH, it provides a pointer to the
callback function directly, and has opted into the transaction
API. All calls to functions which modify ptimer state:
- ptimer_set_period()
- ptimer_set_freq()
- ptimer_set_limit()
- ptimer_set_count()
- ptimer_run()
- ptimer_stop()
must be between matched calls to ptimer_transaction_begin()
and ptimer_transaction_commit(). When ptimer_transaction_commit()
is called it will evaluate the state of the timer after all the
changes in the transaction, and call the callback if necessary.

In the old API the individual update functions generally would
call ptimer_trigger() immediately, which would schedule the QEMUBH.
In the new API the update functions will instead defer the
"set s->next_event and call ptimer_reload()" work to
ptimer_transaction_commit().

Because ptimer_trigger() can now immediately call into the
device code which may then call other ptimer functions that
update ptimer_state fields, we must be more careful in
ptimer_reload() not to cache fields from ptimer_state across
the ptimer_trigger() call. (This was harmless with the QEMUBH
mechanism as the BH would not be invoked until much later.)

We use assertions to check that:
* the functions modifying ptimer state are not called outside
a transaction block
* ptimer_transaction_begin() and _commit() calls are paired
* the transaction API is not used with a QEMUBH ptimer

There is some slight repetition of code:
* most of the set functions have similar looking "if s->bh
call ptimer_reload, otherwise set s->need_reload" code
* ptimer_init() and ptimer_init_with_bh() have similar code
We deliberately don't try to avoid this repetition, because
it will all be deleted when the QEMUBH version of the API
is removed.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20191008171740.9679-3-peter.maydell@linaro.org

ptimer: Rename ptimer_init() to ptimer_init_with_bh()

Currently the ptimer design uses a QEMU bottom-half as its
mechanism for calling back into the device model using the
ptimer when the timer has expired. Unfortunately this design
is fatally flawed, because it means that there is a lag
between the ptimer updating its own state and the device
callback function updating device state, and guest accesses
to device registers between the two can return inconsistent
device state.

We want to replace the bottom-half design with one where
the guest device's callback is called either immediately
(when the ptimer triggers by timeout) or when the device
model code closes a transaction-begin/end section (when the
ptimer triggers because the device model changed the
ptimer's count value or other state). As the first step,
rename ptimer_init() to ptimer_init_with_bh(), to free up
the ptimer_init() name for the new API. We can then convert
all the ptimer users away from ptimer_init_with_bh() before
removing it entirely.

(Commit created with
git grep -l ptimer_init | xargs sed -i -e 's/ptimer_init/ptimer_init_with_bh/'
and three overlong lines folded by hand.)

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20191008171740.9679-2-peter.maydell@linaro.org

ARM: KVM: Check KVM_CAP_ARM_IRQ_LINE_LAYOUT_2 for smp_cpus > 256

Host kernel within [4.18, 5.3] report an erroneous KVM_MAX_VCPUS=512
for ARM. The actual capability to instantiate more than 256 vcpus
was fixed in 5.4 with the upgrade of the KVM_IRQ_LINE ABI to support
vcpu id encoded on 12 bits instead of 8 and a redistributor consuming
a single KVM IO device instead of 2.

So let's check this capability when attempting to use more than 256
vcpus within any ARM kvm accelerated machine.

Signed-off-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Acked-by: Marc Zyngier <maz@kernel.org>
Message-id: 20191003154640.22451-4-eric.auger@redhat.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

intc/arm_gic: Support IRQ injection for more than 256 vpus

Host kernels that expose the KVM_CAP_ARM_IRQ_LINE_LAYOUT_2 capability
allow injection of interrupts along with vcpu ids larger than 255.
Let's encode the vpcu id on 12 bits according to the upgraded KVM_IRQ_LINE
ABI when needed.

Given that we have two callsites that need to assemble
the value for kvm_set_irq(), a new helper routine, kvm_arm_set_irq
is introduced.

Without that patch qemu exits with "kvm_set_irq: Invalid argument"
message.

Signed-off-by: Eric Auger <eric.auger@redhat.com>
Reported-by: Zenghui Yu <yuzenghui@huawei.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Acked-by: Marc Zyngier <maz@kernel.org>
Message-id: 20191003154640.22451-3-eric.auger@redhat.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

linux headers: update against v5.4-rc1

Update the headers against commit:
0f1a7b3fac05 ("timer-of: don't use conditional expression
with mixed 'void' types")

Signed-off-by: Eric Auger <eric.auger@redhat.com>
Acked-by: Marc Zyngier <maz@kernel.org>
Message-id: 20191003154640.22451-2-eric.auger@redhat.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

Merge remote-tracking branch 'remotes/kevin/tags/for-upstream' into staging

Block layer patches:

- block: Fix crash with qcow2 partial cluster COW with small cluster
  sizes (misaligned write requests with BDRV_REQ_NO_FALLBACK)
- qcow2: Fix integer overflow potentially causing corruption with huge
  requests
- vhdx: Detect truncated image files
- tools: Support help options for --object
- Various block-related replay improvements
- iotests/028: Fix for long $TEST_DIRs

# gpg: Signature made Mon 14 Oct 2019 17:02:54 BST
# gpg:                using RSA key 7F09B272C88F2FD6
# gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>" [full]
# Primary key fingerprint: DC3D EB15 9A9A F95D 3D74  56FE 7F09 B272 C88F 2FD6

* remotes/kevin/tags/for-upstream:
  iotests: Test large write request to qcow2 file
  qcow2: Limit total allocation range to INT_MAX
  qemu-nbd: Support help options for --object
  qemu-img: Support help options for --object
  qemu-io: Support help options for --object
  vl: Split off user_creatable_print_help()
  iotests/028: Fix for long $TEST_DIRs
  block: Reject misaligned write requests with BDRV_REQ_NO_FALLBACK
  replay: add BH oneshot event for block layer
  replay: finish record/replay before closing the disks
  replay: don't drain/flush bdrv queue while RR is working
  replay: update docs for record/replay with block devices
  replay: disable default snapshot for record/replay
  block: implement bdrv_snapshot_goto for blkreplay
  block/vhdx: add check for truncated image files

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

Merge remote-tracking branch 'remotes/stefanha/tags/tracing-pull-request' into staging

Pull request

v2:
* Replaced "Launchpad:" tag with "Buglink:" as documented on the SubmitAPatch wiki page [Philippe]

# gpg: Signature made Tue 15 Oct 2019 09:49:05 BST
# gpg:                using RSA key 8695A8BFD3F97CDAAC35775A9CA4ABB381AB73C8
# gpg: Good signature from "Stefan Hajnoczi <stefanha@redhat.com>" [full]
# gpg:                 aka "Stefan Hajnoczi <stefanha@gmail.com>" [full]
# Primary key fingerprint: 8695 A8BF D3F9 7CDA AC35  775A 9CA4 ABB3 81AB 73C8

* remotes/stefanha/tags/tracing-pull-request:
  trace: avoid "is" with a literal Python 3.8 warnings
  trace: add --group=all to tracing.txt

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

Merge remote-tracking branch 'remotes/stefanha/tags/block-pull-request' into staging

Pull request

# gpg: Signature made Mon 14 Oct 2019 09:52:03 BST
# gpg:                using RSA key 8695A8BFD3F97CDAAC35775A9CA4ABB381AB73C8
# gpg: Good signature from "Stefan Hajnoczi <stefanha@redhat.com>" [full]
# gpg:                 aka "Stefan Hajnoczi <stefanha@gmail.com>" [full]
# Primary key fingerprint: 8695 A8BF D3F9 7CDA AC35  775A 9CA4 ABB3 81AB 73C8

* remotes/stefanha/tags/block-pull-request:
  test-bdrv-drain: fix iothread_join() hang

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

trace: avoid "is" with a literal Python 3.8 warnings

The following statement produces a SyntaxWarning with Python 3.8:

if len(format) is 0:
scripts/tracetool/__init__.py:459: SyntaxWarning: "is" with a literal. Did you mean "=="?

Use the conventional len(x) == 0 syntax instead.

Reported-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20191010122154.10553-1-stefanha@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>

trace: add --group=all to tracing.txt

tracetool needs to know the group name ("all", "root", or a specific
subdirectory).  Also remove the stdin redirection because tracetool.py
needs the path to the trace-events file.  Update the documentation.

Fixes: 2098c56a9bc5901e145fa5d4759f075808811685
       ("trace: move setting of group name into Makefiles")
Buglink: https://bugs.launchpad.net/bugs/1844814
Reported-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20191009135154.10970-1-stefanha@redhat.com>

Merge remote-tracking branch 'remotes/mcayland/tags/qemu-openbios-20191012' into staging

qemu-openbios queue

# gpg: Signature made Sat 12 Oct 2019 10:47:55 BST
# gpg:                using RSA key CC621AB98E82200D915CC9C45BC2C56FAE0F321F
# gpg:                issuer "mark.cave-ayland@ilande.co.uk"
# gpg: Good signature from "Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>" [full]
# Primary key fingerprint: CC62 1AB9 8E82 200D 915C  C9C4 5BC2 C56F AE0F 321F

* remotes/mcayland/tags/qemu-openbios-20191012:
  Update OpenBIOS images to f28e16f9 built from submodule.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

iotests: Test large write request to qcow2 file

Without HEAD^, the following happens when you attempt a large write
request to a qcow2 file such that the number of bytes covered by all
clusters involved in a single allocation will exceed INT_MAX:

(A) handle_alloc_space() decides to fill the whole area with zeroes and
    fails because bdrv_co_pwrite_zeroes() fails (the request is too
    large).

(B) If handle_alloc_space() does not do anything, but merge_cow()
    decides that the requests can be merged, it will create a too long
    IOV that later cannot be written.

(C) Otherwise, all parts will be written separately, so those requests
    will work.

In either B or C, though, qcow2_alloc_cluster_link_l2() will have an
overflow: We use an int (i) to iterate over nb_clusters, and then
calculate the L2 entry based on "i << s->cluster_bits" -- which will
overflow if the range covers more than INT_MAX bytes.  This then leads
to image corruption because the L2 entry will be wrong (it will be
recognized as a compressed cluster).

Even if that were not the case, the .cow_end area would be empty
(because handle_alloc() will cap avail_bytes and nb_bytes at INT_MAX, so
their difference (which is the .cow_end size) will be 0).

So this test checks that on such large requests, the image will not be
corrupted.  Unfortunately, we cannot check whether COW will be handled
correctly, because that data is discarded when it is written to null-co
(but we have to use null-co, because writing 2 GB of data in a test is
not quite reasonable).

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

qcow2: Limit total allocation range to INT_MAX

When the COW areas are included, the size of an allocation can exceed
INT_MAX. This is kind of limited by handle_alloc() in that it already
caps avail_bytes at INT_MAX, but the number of clusters still reflects
the original length.

This can have all sorts of effects, ranging from the storage layer write
call failing to image corruption. (If there were no image corruption,
then I suppose there would be data loss because the .cow_end area is
forced to be empty, even though there might be something we need to
COW.)

Fix all of it by limiting nb_clusters so the equivalent number of bytes
will not exceed INT_MAX.

Cc: qemu-stable@nongnu.org
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

qemu-nbd: Support help options for --object

Instead of parsing help options as normal object properties and
returning an error, provide the same help functionality as the system
emulator in qemu-nbd, too.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>

qemu-img: Support help options for --object

Instead of parsing help options as normal object properties and
returning an error, provide the same help functionality as the system
emulator in qemu-img, too.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>

qemu-io: Support help options for --object

Instead of parsing help options as normal object properties and
returning an error, provide the same help functionality as the system
emulator in qemu-io, too.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>

vl: Split off user_creatable_print_help()

Printing help for --object is something that we not only want in the
system emulator, but also in tools that support --object. Move it into a
separate function in qom/object_interfaces.c to make the code accessible
for tools.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>

iotests/028: Fix for long $TEST_DIRs

For long test image paths, the order of the "Formatting" line and the
"(qemu)" prompt after a drive_backup HMP command may be reversed. In
fact, the interaction between the prompt and the line may lead to the
"Formatting" to being greppable at all after "read"-ing it (if the
prompt injects an IFS character into the "Formatting" string).

So just wait until we get a prompt. At that point, the block job must
have been started, so "info block-jobs" will only return "No active
jobs" once it is done.

Reported-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

block: Reject misaligned write requests with BDRV_REQ_NO_FALLBACK

The BDRV_REQ_NO_FALLBACK flag means that an operation should only be
performed if it can be offloaded or otherwise performed efficiently.

However a misaligned write request requires a RMW so we should return
an error and let the caller decide how to proceed.

This hits an assertion since commit c8bb23cbdb if the required
alignment is larger than the cluster size:

qemu-img create -f qcow2 -o cluster_size=2k img.qcow2 4G
qemu-io -c "open -o driver=qcow2,file.align=4k blkdebug::img.qcow2" \
-c 'write 0 512'
qemu-io: block/io.c:1127: bdrv_driver_pwritev: Assertion `!(flags & BDRV_REQ_NO_FALLBACK)' failed.
Aborted

The reason is that when writing to an unallocated cluster we try to
skip the copy-on-write part and zeroize it using BDRV_REQ_NO_FALLBACK
instead, resulting in a write request that is too small (2KB cluster
size vs 4KB required alignment).

Signed-off-by: Alberto Garcia <berto@igalia.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

replay: add BH oneshot event for block layer

Replay is capable of recording normal BH events, but sometimes
there are single use callbacks scheduled with aio_bh_schedule_oneshot
function. This patch enables recording and replaying such callbacks.
Block layer uses these events for calling the completion function.
Replaying these calls makes the execution deterministic.

Signed-off-by: Pavel Dovgalyuk <Pavel.Dovgaluk@ispras.ru>
Acked-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

replay: finish record/replay before closing the disks

After recent updates block devices cannot be closed on qemu exit.
This happens due to the block request polling when replay is not finished.
Therefore now we stop execution recording before closing the block devices.

Signed-off-by: Pavel Dovgalyuk <pavel.dovgaluk@ispras.ru>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

replay: don't drain/flush bdrv queue while RR is working

In record/replay mode bdrv queue is controlled by replay mechanism.
It does not allow saving or loading the snapshots
when bdrv queue is not empty. Stopping the VM is not blocked by nonempty
queue, but flushing the queue is still impossible there,
because it may cause deadlocks in replay mode.
This patch disables bdrv_drain_all and bdrv_flush_all in
record/replay mode.

Stopping the machine when the IO requests are not finished is needed
for the debugging. E.g., breakpoint may be set at the specified step,
and forcing the IO requests to finish may break the determinism
of the execution.

Signed-off-by: Pavel Dovgalyuk <pavel.dovgaluk@ispras.ru>
Acked-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

replay: update docs for record/replay with block devices

This patch updates the description of the command lines for using
record/replay with attached block devices.

Signed-off-by: Pavel Dovgalyuk <Pavel.Dovgaluk@ispras.ru>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

replay: disable default snapshot for record/replay

This patch disables setting '-snapshot' option on by default
in record/replay mode. This is needed for creating vmstates in record
and replay modes.

Signed-off-by: Pavel Dovgalyuk <pavel.dovgaluk@ispras.ru>
Acked-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

block: implement bdrv_snapshot_goto for blkreplay

This patch enables making snapshots with blkreplay used in
block devices.
This function is required to make bdrv_snapshot_goto without
calling .bdrv_open which is not implemented.

Signed-off-by: Pavel Dovgalyuk <pavel.dovgaluk@ispras.ru>
Acked-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

block/vhdx: add check for truncated image files

qemu is currently not able to detect truncated vhdx image files.
Add a basic check if all allocated blocks are reachable at open and
report all errors during bdrv_co_check.

Signed-off-by: Peter Lieven <pl@kamp.de>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

Merge remote-tracking branch 'remotes/dgilbert/tags/pull-migration-20191011a' into staging

Migration pull 2019-10-11

Mostly cleanups and minor fixes

[Note I'm seeing a hang on the aarch64 hosted x86-64 tcg migration
test in xbzrle; but I'm seeing that on current head as well]

# gpg: Signature made Fri 11 Oct 2019 20:14:31 BST
# gpg:                using RSA key 45F5C71B4A0CB7FB977A9FA90516331EBC5BFDE7
# gpg: Good signature from "Dr. David Alan Gilbert (RH2) <dgilbert@redhat.com>" [full]
# Primary key fingerprint: 45F5 C71B 4A0C B7FB 977A  9FA9 0516 331E BC5B FDE7

* remotes/dgilbert/tags/pull-migration-20191011a: (21 commits)
  migration: Support gtree migration
  migration/multifd: pages->used would be cleared when attach to multifd_send_state
  migration/multifd: initialize packet->magic/version once at setup stage
  migration/multifd: use pages->allocated instead of the static max
  migration/multifd: fix a typo in comment of multifd_recv_unfill_packet()
  migration/postcopy: check PostcopyState before setting to POSTCOPY_INCOMING_RUNNING
  migration/postcopy: rename postcopy_ram_enable_notify to postcopy_ram_incoming_setup
  migration/postcopy: postpone setting PostcopyState to END
  migration/postcopy: mis->have_listen_thread check will never be touched
  migration: report SaveStateEntry id and name on failure
  migration: pass in_postcopy instead of check state again
  migration/postcopy: fix typo in mark_postcopy_blocktime_begin's comment
  migration/postcopy: map large zero page in postcopy_ram_incoming_setup()
  migration/postcopy: allocate tmp_page in setup stage
  migration: Don't try and recover return path in non-postcopy
  rcu: Use automatic rc_read unlock in core memory/exec code
  migration: Use automatic rcu_read unlock in rdma.c
  migration: Use automatic rcu_read unlock in ram.c
  migration: Fix missing rcu_read_unlock
  rcu: Add automatically released rcu_read_lock variants
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

Merge remote-tracking branch 'remotes/awilliam/tags/vfio-update-20191010.0' into staging

VFIO update 2019-10-10

- Fix MSI error path double free (Evgeny Yakovlev)

# gpg: Signature made Thu 10 Oct 2019 20:07:39 BST
# gpg:                using RSA key 239B9B6E3BB08B22
# gpg: Good signature from "Alex Williamson <alex.williamson@redhat.com>" [full]
# gpg:                 aka "Alex Williamson <alex@shazbot.org>" [full]
# gpg:                 aka "Alex Williamson <alwillia@redhat.com>" [full]
# gpg:                 aka "Alex Williamson <alex.l.williamson@gmail.com>" [full]
# Primary key fingerprint: 42F6 C04E 540B D1A9 9E7B  8A90 239B 9B6E 3BB0 8B22

* remotes/awilliam/tags/vfio-update-20191010.0:
  hw/vfio/pci: fix double free in vfio_msi_disable

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

Merge remote-tracking branch 'remotes/gkurz/tags/9p-next-2019-10-10' into staging

The most notable change is that we now detect cross-device setups in the
host since it may cause inode number collision and mayhem in the guest.
A new fsdev property is added for the user to choose the appropriate
policy to handle that: either remap all inode numbers or fail I/Os to
another host device or just print out a warning (default behaviour).

This is also my last PR as _active_ maintainer of 9pfs.

# gpg: Signature made Thu 10 Oct 2019 12:14:07 BST
# gpg:                using RSA key B4828BAF943140CEF2A3491071D4D5E5822F73D6
# gpg: Good signature from "Greg Kurz <groug@kaod.org>" [full]
# gpg:                 aka "Gregory Kurz <gregory.kurz@free.fr>" [full]
# gpg:                 aka "[jpeg image of size 3330]" [full]
# Primary key fingerprint: B482 8BAF 9431 40CE F2A3  4910 71D4 D5E5 822F 73D6

* remotes/gkurz/tags/9p-next-2019-10-10:
  MAINTAINERS: Downgrade status of virtio-9p to "Odd Fixes"
  9p: Use variable length suffixes for inode remapping
  9p: stat_to_qid: implement slow path
  9p: Added virtfs option 'multidevs=remap|forbid|warn'
  9p: Treat multiple devices on one export as an error
  fsdev: Add return value to fsdev_throttle_parse_opts()
  9p: Simplify error path of v9fs_device_realize_common()
  9p: unsigned type for type, version, path

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

Merge remote-tracking branch 'remotes/maxreitz/tags/pull-block-2019-10-10' into staging

Block patches:
- Parallelized request handling for qcow2
- Backup job refactoring to use a filter node instead of before-write
  notifiers
- Add discard accounting information to file-posix nodes
- Allow trivial reopening of nbd nodes
- Some iotest fixes

# gpg: Signature made Thu 10 Oct 2019 12:40:34 BST
# gpg:                using RSA key 91BEB60A30DB3E8857D11829F407DB0061D5CF40
# gpg:                issuer "mreitz@redhat.com"
# gpg: Good signature from "Max Reitz <mreitz@redhat.com>" [full]
# Primary key fingerprint: 91BE B60A 30DB 3E88 57D1  1829 F407 DB00 61D5 CF40

* remotes/maxreitz/tags/pull-block-2019-10-10: (36 commits)
  iotests/162: Fix for newer Linux 5.3+
  tests: fix I/O test for hosts defaulting to LUKSv2
  nbd: add empty .bdrv_reopen_prepare
  block/backup: use backup-top instead of write notifiers
  block: introduce backup-top filter driver
  block/block-copy: split block_copy_set_callbacks function
  block/backup: move write_flags calculation inside backup_job_create
  block/backup: move in-flight requests handling from backup to block-copy
  iotests: Use stat -c %b in 125
  iotests: Disable 125 on broken XFS versions
  iotests: Fix 125 for growth_mode = metadata
  qapi: query-blockstat: add driver specific file-posix stats
  file-posix: account discard operations
  scsi: account unmap operations
  scsi: move unmap error checking to the complete callback
  scsi: store unmap offset and nb_sectors in request struct
  ide: account UNMAP (TRIM) operations
  block: add empty account cookie type
  qapi: add unmap to BlockDeviceStats
  qapi: group BlockDeviceStats fields
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

Merge remote-tracking branch 'remotes/davidhildenbrand/tags/s390x-tcg-2019-10-10' into staging

- MMU DAT translation rewrite and cleanup
- Implement more TCG CPU features related to the MMU (e.g., IEP)
- Add the current instruction length to unwind data and clean up
- Resolve one TODO for the MVCL instruction

# gpg: Signature made Thu 10 Oct 2019 12:25:06 BST
# gpg:                using RSA key 1BD9CAAD735C4C3A460DFCCA4DDE10F700FF835A
# gpg:                issuer "david@redhat.com"
# gpg: Good signature from "David Hildenbrand <david@redhat.com>" [unknown]
# gpg:                 aka "David Hildenbrand <davidhildenbrand@gmail.com>" [full]
# Primary key fingerprint: 1BD9 CAAD 735C 4C3A 460D  FCCA 4DDE 10F7 00FF 835A

* remotes/davidhildenbrand/tags/s390x-tcg-2019-10-10: (31 commits)
  s390x/tcg: MVCL: Exit to main loop if requested
  target/s390x: Remove ILEN_UNWIND
  target/s390x: Remove ilen argument from trigger_pgm_exception
  target/s390x: Remove ilen argument from trigger_access_exception
  target/s390x: Remove ILEN_AUTO
  target/s390x: Rely on unwinding in s390_cpu_virt_mem_rw
  target/s390x: Rely on unwinding in s390_cpu_tlb_fill
  target/s390x: Simplify helper_lra
  target/s390x: Remove fail variable from s390_cpu_tlb_fill
  target/s390x: Return exception from translate_pages
  target/s390x: Return exception from mmu_translate
  target/s390x: Remove exc argument to mmu_translate_asce
  target/s390x: Return exception from mmu_translate_real
  target/s390x: Handle tec in s390_cpu_tlb_fill
  target/s390x: Push trigger_pgm_exception lower in s390_cpu_tlb_fill
  target/s390x: Use tcg_s390_program_interrupt in TCG helpers
  target/s390x: Remove ilen parameter from s390_program_interrupt
  target/s390x: Remove ilen parameter from tcg_s390_program_interrupt
  target/s390x: Add ilen to unwind data
  s390x/cpumodel: Add new TCG features to QEMU cpu model
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

test-bdrv-drain: fix iothread_join() hang

tests/test-bdrv-drain can hang in tests/iothread.c:iothread_run():

  while (!atomic_read(&iothread->stopping)) {
      aio_poll(iothread->ctx, true);
  }

The iothread_join() function works as follows:

  void iothread_join(IOThread *iothread)
  {
      iothread->stopping = true;
      aio_notify(iothread->ctx);
      qemu_thread_join(&iothread->thread);

If iothread_run() checks iothread->stopping before the iothread_join()
thread sets stopping to true, then aio_notify() may be optimized away
and iothread_run() hangs forever in aio_poll().

The correct way to change iothread->stopping is from a BH that executes
within iothread_run().  This ensures that iothread->stopping is checked
after we set it to true.

This was already fixed for ./iothread.c (note this is a different source
file!) by commit 2362a28ea11c145e1a13ae79342d76dc118a72a6 ("iothread:
fix iothread_stop() race condition"), but not for tests/iothread.c.

Fixes: 0c330a734b51c177ab8488932ac3b0c4d63a718a
       ("aio: introduce aio_co_schedule and aio_co_wake")
Reported-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20191003100103.331-1-stefanha@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>

Update OpenBIOS images to f28e16f9 built from submodule.

Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>

migration: Support gtree migration

Introduce support for GTree migration. A custom save/restore
is implemented. Each item is made of a key and a data.

If the key is a pointer to an object, 2 VMSDs are passed into
the GTree VMStateField.

When putting the items, the tree is traversed in sorted order by
g_tree_foreach.

On the get() path, gtrees must be allocated using the proper
key compare, key destroy and value destroy. This must be handled
beforehand, for example in a pre_load method.

Tests are added to test save/dump of structs containing gtrees
including the virtio-iommu domain/mappings scenario.

Signed-off-by: Eric Auger <eric.auger@redhat.com>
Message-Id: <20191011121724.433-1-eric.auger@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
uintptr_t fixup for test on 32bit

migration/multifd: pages->used would be cleared when attach to multifd_send_state

When we found an available channel in multifd_send_pages(), its
pages->used is cleared and then attached to multifd_send_state.

It is not necessary to do this twice.

Signed-off-by: Wei Yang <richardw.yang@linux.intel.com>
Message-Id: <20191011085050.17622-5-richardw.yang@linux.intel.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>

migration/multifd: initialize packet->magic/version once at setup stage

MultiFDPacket_t's magic and version field never changes during
migration, so move these two fields in setup stage.

Signed-off-by: Wei Yang <richardw.yang@linux.intel.com>
Message-Id: <20191011085050.17622-4-richardw.yang@linux.intel.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>

migration/multifd: use pages->allocated instead of the static max

multifd_send_fill_packet() prepares meta data for following pages to
transfer. It would be more proper to fill pages->allocated instead of
static max value, especially we want to support flexible packet size.

Signed-off-by: Wei Yang <richardw.yang@linux.intel.com>
Message-Id: <20191011085050.17622-3-richardw.yang@linux.intel.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>

migration/multifd: fix a typo in comment of multifd_recv_unfill_packet()

Signed-off-by: Wei Yang <richardw.yang@linux.intel.com>
Message-Id: <20191011085050.17622-2-richardw.yang@linux.intel.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>

migration/postcopy: check PostcopyState before setting to POSTCOPY_INCOMING_RUNNING

Currently, we set PostcopyState blindly to RUNNING, even we found the
previous state is not LISTENING. This will lead to a corner case.

First let's look at the code flow:

qemu_loadvm_state_main()
    ret = loadvm_process_command()
        loadvm_postcopy_handle_run()
            return -1;
    if (ret < 0) {
        if (postcopy_state_get() == POSTCOPY_INCOMING_RUNNING)
            ...
    }

>From above snippet, the corner case is loadvm_postcopy_handle_run()
always sets state to RUNNING. And then it checks the previous state. If
the previous state is not LISTENING, it will return -1. But at this
moment, PostcopyState is already been set to RUNNING.

Then ret is checked in qemu_loadvm_state_main(), when it is -1
PostcopyState is checked. Current logic would pause postcopy and retry
if PostcopyState is RUNNING. This is not what we expect, because
postcopy is not active yet.

This patch makes sure state is set to RUNNING only previous state is
LISTENING by checking the state first.

Signed-off-by: Wei Yang <richardw.yang@linux.intel.com>
Suggested by: Peter Xu <peterx@redhat.com>
Message-Id: <20191010011316.31363-3-richardw.yang@linux.intel.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>

migration/postcopy: rename postcopy_ram_enable_notify to postcopy_ram_incoming_setup

Function postcopy_ram_incoming_setup and postcopy_ram_incoming_cleanup
is a pair. Rename to make it clear for audience.

Signed-off-by: Wei Yang <richardw.yang@linux.intel.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-Id: <20191010011316.31363-2-richardw.yang@linux.intel.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>

migration/postcopy: postpone setting PostcopyState to END

There are two places to call function postcopy_ram_incoming_cleanup()

    postcopy_ram_listen_thread on migration success
    loadvm_postcopy_handle_listen one setup failure

On success, the vm will never accept another migration. On failure,
PostcopyState is transited from LISTENING to END and would be checked in
qemu_loadvm_state_main(). If PostcopyState is RUNNING, migration would
be paused and retried.

Currently PostcopyState is set to END in function
postcopy_ram_incoming_cleanup(). With above analysis, we can take this
step out and postpone this till the end of listen thread to indicate the
listen thread is done.

This is a preparation patch for later cleanup.

Signed-off-by: Wei Yang <richardw.yang@linux.intel.com>
Message-Id: <20191006000249.29926-3-richardw.yang@linux.intel.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
  Fixed up in merge to the 1 parameter postcopy_state_set

migration/postcopy: mis->have_listen_thread check will never be touched

If mis->have_listen_thread is true, this means current PostcopyState
must be LISTENING or RUNNING. While the check at the beginning of the
function makes sure the state transaction happens when its previous
PostcopyState is ADVISE or DISCARD.

This means we would never touch this check.

Signed-off-by: Wei Yang <richardw.yang@linux.intel.com>
Message-Id: <20191006000249.29926-2-richardw.yang@linux.intel.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>

migration: report SaveStateEntry id and name on failure

This provides helpful information on which entry failed.

Signed-off-by: Wei Yang <richardw.yang@linux.intel.com>
Message-Id: <20191005220517.24029-5-richardw.yang@linux.intel.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>

migration: pass in_postcopy instead of check state again

Not necessary to do the check again.

Signed-off-by: Wei Yang <richardw.yang@linux.intel.com>
Message-Id: <20191005220517.24029-4-richardw.yang@linux.intel.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>