git.proxmox.com Git - mirror

target/ppc: Move no-execute and guarded page checking into new function

A pte entry has bit fields which can be used to make a page no-execute or
guarded, if either of these bits are set then an instruction access to this
page will fail. Currently these bits are checked with the pp_prot function
however the ISA specifies that the access authority controlled by the
key-pp value pair should only be checked on an instruction access after
the no-execute and guard bits have already been verified to permit the
access.

Move the no-execute and guard bit checking into a new separate function.
Note that we can remove the check for the no-execute bit in the slb entry
since this check was already performed above when we obtained the slb
entry.

In the event that the no-execute or guard bits are set, an ISI should be
generated with the SRR1_NOEXEC_GUARD (0x10000000) bit set in srr1. Add a
define for this for clarity.

Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com>
[dwg: Move constants to cpu.h since they're not MMUv3 specific]
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>

target/ppc: Add execute permission checking to access authority check

Basic storage protection defines various access authority permissions
based on a slb storage key and pte pp value pair. This access authority
defines read, write and execute permissions however currently we only
use this to control read and write permissions and ignore the execute
control.

Fix the code to allow execute permissions based on the key-pp value pair.
Execute is allowed under the same conditions which enable reads.
(i.e. read permission -> execute permission)

Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com>
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>

target/ppc: Add Instruction Authority Mask Register Check

The instruction authority mask register (IAMR) can be used to restrict
permissions for instruction fetch accesses on a per key basis for each
of 32 different key values. Access permissions are derived based on the
specific key value stored in the relevant page table entry.

The IAMR was introduced in, and is present in processors since, POWER8
(ISA v2.07). Thus introduce a function to check access permissions based
on the pte key value and the contents of the IAMR when handling a page
fault to ensure sufficient access permissions for an instruction fetch.

A hash pte contains a key value in bits 2:3|52:54 of the second double word
of the pte, this key value gives an index into the IAMR which contains 32
2-bit access masks. If the least significant bit of the 2-bit access mask
corresponding to the given key value is set (IAMR[key] & 0x1 == 0x1) then
the instruction fetch is not permitted and an ISI is generated accordingly.
While we're here, add defines for the srr1 bits to be set for the ISI for
clarity.

e.g.

pte:
dw0 [XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX]
dw1 [XX01XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX010XXXXXXXXX]
^^ ^^^
key = 01010 (0x0a)

IAMR: [XXXXXXXXXXXXXXXXXXXX01XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX]
^^
Access mask = 0b01

Test access mask: 0b01 & 0x1 == 0x1

Least significant bit of the access mask is set, thus the instruction fetch
is not permitted. We should generate an instruction storage interrupt (ISI)
with bit 42 of SRR1 set to indicate access precluded by virtual page class
key protection.

Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com>
[dwg: Move new constants to cpu.h, since they're not MMUv3 specific]
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>

hw/ppc/spapr: Add POWER9 to pseries cpu models

Add POWER9 cpu to list of spapr core models which allows it to be specified
as the cpu model for a pseries guest (e.g. -machine pseries -cpu POWER9).

This now allows a POWER9 cpu to boot to userspace in tcg emulation for a
pseries machine with a legacy kernel.

Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Acked-by: Balbir Singh <bsingharora@gmail.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>

target/ppc/POWER9: Add cpu_has_work function for POWER9

The cpu has work function is used to mask interrupts used to determine
if there is work for the cpu based on the LPCR. Add a function to do this
for POWER9 and add it to the POWER9 cpu definition. This is similar to that
for POWER8 except using the LPCR bits as defined for POWER9.

Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>

target/ppc/POWER9: Add POWER9 pa-features definition

Add a pa-features definition which includes all of the new fields which
have been added, note we don't claim support for any of these new features
at this stage.

Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Acked-by: Balbir Singh <bsingharora@gmail.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>

target/ppc/POWER9: Add POWER9 mmu fault handler

Add a new mmu fault handler for the POWER9 cpu and add it as the handler
for the POWER9 cpu definition.

This handler checks if the guest is radix or hash based on the value in the
partition table entry and calls the correct fault handler accordingly.

The hash fault handling code has also been updated to check if the
partition is using segment tables.

Currently only legacy hash (no segment tables) is supported.

Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>

target/ppc: Don't gen an SDR1 on POWER9 and rework register creation

POWER9 doesn't have a storage description register 1 (SDR1) which is used
to store the base and size of the hash table. Thus we don't need to
generate this register on the POWER9 cpu model. While we're here, the
register generation code for 970, POWER5+, POWER<7/8/9> in general is a
mess where we call a generic function from a model specific function which
then attempts to call model specific functions, so rework this for
readability.

We update ppc_cpu_dump_state so that "info registers" will only display
the value of sdr1 if the register has been generated.

As mentioned above the register generation for the pcc->init_proc
function for 970, POWER5+, POWER7, POWER8 and POWER9 has been reworked
for improved clarity. Instead of calling init_proc_book3s_64 which then
attempts to generate the correct registers through a mess of if statements,
we remove this function and instead call the appropriate register
generation functions directly. This follows the register generation model
used for earlier cpu models (pre-970) whereby cpu specific registers are
generated directly in the init_proc function and makes it easier to
add/remove specific registers for new cpu models.

Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>

target/ppc: Add patb_entry to sPAPRMachineState

ISA v3.00 adds the idea of a partition table which is used to store the
address translation details for all partitions on the system. The partition
table consists of double word entries indexed by partition id where the second
double word contains the location of the process table in guest memory. The
process table is registered by the guest via a h-call.

We need somewhere to store the address of the process table so we add an entry
to the sPAPRMachineState struct called patb_entry to represent the second
doubleword of a single partition table entry corresponding to the current
guest. We need to store this value so we know if the guest is using radix or
hash translation and the location of the corresponding process table in guest
memory. Since we only have a single guest per qemu instance, we only need one
entry.

Since the partition table is technically a hypervisor resource we require that
access to it is abstracted by the virtual hypervisor through the get_patbe()
call. Currently the value of the entry is never set (and thus
defaults to 0 indicating hash), but it will be required to both implement
POWER9 kvm support and tcg radix support.

We also add this field to be migrated as part of the sPAPRMachineState as we
will need it on the receiving side as the guest will never tell us this
information again and we need it to perform translation.

Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>

target/ppc/POWER9: Add POWERPC_MMU_V3 bit

For easier handling of future processors using the POWER9 or something
close to it, add a new bit in the MMU model. This was originally from a
revised version of 86cf1e9 "target/ppc/POWER9: Add ISAv3.00 MMU definition"
but the older version of the patch was already merged. This makes the
change on top of the original version.

Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>

powernv: Don't test POWER9 CPU yet

A couple of tests for the work-in-progress 'powernv' machine type attempt
to test on POWER9 CPUs.  However the POWER9 CPU support is incomplete and
this doesn't really work.  In particular the firmware image we have
currently assumes the presence of the SDR1 register, which no longer exists
on POWER9.  We only got away with this so far, because of a different bug
which added SDR1 to POWER9 even though it shouldn't be there.

For now, remove POWER9 testing of powernv, POWER8 testing will do for now
until the POWER9 support is more complete.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>

exec, kvm, target-ppc: Move getrampagesize() to common code

getrampagesize() returns the largest supported page size and mainly
used to know if huge pages are enabled.

However is implemented in target-ppc/kvm.c and not available
in TCG or other architectures.

This renames and moves gethugepagesize() to mmap-alloc.c where
fd-based analog of it is already implemented. This renames and moves
getrampagesize() to exec.c as it seems to be the common place for
helpers like this.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>

target/ppc: Add POWER9/ISAv3.00 to compat_table

compat_table contains the list of logical pvr compat modes which a cpu can
operate in. It is a list of struct CompatInfo which contains the given pvr
value for a compat mode, the pcr bits which should be set to operate in
that compat mode, the pcr level which must be present in pcr_supported for
a processor to support that compat mode and the max threads possible in
that compat mode.

Add an entry for the POWER9/ISAv3.00 logical pvr which represents a
processor running with support for logical pvr 0x0f000005. A processor
running in this mode should have PCR_COMPAT_3_00 set in the pcr (if
available in pcr_mask) and should have PCR_COMPAT_3_00 in pcr_supported
to indicate that it is capable of running in this compat mode.

Also add PCR_COMPAT_3_00 to the bits which must be set for all previous
compat modes. Since no processor models contain this bit yet in pcr_mask
it will never be set, but this ensures we don't forget to in the future.

Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>

Merge remote-tracking branch 'remotes/rth/tags/pull-tgt-20170302' into staging

Queued sparc patch

# gpg: Signature made Wed 01 Mar 2017 19:53:21 GMT
# gpg:                using RSA key 0xAD1270CC4DD0279B
# gpg: Good signature from "Richard Henderson <rth7680@gmail.com>"
# gpg:                 aka "Richard Henderson <rth@redhat.com>"
# gpg:                 aka "Richard Henderson <rth@twiddle.net>"
# Primary key fingerprint: 9CB1 8DDA F8E8 49AD 2AFC  16A4 AD12 70CC 4DD0 279B

* remotes/rth/tags/pull-tgt-20170302:
  target/sparc: Restore ldstub of odd asis

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

Merge remote-tracking branch 'remotes/kraxel/tags/pull-audio-20170301-1' into staging

audio: replay support, sdl2 fix.

# gpg: Signature made Wed 01 Mar 2017 15:38:09 GMT
# gpg:                using RSA key 0x4CB6D8EED3E87138
# gpg: Good signature from "Gerd Hoffmann (work) <kraxel@redhat.com>"
# gpg:                 aka "Gerd Hoffmann <gerd@kraxel.org>"
# gpg:                 aka "Gerd Hoffmann (private) <kraxel@gmail.com>"
# Primary key fingerprint: A032 8CFF B93A 17A7 9901  FE7D 4CB6 D8EE D3E8 7138

* remotes/kraxel/tags/pull-audio-20170301-1:
  audio/sdlaudio: Allow audio playback with SDL2
  audio: make audio poll timer deterministic
  replay: add record/replay for audio passthrough

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

Merge remote-tracking branch 'remotes/kraxel/tags/pull-docs-20170301-1' into staging

docs: update sample configuration files

# gpg: Signature made Wed 01 Mar 2017 13:43:34 GMT
# gpg:                using RSA key 0x4CB6D8EED3E87138
# gpg: Good signature from "Gerd Hoffmann (work) <kraxel@redhat.com>"
# gpg:                 aka "Gerd Hoffmann <gerd@kraxel.org>"
# gpg:                 aka "Gerd Hoffmann (private) <kraxel@gmail.com>"
# Primary key fingerprint: A032 8CFF B93A 17A7 9901  FE7D 4CB6 D8EE D3E8 7138

* remotes/kraxel/tags/pull-docs-20170301-1:
  mach-virt: Provide sample configuration files
  q35: Improve sample configuration files

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

Merge remote-tracking branch 'remotes/dgilbert/tags/pull-migration-20170228a' into staging

Migration pull

Note: The 'postcopy: Update userfaultfd.h header' is part of
Paolo's header update and will disappear if applied after it.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
# gpg: Signature made Tue 28 Feb 2017 12:38:34 GMT
# gpg:                using RSA key 0x0516331EBC5BFDE7
# gpg: Good signature from "Dr. David Alan Gilbert (RH2) <dgilbert@redhat.com>"
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg:          There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 45F5 C71B 4A0C B7FB 977A  9FA9 0516 331E BC5B FDE7

* remotes/dgilbert/tags/pull-migration-20170228a: (27 commits)
  postcopy: Add extra check for COPY function
  postcopy: Add doc about hugepages and postcopy
  postcopy: Check for userfault+hugepage feature
  postcopy: Update userfaultfd.h header
  postcopy: Allow hugepages
  postcopy: Send whole huge pages
  postcopy: Mask fault addresses to huge page boundary
  postcopy: Load huge pages in one go
  postcopy: Use temporary for placing zero huge pages
  postcopy: Plumb pagesize down into place helpers
  postcopy: Record largest page size
  postcopy: enhance ram_block_discard_range for hugepages
  exec: ram_block_discard_range
  postcopy: Chunk discards for hugepages
  postcopy: Transmit and compare individual page sizes
  postcopy: Transmit ram size summary word
  migration: fix use-after-free of to_dst_file
  migration: Update docs to discourage version bumps
  migration: fix id leak regression
  migrate: Introduce a 'dc->vmsd' check to avoid segfault for --only-migratable
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

Merge remote-tracking branch 'remotes/elmarco/tags/leak-pull-request' into staging

# gpg: Signature made Wed 01 Mar 2017 09:02:53 GMT
# gpg:                using RSA key 0xDAE8E10975969CE5
# gpg: Good signature from "Marc-André Lureau <marcandre.lureau@redhat.com>"
# gpg:                 aka "Marc-André Lureau <marcandre.lureau@gmail.com>"
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg:          It is not certain that the signature belongs to the owner.
# Primary key fingerprint: 87A9 BD93 3F87 C606 D276  F62D DAE8 E109 7596 9CE5

* remotes/elmarco/tags/leak-pull-request: (28 commits)
  tests: fix virtio-blk-test leaks
  tests: add specialized device_find function
  tests: fix usb-test leaks
  tests: allows to run single test in usb-hcd-ehci-test
  usb: release the created buses
  bus: do not unref hotplug handler
  tests: fix virtio-9p-test leaks
  tests: fix virtio-scsi-test leak
  tests: fix e1000e leaks
  tests: fix i440fx-test leaks
  tests: fix e1000-test leak
  tests: fix tco-test leaks
  tests: fix eepro100-test leak
  pc: pcihp: avoid adding ACPI_PCIHP_PROP_BSEL twice
  tests: fix ipmi-bt-test leak
  tests: fix ipmi-kcs-test leak
  tests: fix bios-tables-test leak
  tests: fix hd-geo-test leaks
  tests: fix ide-test leaks
  tests: fix vhost-user-test leaks
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

Merge remote-tracking branch 'remotes/dgibson/tags/ppc-for-2.9-20170301' into staging

ppc patch queue for 2017-03-01

I was hoping to get this pull request squeezed in before the soft
freeze, but I ran into some difficulties during testing.  Everything
here was at least posted before the soft freeze, so I'm hoping we can
still merge it for 2.9.

The biggest things here are:
    * Cleanups to handling of hashed page tables, that will make
      adding support for the POWER9 MMU easier
    * Cleanups to the XICS interrupt controller that will make
      implementing the powernv machine easier
    * TCG implementation of extended overflow and carry handling for
      POWER9

It also includes:
    * Increasing the CPU limit for pseries to 1024 vCPUs
    * Generating proper OF node names in qemu (making hotplug and
      coldplug logic closer together)

# gpg: Signature made Wed 01 Mar 2017 04:43:06 GMT
# gpg:                using RSA key 0x6C38CACA20D9B392
# gpg: Good signature from "David Gibson <david@gibson.dropbear.id.au>"
# gpg:                 aka "David Gibson (Red Hat) <dgibson@redhat.com>"
# gpg:                 aka "David Gibson (ozlabs.org) <dgibson@ozlabs.org>"
# gpg:                 aka "David Gibson (kernel.org) <dwg@kernel.org>"
# Primary key fingerprint: 75F4 6586 AE61 A66C C44E  87DC 6C38 CACA 20D9 B392

* remotes/dgibson/tags/ppc-for-2.9-20170301: (50 commits)
  Add PowerPC 32-bit guest memory dump support
  ppc/xics: rename 'ICPState *' variables to 'icp'
  ppc/xics: move InterruptStatsProvider to the sPAPR machine
  ppc/xics: move ics-simple post_load under the machine
  ppc/xics: remove the XICSState classes
  ppc/xics: export the XICS init routines
  ppc/xics: move the ICP array under the sPAPR machine
  ppc/xics: register the reset handler of ICP objects
  ppc/xics: simplify spapr_dt_xics() interface
  ppc/xics: use the QOM interface to grab an ICP
  ppc/xics: move the cpu_setup() handler under the ICPState class
  ppc/xics: simplify the cpu_setup() handler
  ppc/xics: move kernel_xics_fd out of KVMXICSState
  ppc/xics: extend the QOM interface to handle ICPs
  ppc/xics: remove the XICS list of ICS
  ppc/xics: register the reset handler of ICS objects
  ppc/xics: remove xics_find_source()
  ppc/xics: use the QOM interface to resend irqs
  ppc/xics: use the QOM interface to get irqs
  ppc/xics: use the QOM interface under the sPAPR machine
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

Merge remote-tracking branch 'remotes/mcayland/tags/qemu-openbios-signed' into staging

Update OpenBIOS images

# gpg: Signature made Tue 28 Feb 2017 22:09:11 GMT
# gpg:                using RSA key 0x5BC2C56FAE0F321F
# gpg: Good signature from "Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>"
# Primary key fingerprint: CC62 1AB9 8E82 200D 915C  C9C4 5BC2 C56F AE0F 321F

* remotes/mcayland/tags/qemu-openbios-signed:
  Update OpenBIOS images to 0cd97cc built from submodule.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

Merge remote-tracking branch 'remotes/ehabkost/tags/x86-pull-request' into staging

x86 queue, 2017-02-27

"-cpu max" and query-cpu-model-expansion support for x86. This
should be the last x86 pull request before 2.9 soft freeze.

# gpg: Signature made Mon 27 Feb 2017 16:24:15 GMT
# gpg:                using RSA key 0x2807936F984DC5A6
# gpg: Good signature from "Eduardo Habkost <ehabkost@redhat.com>"
# Primary key fingerprint: 5A32 2FD5 ABC4 D3DB ACCF  D1AA 2807 936F 984D C5A6

* remotes/ehabkost/tags/x86-pull-request:
  i386: Improve query-cpu-model-expansion full mode
  i386: Implement query-cpu-model-expansion QMP command
  i386: Define static "base" CPU model
  i386: Don't set CPUClass::cpu_def on "max" model
  i386: Make "max" model not use any host CPUID info on TCG
  i386: Create "max" CPU model
  qapi-schema: Comment about full expansion of non-migration-safe models
  i386: Reorganize and document CPUID initialization steps
  i386: Rename X86CPU::host_features to X86CPU::max_features
  i386: Add ordering field to CPUClass
  i386: Unset cannot_destroy_with_object_finalize_yet on "host" model

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

Merge remote-tracking branch 'remotes/kraxel/tags/pull-seabios-20170228-1' into staging

seabios: update to 1.10.2 release

# gpg: Signature made Tue 28 Feb 2017 08:57:57 GMT
# gpg:                using RSA key 0x4CB6D8EED3E87138
# gpg: Good signature from "Gerd Hoffmann (work) <kraxel@redhat.com>"
# gpg:                 aka "Gerd Hoffmann <gerd@kraxel.org>"
# gpg:                 aka "Gerd Hoffmann (private) <kraxel@gmail.com>"
# Primary key fingerprint: A032 8CFF B93A 17A7 9901  FE7D 4CB6 D8EE D3E8 7138

* remotes/kraxel/tags/pull-seabios-20170228-1:
  seabios: update to 1.10.2 release

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

Merge remote-tracking branch 'remotes/rth/tags/pull-tcg-20170301' into staging

Queued TCG patch

# gpg: Signature made Tue 28 Feb 2017 21:30:32 GMT
# gpg:                using RSA key 0xAD1270CC4DD0279B
# gpg: Good signature from "Richard Henderson <rth7680@gmail.com>"
# gpg:                 aka "Richard Henderson <rth@redhat.com>"
# gpg:                 aka "Richard Henderson <rth@twiddle.net>"
# Primary key fingerprint: 9CB1 8DDA F8E8 49AD 2AFC  16A4 AD12 70CC 4DD0 279B

* remotes/rth/tags/pull-tcg-20170301:
  aarch64: Change ext type to TCGType to fix warnings

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

Merge remote-tracking branch 'remotes/kevin/tags/for-upstream' into staging

Block layer patches

# gpg: Signature made Tue 28 Feb 2017 20:35:32 GMT
# gpg:                using RSA key 0x7F09B272C88F2FD6
# gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>"
# Primary key fingerprint: DC3D EB15 9A9A F95D 3D74  56FE 7F09 B272 C88F 2FD6

* remotes/kevin/tags/for-upstream: (46 commits)
  block: Add Error parameter to bdrv_append()
  block: Add Error parameter to bdrv_set_backing_hd()
  block: Assertions for resize permission
  block: Assertions for write permissions
  block: Pass BdrvChild to bdrv_aligned_preadv/pwritev and copy-on-read
  tests: Remove FIXME comments
  nbd/server: Use real permissions for NBD exports
  migration/block: Use real permissions
  hmp: Request permissions in qemu-io
  commit: Add filter-node-name to block-commit
  mirror: Add filter-node-name to blockdev-mirror
  stream: Use real permissions in streaming block job
  mirror: Use real permissions in mirror/active commit block job
  blockjob: Factor out block_job_remove_all_bdrv()
  block: Allow backing file links in change_parent_backing_link()
  block: BdrvChildRole.attach/detach() callbacks
  block: Fix pending requests check in bdrv_append()
  backup: Use real permissions in backup block job
  commit: Use real permissions for HMP 'commit'
  commit: Use real permissions in commit block job
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

Merge remote-tracking branch 'remotes/sstabellini/tags/xen-20170228-tag' into staging

Xen 2017/02/28

# gpg: Signature made Tue 28 Feb 2017 19:13:08 GMT
# gpg:                using RSA key 0x894F8F4870E1AE90
# gpg: Good signature from "Stefano Stabellini <sstabellini@kernel.org>"
# gpg:                 aka "Stefano Stabellini <stefano.stabellini@eu.citrix.com>"
# Primary key fingerprint: D04E 33AB A51F 67BA 07D3  0AEA 894F 8F48 70E1 AE90

* remotes/sstabellini/tags/xen-20170228-tag:
  Add a new qmp command to do checkpoint, query xen replication status
  Add a new qmp command to start/stop replication

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

target/sparc: Restore ldstub of odd asis

Fixes the booting of ss20 roms.

Cc: qemu-stable@nongnu.org
Reported-by: Michael Russo <mike@papersolve.com>
Tested-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Signed-off-by: Richard Henderson <rth@twiddle.net>

Merge remote-tracking branch 'remotes/pmaydell/tags/pull-target-arm-20170228-1' into staging

target-arm queue:
* raspi2: add gpio controller and sdhost controller, with
   the wiring so the guest can switch which controller the
   SD card is attached to
   (this is sufficient to get raspbian kernels to boot)
* GICv3: support state save/restore from KVM
* update Linux headers to 4.11
* refactor and QOMify the ARMv7M container object

# gpg: Signature made Tue 28 Feb 2017 17:11:49 GMT
# gpg:                using RSA key 0x3C2525ED14360CDE
# gpg: Good signature from "Peter Maydell <peter.maydell@linaro.org>"
# gpg:                 aka "Peter Maydell <pmaydell@gmail.com>"
# gpg:                 aka "Peter Maydell <pmaydell@chiark.greenend.org.uk>"
# Primary key fingerprint: E1A5 C593 CD41 9DE2 8E83  15CF 3C25 25ED 1436 0CDE

* remotes/pmaydell/tags/pull-target-arm-20170228-1: (21 commits)
  bcm2835: add sdhost and gpio controllers
  bcm2835_gpio: add bcm2835 gpio controller
  hw/sd: add card-reparenting function
  qdev: Have qdev_set_parent_bus() handle devices already on a bus
  hw/intc/arm_gicv3_kvm: Reset GICv3 cpu interface registers
  target-arm: Add GICv3CPUState in CPUARMState struct
  hw/intc/arm_gicv3_kvm: Implement get/put functions
  hw/intc/arm_gicv3_kvm: Add ICC_SRE_EL1 register to vmstate
  update Linux headers to 4.11
  update-linux-headers: update for 4.11
  stm32f205: Rename 'nvic' local to 'armv7m'
  stm32f205: Create armv7m object without using armv7m_init()
  armv7m: Split systick out from NVIC
  armv7m: Don't put core v7M devices under CONFIG_STELLARIS
  armv7m: Make bitband device take the address space to access
  armv7m: Make NVIC expose a memory region rather than mapping itself
  armv7m: Make ARMv7M object take memory region link
  armv7m: Use QOMified armv7m object in armv7m_init()
  armv7m: QOMify the armv7m container
  armv7m: Move NVICState struct definition into header
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

audio/sdlaudio: Allow audio playback with SDL2

When compiling with SDL2, the semaphore trick used in sdlaudio.c
does not work - QEMU locks up completely in this case. To avoid
the hang and get at least some audio playback up and running (it's
a little bit crackling, but better than nothing), we can use the
SDL locking functions SDL_LockAudio() and SDL_UnlockAudio() to sync
with the sound playback thread instead.

Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-id: 1485852398-2327-1-git-send-email-thuth@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>

audio: make audio poll timer deterministic

This patch changes resetting strategy of the audio polling timer.
It does not change expiration time if the timer is already set.
This patch is needed to make this timer deterministic and to use execution
record/replay for audio devices.

audio_reset_timer is used in the function audio_vm_change_state_handler.
Therefore every time VM is stopped or restarted the timer will be reset
to new timeout. Virtual clock does not proceed while VM is stopped.
Therefore there is no need in resetting the timeout when VM restarts.

v2: updated commit message
v3: now using timer_mod_anticipate function (as suggested by Yurii Zubrytskyi)

Signed-off-by: Pavel Dovgalyuk <pavel.dovgaluk@ispras.ru>
Message-id: 20170214071510.6112.76764.stgit@PASHA-ISP
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>

replay: add record/replay for audio passthrough

This patch adds recording and replaying audio data. Is saves synchronization
information for audio out and inputs from the microphone.

v2: removed unneeded whitespace change

Signed-off-by: Pavel Dovgalyuk <pavel.dovgaluk@ispras.ru>
Message-id: 20170202055054.4848.94901.stgit@PASHA-ISP.lan02.inno

[ kraxel: add qemu/error-report.h include to fix osx build failure ]

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>

Merge remote-tracking branch 'remotes/gkurz/tags/cve-2016-9602-for-upstream' into staging

This pull request have all the fixes for CVE-2016-9602, so that it can
be easily picked up by downstreams, as suggested by Michel Tokarev.

# gpg: Signature made Tue 28 Feb 2017 10:21:32 GMT
# gpg:                using DSA key 0x02FC3AEB0101DBC2
# gpg: Good signature from "Greg Kurz <groug@kaod.org>"
# gpg:                 aka "Greg Kurz <groug@free.fr>"
# gpg:                 aka "Greg Kurz <gkurz@linux.vnet.ibm.com>"
# gpg:                 aka "Gregory Kurz (Groug) <groug@free.fr>"
# gpg:                 aka "[jpeg image of size 3330]"
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg:          There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 2BD4 3B44 535E C0A7 9894  DBA2 02FC 3AEB 0101 DBC2

* remotes/gkurz/tags/cve-2016-9602-for-upstream: (28 commits)
  9pfs: local: drop unused code
  9pfs: local: open2: don't follow symlinks
  9pfs: local: mkdir: don't follow symlinks
  9pfs: local: mknod: don't follow symlinks
  9pfs: local: symlink: don't follow symlinks
  9pfs: local: chown: don't follow symlinks
  9pfs: local: chmod: don't follow symlinks
  9pfs: local: link: don't follow symlinks
  9pfs: local: improve error handling in link op
  9pfs: local: rename: use renameat
  9pfs: local: renameat: don't follow symlinks
  9pfs: local: lstat: don't follow symlinks
  9pfs: local: readlink: don't follow symlinks
  9pfs: local: truncate: don't follow symlinks
  9pfs: local: statfs: don't follow symlinks
  9pfs: local: utimensat: don't follow symlinks
  9pfs: local: remove: don't follow symlinks
  9pfs: local: unlinkat: don't follow symlinks
  9pfs: local: lremovexattr: don't follow symlinks
  9pfs: local: lsetxattr: don't follow symlinks
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

mach-virt: Provide sample configuration files

These are very much like the sample configuration files
for q35, and can be used both as documentation and as
a starting point for creating your own guest.

Two sample configuration files are provided:

  * mach-virt-graphical.cfg can be used to start a
    fully-featured (USB, graphical console, etc.)
    guest that uses VirtIO devices;

  * mach-virt-serial.cfg is similar but has a minimal
    set of devices and uses the serial console.

All configuration files are fully commented and neatly
organized.

Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Message-id: 1487326479-8664-3-git-send-email-abologna@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>

q35: Improve sample configuration files

Instead of having a single sample configuration file,
we now have several:

  * q35-emulated.cfg documents the default devices QEMU
    adds to a q35 guest and the additional devices that
    are pretty much guaranteed to be present in a
    physical q35-based machine;

  * q35-virtio-graphical.cfg can be used to start a
    fully-featured (USB, graphical console, audio, etc.)
    guest that uses VirtIO instead of emulated devices;

  * q35-virtio-serial.cfg is similar but has a minimal
    set of devices and uses the serial console.

All configuration files are fully commented and neatly
organized.

Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
Message-id: 1487326479-8664-2-git-send-email-abologna@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>

Merge remote-tracking branch 'remotes/famz/tags/docker-pull-request' into staging

# gpg: Signature made Tue 28 Feb 2017 12:40:00 GMT
# gpg:                using RSA key 0xCA35624C6A9171C6
# gpg: Good signature from "Fam Zheng <famz@redhat.com>"
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg:          It is not certain that the signature belongs to the owner.
# Primary key fingerprint: 5003 7CB7 9706 0F76 F021  AD56 CA35 624C 6A91 71C6

* remotes/famz/tags/docker-pull-request:
  .shippable: add s390x-cross target
  new: dockerfiles/debian-s390-cross

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

tests: fix virtio-blk-test leaks

Use qvirtio_pci_device_find_slot() to avoid leaking the non-hp
device. Add assert() to avoid further leaks in the future.

Use qvirtio_pci_device_free() to correctly free QVirtioPCIDevice.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>

tests: add specialized device_find function

Allow specifying which slot to look for the device.

This will be used in the following patch to avoid leaking when multiple
devices exists and we want to lookup the hotplug one.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>

tests: fix usb-test leaks

Fix the usb tests leaks.

Spotted by ASAN.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Gerd Hoffmann <kraxel@redhat.com>

tests: allows to run single test in usb-hcd-ehci-test

pci_init() shouldn't be a test function, but instead called before any
test. This allows to run a single test with -p /x86_64/ehci/....

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Gerd Hoffmann <kraxel@redhat.com>

usb: release the created buses

Leaks spotted by ASAN.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Gerd Hoffmann <kraxel@redhat.com>

bus: do not unref hotplug handler

Apparently, none of the bus owner give a reference to the hotplug
handler property, do not unref it on bus release.

Furthermore, a bus is allowed to be its own hotplug handler, which can
be seen in qbus_set_bus_hotplug_handler() function. However, in this
case, the reference can't be given to the property, or this will create
a cyclic dependency and the bus will never be free.

Each bus owner should manage the lifecycle of the hotplug handler.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>

tests: fix virtio-9p-test leaks

Spotted by ASAN.

Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Greg Kurz <groug@kaod.org>

tests: fix virtio-scsi-test leak

Spotted by ASAN.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>

tests: fix e1000e leaks

Spotted by ASAN.

This hunk adds an assertion. It checks that we're finding no more than
one e1000e device: each hit allocates, but there is only one g_free().

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>

tests: fix i440fx-test leaks

Spotted by ASAN.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>

tests: fix e1000-test leak

Spotted by ASAN.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>

tests: fix tco-test leaks

Spotted by ASAN.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>

tests: fix eepro100-test leak

Spotted by ASAN.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Stefan Weil <sw@weilnetz.de>

pc: pcihp: avoid adding ACPI_PCIHP_PROP_BSEL twice

PCI hotplug for bridges was introduced only since 2.0 however
acpi_set_bsel()->object_property_add_uint32_ptr(bus, ACPI_PCIHP_PROP_BSEL)
didn't take in account that for legacy mode (1.7) when
PCI hotplug for bridges is unavailable and ACPI_PCIHP_PROP_BSEL property
the only bus "PCI.0' has been created earlier at acpi_pcihp_init() time.

We managed to live with it only because of error rised by adding
a duplicate property in acpi_set_bsel() has been ignored which
resulted in useless leaking of just allocated (int)bus_bsel.

Issue affects only 1.7 machine type as ACPI tables supported by
QEMU were introduced at that time, but there wasn't PCI hotplug
for bridges till the next release (2.0).

Fix it by removing duplicate ACPI_PCIHP_PROP_BSEL intialization
in acpi_pcihp_init() and doing it only in one place acpi_set_pci_info().

PS:
do not ignore error returned by object_property_add_uint32_ptr()
and abort QEMU since it's programming error which should be fixed
instead of being ignored.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reported-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <1470211497-116801-1-git-send-email-imammedo@redhat.com>
Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
[ Marc-André - Remove now unused ACPI_PCIHP_LEGACY_SIZE ]
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>

tests: fix ipmi-bt-test leak

Spotted by ASAN.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>

tests: fix ipmi-kcs-test leak

Spotted by ASAN.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>

tests: fix bios-tables-test leak

The inside array should be free too.
Spotted by ASAN.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>

tests: fix hd-geo-test leaks

Spotted by ASAN.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>

tests: fix ide-test leaks

Spotted by ASAN.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>

tests: fix vhost-user-test leaks

Spotted by ASAN.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>

tests: fix q35-test leaks

Spotted by ASAN.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>

Add PowerPC 32-bit guest memory dump support

This patch extends support for the `dump-guest-memory` command to the
32-bit PowerPC architecture. It relies on the assumption that a 64-bit
guest will not dump a 32-bit core file (and vice versa).

[dwg: I suspect this patch won't cover all cases, in particular a
32-bit machine type on a 64-bit qemu build. However, it does strictly
more than what we had before, so might as well apply as a starting
point]

Signed-off-by: Mike Nawrocki <michael.nawrocki@gtri.gatech.edu>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>

ppc/xics: rename 'ICPState *' variables to 'icp'

'ICPState *' variables are currently named 'ss'. This is confusing, so
let's give them an appropriate name: 'icp'.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>

ppc/xics: move InterruptStatsProvider to the sPAPR machine

It provides a better monitor output of the ICP and ICS objects, else
the objects are printed out of order.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>

ppc/xics: move ics-simple post_load under the machine

The ICS object uses a post_load() handler which is implicitly relying
on the fact that the internal state of the ICS and ICP objects has
been restored but this is not guaranteed. So, let's move the code
under the post_load() handler of the machine where we know the objects
have been fully restored.

The icp_resend() handler of the XICSFabric QOM interface is also
removed as it is now obsolete.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>

ppc/xics: remove the XICSState classes

The XICSState classes are not used anymore. They have now been fully
deprecated by the XICSFabric QOM interface. Do the cleanups.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>

ppc/xics: export the XICS init routines

There is nothing left related to the XICS object in the realize
functions of the KVMXICSState and XICSState class. So adapt the
interfaces to call these routines directly from the sPAPR machine init
sequence.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>

ppc/xics: move the ICP array under the sPAPR machine

This is the last step to remove the XICSState abstraction and have the
machine hold all the objects related to interrupts : ICSs and ICPs.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>

ppc/xics: register the reset handler of ICP objects

The reset of the ICP objects is currently handled by XICS but this can
be done for each individual ICP.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>

ppc/xics: simplify spapr_dt_xics() interface

spapr_dt_xics() only needs the number of servers to build the device
tree nodes. Let's change the routine interface to reflect that.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>

ppc/xics: use the QOM interface to grab an ICP

Also introduce a xics_icp_get() helper to simplify the changes.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>

ppc/xics: move the cpu_setup() handler under the ICPState class

The cpu_setup() handler is currently under the XICSState class but it
really belongs under ICPState as it is setting up an individual vCPU.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>

ppc/xics: simplify the cpu_setup() handler

The cpu_setup() handler currently takes a 'XICSState *' argument to
grab the kernel ICP file descriptor. This interface can be simplified
by using the 'xics' backlink of the ICP object.

This change is also required by subsequent patches which makes use of
the QOM interface for XICS.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>

ppc/xics: move kernel_xics_fd out of KVMXICSState

The kernel ICP file descriptor is the only reason behind the
KVMXICSState class and it's in the way of more cleanups. Let's make it
a static for the moment and move forward.

If this is problem, we could use an attribute under the sPAPR machine
later on.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>

ppc/xics: extend the QOM interface to handle ICPs

Let's add two new handlers for ICPs. One is to get an ICP object from
a server number and a second is to resend the irqs when needed.

The icp_resend() handler is a temporary workaround needed by the
ics-simple post_load() handler. It will be removed when the post_load
portion can be done at the machine level.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>

ppc/xics: remove the XICS list of ICS

This is not used anymore.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>

ppc/xics: register the reset handler of ICS objects

The reset of the ICS objects is currently handled by XICS but this can
be done for each individual ICS. This also reduces the use of the XICS
list of ICS.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>

ppc/xics: remove xics_find_source()

It is not used anymore now that we have the QOM interface for XICS.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>

ppc/xics: use the QOM interface to resend irqs

Also change the ICPState 'xics' backlink to be a XICSFabric, this
removes the need of using qdev_get_machine() to get the QOM interface
in some of the routines.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>

ppc/xics: use the QOM interface to get irqs

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>

ppc/xics: use the QOM interface under the sPAPR machine

Add 'ics_get' and 'ics_resend' handlers to the sPAPR machine. These
are relatively simple for a single ICS.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>

ppc/xics: introduce a XICSFabric QOM interface to handle ICSs

This interface provides two simple handlers. One is to get an ICS
(Interrupt Source Controller) object from an irq number and a second
to resend the irqs when needed.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>

ppc/xics: add an InterruptStatsProvider interface to ICS and ICP objects

This is, again, to reduce the use of the list of ICS objects. Let's
make each individual ICS and ICP object an InterruptStatsProvider and
remove this same interface from XICSState.

The InterruptStatsProvider will be moved at the machine level after
the XICS cleanups are completed.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>

ppc/xics: store the ICS object under the sPAPR machine

A list of ICS objects was introduced under the XICS object for the
PowerNV machine but, for the sPAPR machine, it brings extra complexity
as there is only a single ICS. To simplify the code, let's add the ICS
pointer under the sPAPR machine and try to reduce the use of this list
where possible.

Also, change the xics_spapr_*() routines to use an ICS object instead
of an XICSState and change their name to reflect that these are
specific to the sPAPR ICS object.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>

ppc/xics: remove set_nr_servers() handler from XICSStateClass

Today, the ICP (Interrupt Controller Presenter) objects are created by
the 'nr_servers' property handler of the XICS object and a class
handler. They are realized in the XICS object realize routine.

Let's simplify the process by creating the ICP objects along with the
XICS object at the machine level.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>

ppc/xics: remove set_nr_irqs() handler from XICSStateClass

Today, the ICS (Interrupt Controller Source) object is created and
realized by the init and realize routines of the XICS object, but some
of the parameters are only known at the machine level.

These parameters are passed from the sPAPR machine to the ICS object
in a rather convoluted way using property handlers and a class handler
of the XICS object. The number of irqs required to allocate the IRQ
state objects in the ICS realize routine is one of them.

Let's simplify the process by creating the ICS object along with the
XICS object at the machine level and link the ICS into the XICS list
of ICSs at this level also. In the sPAPR machine, there is only a
single ICS but that will change with the PowerNV machine.

Also, QOMify the creation of the objects and get rid of the
superfluous code.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>

xics: XICS should not be a SysBusDevice

Currently xics - the component of the IBM POWER interrupt controller
representing the overall interrupt fabric / architecture is
represented as a descendent of SysBusDevice.  However, this is not
really correct - the xics presents nothing in MMIO space so it should
be an "unattached" device in the current QOM model.

Since this device will always be created by the machine type, not created
specifically from the command line, and because it has no migrated state
it should be safe to move it around the device composition tree.

Therefore this patch changes it to a descendent of TYPE_DEVICE, and
makes it an unattached device.  So that its reset handler still gets
called correctly, we add a qdev_set_parent_bus() to attach it to
sysbus.  It's not really clear that's correct (instead of using
register_reset()) but it appears to a common technique.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
[clg corrected problems with reset]
Signed-off-by: Cédric Le Goater <clg@kaod.org>
[dwg folded together and updated commit message]
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>

spapr/pci: populate PCI DT in reverse order

Since commit 1d2d974244c6 "spapr_pci: enumerate and add PCI device tree", QEMU
populates the PCI device tree in the opposite order compared to SLOF.

Before 1d2d974244c6:

Populating /pci@800000020000000
                     00 0000 (D) : 1af4 1000    virtio [ net ]
                     00 0800 (D) : 1af4 1001    virtio [ block ]
                     00 1000 (D) : 1af4 1009    virtio [ network ]
Populating /pci@800000020000000/unknown-legacy-device@2

7e5294b8 :  /pci@800000020000000
7e52b998 :  |-- ethernet@0
7e52c0c8 :  |-- scsi@1
7e52c7e8 :  +-- unknown-legacy-device@2 ok

Since 1d2d974244c6:

Populating /pci@800000020000000
                     00 1000 (D) : 1af4 1009    virtio [ network ]
Populating /pci@800000020000000/unknown-legacy-device@2
                     00 0800 (D) : 1af4 1001    virtio [ block ]
                     00 0000 (D) : 1af4 1000    virtio [ net ]

7e5e8118 :  /pci@800000020000000
7e5ea6a0 :  |-- unknown-legacy-device@2
7e5eadb8 :  |-- scsi@1
7e5eb4d8 :  +-- ethernet@0 ok

This behaviour change is not actually a bug since no assumptions should be
made on DT ordering. But it has no real justification either, other than
being the consequence of the way fdt_add_subnode() inserts new elements
to the front of the FDT rather than adding them to the tail.

This patch reverts to the historical SLOF ordering by walking PCI devices
in reverse order. This reconciles pseries with x86 machine types behavior.
It is expected to make things easier when porting existing applications to
power.

Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Tested-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
(slight update to the changelog)
Signed-off-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>

target/ppc: add mcrxrx instruction

mcrxrx: Move to CR from XER Extended

Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>

target/ppc: add ov32 flag in divide operations

Add helper_div_compute_ov() in the int_helper for updating the overflow
flags.

For Divide Word:
SO, OV, and OV32 bits reflects overflow of the 32-bit result

For Divide DoubleWord:
SO, OV, and OV32 bits reflects overflow of the 64-bit result

Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>

target/ppc: add ov32 flag for multiply low insns

For Multiply Word:
SO, OV, and OV32 bits reflects overflow of the 32-bit result

For Multiply DoubleWord:
SO, OV, and OV32 bits reflects overflow of the 64-bit result

Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>

target/ppc: use tcg ops for neg instruction

Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>

target/ppc: update overflow flags for add/sub

* SO and OV reflects overflow of the 64-bit result in 64-bit mode and
overflow of the low-order 32-bit result in 32-bit mode

* OV32 reflects overflow of the low-order 32-bit independent of the mode

Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>

target/ppc: update ca32 in arithmetic substract

Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>

target/ppc: update ca32 in arithmetic add

Adds routine to compute ca32 - gen_op_arith_compute_ca32

For 64-bit mode use the compute ca32 routine. While for 32-bit mode, CA
and CA32 will have same value.

Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>

target/ppc: support for 32-bit carry and overflow

POWER ISA 3.0 adds CA32 and OV32 status in 64-bit mode. Add the flags
and corresponding defines.

Moreover, CA32 is updated when CA is updated and OV32 is updated when OV
is updated.

Arithmetic instructions:
    * Addition and Substractions:

        addic, addic., subfic, addc, subfc, adde, subfe, addme, subfme,
        addze, and subfze always updates CA and CA32.

        => CA reflects the carry out of bit 0 in 64-bit mode and out of
           bit 32 in 32-bit mode.
        => CA32 reflects the carry out of bit 32 independent of the
           mode.

        => SO and OV reflects overflow of the 64-bit result in 64-bit
           mode and overflow of the low-order 32-bit result in 32-bit
           mode
        => OV32 reflects overflow of the low-order 32-bit independent of
           the mode

    * Multiply Low and Divide:

        For mulld, divd, divde, divdu and divdeu: SO, OV, and OV32 bits
        reflects overflow of the 64-bit result

        For mullw, divw, divwe, divwu and divweu: SO, OV, and OV32 bits
        reflects overflow of the 32-bit result

     * Negate with OE=1 (nego)

       For 64-bit mode if the register RA contains
       0x8000_0000_0000_0000, OV and OV32 are set to 1.

       For 32-bit mode if the register RA contains 0x8000_0000, OV and
       OV32 are set to 1.

Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>

target/ppc: Correct SDR1 masking

SDR_64_HTABORG, which indicates the bits of the SDR1 register to use for
the base of a 64-bit machine's hashed page table (HPT) isn't correct.  It
includes the top 46 bits of the register, but in fact the top 4 bits must
be zero (according to the ISA v2.07).  No actual implementation has
supported close to 2^60 bytes of physical address space, so it's kind of
irrelevant, but we might as well correct this.

In addition, although we checked for bad size values in SDR1, we never
reported an error if entirely invalid bits were set there.  Add this check
to ppc_store_sdr1().

Reported-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>

target/ppc: Remove the function ppc_hash64_set_sdr1()

The function ppc_hash64_set_sdr1 basically checked the htabsize and set an
error if it was too big, otherwise it just stored the value in SPR_SDR1.

Given that the only function which calls ppc_hash64_set_sdr1() is
ppc_store_sdr1(), why not handle the checking in ppc_store_sdr1() to avoid
the extra function call. Note that ppc_store_sdr1() already stores the
value in SPR_SDR1 anyway, so we were doing it twice.

Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com>
[dwg: Remove unnecessary error temporary]
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>

target/ppc: Manage external HPT via virtual hypervisor

The pseries machine type implements the behaviour of a PAPR compliant
hypervisor, without actually executing such a hypervisor on the virtual
CPU.  To do this we need some hooks in the CPU code to make hypervisor
facilities get redirected to the machine instead of emulated internally.

For hypercalls this is managed through the cpu->vhyp field, which points
to a QOM interface with a method implementing the hypercall.

For the hashed page table (HPT) - also a hypervisor resource - we use an
older hack.  CPUPPCState has an 'external_htab' field which when non-NULL
indicates that the HPT is stored in qemu memory, rather than within the
guest's address space.

For consistency - and to make some future extensions easier - this merges
the external HPT mechanism into the vhyp mechanism.  Methods are added
to vhyp for the basic operations the core hash MMU code needs: map_hptes()
and unmap_hptes() for reading the HPT, store_hpte() for updating it and
hpt_mask() to retrieve its size.

To match this, the pseries machine now sets these vhyp fields in its
existing vhyp class, rather than reaching into the cpu object to set the
external_htab field.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com>

target/ppc: Eliminate htab_base and htab_mask variables

CPUPPCState includes fields htab_base and htab_mask which store the base
address (GPA) and size (as a mask) of the guest's hashed page table (HPT).
These are set when the SDR1 register is updated.

Keeping these in sync with the SDR1 is actually a little bit fiddly, and
probably not useful for performance, since keeping them expands the size of
CPUPPCState.  It also makes some upcoming changes harder to implement.

This patch removes these fields, in favour of calculating them directly
from the SDR1 contents when necessary.

This does make a change to the behaviour of attempting to write a bad value
(invalid HPT size) to the SDR1 with an mtspr instruction.  Previously, the
bad value would be stored in SDR1 and could be retrieved with a later
mfspr, but the HPT size as used by the softmmu would be, clamped to the
allowed values.  Now, writing a bad value is treated as a no-op.  An error
message is printed in both new and old versions.

I'm not sure which behaviour, if either, matches real hardware.  I don't
think it matters that much, since it's pretty clear that if an OS writes
a bad value to SDR1, it's not going to boot.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>

target/ppc: Cleanup HPTE accessors for 64-bit hash MMU

Accesses to the hashed page table (HPT) are complicated by the fact that
the HPT could be in one of three places:
   1) Within guest memory - when we're emulating a full guest CPU at the
      hardware level (e.g. powernv, mac99, g3beige)
   2) Within qemu, but outside guest memory - when we're emulating user and
      supervisor instructions within TCG, but instead of emulating
      the CPU's hypervisor mode, we just emulate a hypervisor's behaviour
      (pseries in TCG or KVM-PR)
   3) Within the host kernel - a pseries machine using KVM-HV
      acceleration.  Mostly accesses to the HPT are handled by KVM,
      but there are a few cases where qemu needs to access it via a
      special fd for the purpose.

In order to batch accesses to the fd in case (3), we use a somewhat awkward
ppc_hash64_start_access() / ppc_hash64_stop_access() pair, which for case
(3) reads / releases several HPTEs from the kernel as a batch (usually a
whole PTEG).  For cases (1) & (2) it just returns an address value.  The
actual HPTE load helpers then need to interpret the returned token
differently in the 3 cases.

This patch keeps the same basic structure, but simplfiies the details.
First start_access() / stop_access() are renamed to map_hptes() and
unmap_hptes() to make their operation more obvious.  Second, map_hptes()
now always returns a qemu pointer, which can always be used in the same way
by the load_hpte() helpers.  In case (1) it comes from address_space_map()
in case (2) directly from qemu's HPT buffer and in case (3) from a
temporary buffer read from the KVM fd.

While we're at it, make things a bit more consistent in terms of types and
variable names: avoid variables named 'index' (it shadows index(3) which
can lead to confusing results), use 'hwaddr ptex' for HPTE indices and
uint64_t for each of the HPTE words, use ptex throughout the call stack
instead of pte_offset in some places (we still need that at the bottom
layer, but nowhere else).

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>

target/ppc: SDR1 is a hypervisor resource

At present the SDR1 register - the base of the system's hashed page table
(HPT) - is represented as an SPR with supervisor read and write permission.
However, on CPUs which have a hypervisor mode, the SDR1 is a hypervisor
only resource. Change the permission checking on the SPR to reflect this.

Now that this is done, we don't need to check for an external HPT executing
mtsdr1: an external HPT only applies when we're emulating the behaviour of
a hypervisor, rather than modelling the CPU's hypervisor mode internally,
so if we're permitted to execute mtsdr1, we don't have an external HPT.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com>

target/ppc: Merge cpu_ppc_set_vhyp() with cpu_ppc_set_papr()

cpu_ppc_set_papr() sets up various aspects of CPU state for use with PAPR
paravirtualized guests. However, it doesn't set the virtual hypervisor,
so callers must also call cpu_ppc_set_vhyp() so that PAPR hypercalls are
handled properly. This is a bit silly, so fold setting the virtual
hypervisor into cpu_ppc_set_papr().

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com>

pseries: Minor cleanups to HPT management hypercalls

* Standardize on 'ptex' instead of 'pte_index' for HPTE index variables
   for consistency and brevity
* Avoid variables named 'index'; shadowing index(3) from libc can lead to
   surprising bugs if the variable is removed, because compiler errors
   might not appear for remaining references
* Clarify index calculations in h_enter() - we have two cases, H_EXACT
   where the exact HPTE slot is given, and !H_EXACT where we search for
   an empty slot within the hash bucket.  Make the calculation more
   consistent between the cases.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com>

target/ppc: Fix KVM-HV HPTE accessors

When a 'pseries' guest is running with KVM-HV, the guest's hashed page
table (HPT) is stored within the host kernel, so it is not directly
accessible to qemu.  Most of the time, qemu doesn't need to access it:
we're using the hardware MMU, and KVM itself implements the guest
hypercalls for manipulating the HPT.

However, qemu does need access to the in-KVM HPT to implement
get_phys_page_debug() for the benefit of the gdbstub, and maybe for
other debug operations.

To allow this, 7c43bca "target-ppc: Fix page table lookup with kvm
enabled" added kvmppc_hash64_read_pteg() to target/ppc/kvm.c to read
in a batch of HPTEs from the KVM table.  Unfortunately, there are a
couple of problems with this:

First, the name of the function implies it always reads a whole PTEG
from the HPT, but in fact in some cases it's used to grab individual
HPTEs (which ends up pulling 8 HPTEs, not aligned to a PTEG from the
kernel).

Second, and more importantly, the code to read the HPTEs from KVM is
simply wrong, in general.  The data from the fd that KVM provides is
designed mostly for compact migration rather than this sort of one-off
access, and so needs some decoding for this purpose.  The current code
will work in some cases, but if there are invalid HPTEs then it will
not get sane results.

This patch rewrite the HPTE reading function to have a simpler
interface (just read n HPTEs into a caller provided buffer), and to
correctly decode the stream from the kernel.

For consistency we also clean up the similar function for altering
HPTEs within KVM (introduced in c138593 "target-ppc: Update
ppc_hash64_store_hpte to support updating in-kernel htab").

Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>

sysemu: support up to 1024 vCPUs

Some systems can already provide more than 255 hardware threads.

Bumping the QEMU limit to 1024 seems reasonable:
- it has no visible overhead in top;
- the limit itself has no effect on hot paths.

Cc: Greg Kurz <gkurz@linux.vnet.ibm.com>
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>