target/ppc: Move no-execute and guarded page checking into new function
A pte entry has bit fields which can be used to make a page no-execute or
guarded, if either of these bits are set then an instruction access to this
page will fail. Currently these bits are checked with the pp_prot function
however the ISA specifies that the access authority controlled by the
key-pp value pair should only be checked on an instruction access after
the no-execute and guard bits have already been verified to permit the
access.
Move the no-execute and guard bit checking into a new separate function.
Note that we can remove the check for the no-execute bit in the slb entry
since this check was already performed above when we obtained the slb
entry.
In the event that the no-execute or guard bits are set, an ISI should be
generated with the SRR1_NOEXEC_GUARD (0x10000000) bit set in srr1. Add a
define for this for clarity.
Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com>
[dwg: Move constants to cpu.h since they're not MMUv3 specific] Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
target/ppc: Add execute permission checking to access authority check
Basic storage protection defines various access authority permissions
based on a slb storage key and pte pp value pair. This access authority
defines read, write and execute permissions however currently we only
use this to control read and write permissions and ignore the execute
control.
Fix the code to allow execute permissions based on the key-pp value pair.
Execute is allowed under the same conditions which enable reads.
(i.e. read permission -> execute permission)
Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com> Acked-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
The instruction authority mask register (IAMR) can be used to restrict
permissions for instruction fetch accesses on a per key basis for each
of 32 different key values. Access permissions are derived based on the
specific key value stored in the relevant page table entry.
The IAMR was introduced in, and is present in processors since, POWER8
(ISA v2.07). Thus introduce a function to check access permissions based
on the pte key value and the contents of the IAMR when handling a page
fault to ensure sufficient access permissions for an instruction fetch.
A hash pte contains a key value in bits 2:3|52:54 of the second double word
of the pte, this key value gives an index into the IAMR which contains 32
2-bit access masks. If the least significant bit of the 2-bit access mask
corresponding to the given key value is set (IAMR[key] & 0x1 == 0x1) then
the instruction fetch is not permitted and an ISI is generated accordingly.
While we're here, add defines for the srr1 bits to be set for the ISI for
clarity.
Least significant bit of the access mask is set, thus the instruction fetch
is not permitted. We should generate an instruction storage interrupt (ISI)
with bit 42 of SRR1 set to indicate access precluded by virtual page class
key protection.
Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com>
[dwg: Move new constants to cpu.h, since they're not MMUv3 specific] Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
target/ppc/POWER9: Add cpu_has_work function for POWER9
The cpu has work function is used to mask interrupts used to determine
if there is work for the cpu based on the LPCR. Add a function to do this
for POWER9 and add it to the POWER9 cpu definition. This is similar to that
for POWER8 except using the LPCR bits as defined for POWER9.
Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Add a pa-features definition which includes all of the new fields which
have been added, note we don't claim support for any of these new features
at this stage.
Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Acked-by: Balbir Singh <bsingharora@gmail.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
target/ppc: Don't gen an SDR1 on POWER9 and rework register creation
POWER9 doesn't have a storage description register 1 (SDR1) which is used
to store the base and size of the hash table. Thus we don't need to
generate this register on the POWER9 cpu model. While we're here, the
register generation code for 970, POWER5+, POWER<7/8/9> in general is a
mess where we call a generic function from a model specific function which
then attempts to call model specific functions, so rework this for
readability.
We update ppc_cpu_dump_state so that "info registers" will only display
the value of sdr1 if the register has been generated.
As mentioned above the register generation for the pcc->init_proc
function for 970, POWER5+, POWER7, POWER8 and POWER9 has been reworked
for improved clarity. Instead of calling init_proc_book3s_64 which then
attempts to generate the correct registers through a mess of if statements,
we remove this function and instead call the appropriate register
generation functions directly. This follows the register generation model
used for earlier cpu models (pre-970) whereby cpu specific registers are
generated directly in the init_proc function and makes it easier to
add/remove specific registers for new cpu models.
Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
ISA v3.00 adds the idea of a partition table which is used to store the
address translation details for all partitions on the system. The partition
table consists of double word entries indexed by partition id where the second
double word contains the location of the process table in guest memory. The
process table is registered by the guest via a h-call.
We need somewhere to store the address of the process table so we add an entry
to the sPAPRMachineState struct called patb_entry to represent the second
doubleword of a single partition table entry corresponding to the current
guest. We need to store this value so we know if the guest is using radix or
hash translation and the location of the corresponding process table in guest
memory. Since we only have a single guest per qemu instance, we only need one
entry.
Since the partition table is technically a hypervisor resource we require that
access to it is abstracted by the virtual hypervisor through the get_patbe()
call. Currently the value of the entry is never set (and thus
defaults to 0 indicating hash), but it will be required to both implement
POWER9 kvm support and tcg radix support.
We also add this field to be migrated as part of the sPAPRMachineState as we
will need it on the receiving side as the guest will never tell us this
information again and we need it to perform translation.
Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
David Gibson [Thu, 2 Mar 2017 01:13:07 +0000 (12:13 +1100)]
target/ppc/POWER9: Add POWERPC_MMU_V3 bit
For easier handling of future processors using the POWER9 or something
close to it, add a new bit in the MMU model. This was originally from a
revised version of 86cf1e9 "target/ppc/POWER9: Add ISAv3.00 MMU definition"
but the older version of the patch was already merged. This makes the
change on top of the original version.
Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
David Gibson [Thu, 2 Mar 2017 04:29:31 +0000 (15:29 +1100)]
powernv: Don't test POWER9 CPU yet
A couple of tests for the work-in-progress 'powernv' machine type attempt
to test on POWER9 CPUs. However the POWER9 CPU support is incomplete and
this doesn't really work. In particular the firmware image we have
currently assumes the presence of the SDR1 register, which no longer exists
on POWER9. We only got away with this so far, because of a different bug
which added SDR1 to POWER9 even though it shouldn't be there.
For now, remove POWER9 testing of powernv, POWER8 testing will do for now
until the POWER9 support is more complete.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
exec, kvm, target-ppc: Move getrampagesize() to common code
getrampagesize() returns the largest supported page size and mainly
used to know if huge pages are enabled.
However is implemented in target-ppc/kvm.c and not available
in TCG or other architectures.
This renames and moves gethugepagesize() to mmap-alloc.c where
fd-based analog of it is already implemented. This renames and moves
getrampagesize() to exec.c as it seems to be the common place for
helpers like this.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
compat_table contains the list of logical pvr compat modes which a cpu can
operate in. It is a list of struct CompatInfo which contains the given pvr
value for a compat mode, the pcr bits which should be set to operate in
that compat mode, the pcr level which must be present in pcr_supported for
a processor to support that compat mode and the max threads possible in
that compat mode.
Add an entry for the POWER9/ISAv3.00 logical pvr which represents a
processor running with support for logical pvr 0x0f000005. A processor
running in this mode should have PCR_COMPAT_3_00 set in the pcr (if
available in pcr_mask) and should have PCR_COMPAT_3_00 in pcr_supported
to indicate that it is capable of running in this compat mode.
Also add PCR_COMPAT_3_00 to the bits which must be set for all previous
compat modes. Since no processor models contain this bit yet in pcr_mask
it will never be set, but this ensures we don't forget to in the future.
Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Peter Maydell [Thu, 2 Mar 2017 17:39:12 +0000 (17:39 +0000)]
Merge remote-tracking branch 'remotes/dgilbert/tags/pull-migration-20170228a' into staging
Migration pull
Note: The 'postcopy: Update userfaultfd.h header' is part of
Paolo's header update and will disappear if applied after it.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
# gpg: Signature made Tue 28 Feb 2017 12:38:34 GMT
# gpg: using RSA key 0x0516331EBC5BFDE7
# gpg: Good signature from "Dr. David Alan Gilbert (RH2) <dgilbert@redhat.com>"
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 45F5 C71B 4A0C B7FB 977A 9FA9 0516 331E BC5B FDE7
* remotes/dgilbert/tags/pull-migration-20170228a: (27 commits)
postcopy: Add extra check for COPY function
postcopy: Add doc about hugepages and postcopy
postcopy: Check for userfault+hugepage feature
postcopy: Update userfaultfd.h header
postcopy: Allow hugepages
postcopy: Send whole huge pages
postcopy: Mask fault addresses to huge page boundary
postcopy: Load huge pages in one go
postcopy: Use temporary for placing zero huge pages
postcopy: Plumb pagesize down into place helpers
postcopy: Record largest page size
postcopy: enhance ram_block_discard_range for hugepages
exec: ram_block_discard_range
postcopy: Chunk discards for hugepages
postcopy: Transmit and compare individual page sizes
postcopy: Transmit ram size summary word
migration: fix use-after-free of to_dst_file
migration: Update docs to discourage version bumps
migration: fix id leak regression
migrate: Introduce a 'dc->vmsd' check to avoid segfault for --only-migratable
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Peter Maydell [Thu, 2 Mar 2017 15:25:37 +0000 (15:25 +0000)]
Merge remote-tracking branch 'remotes/elmarco/tags/leak-pull-request' into staging
# gpg: Signature made Wed 01 Mar 2017 09:02:53 GMT
# gpg: using RSA key 0xDAE8E10975969CE5
# gpg: Good signature from "Marc-André Lureau <marcandre.lureau@redhat.com>"
# gpg: aka "Marc-André Lureau <marcandre.lureau@gmail.com>"
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg: It is not certain that the signature belongs to the owner.
# Primary key fingerprint: 87A9 BD93 3F87 C606 D276 F62D DAE8 E109 7596 9CE5
* remotes/elmarco/tags/leak-pull-request: (28 commits)
tests: fix virtio-blk-test leaks
tests: add specialized device_find function
tests: fix usb-test leaks
tests: allows to run single test in usb-hcd-ehci-test
usb: release the created buses
bus: do not unref hotplug handler
tests: fix virtio-9p-test leaks
tests: fix virtio-scsi-test leak
tests: fix e1000e leaks
tests: fix i440fx-test leaks
tests: fix e1000-test leak
tests: fix tco-test leaks
tests: fix eepro100-test leak
pc: pcihp: avoid adding ACPI_PCIHP_PROP_BSEL twice
tests: fix ipmi-bt-test leak
tests: fix ipmi-kcs-test leak
tests: fix bios-tables-test leak
tests: fix hd-geo-test leaks
tests: fix ide-test leaks
tests: fix vhost-user-test leaks
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Peter Maydell [Thu, 2 Mar 2017 13:50:54 +0000 (13:50 +0000)]
Merge remote-tracking branch 'remotes/dgibson/tags/ppc-for-2.9-20170301' into staging
ppc patch queue for 2017-03-01
I was hoping to get this pull request squeezed in before the soft
freeze, but I ran into some difficulties during testing. Everything
here was at least posted before the soft freeze, so I'm hoping we can
still merge it for 2.9.
The biggest things here are:
* Cleanups to handling of hashed page tables, that will make
adding support for the POWER9 MMU easier
* Cleanups to the XICS interrupt controller that will make
implementing the powernv machine easier
* TCG implementation of extended overflow and carry handling for
POWER9
It also includes:
* Increasing the CPU limit for pseries to 1024 vCPUs
* Generating proper OF node names in qemu (making hotplug and
coldplug logic closer together)
* remotes/dgibson/tags/ppc-for-2.9-20170301: (50 commits)
Add PowerPC 32-bit guest memory dump support
ppc/xics: rename 'ICPState *' variables to 'icp'
ppc/xics: move InterruptStatsProvider to the sPAPR machine
ppc/xics: move ics-simple post_load under the machine
ppc/xics: remove the XICSState classes
ppc/xics: export the XICS init routines
ppc/xics: move the ICP array under the sPAPR machine
ppc/xics: register the reset handler of ICP objects
ppc/xics: simplify spapr_dt_xics() interface
ppc/xics: use the QOM interface to grab an ICP
ppc/xics: move the cpu_setup() handler under the ICPState class
ppc/xics: simplify the cpu_setup() handler
ppc/xics: move kernel_xics_fd out of KVMXICSState
ppc/xics: extend the QOM interface to handle ICPs
ppc/xics: remove the XICS list of ICS
ppc/xics: register the reset handler of ICS objects
ppc/xics: remove xics_find_source()
ppc/xics: use the QOM interface to resend irqs
ppc/xics: use the QOM interface to get irqs
ppc/xics: use the QOM interface under the sPAPR machine
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Peter Maydell [Thu, 2 Mar 2017 11:18:01 +0000 (11:18 +0000)]
Merge remote-tracking branch 'remotes/ehabkost/tags/x86-pull-request' into staging
x86 queue, 2017-02-27
"-cpu max" and query-cpu-model-expansion support for x86. This
should be the last x86 pull request before 2.9 soft freeze.
# gpg: Signature made Mon 27 Feb 2017 16:24:15 GMT
# gpg: using RSA key 0x2807936F984DC5A6
# gpg: Good signature from "Eduardo Habkost <ehabkost@redhat.com>"
# Primary key fingerprint: 5A32 2FD5 ABC4 D3DB ACCF D1AA 2807 936F 984D C5A6
* remotes/ehabkost/tags/x86-pull-request:
i386: Improve query-cpu-model-expansion full mode
i386: Implement query-cpu-model-expansion QMP command
i386: Define static "base" CPU model
i386: Don't set CPUClass::cpu_def on "max" model
i386: Make "max" model not use any host CPUID info on TCG
i386: Create "max" CPU model
qapi-schema: Comment about full expansion of non-migration-safe models
i386: Reorganize and document CPUID initialization steps
i386: Rename X86CPU::host_features to X86CPU::max_features
i386: Add ordering field to CPUClass
i386: Unset cannot_destroy_with_object_finalize_yet on "host" model
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Peter Maydell [Wed, 1 Mar 2017 23:09:46 +0000 (23:09 +0000)]
Merge remote-tracking branch 'remotes/kevin/tags/for-upstream' into staging
Block layer patches
# gpg: Signature made Tue 28 Feb 2017 20:35:32 GMT
# gpg: using RSA key 0x7F09B272C88F2FD6
# gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>"
# Primary key fingerprint: DC3D EB15 9A9A F95D 3D74 56FE 7F09 B272 C88F 2FD6
* remotes/kevin/tags/for-upstream: (46 commits)
block: Add Error parameter to bdrv_append()
block: Add Error parameter to bdrv_set_backing_hd()
block: Assertions for resize permission
block: Assertions for write permissions
block: Pass BdrvChild to bdrv_aligned_preadv/pwritev and copy-on-read
tests: Remove FIXME comments
nbd/server: Use real permissions for NBD exports
migration/block: Use real permissions
hmp: Request permissions in qemu-io
commit: Add filter-node-name to block-commit
mirror: Add filter-node-name to blockdev-mirror
stream: Use real permissions in streaming block job
mirror: Use real permissions in mirror/active commit block job
blockjob: Factor out block_job_remove_all_bdrv()
block: Allow backing file links in change_parent_backing_link()
block: BdrvChildRole.attach/detach() callbacks
block: Fix pending requests check in bdrv_append()
backup: Use real permissions in backup block job
commit: Use real permissions for HMP 'commit'
commit: Use real permissions in commit block job
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Peter Maydell [Wed, 1 Mar 2017 20:33:47 +0000 (20:33 +0000)]
Merge remote-tracking branch 'remotes/sstabellini/tags/xen-20170228-tag' into staging
Xen 2017/02/28
# gpg: Signature made Tue 28 Feb 2017 19:13:08 GMT
# gpg: using RSA key 0x894F8F4870E1AE90
# gpg: Good signature from "Stefano Stabellini <sstabellini@kernel.org>"
# gpg: aka "Stefano Stabellini <stefano.stabellini@eu.citrix.com>"
# Primary key fingerprint: D04E 33AB A51F 67BA 07D3 0AEA 894F 8F48 70E1 AE90
* remotes/sstabellini/tags/xen-20170228-tag:
Add a new qmp command to do checkpoint, query xen replication status
Add a new qmp command to start/stop replication
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Cc: qemu-stable@nongnu.org Reported-by: Michael Russo <mike@papersolve.com> Tested-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Signed-off-by: Richard Henderson <rth@twiddle.net>
Peter Maydell [Wed, 1 Mar 2017 17:58:53 +0000 (17:58 +0000)]
Merge remote-tracking branch 'remotes/pmaydell/tags/pull-target-arm-20170228-1' into staging
target-arm queue:
* raspi2: add gpio controller and sdhost controller, with
the wiring so the guest can switch which controller the
SD card is attached to
(this is sufficient to get raspbian kernels to boot)
* GICv3: support state save/restore from KVM
* update Linux headers to 4.11
* refactor and QOMify the ARMv7M container object
# gpg: Signature made Tue 28 Feb 2017 17:11:49 GMT
# gpg: using RSA key 0x3C2525ED14360CDE
# gpg: Good signature from "Peter Maydell <peter.maydell@linaro.org>"
# gpg: aka "Peter Maydell <pmaydell@gmail.com>"
# gpg: aka "Peter Maydell <pmaydell@chiark.greenend.org.uk>"
# Primary key fingerprint: E1A5 C593 CD41 9DE2 8E83 15CF 3C25 25ED 1436 0CDE
* remotes/pmaydell/tags/pull-target-arm-20170228-1: (21 commits)
bcm2835: add sdhost and gpio controllers
bcm2835_gpio: add bcm2835 gpio controller
hw/sd: add card-reparenting function
qdev: Have qdev_set_parent_bus() handle devices already on a bus
hw/intc/arm_gicv3_kvm: Reset GICv3 cpu interface registers
target-arm: Add GICv3CPUState in CPUARMState struct
hw/intc/arm_gicv3_kvm: Implement get/put functions
hw/intc/arm_gicv3_kvm: Add ICC_SRE_EL1 register to vmstate
update Linux headers to 4.11
update-linux-headers: update for 4.11
stm32f205: Rename 'nvic' local to 'armv7m'
stm32f205: Create armv7m object without using armv7m_init()
armv7m: Split systick out from NVIC
armv7m: Don't put core v7M devices under CONFIG_STELLARIS
armv7m: Make bitband device take the address space to access
armv7m: Make NVIC expose a memory region rather than mapping itself
armv7m: Make ARMv7M object take memory region link
armv7m: Use QOMified armv7m object in armv7m_init()
armv7m: QOMify the armv7m container
armv7m: Move NVICState struct definition into header
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Thomas Huth [Tue, 31 Jan 2017 08:46:38 +0000 (09:46 +0100)]
audio/sdlaudio: Allow audio playback with SDL2
When compiling with SDL2, the semaphore trick used in sdlaudio.c
does not work - QEMU locks up completely in this case. To avoid
the hang and get at least some audio playback up and running (it's
a little bit crackling, but better than nothing), we can use the
SDL locking functions SDL_LockAudio() and SDL_UnlockAudio() to sync
with the sound playback thread instead.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-id: 1485852398-2327-1-git-send-email-thuth@redhat.com Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Pavel Dovgalyuk [Tue, 14 Feb 2017 07:15:10 +0000 (10:15 +0300)]
audio: make audio poll timer deterministic
This patch changes resetting strategy of the audio polling timer.
It does not change expiration time if the timer is already set.
This patch is needed to make this timer deterministic and to use execution
record/replay for audio devices.
audio_reset_timer is used in the function audio_vm_change_state_handler.
Therefore every time VM is stopped or restarted the timer will be reset
to new timeout. Virtual clock does not proceed while VM is stopped.
Therefore there is no need in resetting the timeout when VM restarts.
v2: updated commit message
v3: now using timer_mod_anticipate function (as suggested by Yurii Zubrytskyi)
Signed-off-by: Pavel Dovgalyuk <pavel.dovgaluk@ispras.ru>
Message-id: 20170214071510.6112.76764.stgit@PASHA-ISP Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Peter Maydell [Wed, 1 Mar 2017 13:53:20 +0000 (13:53 +0000)]
Merge remote-tracking branch 'remotes/gkurz/tags/cve-2016-9602-for-upstream' into staging
This pull request have all the fixes for CVE-2016-9602, so that it can
be easily picked up by downstreams, as suggested by Michel Tokarev.
# gpg: Signature made Tue 28 Feb 2017 10:21:32 GMT
# gpg: using DSA key 0x02FC3AEB0101DBC2
# gpg: Good signature from "Greg Kurz <groug@kaod.org>"
# gpg: aka "Greg Kurz <groug@free.fr>"
# gpg: aka "Greg Kurz <gkurz@linux.vnet.ibm.com>"
# gpg: aka "Gregory Kurz (Groug) <groug@free.fr>"
# gpg: aka "[jpeg image of size 3330]"
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 2BD4 3B44 535E C0A7 9894 DBA2 02FC 3AEB 0101 DBC2
Andrea Bolognani [Fri, 17 Feb 2017 10:14:39 +0000 (11:14 +0100)]
mach-virt: Provide sample configuration files
These are very much like the sample configuration files
for q35, and can be used both as documentation and as
a starting point for creating your own guest.
Two sample configuration files are provided:
* mach-virt-graphical.cfg can be used to start a
fully-featured (USB, graphical console, etc.)
guest that uses VirtIO devices;
* mach-virt-serial.cfg is similar but has a minimal
set of devices and uses the serial console.
All configuration files are fully commented and neatly
organized.
Signed-off-by: Andrea Bolognani <abologna@redhat.com> Reviewed-by: Andrew Jones <drjones@redhat.com> Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Message-id: 1487326479-8664-3-git-send-email-abologna@redhat.com Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Andrea Bolognani [Fri, 17 Feb 2017 10:14:38 +0000 (11:14 +0100)]
q35: Improve sample configuration files
Instead of having a single sample configuration file,
we now have several:
* q35-emulated.cfg documents the default devices QEMU
adds to a q35 guest and the additional devices that
are pretty much guaranteed to be present in a
physical q35-based machine;
* q35-virtio-graphical.cfg can be used to start a
fully-featured (USB, graphical console, audio, etc.)
guest that uses VirtIO instead of emulated devices;
* q35-virtio-serial.cfg is similar but has a minimal
set of devices and uses the serial console.
All configuration files are fully commented and neatly
organized.
Peter Maydell [Wed, 1 Mar 2017 13:06:00 +0000 (13:06 +0000)]
Merge remote-tracking branch 'remotes/famz/tags/docker-pull-request' into staging
# gpg: Signature made Tue 28 Feb 2017 12:40:00 GMT
# gpg: using RSA key 0xCA35624C6A9171C6
# gpg: Good signature from "Fam Zheng <famz@redhat.com>"
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg: It is not certain that the signature belongs to the owner.
# Primary key fingerprint: 5003 7CB7 9706 0F76 F021 AD56 CA35 624C 6A91 71C6
Apparently, none of the bus owner give a reference to the hotplug
handler property, do not unref it on bus release.
Furthermore, a bus is allowed to be its own hotplug handler, which can
be seen in qbus_set_bus_hotplug_handler() function. However, in this
case, the reference can't be given to the property, or this will create
a cyclic dependency and the bus will never be free.
Each bus owner should manage the lifecycle of the hotplug handler.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
PCI hotplug for bridges was introduced only since 2.0 however
acpi_set_bsel()->object_property_add_uint32_ptr(bus, ACPI_PCIHP_PROP_BSEL)
didn't take in account that for legacy mode (1.7) when
PCI hotplug for bridges is unavailable and ACPI_PCIHP_PROP_BSEL property
the only bus "PCI.0' has been created earlier at acpi_pcihp_init() time.
We managed to live with it only because of error rised by adding
a duplicate property in acpi_set_bsel() has been ignored which
resulted in useless leaking of just allocated (int)bus_bsel.
Issue affects only 1.7 machine type as ACPI tables supported by
QEMU were introduced at that time, but there wasn't PCI hotplug
for bridges till the next release (2.0).
Fix it by removing duplicate ACPI_PCIHP_PROP_BSEL intialization
in acpi_pcihp_init() and doing it only in one place acpi_set_pci_info().
PS:
do not ignore error returned by object_property_add_uint32_ptr()
and abort QEMU since it's programming error which should be fixed
instead of being ignored.
Mike Nawrocki [Tue, 28 Feb 2017 13:32:17 +0000 (08:32 -0500)]
Add PowerPC 32-bit guest memory dump support
This patch extends support for the `dump-guest-memory` command to the
32-bit PowerPC architecture. It relies on the assumption that a 64-bit
guest will not dump a 32-bit core file (and vice versa).
[dwg: I suspect this patch won't cover all cases, in particular a
32-bit machine type on a 64-bit qemu build. However, it does strictly
more than what we had before, so might as well apply as a starting
point]
Signed-off-by: Mike Nawrocki <michael.nawrocki@gtri.gatech.edu> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Cédric Le Goater [Mon, 27 Feb 2017 14:29:31 +0000 (15:29 +0100)]
ppc/xics: move ics-simple post_load under the machine
The ICS object uses a post_load() handler which is implicitly relying
on the fact that the internal state of the ICS and ICP objects has
been restored but this is not guaranteed. So, let's move the code
under the post_load() handler of the machine where we know the objects
have been fully restored.
The icp_resend() handler of the XICSFabric QOM interface is also
removed as it is now obsolete.
Signed-off-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Cédric Le Goater [Mon, 27 Feb 2017 14:29:29 +0000 (15:29 +0100)]
ppc/xics: export the XICS init routines
There is nothing left related to the XICS object in the realize
functions of the KVMXICSState and XICSState class. So adapt the
interfaces to call these routines directly from the sPAPR machine init
sequence.
Signed-off-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Cédric Le Goater [Mon, 27 Feb 2017 14:29:23 +0000 (15:29 +0100)]
ppc/xics: simplify the cpu_setup() handler
The cpu_setup() handler currently takes a 'XICSState *' argument to
grab the kernel ICP file descriptor. This interface can be simplified
by using the 'xics' backlink of the ICP object.
This change is also required by subsequent patches which makes use of
the QOM interface for XICS.
Signed-off-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Cédric Le Goater [Mon, 27 Feb 2017 14:29:22 +0000 (15:29 +0100)]
ppc/xics: move kernel_xics_fd out of KVMXICSState
The kernel ICP file descriptor is the only reason behind the
KVMXICSState class and it's in the way of more cleanups. Let's make it
a static for the moment and move forward.
If this is problem, we could use an attribute under the sPAPR machine
later on.
Signed-off-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Cédric Le Goater [Mon, 27 Feb 2017 14:29:21 +0000 (15:29 +0100)]
ppc/xics: extend the QOM interface to handle ICPs
Let's add two new handlers for ICPs. One is to get an ICP object from
a server number and a second is to resend the irqs when needed.
The icp_resend() handler is a temporary workaround needed by the
ics-simple post_load() handler. It will be removed when the post_load
portion can be done at the machine level.
Signed-off-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Cédric Le Goater [Mon, 27 Feb 2017 14:29:20 +0000 (15:29 +0100)]
ppc/xics: remove the XICS list of ICS
This is not used anymore.
Signed-off-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Cédric Le Goater [Mon, 27 Feb 2017 14:29:19 +0000 (15:29 +0100)]
ppc/xics: register the reset handler of ICS objects
The reset of the ICS objects is currently handled by XICS but this can
be done for each individual ICS. This also reduces the use of the XICS
list of ICS.
Signed-off-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Cédric Le Goater [Mon, 27 Feb 2017 14:29:18 +0000 (15:29 +0100)]
ppc/xics: remove xics_find_source()
It is not used anymore now that we have the QOM interface for XICS.
Signed-off-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Cédric Le Goater [Mon, 27 Feb 2017 14:29:17 +0000 (15:29 +0100)]
ppc/xics: use the QOM interface to resend irqs
Also change the ICPState 'xics' backlink to be a XICSFabric, this
removes the need of using qdev_get_machine() to get the QOM interface
in some of the routines.
Signed-off-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Cédric Le Goater [Mon, 27 Feb 2017 14:29:16 +0000 (15:29 +0100)]
ppc/xics: use the QOM interface to get irqs
Signed-off-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Cédric Le Goater [Mon, 27 Feb 2017 14:29:15 +0000 (15:29 +0100)]
ppc/xics: use the QOM interface under the sPAPR machine
Add 'ics_get' and 'ics_resend' handlers to the sPAPR machine. These
are relatively simple for a single ICS.
Signed-off-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Cédric Le Goater [Mon, 27 Feb 2017 14:29:14 +0000 (15:29 +0100)]
ppc/xics: introduce a XICSFabric QOM interface to handle ICSs
This interface provides two simple handlers. One is to get an ICS
(Interrupt Source Controller) object from an irq number and a second
to resend the irqs when needed.
Signed-off-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Cédric Le Goater [Mon, 27 Feb 2017 14:29:13 +0000 (15:29 +0100)]
ppc/xics: add an InterruptStatsProvider interface to ICS and ICP objects
This is, again, to reduce the use of the list of ICS objects. Let's
make each individual ICS and ICP object an InterruptStatsProvider and
remove this same interface from XICSState.
The InterruptStatsProvider will be moved at the machine level after
the XICS cleanups are completed.
Signed-off-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Cédric Le Goater [Mon, 27 Feb 2017 14:29:12 +0000 (15:29 +0100)]
ppc/xics: store the ICS object under the sPAPR machine
A list of ICS objects was introduced under the XICS object for the
PowerNV machine but, for the sPAPR machine, it brings extra complexity
as there is only a single ICS. To simplify the code, let's add the ICS
pointer under the sPAPR machine and try to reduce the use of this list
where possible.
Also, change the xics_spapr_*() routines to use an ICS object instead
of an XICSState and change their name to reflect that these are
specific to the sPAPR ICS object.
Signed-off-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Cédric Le Goater [Mon, 27 Feb 2017 14:29:11 +0000 (15:29 +0100)]
ppc/xics: remove set_nr_servers() handler from XICSStateClass
Today, the ICP (Interrupt Controller Presenter) objects are created by
the 'nr_servers' property handler of the XICS object and a class
handler. They are realized in the XICS object realize routine.
Let's simplify the process by creating the ICP objects along with the
XICS object at the machine level.
Signed-off-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Cédric Le Goater [Mon, 27 Feb 2017 14:29:10 +0000 (15:29 +0100)]
ppc/xics: remove set_nr_irqs() handler from XICSStateClass
Today, the ICS (Interrupt Controller Source) object is created and
realized by the init and realize routines of the XICS object, but some
of the parameters are only known at the machine level.
These parameters are passed from the sPAPR machine to the ICS object
in a rather convoluted way using property handlers and a class handler
of the XICS object. The number of irqs required to allocate the IRQ
state objects in the ICS realize routine is one of them.
Let's simplify the process by creating the ICS object along with the
XICS object at the machine level and link the ICS into the XICS list
of ICSs at this level also. In the sPAPR machine, there is only a
single ICS but that will change with the PowerNV machine.
Also, QOMify the creation of the objects and get rid of the
superfluous code.
Signed-off-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
David Gibson [Tue, 14 Feb 2017 01:58:05 +0000 (12:58 +1100)]
xics: XICS should not be a SysBusDevice
Currently xics - the component of the IBM POWER interrupt controller
representing the overall interrupt fabric / architecture is
represented as a descendent of SysBusDevice. However, this is not
really correct - the xics presents nothing in MMIO space so it should
be an "unattached" device in the current QOM model.
Since this device will always be created by the machine type, not created
specifically from the command line, and because it has no migrated state
it should be safe to move it around the device composition tree.
Therefore this patch changes it to a descendent of TYPE_DEVICE, and
makes it an unattached device. So that its reset handler still gets
called correctly, we add a qdev_set_parent_bus() to attach it to
sysbus. It's not really clear that's correct (instead of using
register_reset()) but it appears to a common technique.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
[clg corrected problems with reset] Signed-off-by: Cédric Le Goater <clg@kaod.org>
[dwg folded together and updated commit message] Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
This behaviour change is not actually a bug since no assumptions should be
made on DT ordering. But it has no real justification either, other than
being the consequence of the way fdt_add_subnode() inserts new elements
to the front of the FDT rather than adding them to the tail.
This patch reverts to the historical SLOF ordering by walking PCI devices
in reverse order. This reconciles pseries with x86 machine types behavior.
It is expected to make things easier when porting existing applications to
power.
Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com> Tested-by: Thomas Huth <thuth@redhat.com> Reviewed-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
(slight update to the changelog) Signed-off-by: Greg Kurz <groug@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com> Reviewed-by: Richard Henderson <rth@twiddle.net> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Add helper_div_compute_ov() in the int_helper for updating the overflow
flags.
For Divide Word:
SO, OV, and OV32 bits reflects overflow of the 32-bit result
For Divide DoubleWord:
SO, OV, and OV32 bits reflects overflow of the 64-bit result
Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com> Reviewed-by: Richard Henderson <rth@twiddle.net> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
For Multiply Word:
SO, OV, and OV32 bits reflects overflow of the 32-bit result
For Multiply DoubleWord:
SO, OV, and OV32 bits reflects overflow of the 64-bit result
Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com> Reviewed-by: Richard Henderson <rth@twiddle.net> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com> Reviewed-by: Richard Henderson <rth@twiddle.net> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com> Reviewed-by: Richard Henderson <rth@twiddle.net> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Adds routine to compute ca32 - gen_op_arith_compute_ca32
For 64-bit mode use the compute ca32 routine. While for 32-bit mode, CA
and CA32 will have same value.
Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com> Reviewed-by: Richard Henderson <rth@twiddle.net> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
POWER ISA 3.0 adds CA32 and OV32 status in 64-bit mode. Add the flags
and corresponding defines.
Moreover, CA32 is updated when CA is updated and OV32 is updated when OV
is updated.
Arithmetic instructions:
* Addition and Substractions:
addic, addic., subfic, addc, subfc, adde, subfe, addme, subfme,
addze, and subfze always updates CA and CA32.
=> CA reflects the carry out of bit 0 in 64-bit mode and out of
bit 32 in 32-bit mode.
=> CA32 reflects the carry out of bit 32 independent of the
mode.
=> SO and OV reflects overflow of the 64-bit result in 64-bit
mode and overflow of the low-order 32-bit result in 32-bit
mode
=> OV32 reflects overflow of the low-order 32-bit independent of
the mode
* Multiply Low and Divide:
For mulld, divd, divde, divdu and divdeu: SO, OV, and OV32 bits
reflects overflow of the 64-bit result
For mullw, divw, divwe, divwu and divweu: SO, OV, and OV32 bits
reflects overflow of the 32-bit result
* Negate with OE=1 (nego)
For 64-bit mode if the register RA contains
0x8000_0000_0000_0000, OV and OV32 are set to 1.
For 32-bit mode if the register RA contains 0x8000_0000, OV and
OV32 are set to 1.
Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
David Gibson [Fri, 24 Feb 2017 06:35:50 +0000 (17:35 +1100)]
target/ppc: Correct SDR1 masking
SDR_64_HTABORG, which indicates the bits of the SDR1 register to use for
the base of a 64-bit machine's hashed page table (HPT) isn't correct. It
includes the top 46 bits of the register, but in fact the top 4 bits must
be zero (according to the ISA v2.07). No actual implementation has
supported close to 2^60 bytes of physical address space, so it's kind of
irrelevant, but we might as well correct this.
In addition, although we checked for bad size values in SDR1, we never
reported an error if entirely invalid bits were set there. Add this check
to ppc_store_sdr1().
Reported-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
target/ppc: Remove the function ppc_hash64_set_sdr1()
The function ppc_hash64_set_sdr1 basically checked the htabsize and set an
error if it was too big, otherwise it just stored the value in SPR_SDR1.
Given that the only function which calls ppc_hash64_set_sdr1() is
ppc_store_sdr1(), why not handle the checking in ppc_store_sdr1() to avoid
the extra function call. Note that ppc_store_sdr1() already stores the
value in SPR_SDR1 anyway, so we were doing it twice.
David Gibson [Thu, 23 Feb 2017 00:39:18 +0000 (11:39 +1100)]
target/ppc: Manage external HPT via virtual hypervisor
The pseries machine type implements the behaviour of a PAPR compliant
hypervisor, without actually executing such a hypervisor on the virtual
CPU. To do this we need some hooks in the CPU code to make hypervisor
facilities get redirected to the machine instead of emulated internally.
For hypercalls this is managed through the cpu->vhyp field, which points
to a QOM interface with a method implementing the hypercall.
For the hashed page table (HPT) - also a hypervisor resource - we use an
older hack. CPUPPCState has an 'external_htab' field which when non-NULL
indicates that the HPT is stored in qemu memory, rather than within the
guest's address space.
For consistency - and to make some future extensions easier - this merges
the external HPT mechanism into the vhyp mechanism. Methods are added
to vhyp for the basic operations the core hash MMU code needs: map_hptes()
and unmap_hptes() for reading the HPT, store_hpte() for updating it and
hpt_mask() to retrieve its size.
To match this, the pseries machine now sets these vhyp fields in its
existing vhyp class, rather than reaching into the cpu object to set the
external_htab field.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com>
David Gibson [Fri, 24 Feb 2017 05:36:44 +0000 (16:36 +1100)]
target/ppc: Eliminate htab_base and htab_mask variables
CPUPPCState includes fields htab_base and htab_mask which store the base
address (GPA) and size (as a mask) of the guest's hashed page table (HPT).
These are set when the SDR1 register is updated.
Keeping these in sync with the SDR1 is actually a little bit fiddly, and
probably not useful for performance, since keeping them expands the size of
CPUPPCState. It also makes some upcoming changes harder to implement.
This patch removes these fields, in favour of calculating them directly
from the SDR1 contents when necessary.
This does make a change to the behaviour of attempting to write a bad value
(invalid HPT size) to the SDR1 with an mtspr instruction. Previously, the
bad value would be stored in SDR1 and could be retrieved with a later
mfspr, but the HPT size as used by the softmmu would be, clamped to the
allowed values. Now, writing a bad value is treated as a no-op. An error
message is printed in both new and old versions.
I'm not sure which behaviour, if either, matches real hardware. I don't
think it matters that much, since it's pretty clear that if an OS writes
a bad value to SDR1, it's not going to boot.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
David Gibson [Mon, 27 Feb 2017 05:03:41 +0000 (16:03 +1100)]
target/ppc: Cleanup HPTE accessors for 64-bit hash MMU
Accesses to the hashed page table (HPT) are complicated by the fact that
the HPT could be in one of three places:
1) Within guest memory - when we're emulating a full guest CPU at the
hardware level (e.g. powernv, mac99, g3beige)
2) Within qemu, but outside guest memory - when we're emulating user and
supervisor instructions within TCG, but instead of emulating
the CPU's hypervisor mode, we just emulate a hypervisor's behaviour
(pseries in TCG or KVM-PR)
3) Within the host kernel - a pseries machine using KVM-HV
acceleration. Mostly accesses to the HPT are handled by KVM,
but there are a few cases where qemu needs to access it via a
special fd for the purpose.
In order to batch accesses to the fd in case (3), we use a somewhat awkward
ppc_hash64_start_access() / ppc_hash64_stop_access() pair, which for case
(3) reads / releases several HPTEs from the kernel as a batch (usually a
whole PTEG). For cases (1) & (2) it just returns an address value. The
actual HPTE load helpers then need to interpret the returned token
differently in the 3 cases.
This patch keeps the same basic structure, but simplfiies the details.
First start_access() / stop_access() are renamed to map_hptes() and
unmap_hptes() to make their operation more obvious. Second, map_hptes()
now always returns a qemu pointer, which can always be used in the same way
by the load_hpte() helpers. In case (1) it comes from address_space_map()
in case (2) directly from qemu's HPT buffer and in case (3) from a
temporary buffer read from the KVM fd.
While we're at it, make things a bit more consistent in terms of types and
variable names: avoid variables named 'index' (it shadows index(3) which
can lead to confusing results), use 'hwaddr ptex' for HPTE indices and
uint64_t for each of the HPTE words, use ptex throughout the call stack
instead of pte_offset in some places (we still need that at the bottom
layer, but nowhere else).
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
David Gibson [Sun, 19 Feb 2017 23:54:48 +0000 (10:54 +1100)]
target/ppc: SDR1 is a hypervisor resource
At present the SDR1 register - the base of the system's hashed page table
(HPT) - is represented as an SPR with supervisor read and write permission.
However, on CPUs which have a hypervisor mode, the SDR1 is a hypervisor
only resource. Change the permission checking on the SPR to reflect this.
Now that this is done, we don't need to check for an external HPT executing
mtsdr1: an external HPT only applies when we're emulating the behaviour of
a hypervisor, rather than modelling the CPU's hypervisor mode internally,
so if we're permitted to execute mtsdr1, we don't have an external HPT.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com>
David Gibson [Sun, 19 Feb 2017 23:47:09 +0000 (10:47 +1100)]
target/ppc: Merge cpu_ppc_set_vhyp() with cpu_ppc_set_papr()
cpu_ppc_set_papr() sets up various aspects of CPU state for use with PAPR
paravirtualized guests. However, it doesn't set the virtual hypervisor,
so callers must also call cpu_ppc_set_vhyp() so that PAPR hypercalls are
handled properly. This is a bit silly, so fold setting the virtual
hypervisor into cpu_ppc_set_papr().
Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com>
David Gibson [Tue, 21 Feb 2017 03:00:16 +0000 (14:00 +1100)]
pseries: Minor cleanups to HPT management hypercalls
* Standardize on 'ptex' instead of 'pte_index' for HPTE index variables
for consistency and brevity
* Avoid variables named 'index'; shadowing index(3) from libc can lead to
surprising bugs if the variable is removed, because compiler errors
might not appear for remaining references
* Clarify index calculations in h_enter() - we have two cases, H_EXACT
where the exact HPTE slot is given, and !H_EXACT where we search for
an empty slot within the hash bucket. Make the calculation more
consistent between the cases.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com>
David Gibson [Mon, 27 Feb 2017 04:34:19 +0000 (15:34 +1100)]
target/ppc: Fix KVM-HV HPTE accessors
When a 'pseries' guest is running with KVM-HV, the guest's hashed page
table (HPT) is stored within the host kernel, so it is not directly
accessible to qemu. Most of the time, qemu doesn't need to access it:
we're using the hardware MMU, and KVM itself implements the guest
hypercalls for manipulating the HPT.
However, qemu does need access to the in-KVM HPT to implement
get_phys_page_debug() for the benefit of the gdbstub, and maybe for
other debug operations.
To allow this, 7c43bca "target-ppc: Fix page table lookup with kvm
enabled" added kvmppc_hash64_read_pteg() to target/ppc/kvm.c to read
in a batch of HPTEs from the KVM table. Unfortunately, there are a
couple of problems with this:
First, the name of the function implies it always reads a whole PTEG
from the HPT, but in fact in some cases it's used to grab individual
HPTEs (which ends up pulling 8 HPTEs, not aligned to a PTEG from the
kernel).
Second, and more importantly, the code to read the HPTEs from KVM is
simply wrong, in general. The data from the fd that KVM provides is
designed mostly for compact migration rather than this sort of one-off
access, and so needs some decoding for this purpose. The current code
will work in some cases, but if there are invalid HPTEs then it will
not get sane results.
This patch rewrite the HPTE reading function to have a simpler
interface (just read n HPTEs into a caller provided buffer), and to
correctly decode the stream from the kernel.
For consistency we also clean up the similar function for altering
HPTEs within KVM (introduced in c138593 "target-ppc: Update
ppc_hash64_store_hpte to support updating in-kernel htab").
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>