The ISA bus has a different DT nodename on POWER9. Compute the name
when the PnvChip is realized, that is before it is used by the machine
to populate the device tree with the ISA devices.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20190307223548.20516-6-clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
It will ease the introduction of the LPC Controller model for POWER9.
Signed-off-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Message-Id: <20190307223548.20516-5-clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
The PowerNV LPC Controller exposes different sets of registers for
each of the functional units it encompasses, among which the OPB
(On-Chip Peripheral Bus) Master and Arbitrer and the LPC HOST
Controller.
The mapping addresses of each register range are correct but the sizes
are too large. Fix the sizes and define the OPB Arbitrer range to fill
the gap between the OPB Master registers and the LPC HOST Controller
registers.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20190307223548.20516-4-clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
The PSI bridge on POWER9 is very similar to POWER8. The BAR is still
set through XSCOM but the controls are now entirely done with MMIOs.
More interrupts are defined and the interrupt controller interface has
changed to XIVE. The POWER9 model is a first example of the usage of
the notify() handler of the XiveNotifier interface, linking the PSI
XiveSource to its owning device model.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20190307223548.20516-3-clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
To ease the introduction of the PSI bridge model for POWER9, abstract
the POWER chip differences in a PnvPsi class model and introduce a
specific Pnv8Psi type for POWER8. POWER8 interface to the interrupt
controller is still XICS whereas POWER9 uses the new XIVE model.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20190307223548.20516-2-clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
mac_newworld: use node name instead of alias name for hd device in FWPathProvider
When using -drive to configure the hd drive for the New World machine, the node
name "disk" should be used instead of the "hd" alias.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Message-Id: <20190307212058.4890-3-mark.cave-ayland@ilande.co.uk> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
mac_oldworld: use node name instead of alias name for hd device in FWPathProvider
When using -drive to configure the hd drive for the Old World machine, the node
name "disk" should be used instead of the "hd" alias.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Message-Id: <20190307212058.4890-2-mark.cave-ayland@ilande.co.uk> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
target/ppc: introduce vsr64_offset() to simplify get_cpu_vsr{l,h}() and set_cpu_vsr{l,h}()
Now that all VSX registers are stored in host endian order, there is no need
to go via different accessors depending upon the register number. Instead we
introduce vsr64_offset() and use it directly from within get_cpu_vsr{l,h}() and
set_cpu_vsr{l,h}().
This also allows us to rewrite avr64_offset() and fpr_offset() in terms of the
new vsr64_offset() function to more clearly express the relationship between the
VSX, FPR and VMX registers, and also remove vsrl_offset() which is no longer
required.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Message-Id: <20190307180520.13868-8-mark.cave-ayland@ilande.co.uk> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
target/ppc: switch fpr/vsrl registers so all VSX registers are in host endian order
When VSX support was initially added, the fpr registers were added at
offset 0 of the VSR register and the vsrl registers were added at offset
1. This is in contrast to the VMX registers (the last 32 VSX registers) which
are stored in host-endian order.
Switch the fpr/vsrl registers so that the lower 32 VSX registers are now also
stored in host endian order to match the VMX registers. This ensures that TCG
vector operations involving mixed VMX and VSX registers will function
correctly.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20190307180520.13868-7-mark.cave-ayland@ilande.co.uk> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
target/ppc: improve avr64_offset() and use it to simplify get_avr64()/set_avr64()
By using the VsrD macro in avr64_offset() the same offset calculation can be
used regardless of the host endian. This allows get_avr64() and set_avr64() to
be simplified accordingly.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Message-Id: <20190307180520.13868-6-mark.cave-ayland@ilande.co.uk> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
All TCG vector operations require pointers to the base address of the vector
rather than separate access to the top and bottom 64-bits. Convert the VMX TCG
instructions to use a new avr_full_offset() function instead of avr64_offset()
which can then itself be written as a simple wrapper onto vsr_full_offset().
This same function can also reused in cpu_avr_ptr() to avoid having more than
one copy of the offset calculation logic.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Message-Id: <20190307180520.13868-5-mark.cave-ayland@ilande.co.uk> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
target/ppc: move Vsr* macros from internal.h to cpu.h
It isn't possible to include internal.h from cpu.h so move the Vsr* macros
into cpu.h alongside the other VMX/VSX register access functions.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Message-Id: <20190307180520.13868-4-mark.cave-ayland@ilande.co.uk> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
target/ppc: introduce single vsrl_offset() function
Instead of having multiple copies of the offset calculation logic, move it to a
single vsrl_offset() function.
This commit also renames the existing get_vsr()/set_vsr() functions to
get_vsrl()/set_vsrl() which better describes their purpose.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20190307180520.13868-3-mark.cave-ayland@ilande.co.uk> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
target/ppc: introduce single fpr_offset() function
Instead of having multiple copies of the offset calculation logic, move it to a
single fpr_offset() function.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20190307180520.13868-2-mark.cave-ayland@ilande.co.uk> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
spapr_iommu: Do not replay mappings from just created DMA window
On sPAPR vfio_listener_region_add() is called in 2 situations:
1. a new listener is registered from vfio_connect_container();
2. a new IOMMU Memory Region is added from rtas_ibm_create_pe_dma_window().
In both cases vfio_listener_region_add() calls
memory_region_iommu_replay() to notify newly registered IOMMU notifiers
about existing mappings which is totally desirable for case 1.
However for case 2 it is nothing but noop as the window has just been
created and has no valid mappings so replaying those does not do anything.
It is barely noticeable with usual guests but if the window happens to be
really big, such no-op replay might take minutes and trigger RCU stall
warnings in the guest.
For example, a upcoming GPU RAM memory region mapped at 64TiB (right
after SPAPR_PCI_LIMIT) causes a 64bit DMA window to be at least 128TiB
which is (128<<40)/0x10000=2.147.483.648 TCEs to replay.
This mitigates the problem by adding an "skipping_replay" flag to
sPAPRTCETable and defining sPAPR own IOMMU MR replay() hook which does
exactly the same thing as the generic one except it returns early if
@skipping_replay==true.
Another way of fixing this would be delaying replay till the very first
H_PUT_TCE but this does not work if in-kernel H_PUT_TCE handler is
enabled (a likely case).
When "ibm,create-pe-dma-window" is complete, the guest will map only
required regions of the huge DMA window.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Message-Id: <20190307050518.64968-2-aik@ozlabs.ru> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
The NSR register of the HV ring has a different, although similar, bit
layout. TM_QW3_NSR_HE_PHYS bit should now be raised when the
Hypervisor interrupt line is signaled. Other bits TM_QW3_NSR_HE_POOL
and TM_QW3_NSR_HE_LSI are not modeled. LSI are for special interrupts
reserved for HW bringup and the POOL bit is used when signaling a
group of VPs. This is not currently implemented in Linux but it is in
pHyp.
The most important special commands on the HV TIMA page are added to
let the core manage interrupts : acking and changing the CPU priority.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20190306085032.15744-10-clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
ppc/pnv: add a XIVE interrupt controller model for POWER9
This is a simple model of the POWER9 XIVE interrupt controller for the
PowerNV machine which only addresses the needs of the skiboot
firmware. The PowerNV model reuses the common XIVE framework developed
for sPAPR as the fundamentals aspects are quite the same. The
difference are outlined below.
The controller initial BAR configuration is performed using the XSCOM
bus from there, MMIO are used for further configuration.
The MMIO regions exposed are :
- Interrupt controller registers
- ESB pages for IPIs and ENDs
- Presenter MMIO (Not used)
- Thread Interrupt Management Area MMIO, direct and indirect
The virtualization controller MMIO region containing the IPI ESB pages
and END ESB pages is sub-divided into "sets" which map portions of the
VC region to the different ESB pages. These are modeled with custom
address spaces and the XiveSource and XiveENDSource objects are sized
to the maximum allowed by HW. The memory regions are resized at
run-time using the configuration of EDT set translation table provided
by the firmware.
The XIVE virtualization structure tables (EAT, ENDT, NVTT) are now in
the machine RAM and not in the hypervisor anymore. The firmware
(skiboot) configures these tables using Virtual Structure Descriptor
defining the characteristics of each table : SBE, EAS, END and
NVT. These are later used to access the virtual interrupt entries. The
internal cache of these tables in the interrupt controller is updated
and invalidated using a set of registers.
Still to address to complete the model but not fully required is the
support for block grouping. Escalation support will be necessary for
KVM guests.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20190306085032.15744-7-clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
The PowerNV machine can perform indirect loads and stores on the TIMA
on behalf of another CPU. Give the controller the possibility to call
the TIMA memory accessors with a XiveTCTX of its choice.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20190306085032.15744-4-clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
ppc/xive: hardwire the Physical CAM line of the thread context
By default on P9, the HW CAM line (23bits) is hardwired to :
0x000||0b1||4Bit chip number||7Bit Thread number.
When the block group mode is enabled at the controller level (PowerNV),
the CAM line is changed for CAM compares to :
4Bit chip number||0x001||7Bit Thread number
This will require changes in xive_presenter_tctx_match() possibly.
This is a lowlevel functionality of the HW controller and it is not
strictly needed. Leave it for later.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20190306085032.15744-2-clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
PPC: E500: Add FSL I2C controller and integrate RTC with it
Original commit message:
This patch adds an emulation model for i2c controller found on most of the FSL SoCs.
It also integrates the RTC (ds1338) that sits on the i2c Bus with e500 machine model.
Patch was originally written by Amit Singh Tomar <amit.tomar@freescale.com>
see http://patchwork.ozlabs.org/patch/431475/
I only fixed it enough for application on top of current qemu master 20b084c4b1401b7f8fbc385649d48c67b6f43d44, and hopefully fixed checkpatch errors
Tested by booting Linux kernel 4.20.12. Now e500 machine doesn't need
network time protocol daemon because it will have working RTC
(before all timestamps on files were from 2016)
Signed-off-by: Amit Singh Tomar <amit.tomar@freescale.com> Signed-off-by: Andrew Randrianasulu <randrianasulu@gmail.com>
Message-Id: <20190306102812.28972-1-randrianasulu@gmail.com>
[dwg: Add Kconfig stanza to define the new symbol, update MAINTAINERS] Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
David Gibson [Wed, 6 Mar 2019 03:15:26 +0000 (14:15 +1100)]
spapr: Force SPAPR_MEMORY_BLOCK_SIZE to be a hwaddr (64-bit)
SPAPR_MEMORY_BLOCK_SIZE is logically a difference in memory addresses, and
hence of type hwaddr which is 64-bit. Previously it wasn't marked as such
which means that it could be treated as 32-bit. That will work in some
circumstances but if multiplied by another 32-bit value it could lead to
a 32-bit overflow and an incorrect result.
One specific instance of this in spapr_lmb_dt_populate() was spotted by
Coverity (CID 1399145).
Reported-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
target/ppc/spapr: Clear partition table entry when allocating hash table
If we allocate a hash page table then we know that the guest won't be
using process tables, so set the partition table entry maintained for
the guest to zero. If this isn't done, then the guest radix bit will
remain set in the entry. This means that when the guest calls
H_REGISTER_PROCESS_TABLE there will be a mismatch between then flags
and the value in spapr->patb_entry, and the call will fail. The guest
will then panic:
Failed to register process table (rc=-4)
kernel BUG at arch/powerpc/platforms/pseries/lpar.c:959
The result being that it isn't possible to boot a hash guest on a P9
system.
Also fix a bug in the flags parsing in h_register_process_table() which
was introduced by the same patch, and simplify the handling to make it
less likely that errors will be introduced in the future. The effect
would have been setting the host radix bit LPCR_HR for a hash guest
using process tables, which currently isn't supported and so couldn't
have been triggered.
Fixes: 00fd075e18 "target/ppc/spapr: Set LPCR:HR when using Radix mode" Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com>
Message-Id: <20190305022102.17610-1-sjitindarsingh@gmail.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Alexander Graf [Mon, 4 Mar 2019 10:39:30 +0000 (11:39 +0100)]
PPC: E500: Update u-boot to v2019.01
Quite a while has passed since we last updated U-Boot for e500. This patch
bumps it to the last released version 2019.01 to make sure users don't feel
like they're using out of date software.
Signed-off-by: Alexander Graf <agraf@csgraf.de>
Message-Id: <20190304103930.16319-1-agraf@csgraf.de> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Fabiano Rosas [Thu, 28 Feb 2019 22:57:58 +0000 (19:57 -0300)]
target/ppc: Refactor kvm_handle_debug
There are four scenarios being handled in this function:
- single stepping
- hardware breakpoints
- software breakpoints
- fallback (no debug supported)
A future patch will add code to handle specific single step and
software breakpoints cases so let's split each scenario into its own
function now to avoid hurting readability.
target/ppc/spapr: Enable mitigations by default for pseries-4.0 machine type
There are currently 3 mitigations the availability of which is controlled
by the spapr-caps mechanism, cap-cfpc, cap-sbbc, and cap-ibs. Enable these
mitigations by default for the pseries-4.0 machine type.
By now machine firmware should have been upgraded to allow these
settings.
target/ppc/tcg: make spapr_caps apply cap-[cfpc/sbbc/ibs] non-fatal for tcg
The spapr_caps cap-cfpc, cap-sbbc and cap-ibs are used to control the
availability of certain mitigations to the guest. These haven't been
implemented under TCG, it is unlikely they ever will be, and it is unclear
as to whether they even need to be.
As such, make failure to apply these capabilities under TCG non-fatal.
Instead we print a warning message to the user but still allow the guest
to continue.
Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com>
Message-Id: <20190301044609.9626-2-sjitindarsingh@gmail.com>
[dwg: Small style fix] Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Introduce a new spapr_cap SPAPR_CAP_CCF_ASSIST to be used to indicate
the requirement for a hw-assisted version of the count cache flush
workaround.
The count cache flush workaround is a software workaround which can be
used to flush the count cache on context switch. Some revisions of
hardware may have a hardware accelerated flush, in which case the
software flush can be shortened. This cap is used to set the
availability of such hardware acceleration for the count cache flush
routine.
The availability of such hardware acceleration is indicated by the
H_CPU_CHAR_BCCTR_FLUSH_ASSIST flag being set in the characteristics
returned from the KVM_PPC_GET_CPU_CHAR ioctl.
Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com>
Message-Id: <20190301031912.28809-2-sjitindarsingh@gmail.com>
[dwg: Small style fixes] Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
target/ppc/spapr: Add workaround option to SPAPR_CAP_IBS
The spapr_cap SPAPR_CAP_IBS is used to indicate the level of capability
for mitigations for indirect branch speculation. Currently the available
values are broken (default), fixed-ibs (fixed by serialising indirect
branches) and fixed-ccd (fixed by diabling the count cache).
Introduce a new value for this capability denoted workaround, meaning that
software can work around the issue by flushing the count cache on
context switch. This option is available if the hypervisor sets the
H_CPU_BEHAV_FLUSH_COUNT_CACHE flag in the cpu behaviours returned from
the KVM_PPC_GET_CPU_CHAR ioctl.
target/ppc/spapr: Enable the large decrementer for pseries-4.0
Enable the large decrementer by default for the pseries-4.0 machine type.
It is disabled again by default_caps_with_cpu() for pre-POWER9 cpus
since they don't support the large decrementer.
target/ppc: Implement large decrementer support for KVM
Implement support to allow KVM guests to take advantage of the large
decrementer introduced on POWER9 cpus.
To determine if the host can support the requested large decrementer
size, we check it matches that specified in the ibm,dec-bits device-tree
property. We also need to enable it in KVM by setting the LPCR_LD bit in
the LPCR. Note that to do this we need to try and set the bit, then read
it back to check the host allowed us to set it, if so we can use it but
if we were unable to set it the host cannot support it and we must not
use the large decrementer.
Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com> Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20190301024317.22137-3-sjitindarsingh@gmail.com>
[dwg: Small style fixes] Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
target/ppc: Implement large decrementer support for TCG
Prior to POWER9 the decrementer was a 32-bit register which decremented
with each tick of the timebase. From POWER9 onwards the decrementer can
be set to operate in a mode called large decrementer where it acts as a
n-bit decrementing register which is visible as a 64-bit register, that
is the value of the decrementer is sign extended to 64 bits (where n is
implementation dependant).
The mode in which the decrementer operates is controlled by the LPCR_LD
bit in the logical paritition control register (LPCR).
>From POWER9 onwards the HDEC (hypervisor decrementer) was enlarged to
h-bits, also sign extended to 64 bits (where h is implementation
dependant). Note this isn't configurable and is always enabled.
On POWER9 the large decrementer and hdec are both 56 bits, as
represented by the lrg_decr_bits cpu class property. Since they are the
same size we only add one property for now, which could be extended in
the case they ever differ in the future.
We also add the lrg_decr_bits property for POWER5+/7/8 since it is used
to determine the size of the hdec, which is only generated on the
POWER5+ processor and later. On these processors it is 32 bits.
Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com> Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20190301024317.22137-2-sjitindarsingh@gmail.com>
[dwg: Small style fixes] Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Greg Kurz [Fri, 1 Mar 2019 19:32:31 +0000 (20:32 +0100)]
Revert "spapr: support memory unplug for qtest"
Commit b8165118f52c broke CPU hotplug tests for old machine types:
$ QTEST_QEMU_BINARY=ppc64-softmmu/qemu-system-ppc64 ./tests/cpu-plug-test -m=slow
/ppc64/cpu-plug/pseries-3.1/device-add/2x3x1&maxcpus=6: OK
/ppc64/cpu-plug/pseries-2.12-sxxm/device-add/2x3x1&maxcpus=6: OK
/ppc64/cpu-plug/pseries-3.0/device-add/2x3x1&maxcpus=6: OK
/ppc64/cpu-plug/pseries-2.10/device-add/2x3x1&maxcpus=6: OK
/ppc64/cpu-plug/pseries-2.11/device-add/2x3x1&maxcpus=6: OK
/ppc64/cpu-plug/pseries-2.12/device-add/2x3x1&maxcpus=6: OK
/ppc64/cpu-plug/pseries-2.9/device-add/2x3x1&maxcpus=6: OK
/ppc64/cpu-plug/pseries-2.7/device-add/2x3x1&maxcpus=6: **
ERROR:/home/thuth/devel/qemu/hw/ppc/spapr_events.c:313:rtas_event_log_to_source: assertion failed: (source->enabled)
Broken pipe
/home/thuth/devel/qemu/tests/libqtest.c:143: kill_qemu() detected QEMU death from signal 6 (Aborted) (core dumped)
Aborted (core dumped)
The approach of faking the availability of OV5_HP_EVT causes the
code to assume the hotplug event source is enabled, which is wrong
for older machines.
A subsequent patch will address the problem of CAS under qtest from
a different angle.
Reported-by: Thomas Huth <thuth@redhat.com> Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <155146875097.147873.1732264036668112686.stgit@bahia.lan> Tested-by: Michael Roth <mdroth@linux.vnet.ibm.com> Reviewed-by: Michael Roth <mdroth@linux.vnet.ibm.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Greg Kurz [Fri, 1 Mar 2019 19:32:37 +0000 (20:32 +0100)]
spapr: Simulate CAS for qtest
The RTAS event hotplug code for machine types 2.8 and newer depends on
the CAS negotiated ov5 in order to work properly. However, there's no
CAS when running under qtest. There has been a tentative to trick the
code by faking the OV5_HP_EVT bit, but it turned out to break other
assumptions in the code and the change got reverted.
Go for a more general approach and simulate a CAS when running under
qtest. For simplicity, this pseudo CAS simple simulates the case where
the guest supports the same features as the machine. It is done at
reset time, just before we reset the DRCs, which could potentially
exercise the unplug code.
This allows to test unplug on spapr with both older and newer machine
types.
Suggested-by: Michael Roth <mdroth@linux.vnet.ibm.com> Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <155146875704.147873.10563808578795890265.stgit@bahia.lan> Tested-by: Michael Roth <mdroth@linux.vnet.ibm.com> Reviewed-by: Michael Roth <mdroth@linux.vnet.ibm.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
The "systempagesize" name suggests that it is the host system page size
while it is the smallest page size of memory backing the guest RAM so
let's rename it to stop confusion. This should cause no behavioral change.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Message-Id: <20190227085149.38596-4-aik@ozlabs.ru> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
The current code assumes that we can address more bits on a PCI bus
for DMA than we really can but there is no way knowing the actual limit.
This makes a better guess for the number of levels and if the kernel
fails to allocate that, this increases the level numbers till succeeded
or reached the 64bit limit.
This adds levels to the trace point.
This may cause the kernel to warn about failed allocation:
[65122.837458] Failed to allocate a TCE memory, level shift=28
which might happen if MAX_ORDER is not large enough as it can vary:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/powerpc/Kconfig?h=v5.0-rc2#n727
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Message-Id: <20190227085149.38596-3-aik@ozlabs.ru> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
* remotes/bonzini/tags/for-upstream: (31 commits)
qemugdb: fix licensing
chardev: add support for authorization for TLS clients
qom: cpu: destroy work_mutex in cpu_common_finalize
exec.c: refactor function flatview_add_to_dispatch()
lsi: 810/895A are always little endian
lsi: return dfifo value
lsi: use SCSI phase names instead of numbers in trace
lsi: use enum type for s->msg_action
lsi: use enum type for s->waiting
lsi: use ldn_le_p()/stn_le_p()
scsi-disk: Fix crash if request is invaild or disk is no medium
configure: Disable W^X on OpenBSD
oslib-posix: Ignore fcntl("/dev/null", F_SETFL, O_NONBLOCK) failure
accel: Allow to build QEMU without TCG or KVM support
build: clean trace/generated-helpers.c
build: remove unnecessary assignments from Makefile.target
build: get rid of target-obj-y
update copyright notice
lsi: check if SIGP bit is already set in Wait reselect
lsi: implement basic SBCL functionality
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Peter Maydell [Mon, 11 Mar 2019 17:16:38 +0000 (17:16 +0000)]
Merge remote-tracking branch 'remotes/amarkovic/tags/mips-queue-mar-11-2019' into staging
MIPS queue for March 11th, 2019
# gpg: Signature made Mon 11 Mar 2019 14:16:09 GMT
# gpg: using RSA key D4972A8967F75A65
# gpg: Good signature from "Aleksandar Markovic <amarkovic@wavecomp.com>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 8526 FBF1 5DA3 811F 4A01 DD75 D497 2A89 67F7 5A65
* remotes/amarkovic/tags/mips-queue-mar-11-2019:
target/mips: Add tests for a variety of MSA integer subtract instructions
target/mips: Add tests for a variety of MSA integer multiply instructions
target/mips: Add tests for a variety of MSA integer dot product instructions
target/mips: Add tests for a variety of MSA integer divide instructions
target/mips: Add tests for a variety of MSA integer average instructions
tests/tcg: target/mips: Rename two header files for consistency
tests/tcg: target/mips: Correct preambles of test source files
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
chardev: add support for authorization for TLS clients
Currently any client which can complete the TLS handshake is able to use
a chardev server. The server admin can turn on the 'verify-peer' option
for the x509 creds to require the client to provide a x509
certificate. This means the client will have to acquire a certificate
from the CA before they are permitted to use the chardev server. This is
still a fairly low bar.
This adds a 'tls-authz=OBJECT-ID' option to the socket chardev backend
which takes the ID of a previously added 'QAuthZ' object instance. This
will be used to validate the client's x509 distinguished name. Clients
failing the check will not be permitted to use the chardev server.
For example to setup authorization that only allows connection from a
client whose x509 certificate distinguished name contains 'CN=fred', you
would use:
Li Qiang [Wed, 2 Jan 2019 07:41:14 +0000 (23:41 -0800)]
qom: cpu: destroy work_mutex in cpu_common_finalize
Commit 376692b9dc6(cpus: protect work list with work_mutex)
initialize a work_mutex in cpu_common_initfn, however forget
to destroy it. This will cause resource leak when hotunplug cpu
or hotplug cpu fails.
Signed-off-by: Li Qiang <liq3ea@163.com>
Message-Id: <20190102074114.26988-1-liq3ea@163.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Wei Yang [Mon, 11 Mar 2019 05:42:52 +0000 (13:42 +0800)]
exec.c: refactor function flatview_add_to_dispatch()
flatview_add_to_dispatch() registers page based on the condition of
*section*, which may looks like this:
|s|PPPPPPP|s|
where s stands for subpage and P for page.
The procedure of this function could be described as:
- register first subpage
- register page
- register last subpage
This means the procedure could be simplified into these three steps
instead of a loop iteration.
This patch refactors the function into three corresponding steps and
adds some comment to clarify it.
Signed-off-by: Wei Yang <richardw.yang@linux.intel.com>
Message-Id: <20190311054252.6094-1-richardw.yang@linux.intel.com>
[Paolo: move exit before adjustment of remain.offset_within_*,
otherwise int128_get64 fails when a region is 2^64 bytes long] Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Sven Schnelle [Mon, 18 Feb 2019 17:55:28 +0000 (18:55 +0100)]
lsi: 810/895A are always little endian
Signed-off-by: Sven Schnelle <svens@stackframe.org>
Message-Id: <20190218175529.11237-1-svens@stackframe.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Sven Schnelle [Tue, 5 Mar 2019 19:55:18 +0000 (20:55 +0100)]
lsi: use SCSI phase names instead of numbers in trace
This makes trace logs much easier to read, especially for
people who are not fluent in SCSI.
Signed-off-by: Sven Schnelle <svens@stackframe.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20190305195519.24303-5-svens@stackframe.org>
Sven Schnelle [Tue, 5 Mar 2019 19:55:17 +0000 (20:55 +0100)]
lsi: use enum type for s->msg_action
This makes the code easier to read - no functional change.
Signed-off-by: Sven Schnelle <svens@stackframe.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20190305195519.24303-4-svens@stackframe.org>
Sven Schnelle [Tue, 5 Mar 2019 19:55:16 +0000 (20:55 +0100)]
lsi: use enum type for s->waiting
This makes the code easier to read - no functional change.
Signed-off-by: Sven Schnelle <svens@stackframe.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20190305195519.24303-3-svens@stackframe.org>
Sven Schnelle [Tue, 5 Mar 2019 19:55:15 +0000 (20:55 +0100)]
lsi: use ldn_le_p()/stn_le_p()
Instead of using the open-coded versions, use the helper already
present as this makes the code easier to read and less error-prone.
Signed-off-by: Sven Schnelle <svens@stackframe.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20190305195519.24303-2-svens@stackframe.org>
Zhengui Li [Thu, 7 Mar 2019 09:12:46 +0000 (17:12 +0800)]
scsi-disk: Fix crash if request is invaild or disk is no medium
Qemu will crash with the assertion error that "assert(r->req.aiocb !=
NULL)" in scsi_read_complete if request is invaild or disk is no medium.
The error is below:
qemu-kvm: hw/scsi/scsi_disk.c:299: scsi_read_complete: Assertion
`r->req.aiocb != NULL' failed.
This patch add a funtion scsi_read_complete_noio to fix it.
Signed-off-by: Zhengui Li <lizhengui@huawei.com>
Message-Id: <1551949966-20092-1-git-send-email-lizhengui@huawei.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Since OpenBSD 6.0 [1], W^X is enforced by default [2].
TCG requires WX access. Disable W^X if it is available.
This fixes:
# lm32-softmmu/qemu-system-lm32
Could not allocate dynamic translator buffer
# sysctl kern.wxabort=1
kern.wxabort: 0 -> 1
# lm32-softmmu/qemu-system-lm32
mmap: Not supported
Abort trap (core dumped)
# gdb -q lm32-softmmu/qemu-system-lm32 qemu-system-lm32.core
(gdb) bt
#0 0x000017e3c156c50a in _thread_sys___syscall () at {standard input}:5
#1 0x000017e3c15e5d7a in *_libc_mmap (addr=Variable "addr" is not available.) at /usr/src/lib/libc/sys/mmap.c:47
#2 0x000017e17d9abc8b in alloc_code_gen_buffer () at /usr/src/qemu/accel/tcg/translate-all.c:1064
#3 0x000017e17d9abd04 in code_gen_alloc (tb_size=0) at /usr/src/qemu/accel/tcg/translate-all.c:1112
#4 0x000017e17d9abe81 in tcg_exec_init (tb_size=0) at /usr/src/qemu/accel/tcg/translate-all.c:1149
#5 0x000017e17d9897e9 in tcg_init (ms=0x17e45e456800) at /usr/src/qemu/accel/tcg/tcg-all.c:66
#6 0x000017e17d9891b8 in accel_init_machine (acc=0x17e3c3f50800, ms=0x17e45e456800) at /usr/src/qemu/accel/accel.c:63
#7 0x000017e17d989312 in configure_accelerator (ms=0x17e45e456800, progname=0x7f7fffff07b0 "lm32-softmmu/qemu-system-lm32") at /usr/src/qemu/accel/accel.c:111
#8 0x000017e17d9d8616 in main (argc=1, argv=0x7f7fffff06b8, envp=0x7f7fffff06c8) at vl.c:4325
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20190307142822.8531-3-philmd@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Previous to OpenBSD 6.3 [1], fcntl(F_SETFL) is not permitted on
memory devices.
Trying this call sets errno to ENODEV ("not a memory device"):
19 ENODEV Operation not supported by device.
An attempt was made to apply an inappropriate function to a device,
for example, trying to read a write-only device such as a printer.
Do not assert fcntl failures in this specific case (errno set to ENODEV)
on OpenBSD. This fixes:
$ lm32-softmmu/qemu-system-lm32
assertion "f != -1" failed: file "util/oslib-posix.c", line 247, function "qemu_set_nonblock"
Abort trap (core dumped)
[1] The fix seems https://github.com/openbsd/src/commit/c2a35b387f9d3c
"fcntl(F_SETFL) invokes the FIONBIO and FIOASYNC ioctls internally, so
the memory devices (/dev/null, /dev/zero, etc) need to permit them."
Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20190307142822.8531-2-philmd@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Anthony PERARD [Wed, 16 Jan 2019 17:35:27 +0000 (17:35 +0000)]
accel: Allow to build QEMU without TCG or KVM support
Instead of deny build of QEMU without a default accelerator, simply
report an error when the user haven't passed -accel or -machine accel=
and TCG and KVM isn't builtin.
./configure already check that at least one accelerator is available.
Signed-off-by: Anthony PERARD <anthony.perard@citrix.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Paolo Bonzini [Fri, 15 Feb 2019 09:15:22 +0000 (10:15 +0100)]
build: remove unnecessary assignments from Makefile.target
It is only necessary to clear block-obj-y because Makefile.objs
uses "+=" instead of "="; fix that and remove the assignment.
The other variables need not be cleared at all.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
David Kiarie [Mon, 4 Mar 2019 15:18:27 +0000 (18:18 +0300)]
update copyright notice
Signed-off-by: David Kiarie <davidkiarie4@gmail.com>
Message-Id: <20190304151827.1813-2-davidkiarie4@gmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Sven Schnelle [Sun, 17 Feb 2019 11:37:17 +0000 (12:37 +0100)]
lsi: check if SIGP bit is already set in Wait reselect
If SIGP is set, the 'Wait for Reselection' command should jump
immediately to the address stored in the second DWORD of the
instruction. This fixes spurious hangs in the HP-UX 11.11
installer when the SIGP bit gets set by the kernel before the
'Wait for Reselection' command is executed by SCRIPTS.
Signed-off-by: Sven Schnelle <svens@stackframe.org> Tested-by: Helge Deller <deller@gmx.de>
Message-Id: <20190217113717.7077-1-svens@stackframe.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Sven Schnelle [Fri, 15 Feb 2019 19:40:21 +0000 (20:40 +0100)]
lsi: implement basic SBCL functionality
HP-UX checks this register after sending data to the target. If there's no valid
information present, it assumes the client disconnected because the kernel sent
to much data. Implement at least some of the SBCL functionality that is possible
without having a real SCSI bus.
Signed-off-by: Sven Schnelle <svens@stackframe.org>
Message-Id: <20190215194021.20543-1-svens@stackframe.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Greg Kurz [Thu, 28 Feb 2019 17:59:42 +0000 (18:59 +0100)]
virtio-scsi: Fix build with gcc 9
Build fails with gcc 9:
CC ppc64-softmmu/hw/scsi/virtio-scsi.o
hw/scsi/virtio-scsi.c: In function ‘virtio_scsi_do_tmf’:
hw/scsi/virtio-scsi.c:265:39: error: taking address of packed member of ‘struct virtio_scsi_ctrl_tmf_req’ may result in an unaligned pointer value [-Werror=address-of-packed-member]
265 | virtio_tswap32s(VIRTIO_DEVICE(s), &req->req.tmf.subtype);
| ^~~~~~~~~~~~~~~~~~~~~
cc1: all warnings being treated as errors
All the fields in struct virtio_scsi_ctrl_tmf_req are naturally aligned,
so we could in theory drop QEMU_PACKED. Unfortunately, the header file
is imported from linux which already has the packed attribute. Trying to
fix that in the update-linux-headers.sh script is likely to produce
ugliness. Turn the call to virtio_tswap32s() into an assignment instead.
Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <155137678223.44753.5438092367451176318.stgit@bahia.lan> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Paolo Bonzini [Thu, 28 Feb 2019 09:23:18 +0000 (10:23 +0100)]
target-i386: add kvm stubs to user-mode emulators
The CPUID code will call kvm_arch_get_supported_cpuid() and, even though
it is undef kvm_enabled() so it never runs for user-mode emulators,
sometimes clang will not optimize it out at -O0.
That could be considered a compiler bug, however at -O0 we give it
a pass and just add the stubs.
The configure script checks multiple times whether it works in a git
repository and it does this by "test -e "${source_path}/.git" in 4 cases
but in one case where it tries to enable werror "-d" is used there which
fails on git worktrees as .git is a file then and not a directory.
This changes the test to "-e" as other occurrences.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Message-Id: <20190228043503.68494-1-aik@ozlabs.ru> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Before this patch, if elf2dmp failed to find NT kernel PE magic in
allowed virtual address range, then it assumes NULL as NT kernel
address and cause segfault.
This patch fix the problem described above by checking NT kernel address
before futher processing.
Signed-off-by: Viktor Prutyanov <viktor.prutyanov@phystech.edu>
Message-Id: <20190219211936.6466-1-viktor.prutyanov@phystech.edu> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Some Linux specific code is missing guards, leading to
build failure on OSX:
$ sudo brew install libiscsi
$ ./configure && make
[...]
CC block/iscsi.o
qemu/block/iscsi.c:338:24: error: 'iscsi_aiocb_info' defined but not used [-Werror=unused-const-variable=]
static const AIOCBInfo iscsi_aiocb_info = {
^~~~~~~~~~~~~~~~
qemu/block/iscsi.c:168:1: error: 'iscsi_schedule_bh' defined but not used [-Werror=unused-function]
iscsi_schedule_bh(IscsiAIOCB *acb)
^~~~~~~~~~~~~~~~~
cc1: all warnings being treated as errors
Add guards to restrict this code for Linux.
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20190220000553.28438-1-philmd@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
hw/i386/pc: run the multiboot loader before the PVH loader
Some multiboot images could be in the ELF format. In the current
implementation QEMU fails because we try to load these images
as a PVH image.
In order to fix this issue, we should try multiboot first (we
already check the multiboot magic header before to load it).
If it is not a multiboot image, we can try the PVH loader.
Fixes: ab969087da6 ("pvh: Boot uncompressed kernel using direct boot ABI", 2019-01-15) Reported-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Message-Id: <20190214180216.246707-1-sgarzare@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Li Qiang [Sun, 10 Mar 2019 16:02:27 +0000 (09:02 -0700)]
tests: test-qgraph: fix a memory leak
Spotted by ASAN when 'make check'.
Signed-off-by: Li Qiang <liq3ea@163.com>
Message-Id: <20190310160227.103090-1-liq3ea@163.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Thomas Huth <thuth@redhat.com> Fixes: fc281c80202 Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Paolo Bonzini [Fri, 8 Mar 2019 17:33:27 +0000 (18:33 +0100)]
vfio-pci: enable by default
CONFIG_VFIO_PCI was not "default y" - and once you do that, it is also
important to disable it if PCI is not there.
Reported-by: Alex Williamson <alex.williamson@redhat.com> Tested-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
tests/tcg: target/mips: Rename two header files for consistency
Rename two header files for consistency and clarity. Do all other
changes to accommodate new names.
Signed-off-by: Aleksandar Markovic <amarkovic@wavecomp.com> Reviewed-by: Aleksandar Rikalo <amarkovic@wavecomp.com>
Message-Id: <1551981716-30664-3-git-send-email-aleksandar.markovic@rt-rk.com>
tests/tcg: target/mips: Correct preambles of test source files
Correct preambles of test source files.
Signed-off-by: Aleksandar Markovic <amarkovic@wavecomp.com> Reviewed-by: Aleksandar Rikalo <amarkovic@wavecomp.com>
Message-Id: <1551981716-30664-2-git-send-email-aleksandar.markovic@rt-rk.com>
Peter Maydell [Fri, 8 Mar 2019 13:57:44 +0000 (13:57 +0000)]
Makefile: Don't install non-sphinx files in sphinx docs install
If we're doing an out-of-tree build of Sphinx, then we
copy some extra spurious files to the install directory
as part of 'make install':
qemu-ga-qapi.texi
qemu-ga-ref.7
qemu-ga-ref.7.pod
qemu-ga-ref.html
qemu-ga-ref.txt
qemu-qmp-qapi.texi
qemu-qmp-ref.7
qemu-qmp-ref.7.pod
qemu-qmp-ref.html
qemu-qmp-ref.txt
because these have been built into build/docs/interop along
with the Sphinx interop documents. Filter them out of the
set of files we install when we're installing the Sphinx-built
manual files. (They are installed into their correct locations
as part of the main install-doc target already.)
Fixes: 5f71eac06e15b9a3fa1134d446f ("Makefile, configure: Support building rST documentation") Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-id: 20190308135744.6480-4-peter.maydell@linaro.org
Peter Maydell [Fri, 8 Mar 2019 13:57:43 +0000 (13:57 +0000)]
Makefile: Fix 'make distclean'
We forgot the '-r' option on the rm command to clean up the
Sphinx .doctrees working directory, which meant that
"make distclean" fails:
rm: cannot remove '.doctrees': Is a directory
Add the missing option.
Fixes: 5f71eac06e15b9a3fa1134d446f ("Makefile, configure: Support building rST documentation") Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-id: 20190308135744.6480-3-peter.maydell@linaro.org
Peter Maydell [Fri, 8 Mar 2019 13:57:42 +0000 (13:57 +0000)]
Makefile: Fix Sphinx documentation builds for in-tree builds
The Sphinx build-sphinx tool does not permit building a manual
into the same directory as its source files. This meant that
commit 5f71eac06e15b9a3fa1134d446f broke QEMU in-source-tree
builds, which would fail with:
Error: source directory and destination directory are same.
Fix this by making in-tree builds build the Sphinx manuals
into a subdirectory of docs/.
Fixes: 5f71eac06e15b9a3fa1134d446f ("Makefile, configure: Support building rST documentation") Reported-by: Philippe Mathieu-Daudé <philmd@redhat.com> Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-id: 20190308135744.6480-2-peter.maydell@linaro.org
Combine all variant in a single handler. As source and destination
have different element sizes, we can't use gvec expansion. Expand
manually. Also watch out for overlapping source and destination
registers. Use a safe evaluation order depending on the operation.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20190307121539.12842-33-david@redhat.com> Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Instead of checking e.g. the first access on every touched page, we should
check the actual access, otherwise we might get false positives when Low
Address Protection (LAP) is active. As probe_write() can only deal with
accesses to one page, we have to loop.
Use i64 for the length, although not needed - easier to reuse
TCG temps we already have in the translation functions where this will
be used. Also allow it to be used from other helpers.
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20190307121539.12842-28-david@redhat.com>
[CH: add missing page_check_range()] Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Cornelia Huck <cohuck@redhat.com>