KONRAD Frederic [Thu, 21 Mar 2013 14:15:17 +0000 (15:15 +0100)]
virtio-scsi-ccw: switch to new API
Here the virtio-scsi-ccw is modified for the new API. The device
virtio-scsi-ccw extends virtio-ccw-device as before. It creates and
connects a virtio-scsi during the init. The properties are not modified.
Signed-off-by: KONRAD Frederic <fred.konrad@greensocs.com> Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Tested-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Message-id: 1363875320-7985-8-git-send-email-fred.konrad@greensocs.com Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
KONRAD Frederic [Thu, 21 Mar 2013 14:15:16 +0000 (15:15 +0100)]
virtio-scsi-s390: switch to the new API.
Here the virtio-scsi-s390 is modified for the new API. The device
virtio-scsi-s390 extends virtio-s390-device as before. It creates and
connects a virtio-scsi during the init. The properties are not modified.
Signed-off-by: KONRAD Frederic <fred.konrad@greensocs.com> Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Tested-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Message-id: 1363875320-7985-7-git-send-email-fred.konrad@greensocs.com Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
KONRAD Frederic [Thu, 21 Mar 2013 14:15:15 +0000 (15:15 +0100)]
virtio-scsi-pci: switch to new API.
Here the virtio-scsi-pci is modified for the new API. The device virtio-scsi-pci
extends virtio-pci. It creates and connects a virtio-scsi during the init.
Signed-off-by: KONRAD Frederic <fred.konrad@greensocs.com> Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Tested-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Message-id: 1363875320-7985-6-git-send-email-fred.konrad@greensocs.com Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Anthony Liguori [Tue, 26 Mar 2013 14:25:45 +0000 (09:25 -0500)]
Merge remote-tracking branch 'luiz/queue/qmp' into staging
# By Corey Bryant (2) and others
# Via Luiz Capitulino
* luiz/queue/qmp:
New QMP command query-cpu-max and HMP command cpu_max
qmp: fix handling of boolean values in qmp-shell
QMP: TPM QMP and man page documentation updates
QMP: Remove duplicate TPM type from query-tpm
Orit Wasserman [Fri, 22 Mar 2013 14:48:03 +0000 (16:48 +0200)]
Use qemu_put_buffer_async for guest memory pages
This will remove an unneeded copy of guest memory pages.
For the page header and device state we still copy the data to the
static buffer the other option is to allocate the memory on demand
which is more expensive.
Signed-off-by: Orit Wasserman <owasserm@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>
Orit Wasserman [Fri, 22 Mar 2013 14:48:02 +0000 (16:48 +0200)]
Add qemu_put_buffer_async
This allows us to add a buffer to the iovec to send without copying it
into the static buffer, the buffer will be sent later when qemu_fflush is called.
Signed-off-by: Orit Wasserman <owasserm@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>
Peter Lieven [Tue, 26 Mar 2013 09:58:39 +0000 (10:58 +0100)]
migration: use XBZRLE only after bulk stage
at the beginning of migration all pages are marked dirty and
in the first round a bulk migration of all pages is performed.
currently all these pages are copied to the page cache regardless
of whether they are frequently updated or not. this doesn't make sense
since most of these pages are never transferred again.
this patch changes the XBZRLE transfer to only be used after
the bulk stage has been completed. that means a page is added
to the page cache the second time it is transferred and XBZRLE
can benefit from the third time of transfer.
since the page cache is likely smaller than the number of pages
it's also likely that in the second round the page is missing in the
cache due to collisions in the bulk phase.
on the other hand a lot of unnecessary mallocs, memdups and frees
are saved.
the following results have been taken earlier while executing
the test program from docs/xbzrle.txt. (+) with the patch and (-)
without. (thanks to Eric Blake for reformatting and comments)
+ total time: 22185 milliseconds
- total time: 22410 milliseconds
Peter Lieven [Tue, 26 Mar 2013 09:58:37 +0000 (10:58 +0100)]
migration: do not sent zero pages in bulk stage
during bulk stage of ram migration if a page is a
zero page do not send it at all.
the memory at the destination reads as zero anyway.
even if there is an madvise with QEMU_MADV_DONTNEED
at the target upon receipt of a zero page I have observed
that the target starts swapping if the memory is overcommitted.
it seems that the pages are dropped asynchronously.
this patch also updates QMP to return the number of
skipped pages in MigrationStats.
Signed-off-by: Peter Lieven <pl@kamp.de> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>
Peter Lieven [Tue, 26 Mar 2013 09:58:36 +0000 (10:58 +0100)]
migration: add an indicator for bulk state of ram migration
the first round of ram transfer is special since all pages
are dirty and thus all memory pages are transferred to
the target. this patch adds a boolean variable to track
this stage.
Signed-off-by: Peter Lieven <pl@kamp.de> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Orit Wasserman <owasserm@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>
Peter Lieven [Tue, 26 Mar 2013 09:58:32 +0000 (10:58 +0100)]
cutils: add a function to find non-zero content in a buffer
this adds buffer_find_nonzero_offset() which is a SSE2/Altivec
optimized function that searches for non-zero content in a
buffer.
the function starts full unrolling only after the first few chunks have
been checked one by one. analyzing real memory page data has revealed
that non-zero pages are non-zero within the first 256-512 bits in
most cases. as this function is also heavily used to check for zero memory
pages this tweak has been made to avoid the high setup costs of the fully
unrolled check for non-zero pages.
due to the optimizations used in the function there are restrictions
on buffer address and search length. the function
can_use_buffer_find_nonzero_content() can be used to check if
the function can be used safely.
Signed-off-by: Peter Lieven <pl@kamp.de> Signed-off-by: Juan Quintela <quintela@redhat.com>
David Gibson [Tue, 12 Mar 2013 03:06:04 +0000 (14:06 +1100)]
savevm: Fix bugs in the VMSTATE_VBUFFER_MULTIPLY definition
The VMSTATE_BUFFER_MULTIPLY macro is misnamed - it actually specifies
a variably sized buffer with VMS_VBUFFER, so should be named
VMSTATE_VBUFFER_MULTIPLY. This patch fixes this (the macro had no current
users under either name).
In addition, unlike the other VMSTATE_VBUFFER variants, this macro did not
specify VMS_POINTER. This patch fixes this bug as well.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Juan Quintela <quintela@redhat.com>
David Gibson [Tue, 12 Mar 2013 03:06:03 +0000 (14:06 +1100)]
savevm: Add VMSTATE_STRUCT_VARRAY_POINTER_UINT32
Currently the savevm code contains a VMSTATE_STRUCT_VARRAY_POINTER_INT32
helper (a variably sized array with the number of elements in an int32_t),
but not VMSTATE_STRUCT_VARRAY_POINTER_UINT32 (... with the number of
elements in a uint32_t). This patch (trivially) fixes the deficiency.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Juan Quintela <quintela@redhat.com>
David Gibson [Tue, 12 Mar 2013 03:06:02 +0000 (14:06 +1100)]
savevm: Add VMSTATE_FLOAT64 helpers
The current savevm code includes VMSTATE helpers for a number of commonly
used data types, but not for the float64 type used by the internal floating
point emulation code. This patch fixes the deficiency.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Juan Quintela <quintela@redhat.com>
David Gibson [Tue, 12 Mar 2013 03:06:00 +0000 (14:06 +1100)]
savevm: Add VMSTATE_UINT64_EQUAL helpers
The savevm code already includes a number of *_EQUAL helpers which act as
sanity checks verifying that the configuration of the saved state matches
that of the machine we're loading into to work. Variants already exist
for 8 bit 16 bit and 32 bit integers, but not 64 bit integers. This patch
fills that hole, adding a UINT64 version.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Juan Quintela <quintela@redhat.com>
Igor Mammedov [Mon, 25 Mar 2013 14:48:46 +0000 (15:48 +0100)]
qmp: fix handling of boolean values in qmp-shell
qmp-shell converts only integer arguments and the rest
is assumed to be strings which are faithfully sent as
quoted strings by json. But QEMU refuses to accept qmp
command with boolean argument whose value is escaped
as string.
Fix it by special-casing true/false keywords and store
value as corresponding boolean.
Signed-off-by: Igor Mammedov <imammedo@redhat.com> Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
Anthony Liguori [Mon, 25 Mar 2013 18:14:26 +0000 (13:14 -0500)]
Merge remote-tracking branch 'stefanha/net' into staging
# By Dmitry Fleytman (5) and others
# Via Stefan Hajnoczi
* stefanha/net:
net: increase buffer size to accommodate Jumbo frame pkts
VMXNET3 device implementation
Packet abstraction for VMWARE network devices
Common definitions for VMWARE devices
net: iovec checksum calculator
Checksum-related utility functions
net: use socket_set_nodelay() for -netdev socket
Anthony Liguori [Mon, 25 Mar 2013 18:14:20 +0000 (13:14 -0500)]
Merge remote-tracking branch 'stefanha/block' into staging
# By Liu Yuan (1) and Stefan Weil (1)
# Via Stefan Hajnoczi
* stefanha/block:
block: Add options QDict to bdrv_file_open() prototypes (fix MinGW build)
rbd: fix compile error
Scott Feldman [Mon, 18 Mar 2013 18:43:44 +0000 (11:43 -0700)]
net: increase buffer size to accommodate Jumbo frame pkts
Socket buffer sizes were hard-coded to 4K for VDE and socket netdevs. Bump this
up to 68K (ala tap netdev) to handle maximum GSO packet size (64k) plus plenty
of room for the ethernet and virtio_net headers.
Originally, ran into this limitation when using -netdev UDP sockets to connect
VM-to-VM, where VM interface is configure with MTU=9000. (Using virtio_net
NIC model). Test is simple: ping -M do -s 8500 <target>. This test will
attempt to ping with unfragmented packet of given size. Without patch, size
is limited to < 4K (minus protocol hdrs). With patch, ping test works with pkt
size up to 9000 (again, minus protocol hdrs).
v2: per Stefan, increase buf size to (4096+65536) as done in tap and apply
to vde and socket netdevs.
v1: increase buf size to 12K just for -netdev UDP sockets
Signed-off-by: Scott Feldman <sfeldma@cumulusnetworks.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Stefan Hajnoczi [Wed, 27 Feb 2013 14:05:47 +0000 (15:05 +0100)]
net: use socket_set_nodelay() for -netdev socket
Reduce -netdev socket latency by disabling the Nagle algorithm on
SOCK_STREAM sockets in net/socket.c. Since we are tunelling Ethernet
over TCP we shouldn't artificially delay outgoing packets, let the guest
decide packet scheduling.
I already get sub-millisecond -netdev socket ping times on localhost, so
there was no measurable difference in my testing. This won't hurt
though and may improve remote socket performance.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Benoit Canet <benoit@irqsave.net> Reviewed-by: Daniel P. Berrange <berrange@redhat.com>
Cc: Kevin Wolf <kwolf@redhat.com> Cc: Eric Blake <eblake@redhat.com> Signed-off-by: Stefan Weil <sw@weilnetz.de> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Cc: Kevin Wolf <kwolf@redhat.com> Cc: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Liu Yuan <tailai.ly@taobao.com> Reviewed-by: Stefan Weil <sw@weilnetz.de> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Aurelien Jarno [Fri, 22 Mar 2013 20:43:57 +0000 (21:43 +0100)]
Merge branch 'ppc-for-upstream' of git://github.com/agraf/qemu
* 'ppc-for-upstream' of git://github.com/agraf/qemu: (58 commits)
target-ppc: Use NARROW_MODE macro for tlbie
target-ppc: Use NARROW_MODE macro for addresses
target-ppc: Use NARROW_MODE macro for comparisons
target-ppc: Use NARROW_MODE macro for branches
target-ppc: Fix add and subf carry generation in narrow mode
target-ppc: Use QOM method dispatch for MMU fault handling
target-ppc: Move ppc tlb_fill implementation into mmu_helper.c
target-ppc: Split user only code out of mmu_helper.c
mmu-hash64: Implement Virtual Page Class Key Protection
mmu-hash*: Merge translate and fault handling functions
mmu-hash*: Don't use full ppc_hash{32, 64}_translate() path for get_phys_page_debug()
mmu-hash*: Correctly mask RPN from hash PTE
mmu-hash*: Clean up real address calculation
mmu-hash*: Clean up PTE flags update
mmu-hash64: Factor SLB N bit into permissions bits
mmu-hash*: Clean up permission checking
mmu-hash32: Remove nx from context structure
mmu-hash*: Don't update PTE flags when permission is denied
mmu-hash32: Don't look up page tables on BAT permission error
mmu-hash32: Cleanup BAT lookup
...
Yeongkyoon Lee [Fri, 22 Mar 2013 12:50:17 +0000 (21:50 +0900)]
tcg: Fix occasional TCG broken problem when ldst optimization enabled
is_tcg_gen_code() checks the upper limit of TCG generated code range wrong, so
that TCG could get broken occasionally only when CONFIG_QEMU_LDST_OPTIMIZATION
enabled. The reason is code_gen_buffer_max_size does not cover the upper range
up to (TCG_MAX_OP_SIZE * OPC_BUF_SIZE), thus code_gen_buffer_max_size should be
modified to code_gen_buffer_size.
CC: qemu-stable@nongnu.org Signed-off-by: Yeongkyoon Lee <yeongkyoon.lee@samsung.com> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Anthony Liguori [Fri, 22 Mar 2013 18:08:01 +0000 (13:08 -0500)]
Merge remote-tracking branch 'kwolf/for-anthony' into staging
# By Kevin Wolf (12) and Peter Lieven (2)
# Via Kevin Wolf
* kwolf/for-anthony:
nbd: Check against invalid option combinations
nbd: Use default port if only host is specified
block: Allow omitting the file name when using driver-specific options
block: Make find_image_format safe with NULL filename
block: Rename variable to avoid shadowing
block: Introduce .bdrv_parse_filename callback
nbd: Accept -drive options for the network connection
nbd: Remove unused functions
nbd: Keep hostname and port separate
qemu-socket: Make socket_optslist public
block: Pass bdrv_file_open() options to block drivers
block: Add options QDict to bdrv_file_open() prototypes
block: complete all IOs before resizing a device
Revert "block: complete all IOs before .bdrv_truncate"
Anthony Liguori [Fri, 22 Mar 2013 18:05:57 +0000 (13:05 -0500)]
Merge remote-tracking branch 'stefanha/trivial-patches' into staging
# By liguang (2) and others
# Via Stefan Hajnoczi
* stefanha/trivial-patches:
qdev: remove redundant abort()
gitignore: ignore more files
Use proper term in TCG README
serial: Fix debug format strings
Fix typos and misspellings
Advertise --libdir in configure --help output
memory: fix a bug of detection of memory region collision
MinGW: Replace setsockopt by qemu_setsocketopt
Kevin Wolf [Mon, 18 Mar 2013 15:56:05 +0000 (16:56 +0100)]
nbd: Use default port if only host is specified
The URL method already takes care to apply the default port when none is
specfied. Directly specifying driver-specific options required the port
number until now. Allow leaving it out and apply the default.
Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com>
Kevin Wolf [Mon, 18 Mar 2013 15:40:51 +0000 (16:40 +0100)]
block: Allow omitting the file name when using driver-specific options
After this patch, using -drive with an empty file name continues to open
the file if driver-specific options are used. If no driver-specific
options are specified, the semantics stay as it was: It defines a drive
without an inserted medium.
In order to achieve this, bdrv_open() must be made safe to work with a
NULL filename parameter. The assumption that is made is that only block
drivers which implement bdrv_parse_filename() support using driver
specific options and could therefore work without a filename. These
drivers must make sure to cope with NULL in their implementation of
.bdrv_open() (this is only NBD for now). For all other drivers, the
block layer code will make sure to error out before calling into their
code - they can't possibly work without a filename.
Kevin Wolf [Mon, 18 Mar 2013 15:20:27 +0000 (16:20 +0100)]
block: Make find_image_format safe with NULL filename
In order to achieve this, the .bdrv_probe callbacks of all drivers must
cope with this. The DMG driver is the only one that bases its decision
on the filename and it needs to be changed.
Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com>
Kevin Wolf [Fri, 15 Mar 2013 17:47:22 +0000 (18:47 +0100)]
block: Introduce .bdrv_parse_filename callback
If a driver needs structured data and not just a string, it can provide
a .bdrv_parse_filename callback now that parses the command line string
into separate options. Keeping this separate from .bdrv_open_filename
ensures that the preferred way of directly specifying the options always
works as well if parsing the string works.
Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com>
Kevin Wolf [Thu, 7 Mar 2013 15:15:11 +0000 (16:15 +0100)]
nbd: Accept -drive options for the network connection
The existing parsers for the file name now parse everything into the
bdrv_open() options QDict. Instead of using these parsers, you can now
directly specify the options on the command line, like this:
Kevin Wolf [Fri, 15 Mar 2013 10:55:29 +0000 (11:55 +0100)]
nbd: Keep hostname and port separate
The NBD block supports an URL syntax, for which a URL parser returns
separate hostname and port fields. It also supports the traditional qemu
syntax encoded in a filename. Until now, after parsing the URL to get
each piece of information, a new string is built to be fed to socket
functions.
Instead of building a string in the URL case that is immediately parsed
again, parse the string in both cases and use the QemuOpts interface to
qemu-sockets.c.
Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com>
Peter Lieven [Mon, 11 Mar 2013 10:04:24 +0000 (11:04 +0100)]
block: complete all IOs before resizing a device
this patch ensures that all pending IOs are completed
before a device is resized. this is especially important
if a device is shrinked as it the bdrv_check_request()
result is invalidated.
Signed-off-by: Peter Lieven <pl@kamp.de> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Peter Lieven [Mon, 11 Mar 2013 10:03:28 +0000 (11:03 +0100)]
Revert "block: complete all IOs before .bdrv_truncate"
brdv_truncate() is also called from readv/writev commands on self-
growing file based storage. this will result in requests waiting
for theirselves to complete.
In TCG, "target" means the host architecture for which TCG generates
the code. Using "guest" rather than "target" to make the document more
consistent.
Signed-off-by: Chen Wei-Ren <chenwj@iis.sinica.edu.tw> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
target-ppc: Fix add and subf carry generation in narrow mode
The set of computations used in b5a73f8d8a57e940f9bbeb399a9e47897522ee9a
are only valid if the current word size == target_long size. This failed
to take ppc64 in 32-bit (narrow) mode into account.
Add a NARROW_MODE macro to avoid conditional compilation.
Signed-off-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Alexander Graf <agraf@suse.de>
David Gibson [Wed, 13 Mar 2013 00:40:33 +0000 (11:40 +1100)]
target-ppc: Use QOM method dispatch for MMU fault handling
After previous cleanups, the many scattered checks of env->mmu_model in
the ppc MMU implementation have, at least for "classic" hash MMUs been
reduced (almost) to a single switch at the top of
cpu_ppc_handle_mmu_fault().
An explicit switch is still a pretty ugly way of handling this though. Now
that Andreas Färber's CPU QOM cleanups for ppc have gone in, it's quite
straightforward to instead make the handle_mmu_fault function a QOM method
on the CPU object.
This patch implements such a scheme, initializing the method pointer at
the same time as the mmu_model variable. We need to keep the latter around
for now, because of the MMU types (BookE, 4xx, et al) which haven't been
converted to the new scheme yet, and also for a few other uses. It would
be good to clean those up eventually.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Alexander Graf <agraf@suse.de>
David Gibson [Tue, 12 Mar 2013 00:31:49 +0000 (00:31 +0000)]
target-ppc: Move ppc tlb_fill implementation into mmu_helper.c
For softmmu builds the interface from the generic code to the target
specific MMU implementation is through the tlb_fill() function. For ppc
this is currently in mem_helper.c, whereas it would make more sense in
mmu_helper.c. This patch moves it, which also allows
cpu_ppc_handle_mmu_fault() to become a local function in mmu_helper.c
Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Alexander Graf <agraf@suse.de>
David Gibson [Tue, 12 Mar 2013 00:31:48 +0000 (00:31 +0000)]
target-ppc: Split user only code out of mmu_helper.c
mmu_helper.c is, for obvious reasons, almost entirely concerned with
softmmu builds of qemu. However, it does contain one stub function which
is used when CONFIG_USER_ONLY=y - the user only versoin of
cpu_ppc_handle_mmu_fault, which always triggers an exception. The entire
rest of the file is surrounded by #if !defined(CONFIG_USER_ONLY).
We clean this up by moving the user only stub into its own new file,
removing the ifdefs and building mmu_helper.c only when CONFIG_SOFTMMU
is set. This also lets us remove the #define of cpu_handle_mmu_fault to
cpu_ppc_handle_mmu_fault - that name is only used from generic code for
user only - so we just name our split user version by the generic name.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Alexander Graf <agraf@suse.de>
David Gibson [Tue, 12 Mar 2013 00:31:47 +0000 (00:31 +0000)]
mmu-hash64: Implement Virtual Page Class Key Protection
Version 2.06 of the Power architecture describes an additional page
protection mechanism. Each virtual page has a "class" (0-31) recorded in
the PTE. The AMR register contains bits which can prohibit reads and/or
writes on a class by class basis. Interestingly, the AMR is userspace
readable and writable, however user mode writes are masked by the contents
of the UAMOR which is privileged.
This patch implements this protection mechanism, along with the AMR and
UAMOR SPRs. The architecture also specifies a hypervisor-privileged AMOR
register which masks user and supervisor writes to the AMR and UAMOR. We
leave this out for now, since we don't at present model hypervisor mode
correctly in any case.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
[agraf: fix 32-bit hosts] Signed-off-by: Alexander Graf <agraf@suse.de>
David Gibson [Tue, 12 Mar 2013 00:31:46 +0000 (00:31 +0000)]
mmu-hash*: Merge translate and fault handling functions
ppc_hash{32,64}_handle_mmu_fault() is now the only caller of
ppc_hash{32,64{_translate(), so this patch combines them together. This
means that instead of one returning a variety of non-obvious error codes
which then get translated into the various mmu exception conditions, we can
just generate the exceptions as we discover problems in the translation
path. This also removes the last usage of mmu_ctx_hash{32,64}.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Alexander Graf <agraf@suse.de>
David Gibson [Tue, 12 Mar 2013 00:31:45 +0000 (00:31 +0000)]
mmu-hash*: Don't use full ppc_hash{32, 64}_translate() path for get_phys_page_debug()
Currently the hash mmu versionsof get_phys_page_debug() use the same
ppc64_hash64_translate() function to do the translation logic as the normal
mm fault handler code.
That sounds like a good idea, but has some complications. The debug path
doesn't need, or even want some parts of the full translation path, like
permissions checking. Furthermore, the pte flags update included in the
normal path means that the debug call is not quite side effect free.
This patch, therefore, reimplements get_phys_page_debug as the minimal
required subset of the full translation path.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>`z Signed-off-by: Alexander Graf <agraf@suse.de>
David Gibson [Tue, 12 Mar 2013 00:31:44 +0000 (00:31 +0000)]
mmu-hash*: Correctly mask RPN from hash PTE
BEHAVIOUR CHANGE
At present we take the whole of word 1 of the hash PTE as the real page
number used to calculate the translated address. This is incorrect,
because it leaves the flags from the low bits of PTE word 1 in place in the
rpm. We mostly get away with that because the value is later masked by
TARGET_PAGE_MASK.
More recent 64-bit CPUs also have a small number of flag bits (PP0 and
KEY) in the top bits of PTE word 1. Any guest which used those bits would
fail with the current code.
This patch fixes the problem by correctly masking out the RPN field of
PTE word 1. This is safe, even for older CPUs which didn't have PP0 and
KEY, because although the RPN notionally extended to the very top of PTE
word 1, none of those CPUs actually implemented that many real address
bits.
We add analogous masking to the 32-bit code, even though it also doesn't
have the high flag bits, for consistency and clarity.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Alexander Graf <agraf@suse.de>
David Gibson [Tue, 12 Mar 2013 00:31:43 +0000 (00:31 +0000)]
mmu-hash*: Clean up real address calculation
More recent 64-bit hash MMUs support multiple page sizes, and PTEs for
large pages only include the offset of the whole large page. But the qemu
tlb only handles pages of the base size (4k) so we need to break up the
large pages into 4k pieces for the qemu tlb. To do that we have a somewhat
awkward piece of code that adds the folds address bits 4k and the page size
from the virtual address into the real address from the pte.
This patch simplifies this redefining the raddr output of
ppc_hash64_translate() to be the full real address of the faulting address,
rather than just the (4k) page offset. Computing that turns out to be
simpler, and is fine for the caller, since it already masks with
TARGET_PAGE_MASK before inserting into the qemu tlb.
The multiple page size complication doesn't exist for 32-bit hash mmus, but
we make an analogous cleanup there for consistency.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Alexander Graf <agraf@suse.de>
David Gibson [Tue, 12 Mar 2013 00:31:42 +0000 (00:31 +0000)]
mmu-hash*: Clean up PTE flags update
Currently the ppc_hash{32,64}_pte_update_flags() helper functions update a
PTE's referenced and changed bits as necessary to reflect the access. It
is somewhat long winded, though. This patch open codes them in their
(single) callers, in a simpler way.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Alexander Graf <agraf@suse.de>
David Gibson [Tue, 12 Mar 2013 00:31:41 +0000 (00:31 +0000)]
mmu-hash64: Factor SLB N bit into permissions bits
BEHAVIOUR CHANGE
Currently, for 64-bit hash mmu, the execute protection bit placed into the
qemu tlb is based only on the N (No execute) bit from the PTE. However,
No Execute can also be set at the segment level. We do check this on
execute faults, but this still means we could incorrectly allow execution
of code from a No Execute segment, if a prior read or write fault caused
the page to be loaded into the qemu tlb with PROT_EXEC set.
To correct this, we (re-)check the segment level no execute permission when
generating the protection bits for the qemu tlb.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Alexander Graf <agraf@suse.de>
David Gibson [Tue, 12 Mar 2013 00:31:40 +0000 (00:31 +0000)]
mmu-hash*: Clean up permission checking
Currently checking of PTE permission bits is split messily amongst
ppc_hash{32,64}_pp_check(), ppc_hash{32,64}_check_prot() and their callers.
This patch cleans this up to have the new function
ppc_hash{32,64}_pte_prot() compute the page permissions from the SLBE (for
64-bit) or segment register (32-bit) and the pte. A greatly simplified
version of the actual permissions check is then open coded in the callers.
The 32-bit version of ppc_hash32_pte_prot() is implemented in terms of
ppc_hash32_pp_prot(), a renamed and slightly cleaned up version of the old
ppc_hash32_pp_check(), which is also used for checking BAT permissions on
the 601.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Alexander Graf <agraf@suse.de>
David Gibson [Tue, 12 Mar 2013 00:31:39 +0000 (00:31 +0000)]
mmu-hash32: Remove nx from context structure
Previous cleanups have meant the nx field of the mmu_ctx_hash32 structure
is now only used within ppc_hash32_translate(), and so it can be replaced
by a local variable.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Alexander Graf <agraf@suse.de>
David Gibson [Tue, 12 Mar 2013 00:31:38 +0000 (00:31 +0000)]
mmu-hash*: Don't update PTE flags when permission is denied
BEHAVIOUR CHANGE
Currently if ppc_hash{32,64}_translate() finds a PTE matching the given
virtual address, it will always update the PTE's R & C (Referenced and
Changed) bits. This happens even if the PTE's permissions mean we are
about to deny the translation.
This is clearly a bug, although we get away with it because:
a) It will only incorrectly set, never reset the bits, which should not
cause guest correctness problems.
b) Linux guests never use the R & C bits anyway.
This patch fixes the behaviour, only updating R & C when access is granted
by the PTE.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Alexander Graf <agraf@suse.de>