Stefan Weil [Thu, 13 Sep 2012 17:37:43 +0000 (19:37 +0200)]
w64: Fix TCG helper functions with 5 arguments
TCG uses 6 registers for function arguments on 64 bit Linux hosts,
but only 4 registers on W64 hosts.
Commit 2999a0b20074a7e4a58f56572bb1436749368f59 increased the number
of arguments for some important helper functions from 4 to 5
which triggered a bug for W64 hosts: QEMU aborts when executing
helper_lcall_real in the guest's BIOS because function
tcg_target_get_call_iarg_regs_count always returned 6.
As W64 has only 4 registers for arguments, the 5th argument must be
passed on the stack using a correct stack offset.
Signed-off-by: Stefan Weil <sw@weilnetz.de> Signed-off-by: Aurelien Jarno <aurelien@aurel32.net> Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Commit 25c4d9cc changed all TCGOpcode enums to be available, so we don't
need to #ifdef #endif the one that are available only on some targets.
This makes the code easier to read.
Reviewed-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Aurelien Jarno <aurelien@aurel32.net> Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
tcg/optimize: prefer the "op a, a, b" form for commutative ops
The "op a, a, b" form is better handled on non-RISC host than the "op
a, b, a" form, so swap the arguments to this form when possible, and
when b is not a constant.
This reduces the number of generated instructions by a tiny bit.
Reviewed-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Aurelien Jarno <aurelien@aurel32.net> Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
tcg/optimize: further optimize brcond/movcond/setcond
When both argument of brcond/movcond/setcond are the same or when one
of the two values is a constant equal to zero, it's possible to do
further optimizations.
Reviewed-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Aurelien Jarno <aurelien@aurel32.net> Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Now that it's possible to detect copies, we can optimize the case
the "op r, a, a => movi r, 0". This helps in the computation of
overflow flags when one of the two args is 0.
Reviewed-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Aurelien Jarno <aurelien@aurel32.net> Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
tcg/optimize: do copy propagation for all operations
It is possible to due copy propagation for all operations, even the one
that have side effects or clobber arguments (it only concerns input
arguments). That said, the call operation should be handled differently
due to the variable number of arguments.
Reviewed-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Aurelien Jarno <aurelien@aurel32.net> Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
The copy propagation pass tries to keep track what is a copy of what
and what has copy of what, and in addition it keep a circular list of
of all the copies. Unfortunately this doesn't fully work: a mov from
a temp which has a state "COPY" changed it into a state "HAS_COPY".
Later when this temp is used again, it is considered has not having
copy and thus no propagation is done.
This patch fixes that by removing the hiearchy between copies, and thus
only keeping a "COPY" state both meaning "is a copy" and "has a copy".
The decision of which copy to use is deferred to the actual temp
replacement. At this stage there is not one best choice to do, but only
better choices than others. For doing the best choice the operation
would have to be parsed in reversed to know if a temp is going to be
used later or not. That what is done by the liveness analysis. At this
stage it is known that globals will be always live, that local temps
will be dead at the end of the translation block, and that the temps
will be dead at the end of the basic block. This means that this stage
should try to replace temps by local temps or globals and local temps
by globals.
Reviewed-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Aurelien Jarno <aurelien@aurel32.net> Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
The copy propagation doesn't check the types of the temps during copy
propagation. However TCG is using the mov_i32 for the i64 to i32
conversion and thus the two are not equivalent.
With this patch tcg_opt_gen_mov() doesn't consider two temps of
different type as copies anymore.
So far it seems the optimization was not aggressive enough to trigger
this bug, but it will be triggered later in this series once the copy
propagation is improved.
Reviewed-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Aurelien Jarno <aurelien@aurel32.net> Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
rotr operations can be optimized on MIPS32 Release 2 using the ROTR and
ROTRV instructions. Also implemented rotl operations by subtracting the
shift from 32.
Reviewed-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Aurelien Jarno <aurelien@aurel32.net> Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
bswap operations can be optimized on MIPS32 Release 2 using the ROTR,
WSBH and SEH instructions. We can't use the non-R2 code to implement the
ops due to registers constraints, so don't define the corresponding
TCG_TARGET_HAS_bswap* values.
Also bswap16* operations are supposed to be called with the 16 high bits
zeroed. This is the case everywhere (including for TCG by definition)
except when called from the store helper. Remove the AND instructions from
bswap16* and move it there.
Reviewed-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Aurelien Jarno <aurelien@aurel32.net> Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Instead of int, use the correct TCGArg and TCGReg type: TCGReg when
representing a TCG target register, TCGArg when representing the latter
or a constant.
Reviewed-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Aurelien Jarno <aurelien@aurel32.net> Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
The 'Z' constraint has been introduced to map the zero register. However
when the op also accept a constant, there is no point to accept the zero
register in addition.
Reviewed-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Aurelien Jarno <aurelien@aurel32.net> Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
gdbstub/sh4: fix build with USE_SOFTFLOAT_STRUCT_TYPES
We have to use different type to access float values when
USE_SOFTFLOAT_STRUCT_TYPES is defined.
Rework SH4 version of cpu_gdb_{read,write}_register() using
a single case, and fixing the coding style. Use ldll_p() and
stfl_p() to access float values.
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net> Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Max Filippov [Thu, 20 Sep 2012 22:59:50 +0000 (02:59 +0400)]
target-xtensa: don't emit extra tcg_gen_goto_tb
Unconditional gen_check_loop_end at the end of disas_xtensa_insn
can emit tcg_gen_goto_tb with slot id already used in the TB (e.g. when
TB ends at LEND with a branch).
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com> Cc: qemu-stable <qemu-stable@nongnu.org> Signed-off-by: malc <av1474@comtv.ru> Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
The real problem is that the all temps states should be reset at the end
of a basic block. This was done by adding such operations in the switch,
but brcond2 was forgotten (that's why the crash was only observed on 32-bit
hosts).
Fix that by looking at the TCG_OPF_BB_END instead. We need to keep the case
for op_set_label as temps might be modified through another path.
Cc: Blue Swirl <blauwirbel@gmail.com> Signed-off-by: Aurelien Jarno <aurelien@aurel32.net> Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
set_label is effectively the end of a basic block, as no optimization
can be made accross it. It was treated as such in the liveness analysis
code, but as a special case.
Mark it with TCG_OPF_BB_END flag so that this information can be used
by other parts of the TCG code, and remove the special case in the liveness
analysis code.
Cc: Blue Swirl <blauwirbel@gmail.com> Signed-off-by: Aurelien Jarno <aurelien@aurel32.net> Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Blue Swirl [Sun, 2 Sep 2012 15:28:56 +0000 (15:28 +0000)]
Remove unused CONFIG_TCG_PASS_AREG0 and dead code
Now that CONFIG_TCG_PASS_AREG0 is enabled for all targets,
remove dead code and support for !CONFIG_TCG_PASS_AREG0 case.
Remove dyngen-exec.h and all references to it. Although included by
hw/spapr_hcall.c, it does not seem to use it.
Remove unused HELPER_CFLAGS.
Signed-off-by: Blue Swirl <blauwirbel@gmail.com> Reviewed-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Add an explicit CPUCRISState parameter instead of relying on AREG0, and
use cpu_ld* in translation and interrupt handling. Remove AREG0 swapping
in tlb_fill(). Switch to AREG0 free mode
Signed-off-by: Blue Swirl <blauwirbel@gmail.com> Signed-off-by: Aurelien Jarno <aurelien@aurel32.net> Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Blue Swirl [Tue, 4 Sep 2012 20:25:59 +0000 (20:25 +0000)]
target-arm: final conversion to AREG0 free mode
Convert code load functions and switch to AREG0 free mode.
Signed-off-by: Blue Swirl <blauwirbel@gmail.com> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Blue Swirl [Tue, 4 Sep 2012 20:19:15 +0000 (20:19 +0000)]
target-arm: convert remaining helpers
Convert remaining helpers to AREG0 free mode: add an explicit
CPUState parameter instead of relying on AREG0.
Signed-off-by: Blue Swirl <blauwirbel@gmail.com> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Blue Swirl [Tue, 4 Sep 2012 20:08:34 +0000 (20:08 +0000)]
target-arm: convert void helpers
Add an explicit CPUState parameter instead of relying on AREG0.
For easier review, convert only op helpers which don't return any value.
Signed-off-by: Blue Swirl <blauwirbel@gmail.com> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
optimizer.c contains some cases were the break is appearing in both the
if and the else parts. Fix that by moving it to the outer part. Also
move some common code there.
Reviewed-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Aurelien Jarno <aurelien@aurel32.net> Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
tcg/optimize: swap brcond/setcond arguments when possible
brcond and setcond ops are not commutative, but it's easy to compute the
new condition after swapping the arguments. Try to always put the constant
argument in second position like for commutative ops, to help backends to
generate better code.
Reviewed-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Aurelien Jarno <aurelien@aurel32.net> Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
c7_region is an array with 8 elements, so the index must be less than 8.
Signed-off-by: Stefan Weil <sw@weilnetz.de> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Aurelien Jarno <aurelien@aurel32.net> Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
The load/store slow path has been broken in e141ab52d:
- We need to move 4 registers for store functions and 3 registers for
load functions and not the reverse.
- According to the s390x calling convention the arguments of a function
should be zero extended. This means that the register shift should be
done with TCG_TYPE_I64 to ensure the higher word is correctly zero
extended when needed.
I am aware that CONFIG_TCG_PASS_AREG0 is being removed and thus that
this patch can be improved, but doing so means it can also be applied to
the 1.1 and 1.2 stable branches.
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net> Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Blue Swirl [Sun, 2 Sep 2012 07:33:40 +0000 (07:33 +0000)]
target-s390x: switch to AREG0 free mode
Add an explicit CPUState parameter instead of relying on AREG0.
Remove temporary wrappers and switch to AREG0 free mode.
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
[agraf: fix conflicts] Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Blue Swirl [Sun, 2 Sep 2012 07:33:39 +0000 (07:33 +0000)]
target-s390x: avoid AREG0 for misc helpers
Make misc helpers take a parameter for CPUState instead
of relying on global env.
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
[agraf: fix conflict] Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Blue Swirl [Sun, 2 Sep 2012 07:33:35 +0000 (07:33 +0000)]
target-s390x: rename op_helper.c to misc_helper.c
Now op_helper.c contains miscellaneous helpers, rename
it to misc_helper.c.
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
[agraf: fix conflict] Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Blue Swirl [Sun, 2 Sep 2012 07:33:34 +0000 (07:33 +0000)]
target-s390x: split memory access helpers
Move memory access helpers to mem_helper.c.
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
[agraf: fold softmmu include ifdefs together] Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
fcmp{s,d,q} instructions are supposed to ignore quiet NaN (contrary to
the fcmpe{s,d,q} instructions), but the current code is wrongly setting
the NV exception in that case. Moreover the current code is duplicated:
first the arguments are checked for NaN to generate an exception, and
later in case the comparison is unordered (which can only happens if one
of the argument is a NaN), the same check is done to generate an
exception.
Fix that by calling clear_float_exceptions() followed by
check_ieee_exceptions() as for the other floating point instructions.
Use the _compare_quiet functions for fcmp{s,d,q} and the _compare ones
for fcmpe{s,d,q}. Simplify the flag setting by not clearing a flag that
is set the line just below.
This fix allows the math glibc testsuite to pass.
Cc: Blue Swirl <blauwirbel@gmail.com> Signed-off-by: Aurelien Jarno <aurelien@aurel32.net> Signed-off-by: Blue Swirl <blauwirbel@gmail.com> Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Max Filippov [Thu, 6 Sep 2012 00:36:46 +0000 (04:36 +0400)]
target-xtensa: fix missing errno codes for mingw32
Put the following errno value mappings under #ifdef:
xtensa-semi.c: In function 'errno_h2g':
xtensa-semi.c:113: error: 'ENOTBLK' undeclared (first use in this function)
xtensa-semi.c:113: error: (Each undeclared identifier is reported only once
xtensa-semi.c:113: error: for each function it appears in.)
xtensa-semi.c:113: error: array index in initializer not of integer type
xtensa-semi.c:113: error: (near initialization for 'guest_errno')
xtensa-semi.c:124: error: 'ETXTBSY' undeclared (first use in this function)
xtensa-semi.c:124: error: array index in initializer not of integer type
xtensa-semi.c:124: error: (near initialization for 'guest_errno')
xtensa-semi.c:134: error: 'ELOOP' undeclared (first use in this function)
xtensa-semi.c:134: error: array index in initializer not of integer type
xtensa-semi.c:134: error: (near initialization for 'guest_errno')
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com> Signed-off-by: Blue Swirl <blauwirbel@gmail.com> Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Max Filippov [Wed, 29 Aug 2012 19:54:25 +0000 (23:54 +0400)]
target-xtensa: convert host errno values to guest
Guest errno values are taken from the newlib. Convert only those errno
values that can be returned from used system calls.
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com> Signed-off-by: Blue Swirl <blauwirbel@gmail.com> Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
The -net none is important otherwise it seems some events are generated
causing the things to work. When it doesn't work, the guest hangs when
measuring the CPU frequency, after the following line:
[ 0.000000] NR_IRQS:256
Pressing a key on the serial port unblocks it, hinting that the problem
is due to the recent elimination of the 1 second timeout in the main
loop.
The problem is that because init_timer_alarm sets the timer's pending
flag to true, the alarm timer is never armed until after the first time
through the main loop. Thus the bug started when QEMU started testing
the pending flag in qemu_mod_timer (commit 1828be3, more alarm timer
cleanup, 2010-03-10).
But actually, it isn't true at all that a timer is pending when the
alarm timer is created, and the real bug has been latent forever: the
fix is to remove the bogus setting of pending flag.
Reported-by: Aurelien Jarno <aurelien@aurel32.net> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Jan Kiszka <jan.kiszka@siemens.com> Tested-by: Aurelien Jarno <aurelien@aurel32.net> Tested-by: Michael Tokarev <mjt@tls.msk.ru> Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Anthony Liguori [Fri, 31 Aug 2012 15:04:54 +0000 (10:04 -0500)]
Merge remote-tracking branch 'kraxel/usb.61' into staging
* kraxel/usb.61:
uas: move transfer kickoff
ehci: Fix interrupt endpoints no longer working
ehci: handle TD deactivation of inflight packets
ehci: add ehci_cancel_queue()
ehci: simplify ehci_state_executing
ehci: Remove unnecessary ehci_flush_qh call
ehci: Schedule async-bh when IAAD bit gets set
ehci: Fix NULL ptr deref when unplugging an USB dev with an iso stream active
usb: unique packet ids
usb: Halt ep queue en cancel pending packets on a packet error
fix info qtree indention
Anthony Liguori [Fri, 31 Aug 2012 15:04:18 +0000 (10:04 -0500)]
Merge remote-tracking branch 'kwolf/for-anthony' into staging
* kwolf/for-anthony:
qemu-iotests: add backing file smaller than image test case
stream: complete early if end of backing file is reached
qed: refuse unaligned zero writes with a backing file
Gerd Hoffmann [Fri, 31 Aug 2012 12:34:19 +0000 (14:34 +0200)]
uas: move transfer kickoff
Kick next scsi transfer from request release callback instead of command
completion callback, otherwise we might get stuck in case scsi_req_unref()
doesn't release the request instantly due to someone else holding a
reference too.
Hans de Goede [Fri, 17 Aug 2012 09:39:17 +0000 (11:39 +0200)]
ehci: simplify ehci_state_executing
ehci_state_executing does not need to check for p->usb_status == USB_RET_ASYNC
or USB_RET_PROCERR, since ehci_execute_complete already does a similar check
and will trigger an assert if either value is encountered.
USB_RET_ASYNC should never be the packet status when execute_complete runs
for obvious reasons, and USB_RET_PROCERR is only used by ehci_state_execute /
ehci_execute not by ehci_state_executing / ehci_execute_complete.
Signed-off-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Hans de Goede [Thu, 30 Aug 2012 07:55:19 +0000 (09:55 +0200)]
ehci: Schedule async-bh when IAAD bit gets set
After the "ehci: Print a warning when a queue unexpectedly contains packets
on cancel" commit. Under certain reproducable conditions I was getting the
following message: "EHCI: Warning queue not empty on queue reset".
After aprox. 8 hours of debugging I've finally found the cause. The Linux EHCI
driver has an IAAD watchdog, to work around certain EHCI hardware sometimes
not acknowledging the doorbell at all. This watchdog has a timeout of 10 ms,
which is less then the time between 2 runs through the async schedule when
async_stepdown is at its highest value.
Thus the watchdog can trigger, after which Linux clears the IAAD bit and
re-uses the QH. IOW we were not properly detecting the unlink of the qh, due
to us missing (ignoring for more then 10 ms) the IAAD command, which triggered
the warning.
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Gerd Hoffmann [Thu, 23 Aug 2012 11:30:13 +0000 (13:30 +0200)]
usb: unique packet ids
This patch adds IDs to usb packets. Those IDs are (a) supposed to be
unique for the lifecycle of a packet (from packet setup until the packet
is either completed or canceled) and (b) stable across migration.
uhci, ohci, ehci and xhci use the guest physical address of the transfer
descriptor for this.
musb needs a different approach because there is no transfer descriptor.
But musb also doesn't support pipelining, so we have never more than one
packet per endpoint in flight. So we go create an ID based on endpoint
and device address.
Hans de Goede [Fri, 17 Aug 2012 13:24:49 +0000 (15:24 +0200)]
usb: Halt ep queue en cancel pending packets on a packet error
For controllers which queue up more then 1 packet at a time, we must halt the
ep queue, and inside the controller code cancel all pending packets on an
error.
There are multiple reasons for this:
1) Guests expect the controllers to halt ep queues on error, so that they
get the opportunity to cancel transfers which the scheduled after the failing
one, before processing continues
2) Not cancelling queued up packets after a failed transfer also messes up
the controller state machine, in the case of EHCI causing the following
assert to trigger: "assert(p->qtdaddr == q->qtdaddr)" at hcd-ehci.c:2075
3) For bulk endpoints with pipelining enabled (redirection to a real USB
device), we must cancel all the transfers after this a failed one so that:
a) If they've completed already, they are not processed further causing more
stalls to be reported, originating from the same failed transfer
b) If still in flight, they are cancelled before the guest does
a clear stall, otherwise the guest and device can loose sync!
Note this patch only touches the ehci and uhci controller changes, since AFAIK
no other controllers actually queue up multiple transfer. If I'm wrong on this
other controllers need to be updated too!
Also note that this patch was heavily tested with the ehci code, where I had
a reproducer for a device causing a transfer to fail. The uhci code is not
tested with actually failing transfers and could do with a thorough review!
Signed-off-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>