]> git.proxmox.com Git - mirror_qemu.git/commit
target/riscv/vector_helper.c: skip set tail when vta is zero
authorDaniel Henrique Barboza <dbarboza@ventanamicro.com>
Thu, 27 Apr 2023 20:57:07 +0000 (17:57 -0300)
committerAlistair Francis <alistair.francis@wdc.com>
Tue, 13 Jun 2023 06:35:02 +0000 (16:35 +1000)
commitbc0ec52eb258e55aa8a4a4ab89cb5c8ad49b30ee
tree4f41547ea4bcd7f9d41126d77a8bda37e68463e3
parentfdd0df5340a8ebc8de88078387ebc85c5af7b40f
target/riscv/vector_helper.c: skip set tail when vta is zero

The function is a no-op if 'vta' is zero but we're still doing a lot of
stuff in this function regardless. vext_set_elems_1s() will ignore every
single time (since vta is zero) and we just wasted time.

Skip it altogether in this case. Aside from the code simplification
there's a noticeable emulation performance gain by doing it. For a
regular C binary that does a vectors operation like this:

=======
 #define SZ 10000000

int main ()
{
  int *a = malloc (SZ * sizeof (int));
  int *b = malloc (SZ * sizeof (int));
  int *c = malloc (SZ * sizeof (int));

  for (int i = 0; i < SZ; i++)
    c[i] = a[i] + b[i];
  return c[SZ - 1];
}
=======

Emulating it with qemu-riscv64 and RVV takes ~0.3 sec:

$ time ~/work/qemu/build/qemu-riscv64 \
    -cpu rv64,debug=false,vext_spec=v1.0,v=true,vlen=128 ./foo.out

real    0m0.303s
user    0m0.281s
sys     0m0.023s

With this skip we take ~0.275 sec:

$ time ~/work/qemu/build/qemu-riscv64 \
    -cpu rv64,debug=false,vext_spec=v1.0,v=true,vlen=128 ./foo.out

real    0m0.274s
user    0m0.252s
sys     0m0.019s

This performance gain adds up fast when executing heavy benchmarks like
SPEC.

Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Acked-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Palmer Dabbelt <palmer@rivosinc.com>
Reviewed-by: Weiwei Li <liweiwei@iscas.ac.cn>
Message-Id: <20230427205708.246679-2-dbarboza@ventanamicro.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
target/riscv/vector_helper.c