Several performance improvements in XDP TX datapath,
including:
- Ring a single doorbell for XDP TX ring per NAPI budget,
instead of doing it per a lower threshold (was 8).
This includes removing the flow of immediate doorbell ringing
in case of a full TX ring.
- Compiler branch predictor hints.
- Calculate values in compile time rather than in runtime.
Performance tests:
Tested on ConnectX3Pro, Intel(R) Xeon(R) CPU E5-2680 v3 @ 2.50GHz
Single queue no-RSS optimization ON.
Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Reviewed-by: Saeed Mahameed <saeedm@mellanox.com> Cc: kernel-team@fb.com Cc: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>