Always fully coalesce guest Rx packets into the minimum number of ring
slots. Reducing the number of slots per packet has significant
performance benefits when receiving off-host traffic.
Results from XenServer's performance benchmarks:
Baseline Full coalesce
Interhost VM receive 7.2 Gb/s 11 Gb/s
Interhost aggregate 24 Gb/s 24 Gb/s
Intrahost single stream 14 Gb/s 14 Gb/s
Intrahost aggregate 34 Gb/s 34 Gb/s
However, this can increase the number of grant ops per packet which
decreases performance of backend (dom0) to VM traffic (by ~10%)
/unless/ grant copy has been optimized for adjacent ops with the same
source or destination (see "grant-table: defer releasing pages
acquired in a grant copy"[1] expected in Xen 4.6).
Signed-off-by: David Vrabel <david.vrabel@citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>