This series updates the GRED Qdisc. The Qdisc matches nfp offload very
well, but before we can offload it there are a number of improvements
to make.
First few patches add extack messages to the Qdisc and pass extack
to netlink validation.
Next a new netlink attribute group is added, to allow GRED to be
extended more easily. Currently GRED passes C structures as attributes,
and even an array of C structs for virtual queue configuration. User
space has hard coded the expected length of that array, so adding new
fields is not possible.
Statistics are dump only. Patch 4 switches the byte counts to be 64 bit,
and patch 5 introduces the new stats attributes for dump. Patch 6
switches RED flags to be per-virtual queue, and patch 7 allows them
to be dumped and set at virtual queue granularity.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Thu, 15 Nov 2018 06:23:51 +0000 (22:23 -0800)]
net: sched: gred: allow manipulating per-DP RED flags
Allow users to set and dump RED flags (ECN enabled and harddrop)
on per-virtual queue basis. Validation of attributes is split
from changes to make sure we won't have to undo previous operations
when we find out configuration is invalid.
The objective is to allow changing per-Qdisc parameters without
overwriting the per-vq configured flags.
Old user space will not pass the TCA_GRED_VQ_FLAGS attribute and
per-Qdisc flags will always get propagated to the virtual queues.
New user space which wants to make use of per-vq flags should set
per-Qdisc flags to 0 and then configure per-vq flags as it
sees fit. Once per-vq flags are set per-Qdisc flags can't be
changed to non-zero. Vice versa - if the per-Qdisc flags are
non-zero the TCA_GRED_VQ_FLAGS attribute has to either be omitted
or set to the same value as per-Qdisc flags.
Update per-Qdisc parameters:
per-Qdisc | per-VQ | result
0 | 0 | all vq flags updated
0 | non-0 | error (vq flags in use)
non-0 | 0 | -- impossible --
non-0 | non-0 | all vq flags updated
Update per-VQ state (flags parameter not specified):
no change to flags
Update per-VQ state (flags parameter set):
per-Qdisc | per-VQ | result
0 | any | per-vq flags updated
non-0 | 0 | -- impossible --
non-0 | non-0 | error (per-Qdisc flags in use)
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: John Hurley <john.hurley@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Thu, 15 Nov 2018 06:23:50 +0000 (22:23 -0800)]
net: sched: gred: store red flags per virtual queue
Right now ECN marking and HARD drop (the common RED flags) can only
be configured for the entire Qdisc. In preparation for per-vq flags
store the values in the virtual queue structure. Setting per-vq
flags will only be allowed when no flags are set for the entire Qdisc.
For the new flags we will also make sure undefined bits are 0.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: John Hurley <john.hurley@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Thu, 15 Nov 2018 06:23:49 +0000 (22:23 -0800)]
net: sched: gred: provide a better structured dump and expose stats
Currently all GRED's virtual queue data is dumped in a single
array in a single attribute. This makes it pretty much impossible
to add new fields. In order to expose more detailed stats add a
new set of attributes. We can now expose the 64 bit value of bytesin
and all the mark stats which were not part of the original design.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: John Hurley <john.hurley@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Thu, 15 Nov 2018 06:23:48 +0000 (22:23 -0800)]
net: sched: gred: store bytesin as a 64 bit value
32 bit counters for bytes are not really going to last long in modern
world. Make sch_gred count bytes on a 64 bit counter. It will still
get truncated during dump but follow up patch will add set of new
stat dump attributes.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: John Hurley <john.hurley@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Thu, 15 Nov 2018 06:23:47 +0000 (22:23 -0800)]
net: sched: gred: use extack to provide more details on configuration errors
Add extack messages to -EINVAL errors, to help users identify
their mistakes.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: John Hurley <john.hurley@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Thu, 15 Nov 2018 06:23:46 +0000 (22:23 -0800)]
net: sched: gred: pass extack to nla_parse_nested()
In case netlink wants to provide parsing error pass extack
to nla_parse_nested().
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: John Hurley <john.hurley@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Thu, 15 Nov 2018 06:23:45 +0000 (22:23 -0800)]
net: sched: gred: separate error and non-error path in gred_change()
We will soon want to add more code to the non-error path, separate
it from the error handling flow.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: John Hurley <john.hurley@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Paolo Abeni [Thu, 15 Nov 2018 01:34:50 +0000 (02:34 +0100)]
udp: fix jump label misuse
The commit 60fb9567bf30 ("udp: implement complete book-keeping for
encap_needed") introduced a severe misuse of jump label APIs, which
syzbot, as reported by Eric, was able to exploit.
When multiple sockets/process can concurrently request (and than
disable) the udp encap, we need to track the activation counter with
*_inc()/*_dec() jump label variants, or we can experience bad things
at disable time.
Fixes: 60fb9567bf30 ("udp: implement complete book-keeping for encap_needed") Reported-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Currently on dequeue() ETF only drops the first expired packet, which
causes a problem if the next packet is already expired. When this
happens, the watchdog will be configured with a time in the past, fire
straight way and the packet will finally be dropped once the dequeue()
function of the qdisc is called again.
We can save quite a few cycles and improve the overall behavior of the
qdisc if we drop all expired packets if the next packet is expired.
This should allow ETF to recover faster from bad situations. But
packet drops are still a very serious warning that the requirements
imposed on the system aren't reasonable.
This was inspired by how the implementation of hrtimers use the
rb_tree inside the kernel.
Signed-off-by: Jesus Sanchez-Palencia <jesus.s.palencia@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
This is just a refactor that will simplify the implementation of the
next patch in this series which will drop all expired packets on the
dequeue flow.
Signed-off-by: Jesus Sanchez-Palencia <jesus.s.palencia@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
ETF's peek() operation is heavily used so use an rb_root_cached instead
and leverage rb_first_cached() which will run in O(1) instead of
O(log n).
Even if on 'timesortedlist_clear()' we could be using rb_erase(), we
choose to use rb_erase_cached(), because if in the future we allow
runtime changes to ETF parameters, and need to do a '_clear()', this
might cause some hard to debug issues.
Signed-off-by: Jesus Sanchez-Palencia <jesus.s.palencia@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Yafang Shao [Wed, 14 Nov 2018 14:26:17 +0000 (22:26 +0800)]
tcp: clean up STATE_TRACE
Currently we can use bpf or tcp tracepoint to conveniently trace the tcp
state transition at the run time.
So we don't need to do this stuff at the compile time anymore.
Signed-off-by: Yafang Shao <laoar.shao@gmail.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
This is a series of a few driver cleanups and some fixups of the code
for the SMSC95XX driver. There have been a few reviews, and the issues
have been fixed so this should be ready for merging.
I will work on the tx-alignment and the other bits of usbnet changes
and produce at least two more patch series for this later.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Ben Dooks [Wed, 14 Nov 2018 11:50:22 +0000 (11:50 +0000)]
usbnet: smsc95xx: check for csum being in last four bytes
The manual states that the checksum cannot lie in the last DWORD of the
transmission, so add a basic check for this and fall back to software
checksumming the packet.
This only seems to trigger for ACK packets with no options or data to
return to the other end, and the use of the tx-alignment option makes
it more likely to happen.
Signed-off-by: Ben Dooks <ben.dooks@codethink.co.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
Ben Dooks [Wed, 14 Nov 2018 11:50:20 +0000 (11:50 +0000)]
usbnet: smsc95xx: simplify tx_fixup code
The smsc95xx_tx_fixup is doing multiple calls to skb_push() to
put an 8-byte command header onto the packet. It would be easier
to do one skb_push() and then copy the data in once the push is
done.
We also make the code smaller by using proper unaligned puts for
the header. This merges in the CPU to LE32 conversion as well and
makes the whole sequence easier to understand hopefully.
Signed-off-by: Ben Dooks <ben.dooks@codethink.co.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
Ben Dooks [Wed, 14 Nov 2018 11:50:19 +0000 (11:50 +0000)]
usbnet: smsc95xx: fix rx packet alignment
The smsc95xx driver already takes into account the NET_IP_ALIGN
parameter when setting up the receive packet data, which means
we do not need to worry about aligning the packets in the usbnet
driver.
Adding the EVENT_NO_IP_ALIGN means that the IPv4 header is now
passed to the ip_rcv() routine with the start on an aligned address.
Tested on Raspberry Pi B3.
Signed-off-by: Ben Dooks <ben.dooks@codethink.co.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Sat, 17 Nov 2018 04:12:31 +0000 (20:12 -0800)]
Merge branch 'dpaa2-eth-add-bql-support'
Ioana Ciocoi Radulescu says:
====================
dpaa2-eth: add bql support
The first two patches make minor tweaks to the driver to
simplify bql implementation. The third patch adds the actual
bql support.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Change the frame consume callback signature:
* the entire FQ structure is passed to the callback instead
of just the queue index
* the NAPI structure can be easily obtained from the channel
it is associated to, so we don't need to pass it explicitly
Signed-off-by: Ioana Radulescu <ruxandra.radulescu@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
The DPNI object on which we build a network interface has a
certain number of {Rx, Tx, Tx confirmation} frame queues as
resources. The default hardware setup offers one queue of each
type, as well as one DPCON channel, for each core available
in the system.
There are however cases where the number of queues is greater
than the number of cores or channels. Until now, we configured
and used all the frame queues associated with a DPNI, even if it
meant assigning multiple queues of one type to the same channel.
Update the driver to only use a number of queues equal to the
number of channels, ensuring each channel will contain exactly
one Rx and one Tx confirmation queue.
>From the user viewpoint, this change is completely transparent.
Performance wise there is no impact in most scenarios. In case
the number of queues is larger than and not a multiple of the
number of channels, Rx hash distribution offers now better load
balancing between cores, which can have a positive impact on
overall system performance.
Signed-off-by: Ioana Radulescu <ruxandra.radulescu@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Jiri Pirko [Tue, 13 Nov 2018 22:22:48 +0000 (23:22 +0100)]
net: 8021q: move vlan offload registrations into vlan_core
Currently, the vlan packet offloads are registered only upon 8021q module
load. However, even without this module loaded, the offloads could be
utilized, for example by openvswitch datapath. As reported by Michael,
that causes 2x to 5x performance improvement, depending on a testcase.
So move the vlan offload registrations into vlan_core and make this
available even without 8021q module loaded.
Reported-by: Michael Shteinbok <michaelsh86@gmail.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Tested-by: Michael Shteinbok <michaelsh86@gmail.com> Reviewed-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Heiner Kallweit [Mon, 12 Nov 2018 20:16:06 +0000 (21:16 +0100)]
net: phy: check for implementation of both callbacks in phy_drv_supports_irq
Now that the icplus driver has been fixed all PHY drivers supporting
interrupts have both callbacks (config_intr and ack_interrupt)
implemented - as it should be. Therefore phy_drv_supports_irq()
can be changed now to check for both callbacks being implemented.
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Acked-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Michał Mirosław [Sat, 10 Nov 2018 18:58:36 +0000 (19:58 +0100)]
net: remove VLAN_TAG_PRESENT
Replace VLAN_TAG_PRESENT with single bit flag and free up
VLAN.CFI overload. Now VLAN.CFI is visible in networking stack
and can be passed around intact.
Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl> Signed-off-by: David S. Miller <davem@davemloft.net>
Li RongQing [Tue, 13 Nov 2018 01:34:31 +0000 (09:34 +0800)]
net: slightly optimize eth_type_trans
netperf udp stream shows that eth_type_trans takes certain cpu,
so adjust the mac address check order, and firstly check if it
is device address, and only check if it is multicast address
only if not the device address.
After this change:
To unicast, and skb dst mac is device mac, this is most of time
reduce a comparision
To unicast, and skb dst mac is not device mac, nothing change
To multicast, increase a comparision
Before:
1.03% [kernel] [k] eth_type_trans
After:
0.78% [kernel] [k] eth_type_trans
Signed-off-by: Zhang Yu <zhangyu31@baidu.com> Signed-off-by: Li RongQing <lirongqing@baidu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Thu, 15 Nov 2018 23:05:11 +0000 (15:05 -0800)]
Merge branch '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue
Jeff Kirsher says:
====================
40GbE Intel Wired LAN Driver Updates 2018-11-14
This series contains updates to i40e and virtchnl.
Lance Roy updates i40e to use lockdep_assert_held() instead of
spin_is_locked(), since it is better suited to check locking
requirements.
Jan improves the code readability in XDP by adding the use of a local
variable. Provides protection on methods that create/modify/destroy
VF's via locking mechanism to prevent unstable behaviour and potential
kernel panics.
Krzysztof adds a hardware capability flag to indicate whether firmware
supports stopping the LLDP agent.
Patryk replaces the use of strncpy() with strlcpy() to ensure the buffer
is NULL terminated.
Mitch fixes the issue of trying to start nway on devices that do not
support auto-negotiation, by checking the autoneg state before
attempting to restart nway.
Alice updates virtchnl to keep the checks all together for ease of
readability and consistency. Also fixed a "off by one" error in the
number of traffic classes being calculated.
Richard fixed VF port VLANs, where the priority bits were incorrectly
set because the incorrect shift and mask bits were being used.
Alan adds a bit to set and check if a timeout recovery is already
pending to prevent overlapping transmit timeout recovery.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Thu, 15 Nov 2018 22:57:09 +0000 (14:57 -0800)]
test_objagg: Fix warning.
lib/test_objagg.c: In function ‘test_delta_action_item’:
./include/linux/printk.h:308:2: warning: ‘errmsg’ may be used uninitialized in this function [-Wmaybe-uninitialized]
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Thu, 15 Nov 2018 22:43:44 +0000 (14:43 -0800)]
Merge branch 'mlxsw-ERP-sharing-multiple-masks'
Ido Schimmel says:
====================
mlxsw: spectrum: acl: Introduce ERP sharing by multiple masks
Jiri says:
The Spectrum-2 hardware has limitation number of ERPs per-region. In
order to accommodate more masks than number of ERPs, the hardware
supports to insert rules with delta bits. By that, the rules with masks
that differ in up-to 8 consecutive bits can share the same ERP.
Patches 1 and 2 fix couple of issues that would appear in existing
selftests after adding delta support
Patch 3 introduces a generic object aggregation library. Now it is
static, but it will get extended for recalculation of aggregations in
the future in order to reach more optimal aggregation.
Patch 4 just simply converts existing ERP code to use the objagg library
instead of a rhashtable.
Patches 5-9 do more or less small changes to prepare ground for the last
patch.
Patch 10 fills-up delta callbacks of objagg library and utilizes the
delta bits for rule insertion.
The last patch adds selftest to test the mlxsw Spectrum-2 delta flows.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Jiri Pirko [Wed, 14 Nov 2018 08:22:35 +0000 (08:22 +0000)]
mlxsw: spectrum: acl: Implement delta for ERP
Allow ERP sharing for multiple mask. Do it by properly implementing
delta_create() objagg object. Use the computed delta info for inserting
rules in A-TCAM.
Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Jiri Pirko [Wed, 14 Nov 2018 08:22:34 +0000 (08:22 +0000)]
mlxsw: spectrum: acl: Push code related to num_ctcam_erps inc/dec into separate helpers
Later on the same code is going to be needed for deltas as well. So push
the procedures related to increment and decrement of num_ctcam_erps
into a separate helpers.
Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Jiri Pirko [Wed, 14 Nov 2018 08:22:33 +0000 (08:22 +0000)]
mlxsw: spectrum: acl: Remove mlxsw_afk_encode() block range args and key/mask check
Since two remaining users of mlxsw_afk_encode() do not specify
block ranges to work on, remove the args. Also, key/mask is always
non-NULL now, so skip the checks.
Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Jiri Pirko [Wed, 14 Nov 2018 08:22:32 +0000 (08:22 +0000)]
mlxsw: spectrum: acl: Don't encode the key again in mlxsw_sp_acl_atcam_12kb_lkey_id_get()
No need to do key encoding again in
mlxsw_sp_acl_atcam_12kb_lkey_id_get(). Instead of that, introduce
a new helper that would just clear unused blocks.
Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Jiri Pirko [Wed, 14 Nov 2018 08:22:30 +0000 (08:22 +0000)]
mlxsw: spectrum: acl: Pass key pointer to master_mask_set/clear
The device requires that the master mask of each region will be
composed from a logical OR between all the unmasked bits in the region.
Currently, this is just a logical OR between all the eRPs used in the
region, but the next patch is going to introduce delta bits support
which need to be taken into account as well.
Since the eRP does not include the delta bits, pass the key pointer to
mlxsw_sp_acl_erp_master_mask_set/clear instead. Convert key->mask to
the bitmap on fly.
Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Jiri Pirko [Wed, 14 Nov 2018 08:22:29 +0000 (08:22 +0000)]
mlxsw: spectrum: acl_erp: Convert to use objagg for tracking ERPs
Currently the ERPs are tracked internally in a hashtable. Benefit from
the newly introduced objagg library and use it to track ERPs. At this
point, there is no nesting of objects done, as the delta_create callback
always returns -EOPNOTSUPP. On the way, add "mask" into ERP mask get and
set functions and struct names.
Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Jiri Pirko [Wed, 14 Nov 2018 08:22:28 +0000 (08:22 +0000)]
lib: introduce initial implementation of object aggregation manager
This lib tracks objects which could be of two types:
1) root object
2) nested object - with a "delta" which differentiates it from
the associated root object
The objects are tracked by a hashtable and reference-counted. User is
responsible of implementing callbacks to create/destroy root entity
related to each root object and callback to create/destroy nested object
delta.
Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Cong Wang [Tue, 13 Nov 2018 00:17:00 +0000 (16:17 -0800)]
net: get rid of __tcp_checksum_complete()
__tcp_checksum_complete() is 100% same with __skb_checksum_complete()
and there is no other caller except tcp_checksum_complete().
So, just use __skb_checksum_complete() there.
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Amritha Nambiar [Tue, 13 Nov 2018 00:15:55 +0000 (16:15 -0800)]
net: sched: cls_flower: Classify packets using port ranges
Added support in tc flower for filtering based on port ranges.
Example:
1. Match on a port range:
-------------------------
$ tc filter add dev enp4s0 protocol ip parent ffff:\
prio 1 flower ip_proto tcp dst_port range 20-30 skip_hw\
action drop
$ tc -s filter show dev enp4s0 parent ffff:
filter protocol ip pref 1 flower chain 0
filter protocol ip pref 1 flower chain 0 handle 0x1
eth_type ipv4
ip_proto tcp
dst_port range 20-30
skip_hw
not_in_hw
action order 1: gact action drop
random type none pass val 0
index 1 ref 1 bind 1 installed 85 sec used 3 sec
Action statistics:
Sent 460 bytes 10 pkt (dropped 10, overlimits 0 requeues 0)
backlog 0b 0p requeues 0
2. Match on IP address and port range:
--------------------------------------
$ tc filter add dev enp4s0 protocol ip parent ffff:\
prio 1 flower dst_ip 192.168.1.1 ip_proto tcp dst_port range 100-200\
skip_hw action drop
$ tc -s filter show dev enp4s0 parent ffff:
filter protocol ip pref 1 flower chain 0 handle 0x2
eth_type ipv4
ip_proto tcp
dst_ip 192.168.1.1
dst_port range 100-200
skip_hw
not_in_hw
action order 1: gact action drop
random type none pass val 0
index 2 ref 1 bind 1 installed 58 sec used 2 sec
Action statistics:
Sent 920 bytes 20 pkt (dropped 20, overlimits 0 requeues 0)
backlog 0b 0p requeues 0
v4:
1. Added condition before setting port key.
2. Organized setting and dumping port range keys into functions
and added validation of input range.
v3:
1. Moved new fields in UAPI enum to the end of enum.
2. Removed couple of empty lines.
v2:
Addressed Jiri's comments:
1. Added separate functions for dst and src comparisons.
2. Removed endpoint enum.
3. Added new bit TCA_FLOWER_FLAGS_RANGE to decide normal/range
lookup.
4. Cleaned up fl_lookup function.
Signed-off-by: Amritha Nambiar <amritha.nambiar@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Andrew Lunn [Mon, 12 Nov 2018 17:51:01 +0000 (18:51 +0100)]
net: dsa: mv88e6xxx: Work around mv886e6161 SERDES missing MII_PHYSID2
We already have a workaround for a couple of switches whose internal
PHYs only have the Marvel OUI, but no model number. We detect such
PHYs and give them the 6390 ID as the model number. However the
mv88e6161 has two SERDES interfaces in the same address range as its
internal PHYs. These suffer from the same problem, the Marvell OUI,
but no model number. As a result, these SERDES interfaces were getting
the same PHY ID as the mv88e6390, even though they are not PHYs, and
the Marvell PHY driver was trying to drive them.
Add a special case to stop this from happen.
Reported-by: Chris Healy <Chris.Healy@zii.aero> Signed-off-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Thu, 15 Nov 2018 19:23:30 +0000 (11:23 -0800)]
Merge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue
Jeff Kirsher says:
====================
100GbE Intel Wired LAN Driver Updates 2018-11-13
This series contains updates to the ice driver only.
Brett cleans up debug print messages by removing useless or duplicate
messages, and make sure we assign the hardware head pointer to head
instead of the software head pointer. Resolved an issue when disabling
SRIOV we were trying to stop queues multiple times, so make sure we
disable SRIOV before stopping transmit and receive queues for VF.
Tony fixes a potential NULL pointer dereference during a VF reset.
Anirudh resolves an issue where we were releasing the VSI before
removing the VSI scheduler node, which was resulting in an error "Failed
to set LAN Tx queue context, error: -1". Also fixed the guaranteed
number of VSIs available and used by discovering the device
capabilities to determine the 'guar_num_vsi' per function, rather than
always using the theoretical max number of VSIs every time.
Dave avoids a deadlock by nesting RTNL locking, so added a boolean to
determine if the RTNL lock is already held.
Lev fixes bad mask values which would break compilation.
Piotr increases the receive queue disable timeout since it can take
additional time to finish all pending queue requests.
Usha resolves an issue of VLAN priority tagged traffic not appearing on
all traffic classes, which was causing ETS bandwidth shaping to not work
as expected.
Henry fixes the reset path to cleanup the old scheduler tree before
rebuilding it.
Md Fahad removes a unnecessary check which was causing a driver load
error on platforms with more than 128 cores.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Thu, 15 Nov 2018 17:44:46 +0000 (09:44 -0800)]
Merge branch 'hns3-hwgro'
Salil Mehta says:
====================
net: hns3: Add support of hardware GRO to HNS3 Driver
This patch-set adds support of hardware assisted GRO feature to
HNS3 driver on Rev B(=0x21) platform. Current hardware only
supports TCP/IPv{4|6} flows.
Peng Li [Thu, 15 Nov 2018 09:29:24 +0000 (09:29 +0000)]
net: hns3: Add skb chain when num of RX buf exceeds MAX_SKB_FRAGS
MAX_SKB_FRAGS in protocol stack is defined as:
MAX_SKB_FRAGS is 17 when PAGE_SIZE is 4K. If HW enable GRO, it may
merge small packets and the rx buffer may be more than
MAX_SKB_FRAGS. So driver will add skb chain when RX buffer num.
more than MAX_SKB_FRAGS.
Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: Salil Mehta <salil.mehta@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Peng Li [Thu, 15 Nov 2018 09:29:22 +0000 (09:29 +0000)]
net: hns3: Add handling of GRO Pkts not fully RX'ed in NAPI poll
The "FE bit" in the description means the last description for
a packets. When HW GRO enable, HW write data to ring every
packet/buffer, there is greater probability that driver handle
with the describtion but HW still not set the "FE bit".
When drier handle the packet and HW still not set "FE bit",
driver stores skb and bd_num in rx ring, and continue to use the
skb and bd_num in next napi.
Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: Salil Mehta <salil.mehta@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Heiner Kallweit [Sun, 11 Nov 2018 20:49:12 +0000 (21:49 +0100)]
net: phy: icplus: add config_intr callback
Move IRQ configuration for IP101A/G from config_init to config_intr
callback. Reasons:
1. This allows phylib to disable interrupts if needed.
2. Icplus was the only driver supporting interrupts w/o defining a
config_intr callback. Now we can add a phylib plausibility check
disabling interrupt mode if one of the two irq-related callbacks
isn't defined.
I don't own hardware with this PHY, and the change is based on the
datasheet for IP101A LF (which is supposed to be register-compatible
with IP101A/G). Change is compile-tested only.
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Alan Brady [Mon, 29 Oct 2018 18:27:21 +0000 (11:27 -0700)]
i40e: prevent overlapping tx_timeout recover
If a TX hang occurs, we attempt to recover by incrementally resetting.
If we're starved for CPU time, it's possible the reset doesn't actually
complete (or even fire) before another tx_timeout fires causing us to
fly through the different resets without actually doing them.
This adds a bit to set and check if a timeout recovery is already
pending and, if so, bail out of tx_timeout. The bit will get cleared at
the end of i40e_rebuild when reset is complete.
Signed-off-by: Alan Brady <alan.brady@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Mitch Williams [Fri, 26 Oct 2018 21:33:33 +0000 (14:33 -0700)]
i40e: suppress bogus error message
The i40e driver complains about unprivileged VFs trying to configure
promiscuous mode each time a VF reset occurs. This isn't the fault of
the poor VF driver - the PF driver itself is making the request.
To fix this, skip the privilege check if the request is to disable all
promiscuous activity. This gets rid of the bogus message, but doesn't
affect privilege checks, since we really only care if the unprivileged
VF is trying to enable promiscuous mode.
Signed-off-by: Mitch Williams <mitch.a.williams@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
When using port VLAN, for VFs, and setting priority bits, the device
was sending out incorrect priority bits, and also setting the CFI
bit incorrectly.
To fix this, changed shift and mask bit definition for this function, to
use the correct ones.
Signed-off-by: Richard Rodriguez <richard.rodriguez@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Alice Michael [Fri, 26 Oct 2018 21:33:31 +0000 (14:33 -0700)]
virtchnl: Fix off by one error
When calculating the valid length for a VIRTCHNL_OP_ENABLE_CHANNELS
message, we accidentally allowed messages with one extra
virtchnl_channel_info structure on the end. This happened due
to an off by one error, because we forgot that valid_len already
accounted for one virtchnl_channel_info structure, so we need to
subtract one from the num_tc value.
Signed-off-by: Alice Michael <alice.michael@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Alice Michael [Fri, 26 Oct 2018 21:33:30 +0000 (14:33 -0700)]
virtchnl: white space and reorder
White space change.
Move the check on the virtchnl_vsi_queue_config_info struct
to be close to the struct like all the other similar checks.
This keeps it clearer and easier to read.
Signed-off-by: Alice Michael <alice.michael@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Jacob Keller [Fri, 26 Oct 2018 21:33:29 +0000 (14:33 -0700)]
i40e: always set ks->base.speed in i40e_get_settings_link_up
In i40e_get_settings_link_up, set ks->base.speed to SPEED_UNKNOWN
in the case where we don't know the link speed.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Mitch Williams [Fri, 26 Oct 2018 21:33:28 +0000 (14:33 -0700)]
i40e: don't restart nway if autoneg not supported
On link types that do not support autoneg, we cannot attempt to restart
nway negotiation. This results in a dead link that requires a power
cycle to remedy.
Fix this by saving off the autoneg state and checking this value before
we try to restart nway.
Signed-off-by: Mitch Williams <mitch.a.williams@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Patryk Małek [Tue, 30 Oct 2018 17:50:47 +0000 (10:50 -0700)]
i40e: Allow disabling FW LLDP on X722 devices
This patch allows disabling FW LLDP agent on X722 devices.
It also changes a source of information for this feature from
pf->hw_features to pf->hw.flags which are set in i40e_init_adminq.
Signed-off-by: Patryk Małek <patryk.malek@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Alice Michael [Tue, 30 Oct 2018 17:50:46 +0000 (10:50 -0700)]
i40e: update driver version
The version numbers have not been kept up to date and this is
an effort to ammend that.
Signed-off-by: Alice Michael <alice.michael@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Jan Sokolowski [Tue, 30 Oct 2018 17:50:45 +0000 (10:50 -0700)]
i40e: Protect access to VF control methods
A scenario has been found in which simultaneous
addition/removal and modification of VF's might cause
unstable behaviour, up to and including kernel panics.
Protect the methods that create/modify/destroy VF's
by locking them behind an atomically set bit in PF status
bitfield.
Signed-off-by: Jan Sokolowski <jan.sokolowski@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Patryk Małek [Tue, 30 Oct 2018 17:50:44 +0000 (10:50 -0700)]
i40e: Replace strncpy with strlcpy to ensure null termination
Using strncpy allows destination buffer to be not null terminated
after the copying takes place. strlcpy ensures that's not the
case by explicitly setting last element in the buffer as '\0'.
Signed-off-by: Patryk Małek <patryk.malek@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Add HW capability flag to indicate that firmware supports stopping
LLDP agent. This feature has been added in FW API 1.7 for XL710
devices and 1.6 for X722. Also raise expected minor version number
for X722 FW API to 6.
Signed-off-by: Krzysztof Galazka <krzysztof.galazka@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Jan Sokolowski [Tue, 30 Oct 2018 17:50:42 +0000 (10:50 -0700)]
i40e: Use a local variable for readability
Use a local variable to make the code a bit more readable.
Signed-off-by: Jan Sokolowski <jan.sokolowski@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Lance Roy [Wed, 3 Oct 2018 05:38:50 +0000 (22:38 -0700)]
i40e: Replace spin_is_locked() with lockdep
lockdep_assert_held() is better suited to checking locking requirements,
since it won't get confused when someone else holds the lock. This is
also a step towards possibly removing spin_is_locked().
Signed-off-by: Lance Roy <ldr709@gmail.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
David S. Miller [Wed, 14 Nov 2018 16:51:28 +0000 (08:51 -0800)]
Merge branch 'nfp-abm-track-all-Qdiscs'
Jakub Kicinski says:
====================
nfp: abm: track all Qdiscs
Our Qdisc offload so far has been very simplistic. We held
and array of marking thresholds and statistics sized to the
number of PF queues. This was sufficient since the only
configuration we supported was single layer of RED Qdiscs
(on top of MQ or not, but MQ isn't really about queuing).
As we move to add more Qdiscs it's time to actually try to
track the full Qdisc hierarchy. This allows us to make sure
our offloaded configuration reflects the SW path better.
We add graft notifications to MQ and RED (PRIO already sends
them) to allow drivers offloading those to learn how Qdiscs
are linked. MQ graft gives us the obvious advantage of being
able to track when Qdiscs are shared or moved. It seems
unlikely HW would offload RED's child Qdiscs but since the
behaviour would change based on linked child we should
stop offloading REDs with modified child. RED will also
handle the child differently during reconfig when limit
parameter is set - so we have to inform the drivers about
the limit, and have them reset the child state when
appropriate.
The NFP driver will now allocate a structure to track each
Qdisc and link it to its children. We will also maintain
a shadow copy of threshold settings - to save device writes
and make it easier to apply defaults when config is
re-evaluated.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Mon, 12 Nov 2018 22:58:19 +0000 (14:58 -0800)]
nfp: abm: restructure Qdisc handling
In preparation of handling more Qdisc types switch to a different
offload strategy. We have now recreated the Qdisc hierarchy in
the driver. Every time the hierarchy changes parse it, and update
the configuration of the HW accordingly.
While at it drop the support of pretending that we can instantiate
a single queue on a multi-queue device in HW/FW. MQ is now required,
and each queue will have its own instance of RED.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: John Hurley <john.hurley@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Mon, 12 Nov 2018 22:58:18 +0000 (14:58 -0800)]
nfp: abm: save RED's parameters
Use the new driver Qdisc structure to keep track of parameters
of RED Qdiscs. This way as the Qdisc moves around in the hierarchy
we will be able to configure the HW appropriately.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: John Hurley <john.hurley@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Mon, 12 Nov 2018 22:58:17 +0000 (14:58 -0800)]
nfp: abm: reset RED's child based on limit
RED qdisc will replace its child Qdisc with a new FIFO queue if
it is reconfigured and the limit parameter is not 0.
This means that when it's created with limit of 0 it will have no FIFO,
and all packets will be dropped. If it's changed and limit is specified
it will loose its existing child (implicit graft). Make sure we mark
RED Qdisc child as NFP_QDISC_UNTRACKED if its not the expected FIFO.
nfp_abm_qdisc_replace() will return 1 if Qdisc already existed.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: John Hurley <john.hurley@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Mon, 12 Nov 2018 22:58:16 +0000 (14:58 -0800)]
net: sched: red: notify drivers about RED's limit parameter
RED qdisc's limit parameter changes the behaviour of the qdisc,
for instance if it's set to 0 qdisc will drop all the packets.
When replace operation happens and parameter is set to non-0
a new fifo qdisc will be instantiated and replace the old child
qdisc which will be destroyed.
Drivers need to know the parameter, even if they don't impose
the actual limit to be able to reliably reconstruct the Qdisc
hierarchy.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: John Hurley <john.hurley@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Mon, 12 Nov 2018 22:58:15 +0000 (14:58 -0800)]
nfp: abm: build full Qdisc hierarchy based on graft notifications
Using graft notifications recreate in the driver the full Qdisc
hierarchy. Keep track of how many times each Qdisc is attached
to the hierarchy to make sure we don't offload Qdiscs which are
attached multiple times (device queues can't be shared). For
graft events of Qdiscs we don't know exist make the child as
invalid/untracked.
Note that MQ Qdisc doesn't send destruction events reliably when
device is dismantled, so we need to manually clean out the
children otherwise we'd think Qdiscs which are still in use
are getting freed.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: John Hurley <john.hurley@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Mon, 12 Nov 2018 22:58:14 +0000 (14:58 -0800)]
net: sched: mq: offload a graft notification
Drivers offloading Qdiscs should have reasonable certainty
the offloaded behaviour matches the SW path. This is impossible
if the driver does not know about all Qdiscs or when Qdiscs move
and are reused. Send a graft notification from MQ.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: John Hurley <john.hurley@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Mon, 12 Nov 2018 22:58:13 +0000 (14:58 -0800)]
net: sched: red: offload a graft notification
Drivers offloading Qdiscs should have reasonable certainty
the offloaded behaviour matches the SW path. This is impossible
if the driver does not know about all Qdiscs or when Qdiscs move
and are reused. Send a graft notification from RED. The drivers
are expected to simply stop offloading the Qdisc, if a non-standard
child is ever grafted onto it.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: John Hurley <john.hurley@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Mon, 12 Nov 2018 22:58:12 +0000 (14:58 -0800)]
nfp: abm: allocate Qdisc child table
To keep track of Qdisc hierarchy allocate a table for children
for each Qdisc. RED Qdisc can only have one child.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: John Hurley <john.hurley@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Mon, 12 Nov 2018 22:58:11 +0000 (14:58 -0800)]
nfp: abm: remember which Qdisc is root
Keep track of which Qdisc is currently root. We need to implement
TC_SETUP_ROOT_QDISC handling, and for completeness also clear the
root Qdisc pointer when it's freed. TC_SETUP_ROOT_QDISC isn't always
sent when device is dismantled.
Remembering the root Qdisc will allow us to build the entire hierarchy
in following patches.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: John Hurley <john.hurley@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Mon, 12 Nov 2018 22:58:10 +0000 (14:58 -0800)]
net: sched: provide notification for graft on root
Drivers are currently not notified when a Qdisc is grafted as root.
This requires special casing Qdiscs added with parent = TC_H_ROOT in
the driver. Also there is no notification sent to the driver when
an existing Qdisc is grafted as root.
Add this very simple notifications, drivers should now be able to
track their Qdisc tree fully.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: John Hurley <john.hurley@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Mon, 12 Nov 2018 22:58:09 +0000 (14:58 -0800)]
nfp: abm: track all offload-enabled qdiscs
Allocate an object corresponding to any offloaded qdisc we are
informed about by the kernel. Not only the qdiscs we have a
chance of offloading.
The count of created objects will be used to decide whether
the ethtool TC offload can be disabled, since otherwise we may
miss destroy commands.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: John Hurley <john.hurley@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Mon, 12 Nov 2018 22:58:08 +0000 (14:58 -0800)]
nfp: abm: keep track of all RED thresholds
Instead of writing the threshold out when Qdisc is configured
and not remembering it move to a scheme where we remember all
thresholds. When configuration changes parse the offloaded
Qdiscs and set thresholds appropriately.
This will help future extensions.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: John Hurley <john.hurley@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Mon, 12 Nov 2018 22:58:07 +0000 (14:58 -0800)]
nfp: abm: rename qdiscs -> red_qdiscs
Rename qdiscs member to red_qdiscs. One of following patches will
use the name qdiscs for tracking all qdisc types.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: John Hurley <john.hurley@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Dmitry Bogdanov [Mon, 12 Nov 2018 15:46:09 +0000 (15:46 +0000)]
net: aquantia: add support of rx-vlan-filter offload
Since it uses the same NIC table as rx flow vlan filter therefore
rx-flow vlan filter accepts only vlans that present on the interface
in case of rx-vlan-filter is on.
Signed-off-by: Dmitry Bogdanov <dmitry.bogdanov@aquantia.com> Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Dmitry Bogdanov [Mon, 12 Nov 2018 15:46:07 +0000 (15:46 +0000)]
net: aquantia: add ethertype and PCP to rx flow filters
L2 EtherType filters allows to filter packet by EtherType field or
both EtherType and User Priority (PCP) field of 802.1Q.
UserPriority (vlan) parameter must be accompanied by mask 0x1FFF. That
is to distinguish VLAN filter from L2 Ethertype filter with
UserPriority since both User Priority and VLAN ID are passed in the
same 'vlan' parameter.
Example:
To add a filter that directs IP4 packess of priority 3 to queue 3:
ethtool -N <ethX> flow-type ether proto 0x800 vlan 0x600 m 0x1FFF \
action 3 loc 16
Signed-off-by: Dmitry Bogdanov <dmitry.bogdanov@aquantia.com> Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Dmitry Bogdanov [Mon, 12 Nov 2018 15:46:05 +0000 (15:46 +0000)]
net: aquantia: add vlan id to rx flow filters
The VLAN filter (VLAN id) is compared against 16 filters.
VLAN id must be accompanied by mask 0xF000. That is to distinguish
VLAN filter from L2 Ethertype filter with UserPriority since both
User Priority and VLAN ID are passed in the same 'vlan' parameter.
Flow type may be any as it is not matched for VLAN filter.
Due to fixed order of the rules in the NIC, the location 0-15 are
reserved for vlan filters.
Example:
To add a rule that directs packets from VLAN 2001 to queue 5:
ethtool -N <ethX> flow-type ip4 vlan 2001 m 0xF000 action 5 loc 0
Signed-off-by: Dmitry Bogdanov <dmitry.bogdanov@aquantia.com> Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Dmitry Bogdanov [Mon, 12 Nov 2018 15:46:02 +0000 (15:46 +0000)]
net: aquantia: add support of L3/L4 ntuple filters
Add support of L3/L4 5-tuple {protocol, src-ip, dst-ip, src-port, dst-port}
filters. Mask is not supported. Src-port and dst-port are only compared for
TCP/UDP/SCTP packets. Both IPv4 and IPv6 are supported.
The supported actions are the drop and the queue assignment.
Due to fixed order of the rules in the NIC, the location 32-39 are
reserved for L3/L4 5-tuple filters. The locations 32 and 36 are
reserved for IPv6 filters.
Signed-off-by: Dmitry Bogdanov <dmitry.bogdanov@aquantia.com> Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Dmitry Bogdanov [Mon, 12 Nov 2018 15:46:00 +0000 (15:46 +0000)]
net: aquantia: add infrastructure for ntuple rules
Add infrastructure to support ntuple filter configuration.
Add rule, remove rule, reapply on interface up.
Signed-off-by: Dmitry Bogdanov <dmitry.bogdanov@aquantia.com> Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com> Signed-off-by: David S. Miller <davem@davemloft.net>