Edward Cree [Thu, 8 Mar 2018 15:45:03 +0000 (15:45 +0000)]
net: ethtool: extend RXNFC API to support RSS spreading of filter matches
We use a two-step process to configure a filter with RSS spreading. First,
the RSS context is allocated and configured using ETHTOOL_SRSSH; this
returns an identifier (rss_context) which can then be passed to subsequent
invocations of ETHTOOL_SRXCLSRLINS to specify that the offset from the RSS
indirection table lookup should be added to the queue number (ring_cookie)
when delivering the packet. Drivers for devices which can only use the
indirection table entry directly (not add it to a base queue number)
should reject rule insertions combining RSS with a nonzero ring_cookie.
Signed-off-by: Edward Cree <ecree@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Fixes: 9426bbc6de99 ("rds: use list structure to track information for zerocopy completion notification") Signed-off-by: Fengguang Wu <fengguang.wu@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Dan Carpenter [Thu, 8 Mar 2018 09:36:28 +0000 (12:36 +0300)]
net/ncsi: unlock on error in ncsi_set_interface_nl()
There are two error paths which are missing unlocks in this function.
Fixes: 955dc68cb9b2 ("net/ncsi: Add generic netlink family") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Dan Carpenter [Thu, 8 Mar 2018 09:36:04 +0000 (12:36 +0300)]
net/ncsi: use kfree_skb() instead of kfree()
We're supposed to use kfree_skb() to free these sk_buffs.
Fixes: 955dc68cb9b2 ("net/ncsi: Add generic netlink family") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Avoid doing useless work by making sure that the response_list is not empty
before scheduling work to process it.
Signed-off-by: Prasad Kanneganti <prasad.kanneganti@cavium.com> Signed-off-by: Felix Manlunas <felix.manlunas@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Intiyaz Basha [Thu, 8 Mar 2018 06:12:24 +0000 (22:12 -0800)]
liquidio: Resolved mbox read issue while reading more than one 64bit data
Corrected length check when data received in the mbox is more than one
64 bit data value
Signed-off-by: Intiyaz Basha <intiyaz.basha@cavium.com> Signed-off-by: Felix Manlunas <felix.manlunas@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
This series follows our previous one to lay out the foundations for IPSec
in user-space and extend current kernel netdev IPSec support. As noted in
our previous pull request cover letter "mlx5-updates-2018-02-28-1 (IPSec-1)",
the IPSec mechanism will be supported through our flow steering mechanism.
Therefore, we need to change the initialization order. Furthermore, IPsec
is also supported in both egress and ingress. Since our current flow
steering is egress only, we add an empty (only implemented through FPGA
steering ops) egress namespace to handle that case. We also implement
the required flow steering callbacks and logic in our FPGA driver.
We extend the FPGA support for ESN and modifying a xfrm too. Therefore, we
add support for some new FPGA command interface that supports them. The
other required bits are added too. The new features and requirements are
advertised via cap bits.
Last but not least, we revise our driver's accel_esp API. This API will be
shared between our netdev and IB driver, so we need to have all the required
functionality from both worlds.
Regards,
Aviad and Matan
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
====================
ibmvnic: Clean up net close and fix reset bug
This patch set cleans up and reorganizes the driver's net_device
close function and leverages that to fix up a bug that can occur
during some device resets. Some reset cases require the backing
adapter to be disabled before continuing, but other cases, such as
during a device failover or partition migration, do not require this
step. Since the device will not be initialized at this stage and
its command-processing queue is closed, do not send the request to
disable the device as it could result in an error or timeout
disrupting the reset.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Thomas Falcon [Wed, 7 Mar 2018 23:51:47 +0000 (17:51 -0600)]
ibmvnic: Do not disable device during failover or partition migration
During a device failover or partition migration reset, it is not
necessary to disable the backing adapter since it should not be
running yet and its Command-Response Queue is closed. Sending
device commands during this time could result in an error or
timeout disrupting the reset process. In these cases, just halt
transmissions, clean up resources, and continue with reset.
Signed-off-by: Thomas Falcon <tlfalcon@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Thomas Falcon [Wed, 7 Mar 2018 23:51:46 +0000 (17:51 -0600)]
ibmvnic: Reorganize device close
Introduce a function to halt network operations and clean up any
unused or outstanding socket buffers. Then, during device close,
disable backing adapter before halting all queues and performing
cleanup. This ensures all backing device operations will be
stopped before the driver cleans up shared resources.
Signed-off-by: Thomas Falcon <tlfalcon@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Thomas Falcon [Wed, 7 Mar 2018 23:51:45 +0000 (17:51 -0600)]
ibmvnic: Clean up device close
Remove some dead code now that RX pools are being cleaned. This
was included to wait until any pending RX queue interrupts are
processed, but NAPI polling should be disabled by this point.
Another minor change is to use the net device parameter for any
print functions instead of accessing it from the adapter structure.
Signed-off-by: Thomas Falcon <tlfalcon@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
William Tu [Wed, 7 Mar 2018 23:38:48 +0000 (15:38 -0800)]
openvswitch: fix vport packet length check.
When sending a packet to a tunnel device, the dev's hard_header_len
could be larger than the skb->len in function packet_length().
In the case of ip6gretap/erspan, hard_header_len = LL_MAX_HEADER + t_hlen,
which is around 180, and an ARP packet sent to this tunnel has
skb->len = 42. This causes the 'unsign int length' to become super
large because it is negative value, causing the later ovs_vport_send
to drop it due to over-mtu size. The patch fixes it by setting it to 0.
Signed-off-by: William Tu <u9012063@gmail.com> Acked-by: Pravin B Shelar <pshelar@ovn.org> Signed-off-by: David S. Miller <davem@davemloft.net>
this series continues to review and to convert pernet_operations
to make them possible to be executed in parallel for several
net namespaces in the same time. There are mostly netfilter
operations (and they should be the last netfilter's), also
there are two patches touching pktgen and xfrm.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Kirill Tkhai [Wed, 7 Mar 2018 09:41:16 +0000 (12:41 +0300)]
net: Convert ipv4_net_ops
These pernet_operations register and unregister bunch
of nf_conntrack_l4proto. Exit method unregisters related
sysctl, init method calls init_net and get_net_proto.
The whole builtin_l4proto4 array has pretty simple
init_net and get_net_proto methods. The first one register
sysctl table, the second one is just RO memory dereference.
So, these pernet_operations are safe to be marked as async.
Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Kirill Tkhai [Wed, 7 Mar 2018 09:41:07 +0000 (12:41 +0300)]
net: Convert iptable_security_net_ops
These pernet_operations unregister net::ipv4::iptable_security table.
Another net/pernet_operations do not send ipv4 packets to foreign
net namespaces. So, we mark them async.
Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Kirill Tkhai [Wed, 7 Mar 2018 09:40:58 +0000 (12:40 +0300)]
net: Convert iptable_raw_net_ops
These pernet_operations unregister net::ipv4::iptable_raw table.
Another net/pernet_operations do not send ipv4 packets to foreign
net namespaces. So, we mark them async.
Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Kirill Tkhai [Wed, 7 Mar 2018 09:40:45 +0000 (12:40 +0300)]
net: Convert iptable_nat_net_ops
These pernet_operations unregister net::ipv4::nat_table table.
Another net/pernet_operations do not send ipv4 packets to foreign
net namespaces. So, we mark them async.
Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Kirill Tkhai [Wed, 7 Mar 2018 09:40:36 +0000 (12:40 +0300)]
net: Convert iptable_mangle_net_ops
These pernet_operations unregister net::ipv4::iptable_mangle table.
Another net/pernet_operations do not send ipv4 packets to foreign
net namespaces. So, we mark them async.
Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Kirill Tkhai [Wed, 7 Mar 2018 09:40:28 +0000 (12:40 +0300)]
net: Convert arptable_filter_net_ops
These pernet_operations unregister net::ipv4::arptable_filter.
Another net/pernet_operations do not send arp packets to foreign
net namespaces. So, we mark them async.
Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Kirill Tkhai [Wed, 7 Mar 2018 09:40:19 +0000 (12:40 +0300)]
net: Convert pg_net_ops
These pernet_operations create per-net pktgen threads
and /proc entries. These pernet subsys looks closed
in itself, and there are no pernet_operations outside
this file, which are interested in the threads.
Init and/or exit methods look safe to be executed
in parallel.
Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Kirill Tkhai [Wed, 7 Mar 2018 09:40:09 +0000 (12:40 +0300)]
net: Convert nfnl_queue_net_ops
These pernet_operations register and unregister net::nf::queue_handler
and /proc entry. The handler is accessed only under RCU, so this looks
safe to convert them.
Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Kirill Tkhai [Wed, 7 Mar 2018 09:40:00 +0000 (12:40 +0300)]
net: Convert nfnl_log_net_ops
These pernet_operations create and destroy /proc entries.
Also, exit method unsets nfulnl_logger. The logger is not
set by default, and it becomes bound via userspace request.
So, they look safe to be made async.
Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Kirill Tkhai [Wed, 7 Mar 2018 09:39:51 +0000 (12:39 +0300)]
net: Convert cttimeout_ops
These pernet_operations also look closed in themself.
Exit method touch only per-net structures, so it's
safe to execute them for several net namespaces in parallel.
Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Kirill Tkhai [Wed, 7 Mar 2018 09:39:42 +0000 (12:39 +0300)]
net: Convert nfnl_acct_ops
These pernet_operations look closed in themself,
and there are no other users of net::nfnl_acct_list
outside. They are safe to be executed for several
net namespaces in parallel.
Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Kirill Tkhai [Wed, 7 Mar 2018 09:39:33 +0000 (12:39 +0300)]
net: Convert nfnetlink_net_ops
These pernet_operations create and destroy net::nfnl
socket of NETLINK_NETFILTER code. There are no other
places, where such type the socket is created, except
these pernet_operations. It seem other pernet_operations
depending on CONFIG_NETFILTER_NETLINK send messages
to this socket. So, we mark it async.
Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Kirill Tkhai [Wed, 7 Mar 2018 09:39:14 +0000 (12:39 +0300)]
net: Convert xfrm_user_net_ops
These pernet_operations create and destroy net::xfrm::nlsk
socket of NETLINK_XFRM. There is only entry point, where
it's dereferenced, it's xfrm_user_rcv_msg(). There is no
in-kernel senders to this socket.
Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com> Signed-off-by: David S. Miller <davem@davemloft.net>
have exit methods, which call ip6t_unregister_table().
ip6table_filter_net_ops has init method registering
filter table.
Since there must not be in-flight ipv6 packets at the time
of pernet_operations execution and since pernet_operations
don't send ipv6 packets each other, these pernet_operations
are safe to be async.
Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com> Signed-off-by: David S. Miller <davem@davemloft.net>
net/sched: cls_flower: Add support to handle first frag as match field
Allow setting firstfrag as matching option in tc flower classifier.
# tc filter add dev eth0 protocol ip parent ffff: \
flower indev eth0 \
ip_flags firstfrag
action mirred egress redirect dev eth1
Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com> Signed-off-by: Simon Horman <simon.horman@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Thu, 8 Mar 2018 16:23:38 +0000 (11:23 -0500)]
Merge branch 'hns3-next'
Peng Li says:
====================
fix some bugs for hns3 driver
This patchset fix some bugs for hns3 driver.
[Patch 1/6 - Patch 3/6] fix bugs related about VF driver.
[Patch 3/6 - Patch 6/6] fix the bugs about ethtool_ops.set_channels.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Peng Li [Thu, 8 Mar 2018 11:41:53 +0000 (19:41 +0800)]
net: hns3: fix the queue id for tqp enable&&reset
Command HCLGE_OPC_CFG_COM_TQP_QUEUE should use queue id in the
function, but command HCLGE_OPC_RESET_TQP_QUEUE should use global
queue id.
This patch fixes the queue id about queue enable/disable/reset.
Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Peng Li [Thu, 8 Mar 2018 11:41:51 +0000 (19:41 +0800)]
net: hns3: set the cmdq out_vld bit to 0 after used
Driver check the out_vld bit when get a new cmdq BD, if the bit is 1,
the BD is valid. driver Should set the bit 0 after used and hw will
set the bit 1 if get a valid BD.
Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Anders Roxell [Thu, 8 Mar 2018 10:17:23 +0000 (11:17 +0100)]
selftests/net: enable fragments for fib-onlink-tests
We miss CONFIG_* fragments so test fib-onlink-tests.sh can do:
ip li add lisa type vrf table 1101
ip li add veth1 type veth peer name veth2
And the follow message occurs if it isn't enabled:
Configuring interfaces
RTNETLINK answers: Operation not supported
This enables for NET_NRF (and friends) and VETH so we can create a vrf
table and veth.
Fixes: 153e1b84f477 ("selftests: Add FIB onlink tests") Signed-off-by: Anders Roxell <anders.roxell@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Paolo Abeni [Thu, 8 Mar 2018 09:29:30 +0000 (10:29 +0100)]
ipvlan: properly annotate rx_handler access
The rx_handler field is rcu-protected, but I forgot to use the
proper accessor while refactoring netif_is_ipvlan_port(). Such
function only check the rx_handler value, so it is safe, but we need
to properly read rx_handler via rcu_access_pointer() to avoid sparse
warnings.
Fixes: 1ec54cb44e67 ("net: unpollute priv_flags space") Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
The kernel compiled with CONFIG_REFCOUNT_FULL produces the following
error. The reason to it that initial value of refcount_t is supposed
to be more than 0, change it.
Aviad Yehezkel [Thu, 18 Jan 2018 14:02:17 +0000 (16:02 +0200)]
net/mlx5: IPSec, Add support for ESN
Currently ESN is not supported with IPSec device offload.
This patch adds ESN support to IPsec device offload.
Implementing new xfrm device operation to synchronize offloading device
ESN with xfrm received SN. New QP command to update SA state at the
following:
^ - marks where QP command invoked to update the SA ESN state
machine.
| - marks the start of the ESN scope (0-2^32-1). At this point move SA
ESN overlap bit to zero and increment ESN.
* - marks the middle of the ESN scope (2^31). At this point move SA
ESN overlap bit to one.
Aviad Yehezkel [Sun, 18 Feb 2018 13:07:20 +0000 (15:07 +0200)]
net/mlx5: Add flow-steering commands for FPGA IPSec implementation
In order to add a context to the FPGA, we need to get both the software
transform context (which includes the keys, etc) and the
source/destination IPs (which are included in the steering
rule). Therefore, we register new set of firmware like commands for
the FPGA. Each time a rule is added, the steering core infrastructure
calls the FPGA command layer. If the rule is intended for the FPGA,
it combines the IPs information with the software transformation
context and creates the respective hardware transform.
Afterwards, it calls the standard steering command layer.
Aviad Yehezkel [Thu, 18 Jan 2018 11:05:48 +0000 (13:05 +0200)]
net/mlx5: Refactor accel IPSec code
The current code has one layer that executed FPGA commands and
the Ethernet part directly used this code. Since downstream patches
introduces support for IPSec in mlx5_ib, we need to provide some
abstractions. This patch refactors the accel code into one layer
that creates a software IPSec transformation and another one which
creates the actual hardware context.
The internal command implementation is now hidden in the FPGA
core layer. The code also adds the ability to share FPGA hardware
contexts. If two contexts are the same, only a reference count
is taken.
Aviad Yehezkel [Tue, 16 Jan 2018 14:12:22 +0000 (16:12 +0200)]
net/mlx5: IPSec, Add command V2 support
This patch adds V2 command support.
New fpga devices support extended features (udp encap, esn etc...), this
features require new hardware sadb format therefore we have a new version
of commands to manipulate it.
Yossi Kuperman [Sun, 22 Oct 2017 16:45:45 +0000 (19:45 +0300)]
net/mlx5e: IPSec, Add support for ESP trailer removal by hardware
Current hardware decrypts and authenticates incoming ESP packets.
Subsequently, the software extracts the nexthdr field, truncates the
trailer and adjusts csum accordingly.
With this patch and a capable device, the trailer is being removed
by the hardware and the nexthdr field is conveyed via PET. This way
we avoid both the need to access the trailer (cache miss) and to
compute its relative checksum, which significantly improve
the performance.
Experiment shows that trailer removal improves the performance by
2Gbps, (netperf). Both forwarding and host-to-host configurations.
Yossi Kuperman [Sun, 22 Oct 2017 16:43:58 +0000 (19:43 +0300)]
net/mlx5: IPSec, Generalize sandbox QP commands
The current code assume only SA QP commands.
Refactor in order to pave the way for new QP commands:
1. Generic cmd response format.
2. SA cmd checks are in dedicated functions.
3. Aligned debug prints.
Eric Dumazet [Wed, 7 Mar 2018 16:43:19 +0000 (08:43 -0800)]
ip6mr: remove synchronize_rcu() in favor of SOCK_RCU_FREE
Kirill found that recently added synchronize_rcu() call in
ip6mr_sk_done()
was slowing down netns dismantle and posted a patch to use it only if
the socket
was found.
I instead suggested to get rid of this call, and use instead
SOCK_RCU_FREE
We might later change IPv4 side to use the same technique and unify
both stacks. IPv4 does not use synchronize_rcu() but has a call_rcu()
that could be replaced by SOCK_RCU_FREE.
Tested:
time for i in {1..1000}; do unshare -n /bin/false;done
Before : real 7m18.911s
After : real 10.187s
Fixes: 8571ab479a6e ("ip6mr: Make mroute_sk rcu-based") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: Kirill Tkhai <ktkhai@virtuozzo.com> Cc: Yuval Mintz <yuvalm@mellanox.com> Reviewed-by: Kirill Tkhai <ktkhai@virtuozzo.com> Signed-off-by: David S. Miller <davem@davemloft.net>
A couple of enhancements to the rds zerocop code
- patch 1 refactors rds_message_copy_from_user to pull the zcopy logic
into its own function
- patch 2 drops the usage sk_buff to track MSG_ZEROCOPY cookies and
uses a simple linked list (enhancement suggested by willemb during
code review)
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
rds: use list structure to track information for zerocopy completion notification
Commit 401910db4cd4 ("rds: deliver zerocopy completion notification
with data") removes support fo r zerocopy completion notification
on the sk_error_queue, thus we no longer need to track the cookie
information in sk_buff structures.
This commit removes the struct sk_buff_head rs_zcookie_queue by
a simpler list that results in a smaller memory footprint as well
as more efficient memory_allocation time.
Signed-off-by: Sowmini Varadhan <sowmini.varadhan@oracle.com> Acked-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
rds: refactor zcopy code into rds_message_zcopy_from_user
Move the large block of code predicated on zcopy from
rds_message_copy_from_user into a new function,
rds_message_zcopy_from_user()
Signed-off-by: Sowmini Varadhan <sowmini.varadhan@oracle.com> Acked-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Remove VLA usage and change the 'len' argument to a u8 and use a 256
byte buffer on the stack. Notice that these lengths are limited by the
encoding field in the VPD structure, which is a u8 [1].
Fix the SO_ZEROCOPY switch case on sock_setsockopt() avoiding the
ret values to be overwritten by the one set on the default case.
Fixes: 28190752c7092 ("sock: permit SO_ZEROCOPY on PF_RDS socket") Signed-off-by: Jesus Sanchez-Palencia <jesus.sanchez-palencia@intel.com> Acked-by: Willem de Bruijn <willemb@google.com> Acked-by: Sowmini Varadhan <sowmini.varadhan@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
This series adds unicast filtering support to the Marvell PPv2 controller.
This is implemented using the header parser cababilities of the PPv2,
which allows for generic packet filtering based on matching patterns in
the packet headers.
PPv2 controller only has 256 of these entries, and we need to share them
with other features, such as VLAN filtering.
For each interface, we have 5 entries dedicated to unicast filtering (the
controller's own address, and 4 other), and 21 to multicast filtering.
When this number is reached, the controller switches to unicast or
multicast promiscuous mode.
The first patch reworks the function that adds and removes addresses to the
filter. This is preparatory work to ease UC filter implementation.
The second patch adds the UC filtering feature.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Marvell PPv2 controller can be used to implement packet filtering based
on the destination MAC address. This is already used to implement
multicast filtering. This patch adds support for Unicast filtering.
Filtering is based on so-called "TCAM entries" to implement filtering.
Due to their limited number and the fact that these are also used for
other purposes, we reserve 80 entries for both unicast and multicast
filters. On top of the broadcast address, and each interface's own MAC
address, we reserve 25 entries per port, 4 for unicast filters, 21 for
multicast.
Whenever unicast or multicast range for one port is full, the filtering
is disabled and port goes into promiscuous mode for the given type of
addresses.
Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Signed-off-by: David S. Miller <davem@davemloft.net>
net: mvpp2: Simplify MAC filtering function parameters
The mvpp2_prs_mac_da_accept function takes into parameter both the
struct representing the controller and the port id. This is meaningful
when we want to create TCAM entries for non-initialized ports, but in
this case we expect the port to be initialized before starting adding or
removing MAC addresses to the per-port filter.
This commit changes the function so that it takes struct mvpp2_port as
a parameter instead.
Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Johan Hovold [Wed, 7 Mar 2018 09:46:58 +0000 (10:46 +0100)]
net: cdc_eem: clean up bind error path
Drop bogus call to usb_driver_release_interface() from an error path in
the usbnet bind() callback, which is called during interface probe. At
this point the interface is not bound and usb_driver_release_interface()
returns early.
Also remove the bogus call to clear the interface data, which is owned
by the usbnet driver and would not even have been set by the time bind()
is called.
Signed-off-by: Johan Hovold <johan@kernel.org> Acked-by: Oliver Neukum <oneukum@suse.com> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Johan Hovold [Wed, 7 Mar 2018 09:46:57 +0000 (10:46 +0100)]
net: kalmia: clean up bind error path
Drop bogus call to usb_driver_release_interface() from an error path in
the usbnet bind() callback, which is called during interface probe. At
this point the interface is not bound and usb_driver_release_interface()
returns early.
Also remove the bogus call to clear the interface data, which is owned
by the usbnet driver and would not even have been set by the time bind()
is called.
Signed-off-by: Johan Hovold <johan@kernel.org> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: David S. Miller <davem@davemloft.net>
This series consists of some fixes and refactors for the mlx5 drivers,
especially around the FPGA and flow steering. Most of them are trivial
fixes and are the foundation of allowing IPSec acceleration from user-space.
We use flow steering abstraction in order to accelerate IPSec packets.
When a user creates a steering rule, [s]he states that we'll carry an
encrypt/decrypt flow action (using a specific configuration) for every
packet which conforms to a certain match. Since currently offloading these
packets is done via FPGA, we'll add another set of flow steering ops.
These ops will execute the required FPGA commands and then call the
standard steering ops.
In order to achieve this, we need that the commands will get all the
required information. Therefore, we pass the fte object and embed the
flow_action struct inside the fte. In addition, we add the shim layer
that will later be used for alternating between the standard and the
FPGA steering commands.
Some fixes, like " net/mlx5e: Wait for FPGA command responses with a timeout"
are very relevant for user-space applications, as these applications could
be killed, but we still want to wait for the FPGA and update the kernel's
database.
Regards,
Aviad and Matan
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Stefano Brivio [Tue, 6 Mar 2018 21:16:27 +0000 (22:16 +0100)]
selftests: net: Introduce first PMTU test
One single test implemented so far: test_pmtu_vti6_exception
checks that the PMTU of a route exception, caused by a tunnel
exceeding the link layer MTU, is affected by administrative
changes of the tunnel MTU. Creation of the route exception is
checked too.
Requested-by: David Ahern <dsahern@gmail.com> Signed-off-by: Stefano Brivio <sbrivio@redhat.com> Acked-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Fengguang Wu [Tue, 6 Mar 2018 18:23:23 +0000 (02:23 +0800)]
enic: fix boolreturn.cocci warnings
drivers/net/ethernet/cisco/enic/vnic_dev.c:1294:9-10: WARNING: return of 0/1 in function 'vnic_dev_capable_udp_rss' with return type bool
Return statements in functions returning bool should use
true/false instead of 1/0.
Generated by: scripts/coccinelle/misc/boolreturn.cocci
Fixes: 48398b6e7065 ("enic: set UDP rss flag") CC: Govindarajulu Varadarajan <gvaradar@cisco.com> Signed-off-by: Fengguang Wu <fengguang.wu@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
drivers/net/dsa/mv88e6xxx/serdes.c:66:9-10: WARNING: return of 0/1 in function 'mv88e6352_port_has_serdes' with return type bool
Return statements in functions returning bool should use
true/false instead of 1/0.
Generated by: scripts/coccinelle/misc/boolreturn.cocci
Fixes: eb755c3f6b7d ("net: dsa: mv88e6xxx: Add helper to determining if port has SERDES") CC: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Fengguang Wu <fengguang.wu@intel.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
Paolo Abeni [Tue, 6 Mar 2018 09:56:31 +0000 (10:56 +0100)]
net: unpollute priv_flags space
the ipvlan device driver defines and uses 2 bits inside the priv_flags
net_device field. Such bits and the related helper are used only
inside the ipvlan device driver, and the core networking does not
need to be aware of them.
This change moves netif_is_ipvlan* helper in the ipvlan driver and
re-implement them looking for ipvlan specific symbols instead of
using priv_flags.
Overall this frees two bits inside priv_flags - and move the following
ones to avoid gaps - without any intended functional change.
Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
====================
net: phy: remove phy_error from phy_disable_interrupts
All callers of phy_disable_interrupts() call phy_error() in the error
case. Therefore we don't need to do this within the function too.
This change also allows us to use phy_disable_interrupts() in code
holding phydev->lock (because phy_error() takes this lock).
Make use of this in phy_stop().
v2:
- splitted into two separate patches
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Heiner Kallweit [Mon, 5 Mar 2018 21:34:27 +0000 (22:34 +0100)]
net: phy: remove phy_error from phy_disable_interrupts
All callers of phy_disable_interrupts() call phy_error() in the error
case. Therefore we don't need to do this within the function too.
This change also allows us to use phy_disable_interrupts() in code
holding phydev->lock (because phy_error() can take this lock).
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
Prashant Bhole [Tue, 6 Mar 2018 08:31:32 +0000 (17:31 +0900)]
selftests/net: fix in_netns.sh script
execute the subprocess in netns using 'ip netns exec'
Fixes: cc30c93fa020 ("selftests/net: ignore background traffic in psock_fanout") Signed-off-by: Prashant Bhole <bhole_prashant_q7@lab.ntt.co.jp> Acked-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
net: mvpp2: mvpp2_check_hw_buf_num() can be static
Fixes: effbf5f58d64 ("net: mvpp2: update the BM buffer free/destroy logic") Signed-off-by: Fengguang Wu <fengguang.wu@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
There are two compatibility strings for mv88e6xxx, but it isn't clear
from the documentation why only those two exist when the mv88e6xxx driver
supports more than the 6085 and 6190. Briefly describe how the compatible
property is used, and provide guidance on which to use.
The model list comes from looking at port_base_addr values (0x0 vs 0x10)
in drivers/net/dsa/mv88e6xxx/chip.c.
Signed-off-by: Brandon Streiff <brandon.streiff@ni.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
tipc: bcast: use true and false for boolean values
Assign true or false to boolean variables instead of an integer value.
This issue was detected with the help of Coccinelle.
Signed-off-by: Gustavo A. R. Silva <garsilva@embeddedor.com> Acked-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Wed, 7 Mar 2018 17:05:56 +0000 (12:05 -0500)]
Merge branch '1GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue
Jeff Kirsher says:
====================
1GbE Intel Wired LAN Driver Updates 2018-03-05
This series contains updates to igb only.
Corinna Vinschen adds the support for trusted VFs into the igb driver.
Mika fixes an issue where PCIe device is physically unplugged can cause
a kernel crash. This issue is that netif_device_detach() is called in
these cases, which prevents netif_unregister() from bringing the device
down properly.
Christophe JAILLET fixes an issue with igb where HWTSTAMP_TX_ON was
being handled like a bit mask and not a value.
v2: dropped the e1000e fix from the series since I will be pushing it
through David Miller's net tree.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Wed, 7 Mar 2018 16:44:43 +0000 (11:44 -0500)]
Merge branch 'lan743x-driver'
Bryan Whitehead says:
====================
lan743x: Add new lan743x driver
Add new lan743x driver.
The lan743x from Microchip Technologies Inc,
is a PCIe to Gigabit Ethernet Controller.
Updates for V4:
Patch 1/2 - Applied community suggestions
convert to using module_pci_driver
Updates for V3:
Patch 1/2 - Applied community suggestions
removed initialization tracking flags.
converted to 64 bit statistics.
converted tx clean up tasklet to napi.
Updates for V2:
Patch 1/2 - Applied community suggestions
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Bryan Whitehead [Mon, 5 Mar 2018 19:23:30 +0000 (14:23 -0500)]
lan743x: Add main source files for new lan743x driver
Add main source files for new lan743x driver
Signed-off-by: Bryan Whitehead <Bryan.Whitehead@microchip.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
SCTP_SENDALL: This flag, if set, will cause a one-to-many
style socket to send the message to all associations that
are currently established on this socket. For the one-to-
one style socket, this flag has no effect.
Note there is another msg_control option:
5.3.8. SCTP AUTH Information Structure (SCTP_AUTHINFO)
It's a little complicated, I will post it in another patchset after
this.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Xin Long [Mon, 5 Mar 2018 12:44:20 +0000 (20:44 +0800)]
sctp: add support for snd flag SCTP_SENDALL process in sendmsg
This patch is to add support for snd flag SCTP_SENDALL process
in sendmsg, as described in section 5.3.4 of RFC6458.
With this flag, you can send the same data to all the asocs of
this sk once.
Signed-off-by: Xin Long <lucien.xin@gmail.com> Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Xin Long [Mon, 5 Mar 2018 12:44:19 +0000 (20:44 +0800)]
sctp: add support for SCTP_DSTADDRV4/6 Information for sendmsg
This patch is to add support for Destination IPv4/6 Address options
for sendmsg, as described in section 5.3.9/10 of RFC6458.
With this option, you can provide more than one destination addrs
to sendmsg when creating asoc, like sctp_connectx.
It's also a necessary send info for sctp_sendv.
Signed-off-by: Xin Long <lucien.xin@gmail.com> Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Xin Long [Mon, 5 Mar 2018 12:44:18 +0000 (20:44 +0800)]
sctp: add support for PR-SCTP Information for sendmsg
This patch is to add support for PR-SCTP Information for sendmsg,
as described in section 5.3.7 of RFC6458.
With this option, you can specify pr_policy and pr_value for user
data in sendmsg.
It's also a necessary send info for sctp_sendv.
Signed-off-by: Xin Long <lucien.xin@gmail.com> Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
When addressing a review comment in a early version of the offending
patch a comment where left in which should have been removed. Remove the
comment to keep it consistent with the code.
Fixes: 75efa06f457bbed3 ("ravb: add support for changing MTU") Reported-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com> Signed-off-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se> Acked-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Kirill Tkhai [Thu, 1 Mar 2018 12:23:28 +0000 (15:23 +0300)]
net: Make account struct net to memcg
The patch adds SLAB_ACCOUNT to flags of net_cachep cache,
which enables accounting of struct net memory to memcg kmem.
Since number of net_namespaces may be significant, user
want to know, how much there were consumed, and control.
Note, that we do not account net_generic to the same memcg,
where net was accounted, moreover, we don't do this at all (*).
We do not want the situation, when single memcg memory deficit
prevents us to register new pernet_operations.
(*)Even despite there is !current process accounting already
available in linux-next. See kmalloc_memcg() there for the details.
Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Aviad Yehezkel [Sun, 18 Feb 2018 13:00:54 +0000 (15:00 +0200)]
net/mlx5: Flow steering cmd interface should get the fte when deleting
Previously, deleting a flow steering entry only got the index.
Since the FPGA implementation of FTE's deletion might need to dig
inside the FTE itself, we would like to get the FTE's context.
Changing the interface to pass the FTE context.
Aviad Yehezkel [Sun, 18 Feb 2018 11:17:17 +0000 (13:17 +0200)]
net/mlx5: Add empty egress namespace to flow steering core
Currently, we don't support egress flow steering namespace in mlx5
flow steering core implementation. However, when we want to encrypt
a packet, we model it as a flow steering rule in the egress path.
To overcome this, we add an empty egress namespace to flow steering.
This namespace is initialized only when ipsec support exists.
In the future, this will grow to a full blown full steering
implementation, resembling the ingress path.
Matan Barak [Sun, 20 Aug 2017 12:46:51 +0000 (15:46 +0300)]
net/mlx5: Add shim layer between fs and cmd
The shim layer allows each namespace to define possibly different
functionality for add/delete/update commands. The shim layer
introduced here, will be used to support flow steering with the FPGA.
Matan Barak [Wed, 16 Aug 2017 06:43:48 +0000 (09:43 +0300)]
{net,IB}/mlx5: Add has_tag to mlx5_flow_act
The has_tag member will indicate whether a tag action was specified
in flow specification.
A flow tag 0 = MLX5_FS_DEFAULT_FLOW_TAG is assumed a valid flow tag
that is currently used by mlx5 RDMA driver, whereas in HW flow_tag = 0
means that the user doesn't care about flow_tag. HW always provide
a flow_tag = 0 if all flow tags requested on a specific flow are 0.
So we need a way (in the driver) to differentiate between a user really
requesting flow_tag = 0 and a user who does not care, in order to be
able to report conflicting flow tags on a specific flow.
Boris Pismenny [Wed, 16 Aug 2017 06:33:30 +0000 (09:33 +0300)]
IB/mlx5: Pass mlx5_flow_act struct instead of multiple arguments
Group and pass all function arguments of parse_flow_attr call in one
common struct mlx5_flow_act.
This patch passes all the action arguments of parse_flow_attr in one common
struct mlx5_flow_act. It allows us to scale the number of actions without adding
new arguments to the function.
Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Boris Pismenny <borisp@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Acked-by: Jason Gunthorpe <jgg@mellanox.com>
Matan Barak [Sun, 19 Nov 2017 15:51:13 +0000 (15:51 +0000)]
net/mlx5: FPGA and IPSec initialization to be before flow steering
Some flow steering namespace initialization (i.e. egress namespace)
might depend on FPGA capabilities. Changing the initialization order
such that the FPGA will be initialized before flow steering.
Flow steering fs cmds initialization might depend on
IPSec capabilities. Changing the initialization order such
that the IPSec will be initialized before flow steering as well.