Jose Abreu [Wed, 27 Jun 2018 14:57:02 +0000 (15:57 +0100)]
net: stmmac: Add support for CBS QDISC
This adds support for CBS reconfiguration using the TC application.
A new callback was added to TC ops struct and another one to DMA ops to
reconfigure the channel mode.
Tested in GMAC5.10.
Signed-off-by: Jose Abreu <joabreu@synopsys.com> Cc: David S. Miller <davem@davemloft.net> Cc: Joao Pinto <jpinto@synopsys.com> Cc: Vitor Soares <soares@synopsys.com> Cc: Giuseppe Cavallaro <peppe.cavallaro@st.com> Cc: Alexandre Torgue <alexandre.torgue@st.com> Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Fri, 29 Jun 2018 14:54:31 +0000 (23:54 +0900)]
Merge tag 'mlx5e-updates-2018-06-28' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux
Saeed Mahameed says:
====================
mlx5e-updates-2018-06-28
mlx5e netdevice driver updates:
- Boris Pismenny added the support for UDP GSO in the first two patches.
Impressive performance numbers are included in the commit message,
@Line rate with ~half of the cpu utilization compared to non offload
or no GSO at all.
- From Tariq Toukan:
- Convert large order kzalloc allocations to kvzalloc.
- Added performance diagnostic statistics to several places in data path.
From Saeed and Eran,
- Update NIC HW stats on demand only, this is to eliminate the background
thread needed to update some HW statistics in the driver cache in
order to report error and drop counters from HW in ndo_get_stats.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
====================
net: Geneve options support for TC act_tunnel_key
Simon & Pieter say:
This set adds Geneve Options support to the TC tunnel key action.
It provides the plumbing required to configure Geneve variable length
options. The options can be configured in the form CLASS:TYPE:DATA,
where CLASS is represented as a 16bit hexadecimal value, TYPE as an 8bit
hexadecimal value and DATA as a variable length hexadecimal value.
Additionally multiple options may be listed using a comma delimiter.
v2:
- fix sparse warnings in patches 3 and 4 (first one reported by
build bot).
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Simon Horman [Wed, 27 Jun 2018 04:39:37 +0000 (21:39 -0700)]
net/sched: add tunnel option support to act_tunnel_key
Allow setting tunnel options using the act_tunnel_key action.
Options are expressed as class:type:data and multiple options
may be listed using a comma delimiter.
# ip link add name geneve0 type geneve dstport 0 external
# tc qdisc add dev eth0 ingress
# tc filter add dev eth0 protocol ip parent ffff: \
flower indev eth0 \
ip_proto udp \
action tunnel_key \
set src_ip 10.0.99.192 \
dst_ip 10.0.99.193 \
dst_port 6081 \
id 11 \
geneve_opts 0102:80:00800022,0102:80:00800022 \
action mirred egress redirect dev geneve0
Signed-off-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Check the tunnel option type stored in tunnel flags when creating options
for tunnels. Thereby ensuring we do not set geneve, vxlan or erspan tunnel
options on interfaces that are not associated with them.
Make sure all users of the infrastructure set correct flags, for the BPF
helper we have to set all bits to keep backward compatibility.
Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Simon Horman [Wed, 27 Jun 2018 04:39:35 +0000 (21:39 -0700)]
net/sched: act_tunnel_key: add extended ack support
Add extended ack support for the tunnel key action by using NL_SET_ERR_MSG
during validation of user input.
Cc: Alexander Aring <aring@mojatatu.com> Signed-off-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Metadata may be NULL for one of two reasons:
* Missing user input
* Failure to allocate the metadata dst
Disambiguate these case by returning -EINVAL for the former and -ENOMEM
for the latter rather than -EINVAL for both cases.
This is in preparation for using extended ack to provide more information
to users when parsing their input.
Signed-off-by: Simon Horman <simon.horman@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Arjun Vynipadath [Tue, 26 Jun 2018 11:40:25 +0000 (17:10 +0530)]
cxgb4: Add support for FW_ETH_TX_PKT_VM_WR
The present TX workrequest(FW_ETH_TX_PKT_WR) cant be used for
host->vf communication, since it doesn't loopback the outgoing
packets to virtual interfaces on the same port. This can be done
using FW_ETH_TX_PKT_VM_WR.
This fix depends on ethtool_flags to determine what WR to use for
TX path. Support for setting this flags by user is added in next
commit.
Based on the original work by : Casey Leedom <leedom@chelsio.com>
Signed-off-by: Casey Leedom <leedom@chelsio.com> Signed-off-by: Arjun Vynipadath <arjun@chelsio.com> Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Xin Long [Thu, 28 Jun 2018 07:31:00 +0000 (15:31 +0800)]
sctp: add support for SCTP_REUSE_PORT sockopt
This feature is actually already supported by sk->sk_reuse which can be
set by socket level opt SO_REUSEADDR. But it's not working exactly as
RFC6458 demands in section 8.1.27, like:
- This option only supports one-to-one style SCTP sockets
- This socket option must not be used after calling bind()
or sctp_bindx().
Besides, SCTP_REUSE_PORT sockopt should be provided for user's programs.
Otherwise, the programs with SCTP_REUSE_PORT from other systems will not
work in linux.
To separate it from the socket level version, this patch adds 'reuse' in
sctp_sock and it works pretty much as sk->sk_reuse, but with some extra
setup limitations that are needed when it is being enabled.
"It should be noted that the behavior of the socket-level socket option
to reuse ports and/or addresses for SCTP sockets is unspecified", so it
leaves SO_REUSEADDR as is for the compatibility.
Note that the name SCTP_REUSE_PORT is somewhat confusing, as its
functionality is nearly identical to SO_REUSEADDR, but with some
extra restrictions. Here it uses 'reuse' in sctp_sock instead of
'reuseport'. As for sk->sk_reuseport support for SCTP, it will be
added in another patch.
Thanks to Neil to make this clear.
v1->v2:
- add sctp_sk->reuse to separate it from the socket level version.
v2->v3:
- improve changelog according to Marcelo's suggestion.
Acked-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
David Wu [Thu, 28 Jun 2018 01:33:21 +0000 (09:33 +0800)]
net: ethernet: stmmac: dwmac-rk: Add GMAC support for px30
Add constants and callback functions for the dwmac on px30 Soc.
The base structure is the same, but registers and the bits in
them are moved slightly, and add the clk_mac_speed for selecting
mac speed.
Signed-off-by: David Wu <david.wu@rock-chips.com> Signed-off-by: David S. Miller <davem@davemloft.net>
In preparation to enabling -Wimplicit-fallthrough, mark switch cases
where we are expecting to fall through.
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Acked-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Fri, 29 Jun 2018 02:32:55 +0000 (11:32 +0900)]
Merge branch 'ila-Cleanup'
Tom Herbert says:
====================
ila: Cleanup
Perform some cleanup in ILA code. This includes:
- Fix rhashtable walk for cases where nl dumps are done with muliple
function calls. Add a skip index to skip over entries in
a node that have been previously visitied. Call rhashtable_walk_peek
to avoid dropping items between calls to ila_nl_dump.
- Call alloc_bucket_spinlocks to create bucket locks.
- Split out module initialization and netlink definitions into
separate files.
- Add ILA_CMD_FLUSH netlink command to clear the ILA translation table.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Tom Herbert [Wed, 27 Jun 2018 21:39:01 +0000 (14:39 -0700)]
ila: Create main ila source file
Create a main ila file that contains the module initialization functions
as well as netlink definitions. Previously these were defined in
ila_xlat and ila_common. This approach allows better extensibility.
Signed-off-by: Tom Herbert <tom@quantonium.net> Signed-off-by: David S. Miller <davem@davemloft.net>
Tom Herbert [Wed, 27 Jun 2018 21:38:59 +0000 (14:38 -0700)]
ila: Fix use of rhashtable walk in ila_xlat.c
Perform better EAGAIN handling, handle case where ila_dump_info
fails and we missed objects in the dump, and add a skip index
to skip over ila entires in a list on a rhashtable node that have
already been visited (by a previous call to ila_nl_dump).
Signed-off-by: Tom Herbert <tom@quantonium.net> Signed-off-by: David S. Miller <davem@davemloft.net>
Huazhong Tan [Thu, 28 Jun 2018 04:12:29 +0000 (12:12 +0800)]
net: hns3: use lower_32_bits and upper_32_bits
MACRO lower_32_bits and upper_32_bits can help to get bits 0-31
and bits 32-63 of a number, so just use it.
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Huazhong Tan [Thu, 28 Jun 2018 04:12:28 +0000 (12:12 +0800)]
net: hns3: remove back in struct hclge_hw
hclge_hw is embedded in hclge_dev, so use container_of instead of
back to get hclge_dev.
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Peng Li [Thu, 28 Jun 2018 04:12:27 +0000 (12:12 +0800)]
net: hns3: remove the Redundant put_vector in hns3_client_uninit
The interface h->ae_algo->ops->put_vector is called in both
hns3_nic_dealloc_vector_data and hns3_nic_uninit_vector_data in
hns3_client_uninit, this will cause vector freed twice.
This patch remove the Redundant put_vector to make vector freed
only once.
Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Peng Li [Thu, 28 Jun 2018 04:12:23 +0000 (12:12 +0800)]
net: hns3: add unlikely for error check
The first bd of a packet is invalid and invalid ring head for tx
IRQ is not offen, they may occur when there is error,
Add unlikely for error check branch is better for performance.
Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Peng Li [Thu, 28 Jun 2018 04:12:20 +0000 (12:12 +0800)]
net: hns3: rename the interface for init_client_instance and uninit_client_instance
The interface init_client_instance and uninit_client_instance
do not register anything, only initialize the client instance.
This patch rename the related interface to make the function
name to indicate the purpose.
Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Peng Li [Thu, 28 Jun 2018 04:12:19 +0000 (12:12 +0800)]
net: hns3: remove hclge_get_vector_index from hclge_bind_ring_with_vector
In hclge_unmap_ring_frm_vector, there are 2 steps:
step 1: get vector index.
step 2 unbind ring with vector.
But it gets vector id again in step 2 interface. This patch
removes hclge_get_vector_index from hclge_bind_ring_with_vector,
and make the step the same with hns3 PF driver.
Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Saeed Mahameed [Thu, 24 May 2018 01:26:09 +0000 (18:26 -0700)]
net/mlx5e: Update NIC HW stats on demand only
Disable periodic stats update background thread and update stats in
background on demand when ndo_get_stats is called.
Having a background thread running in the driver all the time is bad for
power consumption and normally a user space daemon will query the stats
once every specific interval, so ideally the background thread and its
interval can be done in user space..
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Reviewed-by: Eran Ben Elisha <eranbe@mellanox.com>
Boris Pismenny [Thu, 31 May 2018 12:29:42 +0000 (15:29 +0300)]
net/mlx5e: Add UDP GSO support
This patch enables UDP GSO support. We enable this by using two WQEs
the first is a UDP LSO WQE for all segments with equal length, and the
second is for the last segment in case it has different length.
Due to HW limitation, before sending, we must adjust the packet length fields.
====================
net: preserve sock reference when scrubbing the skb.
The sock reference is lost when scrubbing the packet and that breaks
TSQ (TCP Small Queues) and XPS (Transmit Packet Steering) causing
performance impacts of about 50% in a single TCP stream when crossing
network namespaces.
XPS breaks because the queue mapping stored in the socket is not
available, so another random queue might be selected when the stack
needs to transmit something like a TCP ACK, or TCP Retransmissions.
That causes packet re-ordering and/or performance issues.
TSQ breaks because it orphans the packet while it is still in the
host, so packets are queued contributing to the buffer bloat problem.
Preserving the sock reference fixes both issues. The socket is
orphaned anyways in the receiving path before any relevant action,
but the transmit side needs some extra checking included in the
first patch.
The first patch will update netfilter to check if the socket
netns is local before use it.
The second patch removes the skb_orphan() from the skb_scrub_packet()
and improve the documentation.
ChangeLog:
- split into two (Eric)
- addressed Paolo's offline feedback to swap the checks in xt_socket.c
to preserve original behavior.
- improved ip-sysctl.txt (reported by Cong)
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Flavio Leitner [Wed, 27 Jun 2018 13:34:26 +0000 (10:34 -0300)]
skbuff: preserve sock reference when scrubbing the skb.
The sock reference is lost when scrubbing the packet and that breaks
TSQ (TCP Small Queues) and XPS (Transmit Packet Steering) causing
performance impacts of about 50% in a single TCP stream when crossing
network namespaces.
XPS breaks because the queue mapping stored in the socket is not
available, so another random queue might be selected when the stack
needs to transmit something like a TCP ACK, or TCP Retransmissions.
That causes packet re-ordering and/or performance issues.
TSQ breaks because it orphans the packet while it is still in the
host, so packets are queued contributing to the buffer bloat problem.
Preserving the sock reference fixes both issues. The socket is
orphaned anyways in the receiving path before any relevant action
and on TX side the netfilter checks if the reference is local before
use it.
Signed-off-by: Flavio Leitner <fbl@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Flavio Leitner [Wed, 27 Jun 2018 13:34:25 +0000 (10:34 -0300)]
netfilter: check if the socket netns is correct.
Netfilter assumes that if the socket is present in the skb, then
it can be used because that reference is cleaned up while the skb
is crossing netns.
We want to change that to preserve the socket reference in a future
patch, so this is a preparation updating netfilter to check if the
socket netns matches before use it.
Signed-off-by: Flavio Leitner <fbl@redhat.com> Acked-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>
====================
net sched actions: code style cleanup and fixes
The patchset fixes a few code stylistic issues and typos, as well as one
detected by sparse semantic checker tool.
No functional changes introduced.
Patch 1 & 2 fix coding style bits caught by the checkpatch.pl script
Patch 3 fixes an issue with a shadowed variable
Patch 4 adds sizeof() operator instead of magic number for buffer length
Patch 5 fixes typos in diagnostics messages
Patch 6 explicitly sets unsigned char for bitwise operation
v2:
- submit for net-next
- added Reviewed-by tags
- use u8* instead of char* as per Davide Caratti suggestion
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Roman Mashak [Wed, 27 Jun 2018 17:33:35 +0000 (13:33 -0400)]
net sched actions: avoid bitwise operation on signed value in pedit
Since char can be unsigned or signed, and bitwise operators may have
implementation-dependent results when performed on signed operands,
declare 'u8 *' operand instead.
Suggested-by: Davide Caratti <dcaratti@redhat.com> Signed-off-by: Roman Mashak <mrv@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Roman Mashak [Wed, 27 Jun 2018 17:33:34 +0000 (13:33 -0400)]
net sched actions: fix misleading text strings in pedit action
Change "tc filter pedit .." to "tc actions pedit .." in error
messages to clearly refer to pedit action.
Reviewed-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: Roman Mashak <mrv@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Roman Mashak [Wed, 27 Jun 2018 17:33:33 +0000 (13:33 -0400)]
net sched actions: use sizeof operator for buffer length
Replace constant integer with sizeof() to clearly indicate
the destination buffer length in skb_header_pointer() calls.
Reviewed-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: Roman Mashak <mrv@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Roman Mashak [Wed, 27 Jun 2018 17:33:32 +0000 (13:33 -0400)]
net sched actions: fix sparse warning
The variable _data in include/asm-generic/sections.h defines sections,
this causes sparse warning in pedit:
net/sched/act_pedit.c:293:35: warning: symbol '_data' shadows an earlier one
./include/asm-generic/sections.h:36:13: originally declared here
Therefore rename the variable.
Reviewed-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: Roman Mashak <mrv@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Roman Mashak [Wed, 27 Jun 2018 17:33:31 +0000 (13:33 -0400)]
net sched actions: fix coding style in pedit headers
Fix coding style issues in tc pedit headers detected by the
checkpatch script.
Reviewed-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: Roman Mashak <mrv@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Roman Mashak [Wed, 27 Jun 2018 17:33:30 +0000 (13:33 -0400)]
net sched actions: fix coding style in pedit action
Fix coding style issues in tc pedit action detected by the
checkpatch script.
Reviewed-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: Roman Mashak <mrv@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Yousuk Seung [Wed, 27 Jun 2018 17:32:19 +0000 (10:32 -0700)]
netem: slotting with non-uniform distribution
Extend slotting with support for non-uniform distributions. This is
similar to netem's non-uniform distribution delay feature.
Commit f043efeae2f1 ("netem: support delivering packets in delayed
time slots") added the slotting feature to approximate the behaviors
of media with packet aggregation but only supported a uniform
distribution for delays between transmission attempts. Tests with TCP
BBR with emulated wifi links with non-uniform distributions produced
more useful results.
The syntax and use of the distribution table is the same as in the
non-uniform distribution delay feature. A file DISTRIBUTION must be
present in TC_LIB_DIR (e.g. /usr/lib/tc) containing numbers scaled by
NETEM_DIST_SCALE. A random value x is selected from the table and it
takes DELAY + ( x * JITTER ) as delay. Correlation between values is not
supported.
Examples:
Normal distribution delay with mean = 800us and stdev = 100us.
> tc qdisc add dev eth0 root netem slot dist normal 800us 100us
Optionally set the max slot size in bytes and/or packets.
> tc qdisc add dev eth0 root netem slot dist normal 800us 100us \
bytes 64k packets 42
Signed-off-by: Yousuk Seung <ysseung@google.com> Acked-by: Eric Dumazet <edumazet@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Brandon Maier [Tue, 26 Jun 2018 17:50:50 +0000 (12:50 -0500)]
net: phy: xgmiitorgmii: Check read_status results
We're ignoring the result of the attached phy device's read_status().
Return it so we can detect errors.
Signed-off-by: Brandon Maier <brandon.maier@rockwellcollins.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
Brandon Maier [Tue, 26 Jun 2018 17:50:49 +0000 (12:50 -0500)]
net: phy: xgmiitorgmii: Use correct mdio bus
The xgmiitorgmii is using the mii_bus of the device it's attached to,
instead of the bus it was given during probe.
Signed-off-by: Brandon Maier <brandon.maier@rockwellcollins.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
Brandon Maier [Tue, 26 Jun 2018 17:50:48 +0000 (12:50 -0500)]
net: phy: xgmiitorgmii: Check phy_driver ready before accessing
Since a phy_device is added to the global mdio_bus list during
phy_device_register(), but a phy_device's phy_driver doesn't get
attached until phy_probe(). It's possible of_phy_find_device() in
xgmiitorgmii will return a valid phy with a NULL phy_driver. Leading to
a NULL pointer access during the memcpy().
Fixes this Oops:
Unable to handle kernel NULL pointer dereference at virtual address 00000000
pgd = c0004000
[00000000] *pgd=00000000
Internal error: Oops: 5 [#1] PREEMPT SMP ARM
Modules linked in:
CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.14.40 #1
Hardware name: Xilinx Zynq Platform
task: ce4c8d00 task.stack: ce4ca000
PC is at memcpy+0x48/0x330
LR is at xgmiitorgmii_probe+0x90/0xe8
pc : [<c074bc68>] lr : [<c0529548>] psr: 20000013
sp : ce4cbb54 ip : 00000000 fp : ce4cbb8c
r10: 00000000 r9 : 00000000 r8 : c0c49178
r7 : 00000000 r6 : cdc14718 r5 : ce762800 r4 : cdc14710
r3 : 00000000 r2 : 00000054 r1 : 00000000 r0 : cdc14718
Flags: nzCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment none
Control: 18c5387d Table: 0000404a DAC: 00000051
Process swapper/0 (pid: 1, stack limit = 0xce4ca210)
...
[<c074bc68>] (memcpy) from [<c0529548>] (xgmiitorgmii_probe+0x90/0xe8)
[<c0529548>] (xgmiitorgmii_probe) from [<c0526a94>] (mdio_probe+0x28/0x34)
[<c0526a94>] (mdio_probe) from [<c04db98c>] (driver_probe_device+0x254/0x414)
[<c04db98c>] (driver_probe_device) from [<c04dbd58>] (__device_attach_driver+0xac/0x10c)
[<c04dbd58>] (__device_attach_driver) from [<c04d96f4>] (bus_for_each_drv+0x84/0xc8)
[<c04d96f4>] (bus_for_each_drv) from [<c04db5bc>] (__device_attach+0xd0/0x134)
[<c04db5bc>] (__device_attach) from [<c04dbdd4>] (device_initial_probe+0x1c/0x20)
[<c04dbdd4>] (device_initial_probe) from [<c04da8fc>] (bus_probe_device+0x98/0xa0)
[<c04da8fc>] (bus_probe_device) from [<c04d8660>] (device_add+0x43c/0x5d0)
[<c04d8660>] (device_add) from [<c0526cb8>] (mdio_device_register+0x34/0x80)
[<c0526cb8>] (mdio_device_register) from [<c0580b48>] (of_mdiobus_register+0x170/0x30c)
[<c0580b48>] (of_mdiobus_register) from [<c05349c4>] (macb_probe+0x710/0xc00)
[<c05349c4>] (macb_probe) from [<c04dd700>] (platform_drv_probe+0x44/0x80)
[<c04dd700>] (platform_drv_probe) from [<c04db98c>] (driver_probe_device+0x254/0x414)
[<c04db98c>] (driver_probe_device) from [<c04dbc58>] (__driver_attach+0x10c/0x118)
[<c04dbc58>] (__driver_attach) from [<c04d9600>] (bus_for_each_dev+0x8c/0xd0)
[<c04d9600>] (bus_for_each_dev) from [<c04db1fc>] (driver_attach+0x2c/0x30)
[<c04db1fc>] (driver_attach) from [<c04daa98>] (bus_add_driver+0x50/0x260)
[<c04daa98>] (bus_add_driver) from [<c04dc440>] (driver_register+0x88/0x108)
[<c04dc440>] (driver_register) from [<c04dd6b4>] (__platform_driver_register+0x50/0x58)
[<c04dd6b4>] (__platform_driver_register) from [<c0b31248>] (macb_driver_init+0x24/0x28)
[<c0b31248>] (macb_driver_init) from [<c010203c>] (do_one_initcall+0x60/0x1a4)
[<c010203c>] (do_one_initcall) from [<c0b00f78>] (kernel_init_freeable+0x15c/0x1f8)
[<c0b00f78>] (kernel_init_freeable) from [<c0763d10>] (kernel_init+0x18/0x124)
[<c0763d10>] (kernel_init) from [<c0112d74>] (ret_from_fork+0x14/0x20)
Code: ba000002f5d1f03cf5d1f05cf5d1f07c (e8b151f8)
---[ end trace 3e4ec21905820a1f ]---
Signed-off-by: Brandon Maier <brandon.maier@rockwellcollins.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Shannon Nelson [Tue, 26 Jun 2018 17:07:54 +0000 (10:07 -0700)]
netdevsim: add ipsec offload testing
Implement the IPsec/XFRM offload API for testing.
Signed-off-by: Shannon Nelson <shannon.nelson@oracle.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Shannon Nelson [Tue, 26 Jun 2018 17:07:53 +0000 (10:07 -0700)]
selftests: rtnetlink: use dummydev as a test device
We really shouldn't mess with local system settings, so let's
use the already created dummy device instead for ipsec testing.
Oh, and let's put the temp file into a proper directory.
Signed-off-by: Shannon Nelson <shannon.nelson@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Shannon Nelson [Tue, 26 Jun 2018 17:07:52 +0000 (10:07 -0700)]
selftests: rtnetlink: clear the return code at start of ipsec test
Following the custom from the other functions, clear the global
ret code before starting the test so as to not have previously
failed tests cause us to thing this test has failed.
Reported-by: Anders Roxell <anders.roxell@linaro.org> Signed-off-by: Shannon Nelson <shannon.nelson@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Guillaume Nault [Tue, 26 Jun 2018 16:41:36 +0000 (18:41 +0200)]
l2tp: define helper for parsing struct sockaddr_pppol2tp*
'sockaddr_len' is checked against various values when entering
pppol2tp_connect(), to verify its validity. It is used again later, to
find out which sockaddr structure was passed from user space. This
patch combines these two operations into one new function in order to
simplify pppol2tp_connect().
A new structure, l2tp_connect_info, is used to pass sockaddr data back
to pppol2tp_connect(), to avoid passing too many parameters to
l2tp_sockaddr_get_info(). Also, the first parameter is void* in order
to avoid casting between all sockaddr_* structures manually.
Signed-off-by: Guillaume Nault <g.nault@alphalink.fr> Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Tue, 26 Jun 2018 15:45:49 +0000 (08:45 -0700)]
tcp: remove one indentation level in tcp_create_openreq_child
Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Yuchung Cheng <ycheng@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Sergei Shtylyov [Tue, 26 Jun 2018 15:42:33 +0000 (18:42 +0300)]
sh_eth: fix *enum* {A|M}PR_BIT
The *enum* {A|M}PR_BIT were declared in the commit 86a74ff21a7a ("net:
sh_eth: add support for Renesas SuperH Ethernet") adding SH771x support,
however the SH771x manual doesn't have the APR/MPR registers described
and the code writing to them for SH7710 was later removed by the commit 380af9e390ec ("net: sh_eth: CPU dependency code collect to "struct
sh_eth_cpu_data""). All the newer SoC manuals have these registers
documented as having a 16-bit TIME parameter of the PAUSE frame, not
1-bit -- update the *enum* accordingly, fixing up the APR/MPR writes...
Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com> Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: David S. Miller <davem@davemloft.net>
Antoine Tenart [Tue, 26 Jun 2018 12:28:49 +0000 (14:28 +0200)]
net: mscc: ocelot: add VLAN filtering
Add hardware VLAN filtering offloading on ocelot.
Signed-off-by: Antoine Tenart <antoine.tenart@bootlin.com> Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Casey Leedom [Tue, 26 Jun 2018 09:18:48 +0000 (14:48 +0530)]
cxgb4: Add flag tc_flower_initialized
Add flag tc_flower_initialized to indicate the
completion if tc flower initialization.
Signed-off-by: Casey Leedom <leedom@chelsio.com> Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Roopa Prabhu [Tue, 26 Jun 2018 03:32:53 +0000 (20:32 -0700)]
neighbour: force neigh_invalidate when NUD_FAILED update is from admin
In systems where neigh gc thresh holds are set to high values,
admin deleted neigh entries (eg ip neigh flush or ip neigh del) can
linger around in NUD_FAILED state for a long time until periodic gc kicks
in. This patch forces neigh_invalidate when NUD_FAILED neigh_update is
from an admin.
Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Wed, 27 Jun 2018 01:42:13 +0000 (10:42 +0900)]
Merge branch 'Multipath-tests-for-tunnel-devices'
Petr Machata says:
====================
Multipath tests for tunnel devices
This patchset adds a test for ECMP and weighted ECMP between two GRE
tunnels.
In patches #1 and #2, the function multipath_eval() is first moved from
router_multipath.sh to lib.sh for ease of reuse, and then fixed up.
In patch #3, the function tc_rule_stats_get() is parameterized to be
useful for egress rules as well.
In patch #4, a new function __simple_if_init() is extracted from
simple_if_init(). This covers the logic that needs to be done for the
usual interface: VRF migration, upping and installation of IP addresses.
Patch #5 then adds the test itself.
Additionally in patch #6, a requirement to add diagrams to selftests is
documented.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Petr Machata [Tue, 26 Jun 2018 00:08:17 +0000 (02:08 +0200)]
selftests: forwarding: README: Require diagrams
ASCII art diagrams are well suited for presenting the topology that a
test uses while being easy to embed directly in the test file iteslf.
They make the information very easy to grasp even for simple topologies,
and for more complex ones they are almost essential, as figuring out the
interconnects from the script itself proves to be difficult.
Therefore state the requirement for topology ASCII art in README.
Signed-off-by: Petr Machata <petrm@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Petr Machata [Tue, 26 Jun 2018 00:08:05 +0000 (02:08 +0200)]
selftests: forwarding: Test multipath tunneling
Add a GRE-tunneling test such that there are two tunnels involved, with
a multipath route listing both as next hops. Similarly to
router_multipath.sh, test that the distribution of traffic to the
tunnels honors the configured weights.
Signed-off-by: Petr Machata <petrm@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
The function simple_if_init() does two things: it creates a VRF, then
moves an interface into this VRF and configures addresses. The latter
comes in handy when adding more interfaces into a VRF later on. The
situation is similar for simple_if_fini().
Therefore split the interface remastering and address de/initialization
logic to a new pair of helpers __simple_if_init() / __simple_if_fini(),
and defer to these helpers from simple_if_init() and simple_if_fini().
Signed-off-by: Petr Machata <petrm@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Petr Machata [Tue, 26 Jun 2018 00:07:45 +0000 (02:07 +0200)]
selftests: forwarding: tc_rule_stats_get: Parameterize direction
The GRE multipath tests need stats on an egress counter. Change
tc_rule_stats_get() to take direction as an optional argument, with
default of ingress.
Take the opportunity to change line continuation character from | to \.
Move the | to the next line, which indent.
Signed-off-by: Petr Machata <petrm@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
- Change the indentation of the function body from 7 spaces to one tab.
- Move initialization of weights_ratio up so that it can be referenced
from the error message about packet difference being zero.
- Move |'s consistently to continuation line, which reindent.
Signed-off-by: Petr Machata <petrm@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Kees Cook [Mon, 25 Jun 2018 23:55:05 +0000 (16:55 -0700)]
net/tls: Remove VLA usage on nonce
It looks like the prior VLA removal, commit b16520f7493d ("net/tls: Remove
VLA usage"), and a new VLA addition, commit c46234ebb4d1e ("tls: RX path
for ktls"), passed in the night. This removes the newly added VLA, which
happens to have its bounds based on the same max value.
Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: David S. Miller <davem@davemloft.net>
The IP addresses of tunnel endpoint at H3 are set at the VLAN device
$h3.555. Therefore when test_gretap_untagged_egress() sets vlan 555 to
egress untagged at $swp3, $h3's rp_filter rejects these packets. The
test then spuriously fails.
Therefore turn off net.ipv4.conf.{all, $h3}.rp_filter.
Fixes: 9c7c8a82442c ("selftests: forwarding: mirror_gre_vlan_bridge_1q: Add more tests") Signed-off-by: Petr Machata <petrm@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Kees Cook [Mon, 25 Jun 2018 22:49:49 +0000 (15:49 -0700)]
mdio-mux-gpio: Remove VLA usage
In the quest to remove all stack VLA usage from the kernel[1], this
allocates the values buffer during the callback instead of putting it
on the stack.
====================
net: sched: support replay of filter offload when binding to block
This series from John adds the ability to replay filter offload requests
when new offload callback is being registered on a TC block. This is most
likely to take place for shared blocks today, when a block which already
has rules is bound to another interface. Prior to this patch set if any
of the rules were offloaded the block bind would fail.
A new tcf_proto_op is added to generate a filter-specific offload request.
The new 'offload' op is supporting extack from day 0, hence we need to
propagate extack to .ndo_setup_tc TC_BLOCK_BIND/TC_BLOCK_UNBIND and
through tcf_block_cb_register() to tcf_block_playback_offloads().
The immediate use of this patch set is to simplify life of drivers which
require duplicating rules when sharing blocks. Switch drivers (mlxsw)
can bind ports to rule lists dynamically, NIC drivers generally don't
have that ability and need the rules to be duplicated for each ingress
they match on. In code terms this means that switch drivers don't
register multiple callbacks for each port. NIC drivers do, and get a
separate request and hance rule per-port, as if the block was not shared.
The registration fails today, however, if some rules were already present.
As John notes in description of patch 7, drivers which register multiple
callbacks to shared blocks will likely need to flush the rules on block
unbind. This set makes the core not only replay the the offload add
requests but also offload remove requests when callback is unregistered.
v2:
- name parameters in patch 2;
- use unsigned int instead of u32 for in_hw_coun;
- improve extack message in patch 7.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
John Hurley [Mon, 25 Jun 2018 21:30:10 +0000 (14:30 -0700)]
net: sched: call reoffload op on block callback reg
Call the reoffload tcf_proto_op on all tcf_proto nodes in all chains of a
block when a callback tries to register to a block that already has
offloaded rules. If all existing rules cannot be offloaded then the
registration is rejected. This replaces the previous policy of rejecting
such callback registration outright.
On unregistration of a callback, the rules are flushed for that given cb.
The implementation of block sharing in the NFP driver, for example,
duplicates shared rules to all devs bound to a block. This meant that
rules could still exist in hw even after a device is unbound from a block
(assuming the block still remains active).
Signed-off-by: John Hurley <john.hurley@netronome.com> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Add the offload tcf_proto_op in cls_bpf to generate an offload message for
each bpf prog in the given tcf_proto. Call the specified callback with
this new offload message. The function only returns an error if the
callback rejects adding a 'hardware only' prog.
A prog contains a flag to indicate if it is in hardware or not. To
ensure the offload function properly maintains this flag, keep a reference
counter for the number of instances of the prog that are in hardware. Only
update the flag when this counter changes from or to 0.
Signed-off-by: John Hurley <john.hurley@netronome.com> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Add the offload tcf_proto_op in cls_u32 to generate an offload message for
each filter and the hashtable in the given tcf_proto. Call the specified
callback with this new offload message. The function only returns an error
if the callback rejects adding a 'hardware only' rule.
A filter contains a flag to indicate if it is in hardware or not. To
ensure the offload function properly maintains this flag, keep a reference
counter for the number of instances of the filter that are in hardware.
Only update the flag when this counter changes from or to 0.
Signed-off-by: John Hurley <john.hurley@netronome.com> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Add the reoffload tcf_proto_op in matchall to generate an offload message
for each filter in the given tcf_proto. Call the specified callback with
this new offload message. The function only returns an error if the
callback rejects adding a 'hardware only' rule.
Ensure matchall flags correctly report if the rule is in hw by keeping a
reference counter for the number of instances of the rule offloaded. Only
update the flag when this counter changes from or to 0.
Signed-off-by: John Hurley <john.hurley@netronome.com> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Add the reoffload tcf_proto_op in flower to generate an offload message
for each filter in the given tcf_proto. Call the specified callback with
this new offload message. The function only returns an error if the
callback rejects adding a 'hardware only' rule.
A filter contains a flag to indicate if it is in hardware or not. To
ensure the reoffload function properly maintains this flag, keep a
reference counter for the number of instances of the filter that are in
hardware. Only update the flag when this counter changes from or to 0. Add
a generic helper function to implement this behaviour.
Signed-off-by: John Hurley <john.hurley@netronome.com> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
John Hurley [Mon, 25 Jun 2018 21:30:05 +0000 (14:30 -0700)]
net: sched: add tcf_proto_op to offload a rule
Create a new tcf_proto_op called 'reoffload' that generates a new offload
message for each node in a tcf_proto. Pointers to the tcf_proto and
whether the offload request is to add or delete the node are included.
Also included is a callback function to send the offload message to and
the option of priv data to go with the cb.
Signed-off-by: John Hurley <john.hurley@netronome.com> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
John Hurley [Mon, 25 Jun 2018 21:30:04 +0000 (14:30 -0700)]
net: sched: pass extack pointer to block binds and cb registration
Pass the extact struct from a tc qdisc add to the block bind function and,
in turn, to the setup_tc ndo of binding device via the tc_block_offload
struct. Pass this back to any block callback registrations to allow
netlink logging of fails in the bind process.
Signed-off-by: John Hurley <john.hurley@netronome.com> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Tue, 26 Jun 2018 14:15:43 +0000 (23:15 +0900)]
Merge branch 'sh_eth-RPADIR-related-clean-ups'
Sergei Shtylyov says:
====================
sh_eth: RPADIR related clean-ups
Here's a set of 2 patches against DaveM's 'net-next.git' repo. They are
clean-ups related to RPADIR (DMA padding to NET_IP_ALIGN)...
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Sergei Shtylyov [Mon, 25 Jun 2018 20:37:06 +0000 (23:37 +0300)]
sh_eth: remove sh_eth_cpu_data::rpadir_value
If RPADIR exists, the value written to it is always the same for all SoCs
(and derived from NET_IP_ALIGN), so there has not been any need to store
it in the *struct* sh_eth_cpu_data...
Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com> Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: David S. Miller <davem@davemloft.net>
Sergei Shtylyov [Mon, 25 Jun 2018 20:36:21 +0000 (23:36 +0300)]
sh_eth: fix *enum* RPADIR_BIT
The *enum* RPADIR_BIT was declared in the commit 86a74ff21a7a ("net:
sh_eth: add support for Renesas SuperH Ethernet") adding SH771x support,
however the SH771x manual doesn't have the RPADIR register described and,
moreover, tells why the padding insertion must not be used. The newer SoC
manuals do have RPADIR documented, though with somewhat different layout --
update the *enum* according to these manuals...
Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com> Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: David S. Miller <davem@davemloft.net>
Heiner Kallweit [Mon, 25 Jun 2018 18:34:41 +0000 (20:34 +0200)]
r8169: reject unsupported WoL options
So far unsupported WoL options are silently ignored. Change this and
reject attempts to set unsupported options. This prevents situations
where a user tries to set an unsupported WoL option and is under the
impression it was successful because ethtool doesn't complain.
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Petr Machata [Mon, 25 Jun 2018 14:43:55 +0000 (16:43 +0200)]
selftests: net: Test headroom handling of ip6_gre devices
Commit 5691484df961 ("net: ip6_gre: Fix headroom request in
ip6erspan_tunnel_xmit()") and commit 01b8d064d58b ("net: ip6_gre:
Request headroom in __gre6_xmit()") fix problems in reserving headroom
in the packets tunneled through ip6gre/tap and ip6erspan netdevices.
These two patches included snippets that reproduced the issues. This
patch elevates the snippets to a full-fledged test case.
Suggested-by: David Miller <davem@davemloft.net> Signed-off-by: Petr Machata <petrm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>