Daniel Borkmann [Sat, 10 Jun 2017 22:50:40 +0000 (00:50 +0200)]
bpf: avoid excessive stack usage for perf_sample_data
perf_sample_data consumes 386 bytes on stack, reduce excessive stack
usage and move it to per cpu buffer. It's allowed due to preemption
being disabled for tracing, xdp and tc programs, thus at all times
only one program can run on a specific CPU and programs cannot run
from interrupt. We similarly also handle bpf_pt_regs.
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Fabio Estevam [Sat, 10 Jun 2017 01:37:22 +0000 (22:37 -0300)]
net: fec: Add a fec_enet_clear_ethtool_stats() stub for CONFIG_M5272
Commit 2b30842b23b9 ("net: fec: Clear and enable MIB counters on imx51")
introduced fec_enet_clear_ethtool_stats(), but missed to add a stub
for the CONFIG_M5272=y case, causing build failure for the
m5272c3_defconfig.
Add the missing empty stub to fix the build failure.
Fixes: Commit 2b30842b23b9 ("net: fec: Clear and enable MIB counters on imx51") Reported-by: Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: Fabio Estevam <fabio.estevam@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Chenbo Feng [Sat, 10 Jun 2017 19:35:38 +0000 (12:35 -0700)]
Remove the redundant skb->dev initialization in ip6_fragment
After moves the skb->dev and skb->protocol initialization into
ip6_output, setting the skb->dev inside ip6_fragment is unnecessary.
Fixes: 97a7a37a7b7b("ipv6: Initial skb->dev and skb->protocol in ip6_output") Signed-off-by: Chenbo Feng <fengc@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Xin Long [Sat, 10 Jun 2017 07:27:12 +0000 (15:27 +0800)]
sctp: no need to check assoc id before calling sctp_assoc_set_id
sctp_assoc_set_id does the assoc id check in the beginning when
processing dupcookie, no need to do the same check before calling
it.
v1->v2:
fix some typo errs Marcelo pointed in changelog.
Signed-off-by: Xin Long <lucien.xin@gmail.com> Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Xin Long [Sat, 10 Jun 2017 07:13:32 +0000 (15:13 +0800)]
sctp: use read_lock_bh in sctp_eps_seq_show
This patch is to use read_lock_bh instead of local_bh_disable
and read_lock in sctp_eps_seq_show.
Signed-off-by: Xin Long <lucien.xin@gmail.com> Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
This warning is caused by the lock held by sctp_getsockopt() is on one
socket, while the other lock that sctp_close() is getting later is on
the newly created (which failed) socket during peeloff operation.
This patch is to avoid this warning by use lock_sock with subclass
SINGLE_DEPTH_NESTING as Wang Cong and Marcelo's suggestion.
Reported-by: Dmitry Vyukov <dvyukov@google.com> Suggested-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Suggested-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: Xin Long <lucien.xin@gmail.com> Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Chenbo Feng [Fri, 9 Jun 2017 19:17:37 +0000 (12:17 -0700)]
bpf: Remove duplicate tcp_filter hook in ipv6
There are two tcp_filter hooks in tcp_ipv6 ingress path currently.
One is at tcp_v6_rcv and another is in tcp_v6_do_rcv. It seems the
tcp_filter() call inside tcp_v6_do_rcv is redundent and some packet
will be filtered twice in this situation. This will cause trouble
when using eBPF filters to account traffic data.
Signed-off-by: Chenbo Feng <fengc@google.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Nicolas Dichtel [Fri, 9 Jun 2017 15:58:08 +0000 (17:58 +0200)]
bonding: warn user when 802.3ad speed is unknown
Goal is to advertise the user when ethtool speeds and 802.3ad speeds are
desynchronized.
When this case happens, the kernel needs to be patched.
Suggested-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Nicolas Dichtel [Fri, 9 Jun 2017 12:41:57 +0000 (14:41 +0200)]
netns: fix error code when the nsid is already used
When the user tries to assign a specific nsid, idr_alloc() is called with
the range [nsid, nsid+1]. If this nsid is already used, idr_alloc() returns
ENOSPC (No space left on device). In our case, it's better to return
EEXIST to make it clear that the nsid is not available.
CC: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Nicolas Dichtel [Fri, 9 Jun 2017 12:41:56 +0000 (14:41 +0200)]
netns: define extack error msg for nsis cmds
It helps the user to identify errors.
CC: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Ganesh Goudar [Fri, 9 Jun 2017 13:56:24 +0000 (19:26 +0530)]
cxgb4: fix memory leak in init_one()
Free up mbox_log allocated for PF0 to PF3.
Fixes: 7829451c695e ("cxgb4: Add control net_device for configuring PCIe VF") Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Arnd Bergmann [Fri, 9 Jun 2017 10:37:35 +0000 (12:37 +0200)]
qed: add qed_int_sb_init() stub function
When CONFIG_QED_SRIOV is disabled, we get a build error:
drivers/net/ethernet/qlogic/qed/qed_int.c: In function 'qed_int_sb_init':
drivers/net/ethernet/qlogic/qed/qed_int.c:1499:4: error: implicit declaration of function 'qed_vf_set_sb_info'; did you mean 'qed_mcp_get_resc_info'? [-Werror=implicit-function-declaration]
All the other declarations have a 'static inline' stub as an alternative
here, so this adds one more for qed_int_sb_init.
Fixes: 50a207147fce ("qed: Hold a single array for SBs") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Yuval Mintz <Yuval.Mintz@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Fri, 9 Jun 2017 19:49:03 +0000 (15:49 -0400)]
Merge branch 'qed-Light-L2-updates'
Yuval Mintz says:
====================
qed*: Light L2 updates
This series does a major overhaul of the LL2 logic in qed.
The single biggest change done here is in #5 where we're changing
the API qed provides for LL2 [both internally in case of storage and
externally in case of RoCE] to become callback-based to allow cleaner
scalability in preperation to the future iWARP submission which would
aadd additional flavors of LL2. It's also the only patch in series
to modify !qed logic [qedr].
Patches prior to that mostly deal with refactoring LL2 code,
encapsulating varaious parameters into structure and re-ordering
of LL2 code. The latter patches add some small missing bits of LL2
ffunctionality.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Michal Kalderon [Fri, 9 Jun 2017 14:13:22 +0000 (17:13 +0300)]
qed*: LL2 callback operations
LL2 today is interrupt driven - when tx/rx completion arrives [or any
other indication], qed needs to operate on the connection and pass
the information to the protocol-driver [or internal qed consumer].
Since we have several flavors of ll2 employeed by the driver,
each handler needs to do an if-else to determine the right functionality
to use based on the connection type.
In order to make things more scalable [given that we're going to add
additional types of ll2 flavors] move the infrastrucutre into using
a callback-based approach - the callbacks would be provided as part
of the connection's initialization parameters.
Signed-off-by: Michal Kalderon <Michal.Kalderon@cavium.com> Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Mintz, Yuval [Fri, 9 Jun 2017 14:13:20 +0000 (17:13 +0300)]
qed: Cleaner seperation of LL2 inputs
A LL2 connection [qed_ll2_info] has a sub-structure of type qed_ll2_conn
that contain various inputs for ll2 acquisition, but the connection also
utilizes a couple of other inputs.
Restructure the input structure to include all the inputs and refactor
the code necessary to populate those.
Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Mintz, Yuval [Fri, 9 Jun 2017 14:13:18 +0000 (17:13 +0300)]
qed: LL2 to use packed information for tx
First step in revising the LL2 interface, this declares
qed_ll2_tx_pkt_info as part of the ll2 interface, and uses it for
transmission instead of receiving lots of parameters.
Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
llvm 5.0 does not like the section name and the function name
to be the same:
clang -I. -I./include/uapi -I../../../include/uapi \
-I../../../../samples/bpf/ \
-Wno-compare-distinct-pointer-types \
-O2 -target bpf -c \
linux/tools/testing/selftests/bpf/test_obj_id.c -o \
linux/tools/testing/selftests/bpf/test_obj_id.o
fatal error: error in backend: 'test_prog_id' label emitted multiple times to
assembly file
clang-5.0: error: clang frontend command failed with exit code 70 (use -v to
see invocation)
clang version 5.0.0 (trunk 304326) (llvm/trunk 304329)
This patch makes changes to the section name and the function name.
Fixes: 95b9afd3987f ("bpf: Test for bpf ID") Reported-by: Alexei Starovoitov <ast@fb.com> Reported-by: Yonghong Song <yhs@fb.com> Signed-off-by: Martin KaFai Lau <kafai@fb.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Yonghong Song <yhs@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net>
bpf: Fix test_bpf_obj_id() when the bpf_jit_enable sysctl is diabled
test_bpf_obj_id() should not expect a non zero jited_prog_len
to be returned by bpf_obj_get_info_by_fd() when
net.core.bpf_jit_enable is 0.
The patch checks for net.core.bpf_jit_enable and
has different expectation on jited_prog_len.
This patch also removes the pwd.h header which I forgot
to remove after making changes.
Fixes: 95b9afd3987f ("bpf: Test for bpf ID") Reported-by: Yonghong Song <yhs@fb.com> Signed-off-by: Martin KaFai Lau <kafai@fb.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Yonghong Song <yhs@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Chenbo Feng [Fri, 9 Jun 2017 19:06:07 +0000 (12:06 -0700)]
ipv6: Initial skb->dev and skb->protocol in ip6_output
Move the initialization of skb->dev and skb->protocol from
ip6_finish_output2 to ip6_output. This can make the skb->dev and
skb->protocol information avalaible to the CGROUP eBPF filter.
Signed-off-by: Chenbo Feng <fengc@google.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Handle TIMER0INT when FW crashes. Check for PCIE_FW[FW_EVAL]
and if it says "Device FW Crashed", then treat it as fatal.
Else, non-fatal.
Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com> Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Fri, 9 Jun 2017 16:52:09 +0000 (12:52 -0400)]
Merge branch 'nfp-FW-app-build-name-reporting'
Jakub Kicinski says:
====================
nfp: FW app build name reporting
This series adds reporting FW build name in ethtool -i. Most
of the patches are restructuring where information caching is
done. There is also a minor error path fix.
These are last few patches finishing the basic nfp_app support.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Fri, 9 Jun 2017 03:56:11 +0000 (20:56 -0700)]
nfp: remove automatic caching of RTsym table
The fact that RTsym table is cached inside nfp_cpp handle is
a relic of old times when nfpcore was a library module. All
the nfp_cpp "caches" are awkward to deal with because of
concurrency and prone to keeping stale information. Make
the run time symbol table be an object read out from the device
and managed by whoever requested it. Since the driver loads
FW at ->probe() and never reloads, we can hold onto the table
for ever.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Fri, 9 Jun 2017 03:56:10 +0000 (20:56 -0700)]
nfp: make sure to cancel port refresh on the error path
If very last stages of netdev registering and init fail some
other netdevs and devlink ports may have been visible to user
space before we torn them back down. In this case there is a
slight chance user may have triggered port refresh. We need
to make sure the async work is cancelled.
We have to cancel after releasing pf->lock, so we will always
try to cancel, regardless of which part of probe has failed.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Derek Chickles [Fri, 9 Jun 2017 02:20:36 +0000 (19:20 -0700)]
liquidio: disallow enabling firmware debug from a VF
Disallow enabling firmware debug from a VF. Only PF is allowed to do that.
Signed-off-by: Derek Chickles <derek.chickles@cavium.com> Signed-off-by: Felix Manlunas <felix.manlunas@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Ipvlan should return an error when an address is already in use.
The ipvlan code already knows how to detect when a duplicate address is
about to be assigned to an ipvlan device. However, that failure is not
propogated outward and leads to a silent failure.
Introduce a validation step at ip address creation time and allow device
drivers to register to validate the incoming ip addresses. The ipvlan
code is the first consumer. If it detects an address in use, we can
return an error to the user before beginning to commit the new ifa in
the networking code.
This can be especially useful if it is necessary to provision many
ipvlans in containers. The provisioning software (or operator) can use
this to detect situations where an ip address is unexpectedly in use.
Signed-off-by: Krister Johansen <kjlx@templeofstupid.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Speed up transmit check for fragmented packets by using existing
macros to compute number of pages, and eliminate loop since
skb fragments each take a page. Number of slots is also unsigned.
Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
This patchset brings no functional changes. It is a first step in a
bigger cosmetics change to the driver. It simplifies print messages and
polishes data types and chip operations.
The next patchs will only prefix and document the port registers macros.
Changes in v2:
- KISS and simply use dev_* since chip->ds may not be initialized
- add reviewers tags
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Vivien Didelot [Thu, 8 Jun 2017 22:34:14 +0000 (18:34 -0400)]
net: dsa: mv88e6xxx: prefix PHY macros
Prefix the PHY_* macros with a Marvell specific MV88E6XXX_ prefix.
There is no functional changes.
Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Vivien Didelot [Thu, 8 Jun 2017 22:34:13 +0000 (18:34 -0400)]
net: dsa: mv88e6xxx: rework jumbo size operation
Marvell chips have a Jumbo Mode to set the maximum frame size (MTU).
The mv88e6xxx_ops structure is meant to contain generic functionalities,
no driver logic. Change port_jumbo_config to port_set_jumbo_size setting
the mode from a given maximum size value.
There is no functional changes since we still use 10240 bytes.
At the same time, correctly clear all Jumbo Mode bits before writing.
Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Vivien Didelot [Thu, 8 Jun 2017 22:34:12 +0000 (18:34 -0400)]
net: dsa: mv88e6xxx: rework pause limit operation
All Marvell chips supporting Pause frames limiting use 1-byte value for
input and output.
Old chips have both bytes adjacent in a 16-bit register. New ones have
an indirect table using 8-bit data.
The mv88e6xxx library functions (such as in port.c) must not contain
driver logic, but only generic helpers. This patch changes the
port_pause_config operation for port_pause_limit taking two u8 arguments
for input and output limits. There is no functional changes.
Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Vivien Didelot [Thu, 8 Jun 2017 22:34:10 +0000 (18:34 -0400)]
net: dsa: mv88e6xxx: use bridge state values
Reuse the BR_STATE_* values to abstract a port STP state value.
This provides shorter names and better control over the DSA switch
operation call.
Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
Vivien Didelot [Thu, 8 Jun 2017 22:34:09 +0000 (18:34 -0400)]
net: dsa: mv88e6xxx: add egress mode enumeration
As for the frame mode, add a mv88e6xxx_egress_mode enumeration instead
of a 16-bit register mask.
Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Vivien Didelot [Thu, 8 Jun 2017 22:34:08 +0000 (18:34 -0400)]
net: dsa: mv888e6xxx: do not use netdev printing
The mv888e6xxx driver accesses a port's netdev mostly for printing.
This is bad for 2 reasons: DSA and CPU ports do not have a netdev
pointer; it doesn't give us a correct picture of why a DSA driver might
need to access a port's netdev.
Instead simply use dev_* printing functions with chip->dev (or ds->dev
depending on the scope, both guaranteed to exist), with a p%d prefix for
the target port.
Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
When inheriting tx_flags from one skbuff to another, always apply a
mask to avoid overwriting unrelated other bits in the field.
The two SKBTX_SHARED_FRAG cases clears all other bits. In practice,
tx_flags are zero at this point now. But this is fragile. Timestamp
flags are set, for instance, if in tcp_gso_segment, after this clear
in skb_segment.
The SKBTX_ANY_TSTAMP mask in __skb_tstamp_tx ensures that new
skbs do not accidentally inherit flags such as SKBTX_SHARED_FRAG.
Signed-off-by: Willem de Bruijn <willemb@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
====================
drivers: net: add const to mii_phy_ops structures
The object references of mii_phy_ops structures are only stored
in the ops field of a mii_phy_def structure. This ops field is of type
const. So, mii_phy_ops structures having similar properties can be
declared as const.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Bhumika Goyal [Thu, 8 Jun 2017 06:00:58 +0000 (11:30 +0530)]
drivers: net: emac: add const to mii_phy_ops structures
The object references of mii_phy_ops structures are only stored
in the ops field of a mii_phy_def structure. This ops field is of type
const. So, mii_phy_ops structures having similar properties can be
declared as const.
Signed-off-by: Bhumika Goyal <bhumirks@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Bhumika Goyal [Thu, 8 Jun 2017 06:00:57 +0000 (11:30 +0530)]
drivers/net/sungem: add const to mii_phy_ops structures
The object references of mii_phy_ops structures are only stored
in the ops field of a mii_phy_def structure. This ops field is of type
const. So, mii_phy_ops structures having similar properties can be
declared as const.
Signed-off-by: Bhumika Goyal <bhumirks@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Thu, 8 Jun 2017 18:41:19 +0000 (14:41 -0400)]
Merge branch '1GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue
Jeff Kirsher says:
====================
1GbE Intel Wired LAN Driver Updates 2017-06-07
This series contains a fix for e1000e and igb.
Colin Ian King fixes sparse warnings in igb by making functions static.
Chris Wilson provides a fix for a previous commit which is causing an
issue during suspend "e1000e_pm_suspend()", where we need to run
e1000e_pm_thaw() if __e1000_shutdown() is unsuccessful.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Use PORT_REG for T4 and T5_PORT_REG for > T4 to write to correct
register to bring down link during shutdown after adapter crash.
Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com> Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Currently there's no way to dump the VIF table for an ipmr table other
than the default (via proc). This is a major issue when debugging ipmr
issues and in general it is good to know which interfaces are
configured. This patch adds support for RTM_GETLINK for the ipmr family
so we can dump the VIF table and the ipmr table's current config for
each table. We're protected by rtnl so no need to acquire RCU or
mrt_lock.
Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
====================
mlxsw: Remove compatibility with old firmware
Up until recently we couldn't enforce a minimal firmware version, which
forced us to be compatible with old firmware versions. This patchset
removes this code and simplifies the driver.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Ido Schimmel [Thu, 8 Jun 2017 06:47:45 +0000 (08:47 +0200)]
mlxsw: spectrum: Pass port argument to module mapping functions
Previous patch made it unnecessary to map ports to modules before we
allocate their struct. We can now therefore pass the port struct to
these functions, thereby making them consistent with other functions
that operate on ports.
Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Ido Schimmel [Thu, 8 Jun 2017 06:47:44 +0000 (08:47 +0200)]
mlxsw: spectrum: Simplify port split flow
In commit be94535f9531 ("mlxsw: spectrum: Make split flow match firmware
requirements") we had to modify the port split flow to overcome quirks
in the device's firmware. This resulted in asymmetrical code with
regards to port creation and removal.
The problem in the firmware is long gone and since we can now enforce a
minimal firmware version, we can simplify the code and make it symmetric
again.
Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Ido Schimmel [Thu, 8 Jun 2017 06:47:43 +0000 (08:47 +0200)]
mlxsw: spectrum_router: Mark only first LPM tree as reserved
In new firmware versions (that we can now enforce via
request_firmware()), only the first LPM tree is reserved and not the
first two as in older versions.
Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
===================
net: Remove support from bridge bypass for mlxsw/rocker drivers
Currently setting bridge port attributes and adding FDBs are done via
setting the SELF flag which implies unconsistent offloading model. This
patch-set fixes this behavior by making the bridge and drivers which are
using it to be totally in sync.
This implies several changes:
- Offloading bridge flags from the bridge code.
- Sending notification about FDB add/del to the software bridge in a
similiar way it is done for the hardware externally learned FDBs.
By making the offloading model more consistent a cleanup is done in
the drivers supporting it. This is done in order to remove un-needed
logic related to dump operation which is redundant.
First add missing functionality to bridge, then clean up the mlxsw/rocker
drivers.
v1->v2
- Move bridge-switchdev related stuff to br_switchdev.c as suggested by Nik
===================
Signed-off-by: David S. Miller <davem@davemloft.net>
The FDB add/delete are now done through the notification chain. The FDBs
are synced with the bridge and there is no need for extra dumping.
Signed-off-by: Arkadi Sharshevsky <arkadis@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
rocker: Remove support for bypass bridge port attributes/vlan set
The bridge port attributes/vlan for mlxsw devices should be set only
from bridge code. The vlans are synced totally with the bridge so
there is no need to special dump support.
Signed-off-by: Arkadi Sharshevsky <arkadis@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
rocker: Add support for learning FDB through notification
Add support for learning FDB through notification. The driver defers
the hardware update via ordered work queue.
Signed-off-by: Arkadi Sharshevsky <arkadis@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
rocker: Change world_ops API and implementation to be switchdev independant
Currently the switchdev_trans struct is embedded in the world_ops API.
In order to add support for adding FDB via a notfication chain the API should
be switchdev independent.
Signed-off-by: Arkadi Sharshevsky <arkadis@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
rocker: Add support for querying supported bridge flags
Add support for querying supported bridge flags.
Signed-off-by: Arkadi Sharshevsky <arkadis@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
rocker: Remove support for bridge FDB learning sync
Currently the rocker driver supports an option for disabling syncing
the hardware learned FDBs with the software bridge. This behavior
breaks the bridge offload model and thus it is removed.
Signed-off-by: Arkadi Sharshevsky <arkadis@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
net: Remove support for bridge bypass ndos from stacked devices
Remove support for bridge bypass ndos from stacked devices. At this point
no driver which supports stack device behavior offload supports operation
with SELF flag. The case for upper device is already taken care of in both
of the following cases:
1. FDB add/del - driver should check at the notification cb if the
stacked device contains his ports.
2. Port attribute - calls switchdev code directly which checks
for case of stack device.
Signed-off-by: Arkadi Sharshevsky <arkadis@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
mlxsw: spectrum_switchdev: Add support for learning FDB through notification
Add support for learning FDB through notification. The driver defers
the hardware update via ordered work queue. Support for stacked devices
is also provided. In case of a successful FDB add a notification is
sent back to bridge.
Signed-off-by: Arkadi Sharshevsky <arkadis@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
mlxsw: spectrum_switchdev: Change switchdev notifier API
The current API for sending switchdev notifications implies only FDB
add/del. In order to support notification about successful FDB offload
the API is changed.
Signed-off-by: Arkadi Sharshevsky <arkadis@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
mlxsw: spectrum: Remove support for bypass bridge port attributes/vlan set
The bridge port attributes/vlan for mlxsw devices should be set only
from bridge code. The vlans are synced totally with the bridge so
there is no need to special dump support.
Signed-off-by: Arkadi Sharshevsky <arkadis@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
mlxsw: spectrum: Remove support for bridge FDB learning sync
Currently the mlxsw driver supports an option for disabling syncing
the hardware learned FDBs with the software bridge. This behavior
breaks the bridge offload model and thus it is removed.
Signed-off-by: Arkadi Sharshevsky <arkadis@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
net: bridge: Receive notification about successful FDB offload
When a new static FDB is added to the bridge a notification is sent to
the driver for offload. In case of successful offload the driver should
notify the bridge back, which in turn should mark the FDB as offloaded.
Currently, externally learned is equivalent for being offloaded which is
not correct due to the fact that FDBs which are added from user-space are
also marked as externally learned. In order to specify if an FDB was
successfully offloaded a new flag is introduced.
Signed-off-by: Arkadi Sharshevsky <arkadis@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Reviewed-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
net: bridge: Add support for notifying devices about FDB add/del
Currently the bridge doesn't notify the underlying devices about new
FDBs learned. The FDB sync is placed on the switchdev notifier chain
because devices may potentially learn FDB that are not directly related
to their ports, for example:
1. Mixed SW/HW bridge - FDBs that point to the ASICs external devices
should be offloaded as CPU traps in order to
perform forwarding in slow path.
2. EVPN - Externally learned FDBs for the vtep device.
Notification is sent only about static FDB add/del. This is done due
to fact that currently this is the only scenario supported by switch
drivers.
Signed-off-by: Arkadi Sharshevsky <arkadis@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Reviewed-by: Ivan Vecera <ivecera@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
net: switchdev: Change notifier chain to be atomic
In order to use the switchdev notifier chain for FDB sync with the
device it has to be changed to atomic. The is done because the bridge
can learn new FDBs in atomic context.
Signed-off-by: Arkadi Sharshevsky <arkadis@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Reviewed-by: Ivan Vecera <ivecera@redhat.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: Ivan Vecera <ivecera@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
net: bridge: Add support for offloading port attributes
Currently the flood, learning and learning_sync port attributes are
offloaded by setting the SELF flag. Add support for offloading the
flood and learning attribute through the bridge code. In case of
setting an unsupported flag on a offloded port the operation will
fail.
The learning_sync attribute doesn't have any software representation
and cannot be offloaded through the bridge code.
Signed-off-by: Arkadi Sharshevsky <arkadis@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Reviewed-by: Ivan Vecera <ivecera@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
net: switchdev: Add support for querying supported bridge flags by hardware
This is done as a preparation stage before setting the bridge port flags
from the bridge code. Currently the device can be queried for the bridge
flags state, but the querier cannot distinguish if the flag is disabled
or if it is not supported at all. Thus, add new attr and a bit-mask which
include information regarding the support on a per-flag basis.
Drivers that support bridge offload but not support bridge flags should
return zeroed bitmask.
Signed-off-by: Arkadi Sharshevsky <arkadis@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Reviewed-by: Ivan Vecera <ivecera@redhat.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: Ivan Vecera <ivecera@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Thu, 8 Jun 2017 15:43:33 +0000 (11:43 -0400)]
Merge branch 'dsa-add-cross-chip-VLAN-support'
Vivien Didelot says:
====================
net: dsa: add cross-chip VLAN support
The current code in DSA does not support cross-chip VLAN. This means
that in a multi-chip environment such as this one (similar to ZII Rev B)
[CPU].................... (mdio)
(eth0) | : : :
_|_____ _______ _______
[__sw0__]--[__sw1__]--[__sw2__]
| | | | | | | | |
v v v v v v v v v
p1 p2 p3 p4 p5 p6 p7 p8 p9
adding a VLAN to p9 won't be enough to reach the CPU, until at least one
port of sw0 and sw1 join the VLAN as well and become aware of the VID.
This patchset makes the DSA core program the VLAN on the CPU and DSA
links itself, which brings seamlessly cross-chip VLAN support to DSA.
With this series applied*, the hardware VLAN tables of a 3-switch setup
look like this after adding a VLAN to only one port of the end switch:
# cat /sys/class/net/br0/bridge/default_pvid
42
# cat /sys/kernel/debug/mv88e6xxx/sw{0,1,2}/vtu
# ip link set up master br0 dev lan6
# cat /sys/kernel/debug/mv88e6xxx/sw{0,1,2}/vtu
VID FID SID 0 1 2 3 4 5 6
42 1 0 x x x x x = =
VID FID SID 0 1 2 3 4 5 6
42 1 0 x x x x x = =
VID FID SID 0 1 2 3 4 5 6 7 8 9
42 1 0 u x x x x x x x x =
('x' is excluded, 'u' is untagged, '=' is unmodified DSA and CPU ports.)
Completely removing a VLAN entry (which is currently the responsibility
of drivers anyway) is not supported yet since it requires some caching.
Changes in v2:
- canonical incrementation (port++ instead of ++port)
- check CPU and DSA ports before purging a VLAN
- add Reviewed-by tags
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Vivien Didelot [Wed, 7 Jun 2017 22:12:17 +0000 (18:12 -0400)]
net: dsa: mv88e6xxx: do not skip ports on VLAN del
The mv88e6xxx driver currently tries to be smart and remove by itself a
VLAN entry from the VTU when the driven switch sees no user ports as
members of the VLAN.
This is bad in a multi-chip switch fabric, since a chip in between
others may have no bridge port members, but still needs to be aware of
the VID in order to correctly pass frames in the data path.
Now that the DSA core explicitly manages DSA and CPU ports, do not skip
them when checking remaining VLAN members.
Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Vivien Didelot [Wed, 7 Jun 2017 22:12:16 +0000 (18:12 -0400)]
net: dsa: mv88e6xxx: exclude all ports in new VLAN
Now that the DSA core adds the CPU and DSA ports itself to the new VLAN
entry, there is no need to include them as members of this VLAN when
initializing a new VTU entry.
As of now, initialize a new VTU entry with all ports excluded.
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Vivien Didelot [Wed, 7 Jun 2017 22:12:15 +0000 (18:12 -0400)]
net: dsa: add CPU and DSA ports as VLAN members
In a multi-chip switch fabric, it is currently the responsibility of the
driver to add the CPU or DSA (interconnecting chips together) ports as
members of a new VLAN entry. This makes the drivers more complicated.
We want the DSA drivers to be stupid and the DSA core being the one
responsible for caring about the abstracted switch logic and topology.
Make the DSA core program the CPU and DSA ports as part of the VLAN.
This makes all chips of the data path to be aware of VIDs spanning the
the whole fabric and thus, seamlessly add support for cross-chip VLAN.
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Vivien Didelot [Wed, 7 Jun 2017 22:12:14 +0000 (18:12 -0400)]
net: dsa: check VLAN capability of every switch
Now that the VLAN object is propagated to every switch chip of the
switch fabric, we can easily ensure that they all support the required
VLAN operations before modifying an entry on a single switch.
To achieve that, remove the condition skipping other target switches,
and add a bitmap of VLAN members, eventually containing the target port,
if we are programming the switch target.
This will allow us to easily add other VLAN members, such as the DSA or
CPU ports (to introduce cross-chip VLAN support) or the other port
members if we want to reduce hardware accesses later.
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Vivien Didelot [Wed, 7 Jun 2017 22:12:13 +0000 (18:12 -0400)]
net: dsa: mv88e6xxx: define membership on VLAN add
Define the target port membership of the VLAN entry in
mv88e6xxx_port_vlan_add where ds is scoped.
Allow the DSA core to call later the port_vlan_add operation for CPU or
DSA ports, by using the Unmodified membership for these ports, as in the
current behavior.
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Thu, 8 Jun 2017 15:41:41 +0000 (11:41 -0400)]
Merge tag 'rxrpc-rewrite-20170607-v2' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs
David Howells says:
====================
rxrpc: Tx length parameter
Here's a set of patches that allows someone initiating a client call with
AF_RXRPC to indicate upfront the total amount of data that will be
transmitted. This will allow AF_RXRPC to encrypt directly from source
buffer to packet rather than having to copy into the buffer and only
encrypt when it's full (the encrypted portion of the packet starts with a
length and so we can't encrypt until we know what the length will be).
The three patches are:
(1) Provide a means of finding out what control message types are actually
supported. EINVAL is reported if an unsupported cmsg type is seen, so
we don't want to set the new cmsg unless we know it will be accepted.
(2) Consolidate some stuff into a struct to reduce the parameter count on
the function that parses the cmsg buffer.
(3) Introduce the RXRPC_TX_LENGTH cmsg. This can be provided on the first
sendmsg() that contributes data to a client call request or a service
call reply. If provided, the user must provide exactly that amount of
data or an error will be incurred.
Changes in version 2:
(*) struct rxrpc_send_params::tx_total_len should be s64 not u64. Thanks to
Julia Lawall for reporting this.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>