Mickaël Salaün [Thu, 9 Feb 2017 23:21:36 +0000 (00:21 +0100)]
bpf: Change the include directory for selftest
Use the tools include directory instead of the installed one to allow
builds from other kernels.
Signed-off-by: Mickaël Salaün <mic@digikod.net> Cc: Alexei Starovoitov <ast@fb.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: David S. Miller <davem@davemloft.net> Signed-off-by: David S. Miller <davem@davemloft.net>
Mickaël Salaün [Thu, 9 Feb 2017 23:21:35 +0000 (00:21 +0100)]
tools: Sync {,tools/}include/uapi/linux/bpf.h
The tools version of this header is out of date; update it to the latest
version from kernel header.
Synchronize with the following commits:
* b95a5c4db09b ("bpf: add a longest prefix match trie map implementation")
* a5e8c07059d0 ("bpf: add bpf_probe_read_str helper")
* d1b662adcdb8 ("bpf: allow option for setting bpf_l4_csum_replace from scratch")
Signed-off-by: Mickaël Salaün <mic@digikod.net> Cc: Alexei Starovoitov <ast@fb.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Daniel Mack <daniel@zonque.org> Cc: David S. Miller <davem@davemloft.net> Cc: Gianluca Borello <g.borello@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
====================
Netronome NFP4000 and NFP6000 PF driver
This is a base PF driver for Netronome NFP4000 and NFP6000 chips. This
series doesn't add any exciting new features, it provides a foundation
for supporting more advanced firmware applications.
Patch 1 moves a bitfield-related helper from our BPF code to the global
header.
Patch 2 renames the kernel module and adds a new main file. We were
considering 3-module approach (pf, vf, common netdev library) but
ultimately settled on a single module to keep things simple.
Patch 3 adds support for accessing chip internals. It provides a way of
configuring access windows to different parts of chip memory and issuing
pretty much any commands on chip's NoC.
Patches 4, 5, 6, 7, 8 provide support for accessing and interpreting
various hardware and firmware information structures.
Patch 9 introduces service processor (NSP) ABI. This ABI gives us
access to PHY/SFP module configuration and information as well as
methods for unloading and loading application firmware.
Patches 10 and 11 modify the existing netdev code to make it possible
to support multi-port devices (sharing a PCI device).
Patch 12 adds a new driver probe path which will be used for the PF
PCI device IDs. It utilizes the newly added infrastructure and is able
to load application FW and spawn netdevs for all card's ports.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Thu, 9 Feb 2017 17:17:37 +0000 (09:17 -0800)]
nfp: allocate irqs in lower driver
PF services multiple ports using single PCI device therefore
IRQs can no longer be allocated in the netdev code. Lower
portion of the driver has to allocate the IRQs and hand them
out to ports.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Thu, 9 Feb 2017 17:17:35 +0000 (09:17 -0800)]
nfp: add support for service processor access
NFP Service Processor (NSP) is an ARM core inside the chip which
is responsible for management and control functions. Add support
for chip reset, FW load and external module access using the NSP.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Thu, 9 Feb 2017 17:17:32 +0000 (09:17 -0800)]
nfp: add support for reading nffw info
NFFW info is a resource which contains information about
the loaded application firmware. Add code which will allow
us to decode it and retrieve MIP location.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Thu, 9 Feb 2017 17:17:31 +0000 (09:17 -0800)]
nfp: add hwinfo support
Hwinfo is a simple key=value store of information which is read
from the flash and populated during chip power on. Add code to
look up information in it.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Thu, 9 Feb 2017 17:17:30 +0000 (09:17 -0800)]
nfp: add support for resources
Resource table is an array placed in a well defined location
in device's memory which describes device resources and contains
locks which have to be acquired to use them.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Thu, 9 Feb 2017 17:17:29 +0000 (09:17 -0800)]
nfp: add CPP access core
Command Push Pull is the name of NFP's network on a chip.
PCIe PF can access the interconnect through a number of mappings
controlled via Base Access Registers. BARs allow the PF to issue
pretty much any command or address any memory on the chip.
Add appropriate logic and a handful of helper for simple operations
like reading scalars from memories.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Thu, 9 Feb 2017 17:17:28 +0000 (09:17 -0800)]
nfp: rename the driver and add new main file
Support for the PF driver is about to be added and will share
much of the code. When the VF driver was added we planned to
maintain the PF driver as a separate module but have decided
that for our simple use case just maintaining a single module
is more reasonable. Rename the driver to just "nfp" and update
the Kconfig.
While at it remove latent references to NFP3200.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Jiri Pirko [Thu, 9 Feb 2017 14:54:33 +0000 (15:54 +0100)]
devlink: fix the name of eswitch commands
The eswitch_[gs]et command is supposed to be similar to port_[gs]et
command - for multiple eswitch attributes. However, when it was introduced
by 08f4b5918b2d ("net/devlink: Add E-Switch mode control") it was wrongly
named with the word "mode" in it. So fix this now, make the oririnal
enum value existing but obsolete.
Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Fri, 10 Feb 2017 19:31:51 +0000 (14:31 -0500)]
Merge tag 'mac80211-next-for-davem-2017-02-09' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211-next
Johannes Berg says:
====================
Some more updates:
* use shash in mac80211 crypto code where applicable
* some documentation fixes
* pass RSSI levels up in change notifications
* remove unused rfkill-regulator
* various other cleanups
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Fri, 10 Feb 2017 18:51:05 +0000 (13:51 -0500)]
Merge branch 'dsa-phy-include'
Florian Fainelli says:
====================
net: dsa: remove unnecessary phy.h include
Including phy.h and phy_fixed.h into net/dsa.h causes phy*.h to be an
unnecessary dependency for quite a large amount of the kernel. There's
very little which actually requires definitions from phy.h in net/dsa.h
- the include itself only wants the declaration of a couple of
structures and IFNAMSIZ.
Add linux/if.h for IFNAMSIZ, declarations for the structures, phy.h to
mv88e6xxx.h as it needs it for phy_interface_t, and remove both phy.h
and phy_fixed.h from net/dsa.h.
This patch reduces from around 800 files rebuilt to around 40 - even
with ccache, the time difference is noticable.
In order to make this change, several drivers need to be updated to
include necessary headers that they were picking up through this
include. This has resulted in a much larger patch series.
I'm assuming the 0-day builder has had 24 hours with this series, and
hasn't reported any further issues with it - the last issue was two
weeks ago (before I became ill) which I fixed over the last weekend.
I'm hoping this doesn't conflict with what's already in net-next...
David, this should probably go via your tree considering the diffstat.
Changes in v2:
- took Russell's patch series
- removed Qualcomm EMAC patch
- rebased against net-next/master
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Russell King [Tue, 7 Feb 2017 23:03:05 +0000 (15:03 -0800)]
net: dsa: remove unnecessary phy*.h includes
Including phy.h and phy_fixed.h into net/dsa.h causes phy*.h to be an
unnecessary dependency for quite a large amount of the kernel. There's
very little which actually requires definitions from phy.h in net/dsa.h
- the include itself only wants the declaration of a couple of
structures and IFNAMSIZ.
Add linux/if.h for IFNAMSIZ, declarations for the structures, phy.h to
mv88e6xxx.h as it needs it for phy_interface_t, and remove both phy.h
and phy_fixed.h from net/dsa.h.
This patch reduces from around 800 files rebuilt to around 40 - even
with ccache, the time difference is noticable.
Tested-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
Russell King [Tue, 7 Feb 2017 23:03:04 +0000 (15:03 -0800)]
net: ath5k: fix build errors when linux/phy*.h is removed from net/dsa.h
Fix these errors reported by the 0-day builder by replacing the
linux/export.h include with linux/module.h.
In file included from include/linux/platform_device.h:14:0,
from drivers/net/wireless/ath/ath5k/ahb.c:20:
include/linux/device.h:1463:1: warning: data definition has no type or storage class
module_init(__driver##_init); \
^
include/linux/platform_device.h:228:2: note: in expansion of macro 'module_driver'
module_driver(__platform_driver, platform_driver_register, \
^~~~~~~~~~~~~
drivers/net/wireless/ath/ath5k/ahb.c:233:1: note: in expansion of macro 'module_platform_driver'
module_platform_driver(ath_ahb_driver);
^~~~~~~~~~~~~~~~~~~~~~
include/linux/device.h:1463:1: error: type defaults to 'int' in declaration of 'module_init' [-Werror=implicit-int]
module_init(__driver##_init); \
^
include/linux/platform_device.h:228:2: note: in expansion of macro 'module_driver'
module_driver(__platform_driver, platform_driver_register, \
^~~~~~~~~~~~~
drivers/net/wireless/ath/ath5k/ahb.c:233:1: note: in expansion of macro 'module_platform_driver'
module_platform_driver(ath_ahb_driver);
^~~~~~~~~~~~~~~~~~~~~~
drivers/net/wireless/ath/ath5k/ahb.c:233:1: warning: parameter names (without types) in function declaration
In file included from include/linux/platform_device.h:14:0,
from drivers/net/wireless/ath/ath5k/ahb.c:20:
include/linux/device.h:1468:1: warning: data definition has no type or storage class
module_exit(__driver##_exit);
^
include/linux/platform_device.h:228:2: note: in expansion of macro 'module_driver'
module_driver(__platform_driver, platform_driver_register, \
^~~~~~~~~~~~~
drivers/net/wireless/ath/ath5k/ahb.c:233:1: note: in expansion of macro 'module_platform_driver'
module_platform_driver(ath_ahb_driver);
^~~~~~~~~~~~~~~~~~~~~~
include/linux/device.h:1468:1: error: type defaults to 'int' in declaration of 'module_exit' [-Werror=implicit-int]
module_exit(__driver##_exit);
^
include/linux/platform_device.h:228:2: note: in expansion of macro 'module_driver'
module_driver(__platform_driver, platform_driver_register, \
^~~~~~~~~~~~~
drivers/net/wireless/ath/ath5k/ahb.c:233:1: note: in expansion of macro 'module_platform_driver'
module_platform_driver(ath_ahb_driver);
^~~~~~~~~~~~~~~~~~~~~~
drivers/net/wireless/ath/ath5k/ahb.c:233:1: warning: parameter names (without types) in function declaration
In file included from include/linux/platform_device.h:14:0,
from drivers/net/wireless/ath/ath5k/ahb.c:20:
drivers/net/wireless/ath/ath5k/ahb.c:233:24: warning: 'ath_ahb_driver_exit' defined but not used [-Wunused-function]
module_platform_driver(ath_ahb_driver);
^
include/linux/device.h:1464:20: note: in definition of macro 'module_driver'
static void __exit __driver##_exit(void) \
^~~~~~~~
drivers/net/wireless/ath/ath5k/ahb.c:233:1: note: in expansion of macro 'module_platform_driver'
module_platform_driver(ath_ahb_driver);
^~~~~~~~~~~~~~~~~~~~~~
drivers/net/wireless/ath/ath5k/ahb.c:233:24: warning: 'ath_ahb_driver_init' defined but not used [-Wunused-function]
module_platform_driver(ath_ahb_driver);
^
include/linux/device.h:1459:19: note: in definition of macro 'module_driver'
static int __init __driver##_init(void) \
^~~~~~~~
drivers/net/wireless/ath/ath5k/ahb.c:233:1: note: in expansion of macro 'module_platform_driver'
module_platform_driver(ath_ahb_driver);
^~~~~~~~~~~~~~~~~~~~~~
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Acked-by: Kalle Valo <kvalo@codeaurora.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Russell King [Tue, 7 Feb 2017 23:03:03 +0000 (15:03 -0800)]
net: liquidio: fix build errors when linux/phy*.h is removed from net/dsa.h
drivers/net/ethernet/cavium/liquidio/lio_vf_main.c:30: error: expected declaration specifiers or '...' before string constant
drivers/net/ethernet/cavium/liquidio/lio_vf_main.c:30: warning: data definition has no type or storage class
drivers/net/ethernet/cavium/liquidio/lio_vf_main.c:30: error: type defaults to 'int' in declaration of 'MODULE_AUTHOR'
drivers/net/ethernet/cavium/liquidio/lio_vf_main.c:30: error: function declaration isn't a prototype
drivers/net/ethernet/cavium/liquidio/lio_vf_main.c:31: error: expected declaration specifiers or '...' before string constant
drivers/net/ethernet/cavium/liquidio/lio_vf_main.c:31: warning: data definition has no type or storage class
drivers/net/ethernet/cavium/liquidio/lio_vf_main.c:31: error: type defaults to 'int' in declaration of 'MODULE_DESCRIPTION'
drivers/net/ethernet/cavium/liquidio/lio_vf_main.c:31: error: function declaration isn't a prototype
drivers/net/ethernet/cavium/liquidio/lio_vf_main.c:32: error: expected declaration specifiers or '...' before string constant
drivers/net/ethernet/cavium/liquidio/lio_vf_main.c:32: warning: data definition has no type or storage class
drivers/net/ethernet/cavium/liquidio/lio_vf_main.c:32: error: type defaults to 'int' in declaration of 'MODULE_LICENSE'
drivers/net/ethernet/cavium/liquidio/lio_vf_main.c:32: error: function declaration isn't a prototype
drivers/net/ethernet/cavium/liquidio/lio_vf_main.c:33: error: expected declaration specifiers or '...' before string constant
drivers/net/ethernet/cavium/liquidio/lio_vf_main.c:33: warning: data definition has no type or storage class
drivers/net/ethernet/cavium/liquidio/lio_vf_main.c:33: error: type defaults to 'int' in declaration of 'MODULE_VERSION'
drivers/net/ethernet/cavium/liquidio/lio_vf_main.c:33: error: function declaration isn't a prototype
drivers/net/ethernet/cavium/liquidio/lio_vf_main.c:36: error: expected ')' before 'int'
drivers/net/ethernet/cavium/liquidio/lio_vf_main.c:37: error: expected ')' before string constant
drivers/net/ethernet/cavium/liquidio/lio_vf_main.c:325: warning: data definition has no type or storage class
drivers/net/ethernet/cavium/liquidio/lio_vf_main.c:325: error: type defaults to 'int' in declaration of 'MODULE_DEVICE_TABLE'
drivers/net/ethernet/cavium/liquidio/lio_vf_main.c:325: warning: parameter names (without types) in function declaration
drivers/net/ethernet/cavium/liquidio/lio_vf_main.c:3250: warning: data definition has no type or storage class
drivers/net/ethernet/cavium/liquidio/lio_vf_main.c:3250: error: type defaults to 'int' in declaration of 'module_init'
drivers/net/ethernet/cavium/liquidio/lio_vf_main.c:3250: warning: parameter names (without types) in function declaration
drivers/net/ethernet/cavium/liquidio/lio_vf_main.c:3251: warning: data definition has no type or storage class
drivers/net/ethernet/cavium/liquidio/lio_vf_main.c:3251: error: type defaults to 'int' in declaration of 'module_exit'
drivers/net/ethernet/cavium/liquidio/lio_vf_main.c:3251: warning: parameter names (without types) in function declaration
drivers/net/ethernet/cavium/liquidio/lio_main.c:36: error: expected declaration specifiers or '...' before string constant
drivers/net/ethernet/cavium/liquidio/lio_main.c:36: warning: data definition has no type or storage class
drivers/net/ethernet/cavium/liquidio/lio_main.c:36: error: type defaults to 'int' in declaration of 'MODULE_AUTHOR'
drivers/net/ethernet/cavium/liquidio/lio_main.c:36: error: function declaration isn't a prototype
drivers/net/ethernet/cavium/liquidio/lio_main.c:37: error: expected declaration specifiers or '...' before string constant
drivers/net/ethernet/cavium/liquidio/lio_main.c:37: warning: data definition has no type or storage class
drivers/net/ethernet/cavium/liquidio/lio_main.c:37: error: type defaults to 'int' in declaration of 'MODULE_DESCRIPTION'
drivers/net/ethernet/cavium/liquidio/lio_main.c:37: error: function declaration isn't a prototype
drivers/net/ethernet/cavium/liquidio/lio_main.c:38: error: expected declaration specifiers or '...' before string constant
drivers/net/ethernet/cavium/liquidio/lio_main.c:38: warning: data definition has no type or storage class
drivers/net/ethernet/cavium/liquidio/lio_main.c:38: error: type defaults to 'int' in declaration of 'MODULE_LICENSE'
drivers/net/ethernet/cavium/liquidio/lio_main.c:38: error: function declaration isn't a prototype
drivers/net/ethernet/cavium/liquidio/lio_main.c:39: error: expected declaration specifiers or '...' before string constant
drivers/net/ethernet/cavium/liquidio/lio_main.c:39: warning: data definition has no type or storage class
drivers/net/ethernet/cavium/liquidio/lio_main.c:39: error: type defaults to 'int' in declaration of 'MODULE_VERSION'
drivers/net/ethernet/cavium/liquidio/lio_main.c:39: error: function declaration isn't a prototype
drivers/net/ethernet/cavium/liquidio/lio_main.c:40: error: expected declaration specifiers or '...' before string constant
drivers/net/ethernet/cavium/liquidio/lio_main.c:40: warning: data definition has no type or storage class
drivers/net/ethernet/cavium/liquidio/lio_main.c:40: error: type defaults to 'int' in declaration of 'MODULE_FIRMWARE'
drivers/net/ethernet/cavium/liquidio/lio_main.c:40: error: function declaration isn't a prototype
drivers/net/ethernet/cavium/liquidio/lio_main.c:41: error: expected declaration specifiers or '...' before string constant
drivers/net/ethernet/cavium/liquidio/lio_main.c:41: warning: data definition has no type or storage class
drivers/net/ethernet/cavium/liquidio/lio_main.c:41: error: type defaults to 'int' in declaration of 'MODULE_FIRMWARE'
drivers/net/ethernet/cavium/liquidio/lio_main.c:41: error: function declaration isn't a prototype
drivers/net/ethernet/cavium/liquidio/lio_main.c:42: error: expected declaration specifiers or '...' before string constant
drivers/net/ethernet/cavium/liquidio/lio_main.c:42: warning: data definition has no type or storage class
drivers/net/ethernet/cavium/liquidio/lio_main.c:42: error: type defaults to 'int' in declaration of 'MODULE_FIRMWARE'
drivers/net/ethernet/cavium/liquidio/lio_main.c:42: error: function declaration isn't a prototype
drivers/net/ethernet/cavium/liquidio/lio_main.c:43: error: expected declaration specifiers or '...' before string constant
drivers/net/ethernet/cavium/liquidio/lio_main.c:43: warning: data definition has no type or storage class
drivers/net/ethernet/cavium/liquidio/lio_main.c:43: error: type defaults to 'int' in declaration of 'MODULE_FIRMWARE'
drivers/net/ethernet/cavium/liquidio/lio_main.c:43: error: function declaration isn't a prototype
drivers/net/ethernet/cavium/liquidio/lio_main.c:46: error: expected ')' before 'int'
drivers/net/ethernet/cavium/liquidio/lio_main.c:48: error: expected ')' before string constant
drivers/net/ethernet/cavium/liquidio/lio_main.c:53: error: expected ')' before 'int'
drivers/net/ethernet/cavium/liquidio/lio_main.c:54: error: expected ')' before string constant
drivers/net/ethernet/cavium/liquidio/lio_main.c:57: error: expected ')' before 'sizeof'
drivers/net/ethernet/cavium/liquidio/lio_main.c:58: error: expected ')' before string constant
drivers/net/ethernet/cavium/liquidio/lio_main.c:498: warning: data definitionhas no type or storage class
drivers/net/ethernet/cavium/liquidio/lio_main.c:498: error: type defaults to 'int' in declaration of 'MODULE_DEVICE_TABLE'
drivers/net/ethernet/cavium/liquidio/lio_main.c:498: warning: parameter names (without types) in function declaration
drivers/net/ethernet/cavium/liquidio/lio_main.c: In function 'octeon_recv_vf_drv_notice':
drivers/net/ethernet/cavium/liquidio/lio_main.c:4393: error: implicit declaration of function 'try_module_get'
drivers/net/ethernet/cavium/liquidio/lio_main.c:4400: error: implicit declaration of function 'module_put'
drivers/net/ethernet/cavium/liquidio/lio_main.c: At top level:
drivers/net/ethernet/cavium/liquidio/lio_main.c:4670: warning: data definition has no type or storage class
drivers/net/ethernet/cavium/liquidio/lio_main.c:4670: error: type defaults to 'int' in declaration of 'module_init'
drivers/net/ethernet/cavium/liquidio/lio_main.c:4670: warning: parameter names (without types) in function declaration
drivers/net/ethernet/cavium/liquidio/lio_main.c:4671: warning: data definition has no type or storage class
drivers/net/ethernet/cavium/liquidio/lio_main.c:4671: error: type defaults to 'int' in declaration of 'module_exit'
drivers/net/ethernet/cavium/liquidio/lio_main.c:4671: warning: parameter names (without types) in function declaration
Add linux/module.h to both these files.
drivers/net/ethernet/cavium/liquidio/octeon_console.c:40:31: error: expected ')' before 'int'
drivers/net/ethernet/cavium/liquidio/octeon_console.c:42:4: error: expected ')' before string constant
Add linux/moduleparam.h to this file.
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Acked-by: Felix Manlunas <felix.manlunas@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
(b) the multiple *_initcall() statements, each of which are translated
to a module_init() call when attempting a module build, become
aliases to init_module(). Having more than one alias will cause a
build error.
Hence, rather than adding a linux/module.h include, remove the redundant
MODULE_*() from this file.
Acked-by: David Daney <david.daney@cavium.com> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
Russell King [Tue, 7 Feb 2017 23:03:01 +0000 (15:03 -0800)]
iscsi: fix build errors when linux/phy*.h is removed from net/dsa.h
drivers/target/iscsi/iscsi_target_login.c:1135:7: error: implicit declaration of function 'try_module_get' [-Werror=implicit-function-declaration]
Add linux/module.h to iscsi_target_login.c.
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Reviewed-by: Bart Van Assche <bart.vanassche@sandisk.com> Acked-by: Nicholas Bellinger <nab@linux-iscsi.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Russell King [Tue, 7 Feb 2017 23:03:00 +0000 (15:03 -0800)]
net: mvneta: fix build errors when linux/phy*.h is removed from net/dsa.h
drivers/net/ethernet/marvell/mvneta.c:2694:26: error: storage size of 'status' isn't known
drivers/net/ethernet/marvell/mvneta.c:2695:26: error: storage size of 'changed' isn't known
drivers/net/ethernet/marvell/mvneta.c:2695:9: error: variable 'changed' has initializer but incomplete type
drivers/net/ethernet/marvell/mvneta.c:2709:2: error: implicit declaration of function 'fixed_phy_update_state' [-Werror=implicit-function-declaration]
Add linux/phy_fixed.h to mvneta.c
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Acked-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Russell King [Tue, 7 Feb 2017 23:02:58 +0000 (15:02 -0800)]
net: bgmac: fix build errors when linux/phy*.h is removed from net/dsa.h
drivers/net/ethernet/broadcom/bgmac.c:1015:17: error: dereferencing pointer to incomplete type 'struct mii_bus'
drivers/net/ethernet/broadcom/bgmac.c:1185:2: error: implicit declaration of function 'phy_start' [-Werror=implicit-function-declaration]
drivers/net/ethernet/broadcom/bgmac.c:1198:2: error: implicit declaration of function 'phy_stop' [-Werror=implicit-function-declaration]
drivers/net/ethernet/broadcom/bgmac.c:1239:9: error: implicit declaration of function 'phy_mii_ioctl' [-Werror=implicit-function-declaration]
drivers/net/ethernet/broadcom/bgmac.c:1389:28: error: 'phy_ethtool_get_link_ksettings' undeclared here (not in a function)
drivers/net/ethernet/broadcom/bgmac.c:1390:28: error: 'phy_ethtool_set_link_ksettings' undeclared here (not in a function)
drivers/net/ethernet/broadcom/bgmac.c:1403:13: error: dereferencing pointer to incomplete type 'struct phy_device'
drivers/net/ethernet/broadcom/bgmac.c:1417:3: error: implicit declaration of function 'phy_print_status' [-Werror=implicit-function-declaration]
drivers/net/ethernet/broadcom/bgmac.c:1424:26: error: storage size of 'fphy_status' isn't known
drivers/net/ethernet/broadcom/bgmac.c:1424:9: error: variable 'fphy_status' has initializer but incomplete type
drivers/net/ethernet/broadcom/bgmac.c:1425:11: warning: excess elements in struct initializer
drivers/net/ethernet/broadcom/bgmac.c:1425:3: error: unknown field 'link' specified in initializer
drivers/net/ethernet/broadcom/bgmac.c:1426:12: note: in expansion of macro 'SPEED_1000'
drivers/net/ethernet/broadcom/bgmac.c:1426:3: error: unknown field 'speed' specified in initializer
drivers/net/ethernet/broadcom/bgmac.c:1427:13: note: in expansion of macro 'DUPLEX_FULL'
drivers/net/ethernet/broadcom/bgmac.c:1427:3: error: unknown field 'duplex' specified in initializer
drivers/net/ethernet/broadcom/bgmac.c:1432:12: error: implicit declaration of function 'fixed_phy_register' [-Werror=implicit-function-declaration]
drivers/net/ethernet/broadcom/bgmac.c:1432:31: error: 'PHY_POLL' undeclared (first use in this function)
drivers/net/ethernet/broadcom/bgmac.c:1438:8: error: implicit declaration of function 'phy_connect_direct' [-Werror=implicit-function-declaration]
drivers/net/ethernet/broadcom/bgmac.c:1439:6: error: 'PHY_INTERFACE_MODE_MII' undeclared (first use in this function)
drivers/net/ethernet/broadcom/bgmac.c:1521:2: error: implicit declaration of function 'phy_disconnect' [-Werror=implicit-function-declaration]
drivers/net/ethernet/broadcom/bgmac.c:1541:15: error: expected declaration specifiers or '...' before string constant
Add linux/phy.h to bgmac.c
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Acked-by: Rafał Miłecki <rafal@milecki.pl> Signed-off-by: David S. Miller <davem@davemloft.net>
Russell King [Tue, 7 Feb 2017 23:02:57 +0000 (15:02 -0800)]
net: lan78xx: fix build errors when linux/phy*.h is removed from net/dsa.h
drivers/net/usb/lan78xx.c:394:33: sparse: expected ; at end of declaration
drivers/net/usb/lan78xx.c:394:33: sparse: Expected } at end of struct-union-enum-specifier
drivers/net/usb/lan78xx.c:394:33: sparse: got interface
drivers/net/usb/lan78xx.c:403:1: sparse: Expected ; at the end of type declaration
drivers/net/usb/lan78xx.c:403:1: sparse: got }
Add linux/phy.h to lan78xx.c
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
Russell King [Tue, 7 Feb 2017 23:02:56 +0000 (15:02 -0800)]
net: macb: fix build errors when linux/phy*.h is removed from net/dsa.h
drivers/net/ethernet/cadence/macb.h:862:33: sparse: expected ; at end of declaration
drivers/net/ethernet/cadence/macb.h:862:33: sparse: Expected } at end of struct-union-enum-specifier
drivers/net/ethernet/cadence/macb.h:862:33: sparse: got phy_interface
drivers/net/ethernet/cadence/macb.h:877:1: sparse: Expected ; at the end of type declaration
drivers/net/ethernet/cadence/macb.h:877:1: sparse: got }
In file included from drivers/net/ethernet/cadence/macb_pci.c:29:0:
drivers/net/ethernet/cadence/macb.h:862:2: error: unknown type name 'phy_interface_t'
phy_interface_t phy_interface;
^~~~~~~~~~~~~~~
Add linux/phy.h to macb.h
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Acked-by: Nicolas Ferre <nicolas.ferre@atmel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Russell King [Tue, 7 Feb 2017 23:02:54 +0000 (15:02 -0800)]
net: sunrpc: fix build errors when linux/phy*.h is removed from net/dsa.h
Removing linux/phy.h from net/dsa.h reveals a build error in the sunrpc
code:
net/sunrpc/xprtrdma/svc_rdma_backchannel.c: In function 'xprt_rdma_bc_put':
net/sunrpc/xprtrdma/svc_rdma_backchannel.c:277:2: error: implicit declaration of function 'module_put' [-Werror=implicit-function-declaration]
net/sunrpc/xprtrdma/svc_rdma_backchannel.c: In function 'xprt_setup_rdma_bc':
net/sunrpc/xprtrdma/svc_rdma_backchannel.c:348:7: error: implicit declaration of function 'try_module_get' [-Werror=implicit-function-declaration]
Fix this by adding linux/module.h to svc_rdma_backchannel.c
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Acked-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Fri, 10 Feb 2017 18:18:34 +0000 (13:18 -0500)]
Merge branch 'act_pedit-relative-offset'
Amir Vadai says:
====================
net/sched: act_pedit: Use offset relative to conventional network headers
Some FW/HW parser APIs are such that they need to get the specific header type (e.g
IPV4 or IPV6, TCP or UDP) and not only the networking level (e.g network or transport).
Enhancing the UAPI to allow for specifying that, would allow the same flows to be
set into both SW and HW.
This patchset also makes pedit more robust. Currently fields offset is specified
by offset relative to the ip header, while using negative offsets for
MAC layer fields.
This series enables the user to set offset relative to the relevant header.
Usage example:
$ tc filter add dev enp0s9 protocol ip parent ffff: \
flower \
ip_proto tcp \
dst_port 80 \
action \
pedit munge ip ttl add 0xff \
pedit munge tcp dport set 8080 \
pipe action mirred egress redirect dev veth0
Will forward traffic destined to tcp dport 80, while modifying the
destination port to 8080, and decreasing the ttl by one.
I've uploaded a draft for the userspace [2] to make it easier to review and
test the patchset.
Patchset was tested and applied on top of upstream commit bd092ad1463c ("Merge
branch 'remove-__napi_complete_done'")
Thanks,
Amir
Changes since V2:
- Instead of reusing unused bits in existing uapi fields, using new netlink
attributes for the new information. This way new/old user space and new/old
kernel can live together without having misunderstandings.
Changes since V1:
- No changes - V1 was sent and didn't make it for 4.10.
- You asked me [1] why did I use specific header names instead of layers (L2,
L3...), and I explained that it is on purpose, this extra information is
planned to be used by hardware drivers to offload the action.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Amir Vadai [Tue, 7 Feb 2017 07:56:08 +0000 (09:56 +0200)]
net/act_pedit: Introduce 'add' operation
This command could be useful to inc/dec fields.
For example, to forward any TCP packet and decrease its TTL:
$ tc filter add dev enp0s9 protocol ip parent ffff: \
flower ip_proto tcp \
action pedit munge ip ttl add 0xff pipe \
action mirred egress redirect dev veth0
In the example above, adding 0xff to this u8 field is actually
decreasing it by one, since the operation is masked.
Signed-off-by: Amir Vadai <amir@vadai.me> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Amir Vadai [Tue, 7 Feb 2017 07:56:07 +0000 (09:56 +0200)]
net/act_pedit: Support using offset relative to the conventional network headers
Extend pedit to enable the user setting offset relative to network
headers. This change would enable to work with more complex header
schemes (vs the simple IPv4 case) where setting a fixed offset relative
to the network header is not enough.
After this patch, the action has information about the exact header type
and field inside this header. This information could be used later on
for hardware offloading of pedit.
Backward compatibility was being kept:
1. Old kernel <-> new userspace
2. New kernel <-> old userspace
3. add rule using new userspace <-> dump using old userspace
4. add rule using old userspace <-> dump using new userspace
When using the extended api, new netlink attributes are being used. This
way, operation will fail in (1) and (3) - and no malformed rule be added
or dumped. Of course, new user space that doesn't need the new
functionality can use the old netlink attributes and operation will
succeed.
Since action can support both api's, (2) should work, and it is easy to
write the new user space to have (4) work.
The action is having a strict check that only header types and commands
it can handle are accepted. This way future additions will be much
easier.
Usage example:
$ tc filter add dev enp0s9 protocol ip parent ffff: \
flower \
ip_proto tcp \
dst_port 80 \
action pedit munge tcp dport set 8080 pipe \
action mirred egress redirect dev veth0
Will forward tcp port whose original dest port is 80, while modifying
the destination port to 8080.
Signed-off-by: Amir Vadai <amir@vadai.me> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Fri, 10 Feb 2017 16:46:42 +0000 (11:46 -0500)]
Merge branch 'mlxsw-offload-mc-flood'
Jiri Pirko says:
====================
mlxsw: Offload MC flood for unregister MC
Nogah says:
When multicast is enabled, the Linux bridge floods unregistered multicast
packets only to ports connected to a multicast router. Devices capable of
offloading the Linux bridge need to be made aware of such ports, for
proper flooding behavior.
On the other hand, when multicast is disabled, such packets should be
flooded to all ports. This patchset aims to fix that, by offloading
the multicast state and the list of multicast router ports.
The first 3 patches adds switchdev attributes to offload this data.
The rest of the patchset add implementation for handling this data in the
mlxsw driver.
The effects this data has on the MDB (namely, when the multicast is
disabled the MDB should be considered as invalid, and when it is enabled, a
packet that is flooded by it should also be flooded to the multicast
routers ports) is subject of future work.
Testing of this patchset included:
Sending 3 mc packets streams, LL, register and unregistered, and checking
that they reached only to the ports that should have received them.
The configs were:
mc disabled, mc without mc router ports and mc with fixed router port.
It was checked for vlan aware bridge, vlan unaware bridge and vlan unaware
bridge with another vlan unaware bridge on the same machine
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Nogah Frankel [Thu, 9 Feb 2017 13:54:48 +0000 (14:54 +0100)]
mlxsw: spectrum: Extend port_orig_get for bridge devices
The function mlxsw_sp_port_orig_get returns the vport from the physical
port if needed, based on the original device.
This patch addresses the case where the original device is a bridge.
If it is vlan unaware bridge, it returns the matching vport. If it is vlan
aware bridge, there is no matching vport, and it returns the original port.
Signed-off-by: Nogah Frankel <nogahf@mellanox.com> Signed-off-by: Yotam Gigi <yotamg@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Nogah Frankel [Thu, 9 Feb 2017 13:54:47 +0000 (14:54 +0100)]
mlxsw: spectrum: Add an option to flood mc by mc_router_port
The decision whether to flood a multicast packet to a port dependent
on three flags: mc_disabled, mc_router_port, mc_flood.
If mc_disabled is on, the port will be flooded according to mc_flood,
otherwise, according to mc_router_port. To accomplish that, add those
flags into the mlxsw_sp_port struct and update the mc flood table
accordingly.
Update mc_router_port by switchdev attribute
SWITCHDEV_ATTR_ID_PORT_MC_ROUTER_PORT.
Signed-off-by: Nogah Frankel <nogahf@mellanox.com> Signed-off-by: Yotam Gigi <yotamg@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Nogah Frankel [Thu, 9 Feb 2017 13:54:46 +0000 (14:54 +0100)]
mlxsw: spectrum: Separate bc and mc floods
Break the bm (broadcast-multicast) into two tables, one for broadcast
(and link local multicast that behaves like bc) and one for unknown
multicasts.
Add a bool into mlxsw_sp_port named mc_flood that reflect the value this
port should have in the mc flood table (currently, always 1);
Signed-off-by: Nogah Frankel <nogahf@mellanox.com> Signed-off-by: Yotam Gigi <yotamg@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Nogah Frankel [Thu, 9 Feb 2017 13:54:45 +0000 (14:54 +0100)]
mlxsw: spectrum: Change max vfid
A user that wants many bridges will use 1.Q bridge which are scalable.
One can have as many 1.Q bridges as vfids.
This patch sets their number to 1k, which is a reasonably large number.
This change is done here because the next patches will add a new flood
table, and without it, it will increase the overall size of the flood
tables dramatically.
Signed-off-by: Nogah Frankel <nogahf@mellanox.com> Signed-off-by: Yotam Gigi <yotamg@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Nogah Frankel [Thu, 9 Feb 2017 13:54:43 +0000 (14:54 +0100)]
mlxsw: spectrum: Break flood set func to be per table
Currently, the flood set function can't operate on only one table, but
sets both uc_flood and mb_flood together.
This patch creates a function that sets the flood state per table.
Signed-off-by: Nogah Frankel <nogahf@mellanox.com> Signed-off-by: Yotam Gigi <yotamg@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Nogah Frankel [Thu, 9 Feb 2017 13:54:42 +0000 (14:54 +0100)]
switchdev: bridge: Offload mc router ports
Offload the mc router ports list, whenever it is being changed.
It is done because in some cases mc packets needs to be flooded to all
the ports in this list.
Signed-off-by: Nogah Frankel <nogahf@mellanox.com> Signed-off-by: Yotam Gigi <yotamg@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Acked-by: Ivan Vecera <ivecera@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Nogah Frankel [Thu, 9 Feb 2017 13:54:41 +0000 (14:54 +0100)]
bridge: mcast: Merge the mc router ports deletions to one function
There are three places where a port gets deleted from the mc router port
list. This patch join the actual deletion to one function.
It will be helpful for later patch that will offload changes in the mc
router ports list.
Signed-off-by: Nogah Frankel <nogahf@mellanox.com> Signed-off-by: Yotam Gigi <yotamg@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Acked-by: Ivan Vecera <ivecera@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Nogah Frankel [Thu, 9 Feb 2017 13:54:40 +0000 (14:54 +0100)]
switchdev: bridge: Offload multicast disabled
Offload multicast disabled flag, for more accurate mc flood behavior:
When it is on, the mdb should be ignored.
When it is off, unregistered mc packets should be flooded to mc router
ports.
Signed-off-by: Nogah Frankel <nogahf@mellanox.com> Signed-off-by: Yotam Gigi <yotamg@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Acked-by: Ivan Vecera <ivecera@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Jiri Pirko [Thu, 9 Feb 2017 13:39:00 +0000 (14:39 +0100)]
sched: check negative err value to safe one level of indent
As it is more common, check err for !0. That allows to safe one level of
indentation and makes the code easier to read. Also, make 'next' variable
global in function as it is used twice.
Signed-off-by: Jiri Pirko <jiri@mellanox.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
The kernel can store several FIB aliases that share the same prefix and
length. These aliases can differ in other parameters such as TOS and
metric, which are taken into account during lookup.
Offloading devices might not have the same flexibility, allowing only a
single route with the same prefix and length to be reflected. mlxsw is
one such device.
This patchset aims to correctly handle this situation in the mlxsw
driver. The first four patches introduce small changes in the IPv4 FIB
code, so that listeners of the FIB notification chain will be able to
correctly handle identical routes.
The last three patches build on top of previous work and introduce the
necessary changes in the mlxsw driver. The biggest change is the
introduction of a FIB node, where identical routes are chained, instead
of a primitive reference counting. This is explained in detail in the
fifth patch.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Ido Schimmel [Thu, 9 Feb 2017 09:28:44 +0000 (10:28 +0100)]
mlxsw: spectrum_router: Add support for route replace
Upon the reception of an ENTRY_REPLACE notification, resolve the FIB
node corresponding to the prefix and length and insert the new route
before the first matching entry.
Since the notification also signals the deletion of the replaced route,
delete it from the driver's cache.
Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Ido Schimmel [Thu, 9 Feb 2017 09:28:43 +0000 (10:28 +0100)]
mlxsw: spectrum_router: Add support for route append
When a new route is appended, it's placed after existing routes sharing
the same parameters (prefix, length, table ID, TOS and priority).
While the device supports only one route with the same prefix and length
in a single table, it's important to correctly place the appended route
in the driver's cache, as when a route is deleted the next one is
programmed into the device.
Following the reception of an ENTRY_APPEND notification, resolve the
FIB node corresponding to the prefix and length and correctly place the
new entry in its entry list.
Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
In the device, routes are indexed in a routing table based on the prefix
and its length. This is in contrast to the kernel's FIB where several
FIB aliases can exist with these parameters being identical. In such
cases, the routes will be sorted by table ID (LOCAL first, then MAIN),
TOS and finally priority (metric).
During lookup, these routes will be evaluated in order. In case the
packet's TOS field is non-zero and a FIB alias with a matching TOS is
found, then it's selected. Otherwise, the lookup defaults to the route
with TOS 0 (if it exists). However, if the requested scope is narrower
than the one found, then the lookup continues.
To best reflect the kernel's datapath we should take the above into
account. Given a prefix and its length, the reflected route will always
be the first one in the FIB alias list. However, if the route has a
non-zero TOS then its action will be converted to trap instead of
forward, since we currently don't support TOS-based routing. If this
turns out to be a real issue, we can add support for that using
policy-based switching.
The route's scope can be effectively ignored as any packet being routed
by the device would've been looked-up using the widest scope (UNIVERSE).
To achieve that we need to do two changes. Firstly, we need to create
another struct (FIB node) that will hold the list of FIB entries sharing
the same prefix and length. This struct will be hashed using these two
parameters.
Secondly, we need to change the route reflection to match the above
logic, so that the first FIB entry in the list will be programmed into
the device while the rest will remain in the driver's cache in case of
subsequent changes.
Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Ido Schimmel [Thu, 9 Feb 2017 09:28:41 +0000 (10:28 +0100)]
ipv4: fib: Add events for FIB replace and append
The FIB notification chain currently uses the NLM_F_{REPLACE,APPEND}
flags to signal routes being replaced or appended.
Instead of using netlink flags for in-kernel notifications we can simply
introduce two new events in the FIB notification chain. This has the
added advantage of making the API cleaner, thereby making it clear that
these events should be supported by listeners of the notification chain.
Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> CC: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
Ido Schimmel [Thu, 9 Feb 2017 09:28:40 +0000 (10:28 +0100)]
ipv4: fib: Send notification before deleting FIB alias
When a FIB alias is replaced following NLM_F_REPLACE, the ENTRY_ADD
notification is sent after the reference on the previous FIB info was
dropped. This is problematic as potential listeners might need to access
it in their notification blocks.
Solve this by sending the notification prior to the deletion of the
replaced FIB alias. This is consistent with ENTRY_DEL notifications.
Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> CC: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
Ido Schimmel [Thu, 9 Feb 2017 09:28:39 +0000 (10:28 +0100)]
ipv4: fib: Send deletion notification with actual FIB alias type
When a FIB alias is removed, a notification is sent using the type
passed from user space - can be RTN_UNSPEC - instead of the actual type
of the removed alias. This is problematic for listeners of the FIB
notification chain, as several FIB aliases can exist with matching
parameters, but the type.
Solve this by passing the actual type of the removed FIB alias.
Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> CC: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
Ido Schimmel [Thu, 9 Feb 2017 09:28:38 +0000 (10:28 +0100)]
ipv4: fib: Only flush FIB aliases belonging to currently flushed table
In case the MAIN table is flushed and its trie is shared with the LOCAL
table, then we might be flushing FIB aliases belonging to the latter.
This can lead to FIB_ENTRY_DEL notifications sent with the wrong table
ID.
The above doesn't affect current listeners, as the table ID is ignored
during entry deletion, but this will change later in the patchset.
When flushing a particular table, skip any aliases belonging to a
different one.
Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> CC: Alexander Duyck <alexander.h.duyck@intel.com> CC: Patrick McHardy <kaber@trash.net> Reviewed-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Jarno Rajahalme [Thu, 9 Feb 2017 19:22:01 +0000 (11:22 -0800)]
openvswitch: Pack struct sw_flow_key.
struct sw_flow_key has two 16-bit holes. Move the most matched
conntrack match fields there. In some typical cases this reduces the
size of the key that needs to be hashed into half and into one cache
line.
Signed-off-by: Jarno Rajahalme <jarno@ovn.org> Acked-by: Joe Stringer <joe@ovn.org> Acked-by: Pravin B Shelar <pshelar@ovn.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Jarno Rajahalme [Thu, 9 Feb 2017 19:22:00 +0000 (11:22 -0800)]
openvswitch: Add force commit.
Stateful network admission policy may allow connections to one
direction and reject connections initiated in the other direction.
After policy change it is possible that for a new connection an
overlapping conntrack entry already exists, where the original
direction of the existing connection is opposed to the new
connection's initial packet.
Most importantly, conntrack state relating to the current packet gets
the "reply" designation based on whether the original direction tuple
or the reply direction tuple matched. If this "directionality" is
wrong w.r.t. to the stateful network admission policy it may happen
that packets in neither direction are correctly admitted.
This patch adds a new "force commit" option to the OVS conntrack
action that checks the original direction of an existing conntrack
entry. If that direction is opposed to the current packet, the
existing conntrack entry is deleted and a new one is subsequently
created in the correct direction.
Signed-off-by: Jarno Rajahalme <jarno@ovn.org> Acked-by: Pravin B Shelar <pshelar@ovn.org> Acked-by: Joe Stringer <joe@ovn.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Jarno Rajahalme [Thu, 9 Feb 2017 19:21:59 +0000 (11:21 -0800)]
openvswitch: Add original direction conntrack tuple to sw_flow_key.
Add the fields of the conntrack original direction 5-tuple to struct
sw_flow_key. The new fields are initially marked as non-existent, and
are populated whenever a conntrack action is executed and either finds
or generates a conntrack entry. This means that these fields exist
for all packets that were not rejected by conntrack as untrackable.
The original tuple fields in the sw_flow_key are filled from the
original direction tuple of the conntrack entry relating to the
current packet, or from the original direction tuple of the master
conntrack entry, if the current conntrack entry has a master.
Generally, expected connections of connections having an assigned
helper (e.g., FTP), have a master conntrack entry.
The main purpose of the new conntrack original tuple fields is to
allow matching on them for policy decision purposes, with the premise
that the admissibility of tracked connections reply packets (as well
as original direction packets), and both direction packets of any
related connections may be based on ACL rules applying to the master
connection's original direction 5-tuple. This also makes it easier to
make policy decisions when the actual packet headers might have been
transformed by NAT, as the original direction 5-tuple represents the
packet headers before any such transformation.
When using the original direction 5-tuple the admissibility of return
and/or related packets need not be based on the mere existence of a
conntrack entry, allowing separation of admission policy from the
established conntrack state. While existence of a conntrack entry is
required for admission of the return or related packets, policy
changes can render connections that were initially admitted to be
rejected or dropped afterwards. If the admission of the return and
related packets was based on mere conntrack state (e.g., connection
being in an established state), a policy change that would make the
connection rejected or dropped would need to find and delete all
conntrack entries affected by such a change. When using the original
direction 5-tuple matching the affected conntrack entries can be
allowed to time out instead, as the established state of the
connection would not need to be the basis for packet admission any
more.
It should be noted that the directionality of related connections may
be the same or different than that of the master connection, and
neither the original direction 5-tuple nor the conntrack state bits
carry this information. If needed, the directionality of the master
connection can be stored in master's conntrack mark or labels, which
are automatically inherited by the expected related connections.
The fact that neither ARP nor ND packets are trackable by conntrack
allows mutual exclusion between ARP/ND and the new conntrack original
tuple fields. Hence, the IP addresses are overlaid in union with ARP
and ND fields. This allows the sw_flow_key to not grow much due to
this patch, but it also means that we must be careful to never use the
new key fields with ARP or ND packets. ARP is easy to distinguish and
keep mutually exclusive based on the ethernet type, but ND being an
ICMPv6 protocol requires a bit more attention.
Signed-off-by: Jarno Rajahalme <jarno@ovn.org> Acked-by: Joe Stringer <joe@ovn.org> Acked-by: Pravin B Shelar <pshelar@ovn.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Jarno Rajahalme [Thu, 9 Feb 2017 19:21:58 +0000 (11:21 -0800)]
openvswitch: Inherit master's labels.
We avoid calling into nf_conntrack_in() for expected connections, as
that would remove the expectation that we want to stick around until
we are ready to commit the connection. Instead, we do a lookup in the
expectation table directly. However, after a successful expectation
lookup we have set the flow key label field from the master
connection, whereas nf_conntrack_in() does not do this. This leads to
master's labels being inherited after an expectation lookup, but those
labels not being inherited after the corresponding conntrack action
with a commit flag.
This patch resolves the problem by changing the commit code path to
also inherit the master's labels to the expected connection.
Resolving this conflict in favor of inheriting the labels allows more
information be passed from the master connection to related
connections, which would otherwise be much harder if the 32 bits in
the connmark are not enough. Labels can still be set explicitly, so
this change only affects the default values of the labels in presense
of a master connection.
Fixes: 7f8a436eaa2c ("openvswitch: Add conntrack action") Signed-off-by: Jarno Rajahalme <jarno@ovn.org> Acked-by: Pravin B Shelar <pshelar@ovn.org> Acked-by: Joe Stringer <joe@ovn.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Jarno Rajahalme [Thu, 9 Feb 2017 19:21:57 +0000 (11:21 -0800)]
openvswitch: Refactor labels initialization.
Refactoring conntrack labels initialization makes changes in later
patches easier to review.
Signed-off-by: Jarno Rajahalme <jarno@ovn.org> Acked-by: Pravin B Shelar <pshelar@ovn.org> Acked-by: Joe Stringer <joe@ovn.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Jarno Rajahalme [Thu, 9 Feb 2017 19:21:56 +0000 (11:21 -0800)]
openvswitch: Simplify labels length logic.
Since 23014011ba42 ("netfilter: conntrack: support a fixed size of 128
distinct labels"), the size of conntrack labels extension has fixed to
128 bits, so we do not need to check for labels sizes shorter than 128
at run-time. This patch simplifies labels length logic accordingly,
but allows the conntrack labels size to be increased in the future
without breaking the build. In the event of conntrack labels
increasing in size OVS would still be able to deal with the 128 first
label bits.
Suggested-by: Joe Stringer <joe@ovn.org> Signed-off-by: Jarno Rajahalme <jarno@ovn.org> Acked-by: Pravin B Shelar <pshelar@ovn.org> Acked-by: Joe Stringer <joe@ovn.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Jarno Rajahalme [Thu, 9 Feb 2017 19:21:55 +0000 (11:21 -0800)]
openvswitch: Unionize ovs_key_ct_label with a u32 array.
Make the array of labels in struct ovs_key_ct_label an union, adding a
u32 array of the same byte size as the existing u8 array. It is
faster to loop through the labels 32 bits at the time, which is also
the alignment of netlink attributes.
Signed-off-by: Jarno Rajahalme <jarno@ovn.org> Acked-by: Joe Stringer <joe@ovn.org> Acked-by: Pravin B Shelar <pshelar@ovn.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Jarno Rajahalme [Thu, 9 Feb 2017 19:21:54 +0000 (11:21 -0800)]
openvswitch: Do not trigger events for unconfirmed connections.
Receiving change events before the 'new' event for the connection has
been received can be confusing. Avoid triggering change events for
setting conntrack mark or labels before the conntrack entry has been
confirmed.
Fixes: 182e3042e15d ("openvswitch: Allow matching on conntrack mark") Fixes: c2ac66735870 ("openvswitch: Allow matching on conntrack label") Signed-off-by: Jarno Rajahalme <jarno@ovn.org> Acked-by: Joe Stringer <joe@ovn.org> Acked-by: Pravin B Shelar <pshelar@ovn.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Jarno Rajahalme [Thu, 9 Feb 2017 19:21:53 +0000 (11:21 -0800)]
openvswitch: Use inverted tuple in ovs_ct_find_existing() if NATted.
The conntrack lookup for existing connections fails to invert the
packet 5-tuple for NATted packets, and therefore fails to find the
existing conntrack entry. Conntrack only stores 5-tuples for incoming
packets, and there are various situations where a lookup on a packet
that has already been transformed by NAT needs to be made. Looking up
an existing conntrack entry upon executing packet received from the
userspace is one of them.
This patch fixes ovs_ct_find_existing() to invert the packet 5-tuple
for the conntrack lookup whenever the packet has already been
transformed by conntrack from its input form as evidenced by one of
the NAT flags being set in the conntrack state metadata.
Fixes: 05752523e565 ("openvswitch: Interface with NAT.") Signed-off-by: Jarno Rajahalme <jarno@ovn.org> Acked-by: Joe Stringer <joe@ovn.org> Acked-by: Pravin B Shelar <pshelar@ovn.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Jarno Rajahalme [Thu, 9 Feb 2017 19:21:52 +0000 (11:21 -0800)]
openvswitch: Fix comments for skb->_nfct
Fix comments referring to skb 'nfct' and 'nfctinfo' fields now that
they are combined into '_nfct'.
Signed-off-by: Jarno Rajahalme <jarno@ovn.org> Acked-by: Pravin B Shelar <pshelar@ovn.org> Acked-by: Joe Stringer <joe@ovn.org> Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Fri, 10 Feb 2017 03:27:08 +0000 (22:27 -0500)]
Merge branch 'ena-bug-fixes'
Netanel Belgazal says:
====================
Bug Fixes in ENA driver
Changes from V3:
* Rebase patchset to master and solve merge conflicts.
* Remove redundant bug fix (fix error handling when probe fails)
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
net/ena: change condition for host attribute configuration
Move the host info config to be the first admin command that is executed.
This change require the driver to remove the 'feature check'
from host info configuration flow.
The check is removed since the supported features bitmask field
is retrieved only after calling ENA_ADMIN_DEVICE_ATTRIBUTES admin command.
If set host info is not supported an error will be returned by the device.
Signed-off-by: Netanel Belgazal <netanel@annapurnalabs.com> Signed-off-by: David S. Miller <davem@davemloft.net>
net/ena: fix potential access to freed memory during device reset
If the ena driver detects that the device is not behave as expected,
it tries to reset the device.
The reset flow calls ena_down, which will frees all the resources
the driver allocates and then it will reset the device.
This flow can cause memory corruption if the device is still writes
to the driver's memory space.
To overcome this potential race, move the reset before the device
resources are freed.
Signed-off-by: Netanel Belgazal <netanel@annapurnalabs.com> Signed-off-by: David S. Miller <davem@davemloft.net>
net/ena: refactor ena_get_stats64 to be atomic context safe
ndo_get_stat64() can be called from atomic context, but the current
implementation sends an admin command to retrieve the statistics from
the device. This admin command can sleep.
This patch re-factors the implementation of ena_get_stats64() to use
the {rx,tx}bytes/count from the driver's inner counters, and to obtain
the rx drop counter from the asynchronous keep alive (heart bit)
event.
Signed-off-by: Netanel Belgazal <netanel@annapurnalabs.com> Signed-off-by: David S. Miller <davem@davemloft.net>
net/ena: fix NULL dereference when removing the driver after device reset failed
If for some reason the device stops responding, and the device reset
failes to recover the device, the mmio register read data structure
will not be reinitialized.
On driver removal, the driver will also try to reset the device, but
this time the mmio data structure will be NULL.
To solve this issue, perform the device reset in the remove function
only if the device is runnig.
ENA default hash configures IPv4_frag hash twice instead of
configure non-IP packets.
The bug caused IPv4 fragmented packets to be calculated based on
L2 source and destination address instead of L3 source and destination.
IPv4 packets can reach to the wrong Rx queue.
Signed-off-by: Netanel Belgazal <netanel@annapurnalabs.com> Signed-off-by: David S. Miller <davem@davemloft.net>
The ENA driver tries to open a queue per vCPU.
To determine how many vCPUs the instance have it uses num_possible_cpus()
while it should have use num_online_cpus() instead.
Signed-off-by: Netanel Belgazal <netanel@annapurnalabs.com> Signed-off-by: David S. Miller <davem@davemloft.net>