mlxsw: reg: Add Policy-Engine Region Configuration Register
The PERCR register configures the region parameters such as whether to
consult the bloom filter before performing a lookup using a specific
eRP.
For C-TCAM only usage we don't need to accurately set the master mask.
Instead, we can set all of its bits to make sure all the extracted keys
are actually used.
Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
mlxsw: acl: Introduce activity get operation for action block/set
In Spectrum-2, activity cannot be find out by TCAM rule (PTCEv2 register),
but rather by associated action set. For that purpose, extend action ops
to allow query activity from PEFA register. Block activity is decided
according to activity of the first set.
Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
mlxsw: reg: Add support for activity information from PEFA register
In Spectrum-2, the PEFA register is extend to report if the action set
was hit during processing of packets. Introduce this extension and
adjust the code around this accordingly.
Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
mlxsw: spectrum: Add Spectrum-2 variant of flex actions ops
In Spectrum-2, no action set is stored directly in TCAM, all are located
in KVD linear. So ask core to treat the first set as dummy empty one,
to be just used for PTCEV2 purposes.
Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
mlxsw: spectrum: Add KVDL manager implementation for Spectrum-2
In Spectrum-2, KVD linear indexes are hashed into KVD hash. Therefore it
is possible for multiple resource types to use same indexes. There are
multiple index spaces. Also, the index space is bigger than the actual
KVD hash area, which allows to have holes in the index space without any
penalization. The HW has to be told in case the index for particular
resource type is no longer used so it can be freed from KVD hash. IEDR
register is used for that.
Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Commit b6fb0df12db6 ("RDS/IB: Make ib_recv_refill return void") did
not change the comment accordingly.
Fixes: b6fb0df12db6 ("RDS/IB: Make ib_recv_refill return void") Signed-off-by: Håkon Bugge <haakon.bugge@oracle.com> Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.ccom> Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Wed, 18 Jul 2018 05:21:49 +0000 (14:21 +0900)]
Merge branch 'ravb-small-sparse-fixes'
Niklas Söderlund says:
====================
ravb: small sparse fixes
This are fixes that have bugged me whenever I run sparse to check my own
changes to the driver. It's based on the latest net-next tree and tested
on M3-N.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
ravb: fix byte order for TX descriptor tag field lower bits
The wrong helper is used to swap the bytes when adding the lower bits of
the TX descriptors tag field in the shared ds_tagl variable. The
variable contains the DS[11:0] field and then the TAG[3:0] bits.
The mistake was highlighted by the sparse warning:
ravb_main.c:1622:31: left side has type restricted __le16
ravb_main.c:1622:31: right side has type unsigned short
ravb_main.c:1622:31: warning: invalid assignment: |=
ravb_main.c:1622:34: warning: cast to restricted __le16
Signed-off-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se> Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be> Acked-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com> Signed-off-by: David S. Miller <davem@davemloft.net>
ravb_main.c:1257 ravb_get_strings() error: memcpy() '*ravb_gstrings_stats' too small (32 vs 960)
Signed-off-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se> Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be> Acked-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com> Signed-off-by: David S. Miller <davem@davemloft.net>
ravb: fix shadowing of symbol 'stats' in ravb_get_ethtool_stats()
Inside a loop in ravb_get_ethtool_stats() a variable 'stats' is declared
resulting in the argument also named 'stats' to be shadowed. Fix this
warning by renaming the unused argument 'stats' to 'estats'.
This fixes the sparse warning:
ravb_main.c:1225:36: originally declared here
ravb_main.c:1233:41: warning: symbol 'stats' shadows an earlier one
Signed-off-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se> Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be> Acked-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com> Signed-off-by: David S. Miller <davem@davemloft.net>
This adds a driver core for the Realtek SMI chips and a
subdriver for the RTL8366RB. I just added this chip simply
because it is all I can test.
The code is a massaged variant of the code that has been
sitting out-of-tree in OpenWRT for years in the absence of
a proper switch subsystem. This creates a DSA driver for it.
I have tried to credit the original authors wherever
possible.
The main changes I've done from the OpenWRT code:
- Added an IRQ chip inside the RTL8366RB switch to demux and
handle the line state IRQs.
- Distributed the phy handling out to the PHY driver.
- Added some RTL8366RB code that was missing in the driver at
the time, such as setting up "green ethernet" with a funny
jam table and forcing MAC5 (the CPU port) into 1 GBit.
- Select jam table and add the default jam table from the
vendor driver, also for ASIC "version 0" if need be.
- Do not store jam tables in the device tree, store them
in the driver.
- Pick in the "initvals" jam tables from OpenWRT's driver
and make those get selected per compatible for the
whole system. It's apparently about electrical settings
for this system and whatnot, not really configuration
from device tree.
- Implemented LED control: beware of bugs because there are
no LEDs on the device I am using!
We do not implement custom DSA tags. This is explained in
a comment in the driver as well: this "tagging protocol" is
not simply a few extra bytes tagged on to the ethernet
frame as DSA is used to. Instead, enabling the CPU tags
will make the switch start talking Realtek RRCP internally.
For example a simple ping will make this kind of packets
appear inside the switch:
As you can see a custom "8899" tagged packet using the
protocol 0xa2. Norm RRCP appears to always have this
protocol set to 0x01 according to OpenRRCP. You can also
see that this is not a ping packet at all, instead the
switch is starting to talk network management issues
with the CPU port.
So for now custom "tagging" is disabled.
This was tested on the D-Link DIR-685 with initramfs and
OpenWRT userspaces and works fine on all the LAN ports
(lan0 .. lan3). The WAN port is yet not working.
Cc: Antti Seppälä <a.seppala@gmail.com> Cc: Roman Yeryomin <roman@advem.lv> Cc: Colin Leitner <colin.leitner@googlemail.com> Cc: Gabor Juhos <juhosg@openwrt.org> Cc: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
The Realtek SMI family is a set of DSA chips that provide
switching in routers. This binding just follows the pattern
set by other switches but with the introduction of an embedded
irqchip to demux and handle the interrupts fired by the single
line from the chip.
This interrupt construction is similar to how we handle
interrupt controllers inside PCI bridges etc.
Cc: Antti Seppälä <a.seppala@gmail.com> Cc: Roman Yeryomin <roman@advem.lv> Cc: Colin Leitner <colin.leitner@googlemail.com> Cc: Gabor Juhos <juhosg@openwrt.org> Cc: Florian Fainelli <f.fainelli@gmail.com> Cc: devicetree@vger.kernel.org Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
The RTL8366RB is an ASIC with five internal PHYs for
LAN0..LAN3 and WAN. The PHYs are spawn off the main
device so they can be handled in a distributed manner
by the Realtek PHY driver. All that is really needed
is the power save feature enablement and letting the
PHY driver core pick up the IRQ from the switch chip.
Cc: Antti Seppälä <a.seppala@gmail.com> Cc: Roman Yeryomin <roman@advem.lv> Cc: Colin Leitner <colin.leitner@googlemail.com> Cc: Gabor Juhos <juhosg@openwrt.org> Cc: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
The removed code would be called in two situations:
1. interface is brought up never or >10s after driver load
2. after close()
Case 1 we can handle cleaner by ensuring chip is powered down when
leaving probe(). open() callback will power up the chip.
In case 2 we call rtl_pll_power_down() twice currently, from the
close() callback and 10s later when entering runtime-suspend.
This is avoided by this patch.
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Wed, 18 Jul 2018 01:02:11 +0000 (10:02 +0900)]
Merge branch 'HWMON-support-for-SFP-modules'
Andrew Lunn says:
====================
HWMON support for SFP modules
This patchset adds HWMON support to SFP modules. The two patches add
some attributes for temperature and power sensors which are currently
missing from the hwmon core. The third patch adds a helper for
filtering out characters in hwmon names which are invalid. The last
patch then extends the core SFP code to export the sensors found in
SFP modules.
This code has been tested with two SFP modules:
module OEM SFP-7000-85 rev 11.0 sn M1512220075 dc 160221
module FINISAR CORP. FTLF8524E2GNL rev A sn PW40MNN dc 160725
The anonymous module uses external calibration, while the FINISAR uses
internal calibration. Thus both code paths have been tested.
Due to the cross subsystem nature of these patches, as discussed with
the RFC, it is hoped Guenter Roeck will ACK the patches, and then Dave
Miller will merge them all via net-next.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Andrew Lunn [Tue, 17 Jul 2018 19:48:13 +0000 (21:48 +0200)]
net: phy: sfp: Add HWMON support for module sensors
SFP modules can contain a number of sensors. The EEPROM also contains
recommended alarm and critical values for each sensor, and indications
of if these have been exceeded. Export this information via
HWMON. Currently temperature, VCC, bias current, transmit power, and
possibly receiver power is supported.
The sensors in the modules can either return calibrate or uncalibrated
values. Uncalibrated values need to be manipulated, using coefficients
provided in the SFP EEPROM. Uncalibrated receive power values require
floating point maths in order to calibrate them. Performing this in
the kernel is hard. So if the SFP module indicates it uses
uncalibrated values, RX power is not made available.
With this hwmon device, it is possible to view the sensor values using
lm-sensors programs:
in0: +3.29 V (crit min = +2.90 V, min = +3.00 V)
(max = +3.60 V, crit max = +3.70 V)
temp1: +33.0°C (low = -5.0°C, high = +80.0°C)
(crit low = -10.0°C, crit = +85.0°C)
power1: 1000.00 nW (max = 794.00 uW, min = 50.00 uW) ALARM (LCRIT)
(lcrit = 40.00 uW, crit = 1000.00 uW)
curr1: +0.00 A (crit min = +0.00 A, min = +0.00 A) ALARM (LCRIT, MIN)
(max = +0.01 A, crit max = +0.01 A)
The scaling sensors performs on the bias current is not particularly
good. The raw values are more useful:
In order to keep the I2C overhead to a minimum, the constant values,
such as limits and calibration coefficients are read once at module
insertion time. Thus only reading *_input and *_alarm properties
requires i2c read operations.
Signed-off-by: Andrew Lunn <andrew@lunn.ch> Acked-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: David S. Miller <davem@davemloft.net>
Andrew Lunn [Tue, 17 Jul 2018 19:48:12 +0000 (21:48 +0200)]
hwmon: Add helper to tell if a char is invalid in a name
HWMON device names are not allowed to contain "-* \t\n". Add a helper
which will return true if passed an invalid character. It can be used
to massage a string into a hwmon compatible name by replacing invalid
characters with '_'.
Signed-off-by: Andrew Lunn <andrew@lunn.ch> Acked-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: David S. Miller <davem@davemloft.net>
Andrew Lunn [Tue, 17 Jul 2018 19:48:11 +0000 (21:48 +0200)]
hwmon: Add support for power min, lcrit, min_alarm and lcrit_alarm
Some sensors support reporting minimal and lower critical power, as
well as alarms when these thresholds are reached. Add support for
these attributes to the hwmon core.
Signed-off-by: Andrew Lunn <andrew@lunn.ch> Acked-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Wed, 18 Jul 2018 00:46:40 +0000 (09:46 +0900)]
Merge branch 'r8169-add-phylib-support'
Heiner Kallweit says:
====================
r8169: add phylib support
Now that all the basic refactoring has been done we can add phylib
support. This patch series was successfully tested on:
RTL8168h
RTL8168evl
RTL8169sb
Changes in v2:
- return error in mdio ops if phyaddr > 0
- advertise pause modes
- added reviewed-by for several patches
Changes in v3:
- return ENODEV for unused phy addresses in mdio ops
- remove unneeded PHY suspend in patch 2
- use recently added phy_speed_down and phy_speed_up in patch 7
- other minor changes based on review comments
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Instead of accessing the PHYstatus register we can use the information
phylib stores in the phy_device structure.
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
r8169: remove mii_if_info member from struct rtl8169_private
The only remaining usage of the struct mii_if_info member is to store the
information whether the chip is GMII-capable. So we can replace it with
a simple flag.
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Use phy_ethtool_(g|s)et_link_ksettings() for the respective ethtool_ops
callbacks.
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
r8169: replace open-coded PHY soft reset with genphy_soft_reset
Use genphy_soft_reset() instead of open-coding a PHY soft reset. We have
to do an explicit PHY soft reset because some chips use the genphy driver
which uses a no-op as soft_reset callback.
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Use phy_resume() / phy_suspend() instead of open coding this functionality.
The chip version specific differences are handled by the respective PHY
drivers.
The call to r8168_phy_power_down() in r8168_pll_power_down() can be
removed because phylib takes care now. The relevant scenarios are:
- rtl8169_close(): phy_disconnect() powers down PHY
- suspend: mdio_bus_phy_suspend() takes care
- runtime-suspend: WoL is active, don't suspend PHY
- rtl_shutdown(): no need to power down PHY
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Rick Farrington [Tue, 17 Jul 2018 01:06:07 +0000 (18:06 -0700)]
liquidio: correct error msg text when removing VLAN ID
Signed-off-by: Rick Farrington <ricardo.farrington@cavium.com> Signed-off-by: Felix Manlunas <felix.manlunas@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Tue, 17 Jul 2018 00:02:04 +0000 (17:02 -0700)]
net: Fix GRO_HASH_BUCKETS assertion.
FIELD_SIZEOF() is in bytes, but we want bits.
Fixes: d9f37d01e294 ("net: convert gro_count to bitmask") Suggested-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
sch_cake: Fix tin order when set through skb->priority
In diffserv mode, CAKE stores tins in a different order internally than
the logical order exposed to userspace. The order remapping was missing
in the handling of 'tc filter' priority mappings through skb->priority,
resulting in bulk and best effort mappings being reversed relative to
how they are displayed.
Fix this by adding the missing mapping when reading skb->priority.
Fixes: 83f8fd69af4f ("sch_cake: Add DiffServ handling") Signed-off-by: Toke Høiland-Jørgensen <toke@toke.dk> Signed-off-by: David S. Miller <davem@davemloft.net>
Rick Farrington [Fri, 13 Jul 2018 19:50:21 +0000 (12:50 -0700)]
liquidio: fix hang when re-binding VF host drv after running DPDK VF driver
When configuring SLI_PKTn_OUTPUT_CONTROL, VF driver was assuming that IPTR
mode was disabled by reset, which was not true. Since DPDK driver had
set IPTR mode previously, the VF driver (which uses buf-ptr-only mode) was
not properly handling DROQ packets (i.e. it saw zero-length packets).
This represented an invalid hardware configuration which the driver could
not handle.
Signed-off-by: Rick Farrington <ricardo.farrington@cavium.com> Signed-off-by: Felix Manlunas <felix.manlunas@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Antoine Tenart [Fri, 13 Jul 2018 14:57:57 +0000 (16:57 +0200)]
net: mscc: simplify retrieving the tag type from the frame header
The tag type in the frame extraction header is only a bit wide. There's
no need to use GENMASK when retrieving the information. This patch
simplify the code by dropping GENMASK and using BIT instead.
Signed-off-by: Antoine Tenart <antoine.tenart@bootlin.com> Reviewed-by: Alexandre Belloni <alexandre.belloni@bootlin.com> Signed-off-by: David S. Miller <davem@davemloft.net>
cxgb4: do not return DUPLEX_UNKNOWN when link is down
We were returning DUPLEX_UNKNOWN in get_link_ksettings() when
the link was down. Unfortunately, this causes a problem when
"ethtool -s autoneg on" is issued for a link which is down because
the ethtool code first reads the settings and then reapplies them
with only the changes provided on the command line. Which results
in us diving into set_link_ksettings() with DUPLEX_UNKNOWN which is
not DUPLEX_FULL, so set_link_ksettings() throws an -EINVAL error.
do not return DUPLEX_UNKNOWN to fix the issue.
Signed-off-by: Casey Leedom <leedom@chelsio.com> Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Li RongQing [Fri, 13 Jul 2018 06:41:36 +0000 (14:41 +0800)]
net: convert gro_count to bitmask
gro_hash size is 192 bytes, and uses 3 cache lines, if there is few
flows, gro_hash may be not fully used, so it is unnecessary to iterate
all gro_hash in napi_gro_flush(), to occupy unnecessary cacheline.
convert gro_count to a bitmask, and rename it as gro_bitmask, each bit
represents a element of gro_hash, only flush a gro_hash element if the
related bit is set, to speed up napi_gro_flush().
and update gro_bitmask only if it will be changed, to reduce cache
update
Suggested-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Li RongQing <lirongqing@baidu.com> Cc: Stefano Brivio <sbrivio@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
bnxt_en: remove redundant debug register dma mem allocation
hwrm_dbg_resp_addr and hwrm_dbg_resp_dma_addr are never used
and can be removed.
Signed-off-by: YueHaibing <yuehaibing@huawei.com> Acked-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
net: phy: realtek: add missing entry for RTL8211C to mdio_device_id table
Add missing entry for RTL8211C to mdio_device_id table.
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Fixes: cf87915cb9f8 ("net: phy: realtek: add support for RTL8211C") Signed-off-by: David S. Miller <davem@davemloft.net>
net: usb: hso: use swap macro in hso_kick_transmit
Make use of the swap macro and remove unnecessary variable *temp*.
This makes the code easier to read and maintain. Also, slightly
refactor some code due to the removal of *temp*.
This code was detected with the help of Coccinelle.
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Some network drivers include functionality to speed down the PHY when
suspending and just waiting for a WoL packet because this saves energy.
This functionality is quite generic, therefore let's factor it out to
phylib.
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
This functionality will also be needed in subsequent patches of this
series, therefore factor it out to a helper.
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Dave Watson [Thu, 12 Jul 2018 17:59:20 +0000 (10:59 -0700)]
selftests: tls: add selftests for TLS sockets
Add selftests for tls socket. Tests various iov and message options,
poll blocking and nonblocking behavior, partial message sends / receives,
and control message data. Tests should pass regardless of if TLS
is enabled in the kernel or not, and print a warning message if not.
Signed-off-by: Dave Watson <davejwatson@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net>
This is my first patch set to net-next. Please shout loud and clear if
I've botched anything.
Recently failover and net_failover modules were added to the mainline.
Documentation was included in rst format but they were not added to the
toctree in `networking/index.rst`. Also building docs for net_failover
is currently emitting a few warnings.
Patch 1 adds failover and net_failover to the index toctree
Patch 2 fixes the build warnings for net_failover
I haven't been super active on netdev list so if there is some reason I
missed why these files are not in the index please do say so.
Has there been any discussion on preferred order for the toctree index
list? I just added them to the bottom of the list.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently building the net_failover docs causes a bunch of warnings to
be emitted. These warnings are all related to indentation and correctly
highlight missing '::' (for code sections). It looks, from other rst
files in Documentation, that the first column should be indented 2
spaces.
Add '::' before code snippets and indent all snippets uniformly starting
with 2 spaces.
Cc: Jonathan Corbet <corbet@lwn.net> Signed-off-by: Tobin C. Harding <me@tobin.cc> Signed-off-by: David S. Miller <davem@davemloft.net>
net: hns3: Fix comments for hclge_get_ring_chain_from_mbx
Actually, hclge_get_ring_chain_from_mbx is used to get ring type, tqp id,
and int_gl index from mailbox message. So the comments is incorrect. This
patch fixes it.
Fixes: dde1a86e93ca ("net: hns3: Add mailbox support to PF driver") Signed-off-by: Fuyun Liang <liangfuyun1@huawei.com> Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: Salil Mehta <salil.mehta@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
net: hns3: Fix for using wrong mask and shift in hclge_get_ring_chain_from_mbx
HCLGE_INT_GL_IDX_M and HCLGE_INT_GL_IDX_S are used to set fireware
cmd. When getting int_gl value from mailbox message, we should use
HNAE3_RING_GL_IDX_M and HNAE3_RING_GL_IDX_S.
Fixes: 79eee4108541 ("net: hns3: add int_gl_idx setup for VF") Signed-off-by: Fuyun Liang <liangfuyun1@huawei.com> Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: Salil Mehta <salil.mehta@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Yunsheng Lin [Mon, 16 Jul 2018 15:36:25 +0000 (16:36 +0100)]
net: hns3: Fix for reset_level default assignment probelm
handle->reset_level is assigned to HNAE3_NONE_RESET when client is
initialized, if a tx timeout happens right after initialization,
then handle->reset_level is not resetted to HNAE3_FUNC_RESET in
hclge_reset_event, which will cause reset event not properly
handled problem.
This patch fixes it by setting handle->reset_level properly when
client is initialized.
Fixes: 6d4c3981a8d8 ("net: hns3: Changes to make enet watchdog timeout func common for PF/VF") Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com> Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: Salil Mehta <salil.mehta@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Huazhong Tan [Mon, 16 Jul 2018 15:36:24 +0000 (16:36 +0100)]
net: hns3: remove unnecessary ring configuration operation while resetting
The configuration of the ring will be used to reinitialize the
ring after the hardware reset is completed. So we should not
release and reacquire this configuration during reset.
Fixes: bb6b94a896d4 ("net: hns3: Add reset interface implementation in client") Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: Salil Mehta <salil.mehta@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Huazhong Tan [Mon, 16 Jul 2018 15:36:23 +0000 (16:36 +0100)]
net: hns3: Fix return value error in hns3_reset_notify_down_enet
When doing reset, netdev has not been brought up is not an error,
it means that we do not need do the stop operation, so just return
zero.
Fixes: 76ad4f0ee747 ("net: hns3: Add support of HNS3 Ethernet Driver for hip08 SoC") Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: Salil Mehta <salil.mehta@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Huazhong Tan [Mon, 16 Jul 2018 15:36:22 +0000 (16:36 +0100)]
net: hns3: Correct reset event status register
According to hardware's description, driver should get reset event
from VECTOR0_PF_OTHER_INT_ST(0x20800) instead of
VECTOR0_PF_OTHER_INT_SRC(0x20700).
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: Salil Mehta <salil.mehta@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Huazhong Tan [Mon, 16 Jul 2018 15:36:21 +0000 (16:36 +0100)]
net: hns3: Prevent to request reset frequently
Netdevice reset should not be requested frequently, a new one
must wait a moment since there may be some work not completed.
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: Salil Mehta <salil.mehta@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Huazhong Tan [Mon, 16 Jul 2018 15:36:20 +0000 (16:36 +0100)]
net: hns3: Reset net device with rtnl_lock
Since current locking was not covering certain code where
netdev was being accessed or manipulated, this patch fixes
it.
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: Salil Mehta <salil.mehta@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Huazhong Tan [Mon, 16 Jul 2018 15:36:19 +0000 (16:36 +0100)]
net: hns3: Modify the order of initializing command queue register
According to hardware's description, the head pointer register should
be written before the tail pointer register while doing command queue
initialization.
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: Salil Mehta <salil.mehta@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
The following series provides TLS RX inline crypto offload.
v5->v4:
- Remove the Kconfig to mutually exclude both IPsec and TLS
v4->v3:
- Remove the iov revert for zero copy send flow
v2->v3:
- Fix typo
- Adjust cover letter
- Fix bug in zero copy flows
- Use network byte order for the record number in resync
- Adjust the sequence provided in resync
v1->v2:
- Fix bisectability problems due to variable name changes
- Fix potential uninitialized return value
This series completes the generic infrastructure to offload TLS crypto to
a network devices. It enables the kernel TLS socket to skip decryption and
authentication operations for SKBs marked as decrypted on the receive
side of the data path. Leaving those computationally expensive operations
to the NIC.
This infrastructure doesn't require a TCP offload engine. Instead, the
NIC decrypts a packet's payload if the packet contains the expected TCP
sequence number. The TLS record authentication tag remains unmodified
regardless of decryption. If the packet is decrypted successfully and it
contains an authentication tag, then the authentication check has passed.
Otherwise, if the authentication fails, then the packet is provided
unmodified and the KTLS layer is responsible for handling it.
Out-Of-Order TCP packets are provided unmodified. As a result,
in the slow path some of the SKBs are decrypted while others remain as
ciphertext.
The GRO and TCP layers must not coalesce decrypted and non-decrypted SKBs.
At the worst case a received TLS record consists of both plaintext
and ciphertext packets. These partially decrypted records must be
reencrypted, only to be decrypted.
The notable differences between SW KTLS and NIC offloaded TLS
implementations are as follows:
1. Partial decryption - Software must handle the case of a TLS record
that was only partially decrypted by HW. This can happen due to packet
reordering.
2. Resynchronization - tls_read_size calls the device driver to
resynchronize HW whenever it lost track of the TLS record framing in
the TCP stream.
The infrastructure should be extendable to support various NIC offload
implementations. However it is currently written with the
implementation below in mind:
The NIC identifies packets that should be offloaded according to
the 5-tuple and the TCP sequence number. If these match and the
packet is decrypted and authenticated successfully, then a syndrome
is provided to software. Otherwise, the packet is unmodified.
Decrypted and non-decrypted packets aren't coalesced by the network stack,
and the KTLS layer decrypts and authenticates partially decrypted records.
The NIC provides an indication whenever a resync is required. The resync
operation is triggered by the KTLS layer while parsing TLS record headers.
Finally, we measure the performance obtained by running single stream
iperf with two Intel(R) Xeon(R) CPU E5-2620 v3 @ 2.40GHz machines connected
back-to-back with Innova TLS (40Gbps) NICs. We compare TCP (upper bound)
and KTLS-Offload running both in Tx and Rx. The results show that the
performance of offload is comparable to TCP.
Boris Pismenny [Fri, 13 Jul 2018 11:33:48 +0000 (14:33 +0300)]
net/mlx5e: TLS, add Innova TLS rx data path
Implement the TLS rx offload data path according to the
requirements of the TLS generic NIC offload infrastructure.
Special metadata ethertype is used to pass information to
the hardware.
When hardware loses synchronization a special resync request
metadata message is used to request resync.
Signed-off-by: Boris Pismenny <borisp@mellanox.com> Signed-off-by: Ilya Lesokhin <ilyal@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Boris Pismenny [Fri, 13 Jul 2018 11:33:47 +0000 (14:33 +0300)]
net/mlx5e: TLS, add innova rx support
Add the mlx5 implementation of the TLS Rx routines to add/del TLS
contexts, also add the tls_dev_resync_rx routine
to work with the TLS inline Rx crypto offload infrastructure.
Signed-off-by: Boris Pismenny <borisp@mellanox.com> Signed-off-by: Ilya Lesokhin <ilyal@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Boris Pismenny [Fri, 13 Jul 2018 11:33:46 +0000 (14:33 +0300)]
net/mlx5: Accel, add TLS rx offload routines
In Innova TLS, TLS contexts are added or deleted
via a command message over the SBU connection.
The HW then sends a response message over the same connection.
Complete the implementation for Innova TLS (FPGA-based) hardware by
adding support for rx inline crypto offload.
Signed-off-by: Boris Pismenny <borisp@mellanox.com> Signed-off-by: Ilya Lesokhin <ilyal@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Boris Pismenny [Fri, 13 Jul 2018 11:33:45 +0000 (14:33 +0300)]
net/mlx5e: TLS, refactor variable names
For symmetry, we rename mlx5e_tls_offload_context to
mlx5e_tls_offload_context_tx before we add mlx5e_tls_offload_context_rx.
Signed-off-by: Boris Pismenny <borisp@mellanox.com> Reviewed-by: Aviad Yehezkel <aviadye@mellanox.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Boris Pismenny [Fri, 13 Jul 2018 11:33:44 +0000 (14:33 +0300)]
tls: Fix zerocopy_from_iter iov handling
zerocopy_from_iter iterates over the message, but it doesn't revert the
updates made by the iov iteration. This patch fixes it. Now, the iov can
be used after calling zerocopy_from_iter.
Fixes: 3c4d75591 ("tls: kernel TLS support") Signed-off-by: Boris Pismenny <borisp@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Boris Pismenny [Fri, 13 Jul 2018 11:33:43 +0000 (14:33 +0300)]
tls: Add rx inline crypto offload
This patch completes the generic infrastructure to offload TLS crypto to a
network device. It enables the kernel to skip decryption and
authentication of some skbs marked as decrypted by the NIC. In the fast
path, all packets received are decrypted by the NIC and the performance
is comparable to plain TCP.
This infrastructure doesn't require a TCP offload engine. Instead, the
NIC only decrypts packets that contain the expected TCP sequence number.
Out-Of-Order TCP packets are provided unmodified. As a result, at the
worst case a received TLS record consists of both plaintext and ciphertext
packets. These partially decrypted records must be reencrypted,
only to be decrypted.
The notable differences between SW KTLS Rx and this offload are as
follows:
1. Partial decryption - Software must handle the case of a TLS record
that was only partially decrypted by HW. This can happen due to packet
reordering.
2. Resynchronization - tls_read_size calls the device driver to
resynchronize HW after HW lost track of TLS record framing in
the TCP stream.
Signed-off-by: Boris Pismenny <borisp@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Boris Pismenny [Fri, 13 Jul 2018 11:33:41 +0000 (14:33 +0300)]
tls: Split tls_sw_release_resources_rx
This patch splits tls_sw_release_resources_rx into two functions one
which releases all inner software tls structures and another that also
frees the containing structure.
In TLS_DEVICE we will need to release the software structures without
freeeing the containing structure, which contains other information.
Signed-off-by: Boris Pismenny <borisp@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Boris Pismenny [Fri, 13 Jul 2018 11:33:40 +0000 (14:33 +0300)]
tls: Split decrypt_skb to two functions
Previously, decrypt_skb also updated the TLS context.
Now, decrypt_skb only decrypts the payload using the current context,
while decrypt_skb_update also updates the state.
Later, in the tls_device Rx flow, we will use decrypt_skb directly.
Signed-off-by: Boris Pismenny <borisp@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Boris Pismenny [Fri, 13 Jul 2018 11:33:38 +0000 (14:33 +0300)]
tcp: Don't coalesce decrypted and encrypted SKBs
Prevent coalescing of decrypted and encrypted SKBs in GRO
and TCP layer.
Signed-off-by: Boris Pismenny <borisp@mellanox.com> Signed-off-by: Ilya Lesokhin <ilyal@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds a netdev feature to configure TLS RX inline crypto offload.
Signed-off-by: Ilya Lesokhin <ilyal@mellanox.com> Signed-off-by: Boris Pismenny <borisp@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Boris Pismenny [Fri, 13 Jul 2018 11:33:35 +0000 (14:33 +0300)]
net: Add decrypted field to skb
The decrypted bit is propogated to cloned/copied skbs.
This will be used later by the inline crypto receive side offload
of tls.
Signed-off-by: Boris Pismenny <borisp@mellanox.com> Signed-off-by: Ilya Lesokhin <ilyal@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
The PPv2 Header Parser and Classifier are not straightforward to debug,
having easy access to some of the many lookup tables configuration is
helpful during development and debug.
This series adds a basic debugfs interface, allowing to read data from
the Header Parser and some of the Classifier tables.
For now, the interface is read-only, and contains only some basic info.
This was actually used during RSS development, and might be useful to
troubleshoot some issues we might find.
The first patch of the series converts the mvpp2 files to SPDX, which
eases adding the new debugfs dedicated file.
The second patch adds the interface, and exposes basic Header Parser data.
The 3rd patch adds a hit counter for the Header Parser TCAM.
The 4th patch exposes classifier info.
The 5th patch adds some hit counters for some of the classifier engines.
Changes since V1:
- Rebased on the lastest net-next
- Made cls_flow_get non static so that it can be used in mvpp2_debugfs
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
The classification operations that are used for RSS make use of several
lookup tables. Having hit counters for these tables is really helpful
to determine what flows were matched by ingress traffic, and see the
path of packets among all the classifier tables.
This commit adds hit counters for the 3 tables used at the moment :
- The decoding table (also called lookup_id table), that links flows
identified by the Header Parser to the flow table.
There's one entry per flow, located at :
.../mvpp2/<controller>/flows/XX/dec_hits
Note that there are 21 flows in the decoding table, whereas there are
52 flows in the Header Parser. That's because there are several kind
of traffic that will match a given flow. Reading the hit counter from
one sub-flow will clear all hit counter that have the same flow_id.
This also applies to the flow_hits.
- The flow table, that contains all the different lookups to be
performed by the classifier for each packet of a given flow. The match
is done on the first entry of the flow sequence.
- The C2 engine entries, that are used to assign the default rx queue,
and enable or disable RSS for a given port.
There's one entry per flow, located at:
.../mvpp2/<controller>/flows/XX/flow_hits
There is one C2 entry per port, so the c2 hit counter is located at :
.../mvpp2/<controller>/ethX/c2_hits
All hit counter values are 16-bits clear-on-read values.
Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Signed-off-by: David S. Miller <davem@davemloft.net>
net: mvpp2: debugfs: add entries for classifier flows
The classifier configuration for RSS is quite complex, with several
lookup tables being used. This commit adds useful info in debugfs to
see how the different tables are configured :
Added 2 new entries in the per-port directory :
- .../eth0/default_rxq : The default rx queue on that port
- .../eth0/rss_enable : Indicates if RSS is enabled in the C2 entry
Added the 'flows' directory :
It contains one entry per sub-flow. a 'sub-flow' is a unique path from
Header Parser to the flow table. Multiple sub-flows can point to the
same 'flow' (each flow has an id from 8 to 29, which is its index in the
Lookup Id table) :
- .../flows/00/...
/01/...
...
/51/id : The flow id. There are 21 unique flows. There's one
flow per combination of the following parameters :
- L4 protocol (TCP, UDP, none)
- L3 protocol (IPv4, IPv6)
- L3 parameters (Fragmented or not)
- L2 parameters (Vlan tag presence or not)
.../type : The flow type. This is an even higher level flow,
that we manipulate with ethtool. It can be :
"udp4" "tcp4" "udp6" "tcp6" "ipv4" "ipv6" "other".
.../eth0/...
.../eth1/engine : The hash generation engine used for this
flow on the given port
.../hash_opts : The hash generation options indicating on
what data we base the hash (vlan tag, src
IP, src port, etc.)
Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Signed-off-by: David S. Miller <davem@davemloft.net>
net: mvpp2: debugfs: add hit counter stats for Header Parser entries
One helpful feature to help debug the Header Parser TCAM filter in PPv2
is to be able to see if the entries did match something when a packet
comes in. This can be done by using the built-in hit counter for TCAM
entries.
This commit implements reading the counter, and exposing its value on
debugfs for each filter entry.
The counter is a 16-bits clear-on-read value, located at:
.../mvpp2/<controller>/parser/XXX/hits
Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Signed-off-by: David S. Miller <davem@davemloft.net>
net: mvpp2: add a debugfs interface for the Header Parser
Marvell PPv2 Packer Header Parser has a TCAM based filter, that is not
trivial to configure and debug. Being able to dump TCAM entries from
userspace can be really helpful to help development of new features
and debug existing ones.
This commit adds a basic debugfs interface for the PPv2 driver, focusing
on TCAM related features.
Antoine Tenart [Sat, 14 Jul 2018 11:29:24 +0000 (13:29 +0200)]
net: mvpp2: switch to SPDX identifiers
Use the appropriate SPDX license identifiers and drop the license text.
This patch is only cosmetic.
Signed-off-by: Antoine Tenart <antoine.tenart@bootlin.com> Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Signed-off-by: David S. Miller <davem@davemloft.net>
The following pull-request contains BPF updates for your *net-next* tree.
The main changes are:
1) Various different arm32 JIT improvements in order to optimize code emission
and make the JIT code itself more robust, from Russell.
2) Support simultaneous driver and offloaded XDP in order to allow for advanced
use-cases where some work is offloaded to the NIC and some to the host. Also
add ability for bpftool to load programs and maps beyond just the cgroup case,
from Jakub.
3) Add BPF JIT support in nfp for multiplication as well as division. For the
latter in particular, it uses the reciprocal algorithm to emulate it, from Jiong.
4) Add BTF pretty print functionality to bpftool in plain and JSON output
format, from Okash.
5) Add build and installation to the BPF helper man page into bpftool, from Quentin.
6) Add a TCP BPF callback for listening sockets which is triggered right after
the socket transitions to TCP_LISTEN state, from Andrey.
7) Add a new cgroup tree command to bpftool which iterates over the whole cgroup
tree and prints all attached programs, from Roman.
8) Improve xdp_redirect_cpu sample to support parsing of double VLAN tagged
packets, from Jesper.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
selftests/bpf: Test case for BPF_SOCK_OPS_TCP_LISTEN_CB
Cover new TCP-BPF callback in test_tcpbpf: when listen() is called on
socket, set BPF_SOCK_OPS_STATE_CB_FLAG so that BPF_SOCK_OPS_STATE_CB
callback can be called on future state transition, and when such a
transition happens (TCP_LISTEN -> TCP_CLOSE), track it in the map and
verify it in user space later.
Reduce amount of copy/paste for debug info when result is verified in
the test and keep that info together with values being checked so that
they won't get out of sync.
It also improves debug experience: instead of checking manually what
doesn't match in debug output for all fields, only unexpected field is
printed.