David S. Miller [Fri, 7 Apr 2017 14:03:35 +0000 (07:03 -0700)]
Merge branch 'net_device_stats'
Tobias Klauser says:
====================
Use net_device_stats from struct net_device
Along the lines of previous patches, switch (almost) all remaining net
drivers to use net_device_stats from net_device instead of including a
copy of it in their netdev_priv struct.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
usbnet: pegasus: Use net_device_stats from struct net_device
Instead of using a private copy of struct net_device_stats in struct
pegasus, use stats from struct net_device. Also remove the now
unnecessary .ndo_get_stats function.
Cc: Petko Manolov <petkan@nucleusys.com> Cc: linux-usb@vger.kernel.org Signed-off-by: Tobias Klauser <tklauser@distanz.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
usbnet: kaweth: Use net_device_stats from struct net_device
Instead of using a private copy of struct net_device_stats in struct
kaweth_device, use stats from struct net_device. Also remove the now
unnecessary .ndo_get_stats function.
Cc: linux-usb@vger.kernel.org Signed-off-by: Tobias Klauser <tklauser@distanz.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
net: nuvoton: Use net_device_stats from struct net_device
Instead of using a private copy of struct net_device_stats in
struct w90p910_ether, use stats from struct net_device. Also remove
the now unnecessary .ndo_get_stats function.
Signed-off-by: Tobias Klauser <tklauser@distanz.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
net: moxa: Use net_device_stats from struct net_device
Instead of using a private copy of struct net_device_stats in struct
moxart_mac_priv_t, use stats from struct net_device. Also remove the now
unnecessary .ndo_get_stats function.
Signed-off-by: Tobias Klauser <tklauser@distanz.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
net: dl2k: Use net_device_stats from struct net_device
Instead of using a private copy of struct net_device_stats in struct
netdev_private, use stats from struct net_device. Also remove the now
unnecessary .ndo_get_stats function and the #ifdef'ed increment of the
collisions16 counter which doesn't exist in struct net_device_stats.
Signed-off-by: Tobias Klauser <tklauser@distanz.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Fri, 7 Apr 2017 13:26:15 +0000 (06:26 -0700)]
Merge branch 'qed-XDP-header-adjust'
Yuval Mintz says:
====================
qede: support XDP head adjustments
Daniel has brought to my attention the fact that qede is the only driver
that currently supports XDP but still fails any program where
xdp_adjust_head is set on the bpf_prog. This series is meant to remedy
this and align qede with the rest of the driver, making it possible to
remove said field.
Patch #1 contains a minor cache-saving optimization for latter patches.
Patches #2 & #3 address existing issues with the qede implementation
[#2 should have been a part of this as it addresses something that's
affected by the additional headroom; #3 is simply here for the ride].
Patches #4 & #5 add the necessary logic in driver for ingress headroom,
the first adding the infrastrucutre needed for supporting the headroon
[as currently qede doesn't support such], and the second removing the
existing XDP limitation.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
In case an XDP program is attached, reserve XDP_PACKET_HEADROOM
bytes at the beginning of the packet for the program to play
with.
Modify the XDP logic in the driver to fill-in the missing bits
and re-calculate offsets and length after the program has finished
running to properly reflect the current status of the packet.
We can then go and remove the limitation of not supporting XDP programs
where xdp_adjust_head is set.
Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Driver currently doesn't support any headroom; The only 'available'
space it has in the head of the buffer is due to the placement
offset.
In order to allow [later] support of XDP adjustment of headroom,
modify the the ingress flow to properly handle a scenario where
the packets would have such.
Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Current implementation of VFs is very tight in regard to queue
resources. VFs support for XDP would require quite a bit of additional
infrastructure in qede and qed [sharing of queue-zones between queues,
more VF cids, mapping of the doorbell bar, etc.].
For now, prevent XDP programs from being attached to VFs.
Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Driver is currently using dma_unmap_single() with the address it
passed to device for the purpose of forwarding, but the XDP
transmission buffer was originally a page allocated for the rx-queue.
The mapped address is likely to differ from the original mapped
address due to the placement offset.
This difference is going to get even bigger once we support headroom.
Cache the original mapped address of the page, and use it for unmapping
of the buffer when completion arrives for the XDP forwarded packet.
Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Currently, each time an ingress packet is passed to networking stack
the driver increments a per-queue SW statistic.
As we want to have additional fields in the first cache-line of the
Rx-queue struct, change flow so this statistic would be updated once per
NAPI run. We will later push the statistic to a different cache line.
Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Fri, 7 Apr 2017 12:52:52 +0000 (05:52 -0700)]
Merge branch 's390-next'
Ursula Braun says:
====================
s390 patches for net-next
here are some cleanup patches for drivers/s390/net.
V2: respin, now patch "s390/qeth: improve endianness handling"
is supposed to apply cleanly to net-next
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Hans Wippel [Fri, 7 Apr 2017 07:15:38 +0000 (09:15 +0200)]
s390/netiucv: improve endianness handling
Replace ntohs with endianness conversion for the SKB protocol assignment
to avoid an endianness warning reported by sparse. No functional change.
Signed-off-by: Hans Wippel <hwippel@linux.vnet.ibm.com> Signed-off-by: Ursula Braun <ubraun@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Hans Wippel [Fri, 7 Apr 2017 07:15:37 +0000 (09:15 +0200)]
s390/ctcm: improve endianness handling
Use endianness conversions for SKB protocol assignments and usage to
avoid endianness warnings reported by sparse. No functional changes.
Signed-off-by: Hans Wippel <hwippel@linux.vnet.ibm.com> Signed-off-by: Ursula Braun <ubraun@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Hans Wippel [Fri, 7 Apr 2017 07:15:36 +0000 (09:15 +0200)]
s390/qeth: improve endianness handling
Avoid endianness warnings reported by sparse by (1) using endianness
conversions for assigning and using network packet fields, and (2)
removing unnecessary endianness conversions from qeth_l3_rebuild_skb. No
functional changes.
Signed-off-by: Hans Wippel <hwippel@linux.vnet.ibm.com> Signed-off-by: Ursula Braun <ubraun@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
1. options.add_hhlen is set but never used, drop it
2. clean up no longer required forward declarations
3. delete all sorts of unused defines
Signed-off-by: Julian Wiedmann <jwi@linux.vnet.ibm.com> Reviewed-by: Thomas Richter <tmricht@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
qeth_qdio_output_handler() is the only caller of
qeth_handle_send_error() and doesn't care about the return value.
Signed-off-by: Julian Wiedmann <jwi@linux.vnet.ibm.com> Signed-off-by: Ursula Braun <ubraun@linux.vnet.ibm.com> Reviewed-by: Thomas Richter <tmricht@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
The ac fields are bitmaps, so format them as hex.
While at it, also print the ac2 field.
Signed-off-by: Julian Wiedmann <jwi@linux.vnet.ibm.com> Signed-off-by: Ursula Braun <ubraun@linux.vnet.ibm.com> Reviewed-by: Thomas Richter <tmricht@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Julian Wiedmann <jwi@linux.vnet.ibm.com> Signed-off-by: Ursula Braun <ubraun@linux.vnet.ibm.com> Reviewed-by: Thomas Richter <tmricht@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Fri, 7 Apr 2017 12:41:49 +0000 (05:41 -0700)]
Merge branch '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue
Jeff Kirsher says:
====================
40GbE Intel Wired LAN Driver Updates 2017-04-06
This series contains updates to i40e and i40evf.
Preethi adds support for the outer checksum and TSO offloads for
encapsulated packets for the VF.
Mitch fixes a possible memory leak, where we need to remove the client
instance when the driver unloads. Also we need to check to see if the
client (i40iw) is already present during probe, and add a client instance
if necessary. Lastly, make sure we close any attached clients when the
driver is removed or shut down to prevent a kernel panic.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Mitch Williams [Thu, 30 Mar 2017 07:46:08 +0000 (00:46 -0700)]
i40e: close client on remove and shutdown
When the driver is removed or shut down, close any attached clients
(i.e. i40iw). This prevents a panic seen sometimes on forced driver
removal or system shutdown when iWarp is running.
Change-ID: I4f6161e5a73ffbb2fd5883567b007310302bfcb5 Signed-off-by: Mitch Williams <mitch.a.williams@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Mitch Williams [Thu, 30 Mar 2017 07:46:07 +0000 (00:46 -0700)]
i40e: register existing client on probe
In some cases, a client (i40iw) may already be present when probe is
called. Check for this, and add a client instance if necessary.
Change-ID: I2009312694b7ad81f1023919e4c6c86181f21689 Signed-off-by: Mitch Williams <mitch.a.williams@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Mitch Williams [Thu, 30 Mar 2017 07:46:06 +0000 (00:46 -0700)]
i40e: remove client instance on driver unload
When the driver is unloaded, we need to remove the client instance,
otherwise we leak memory.
Change-ID: If1e7882ac1f6ce15d004722fafbe31afbe0adc9a Signed-off-by: Mitch Williams <mitch.a.williams@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Preethi Banala [Mon, 27 Mar 2017 21:43:18 +0000 (14:43 -0700)]
i40e/i40evf: Add capability exchange for outer checksum
This patch adds a capability negotiation between VF and PF using ENCAP/
ENCAP_CSUM offload flags in order for the VF to support outer checksum
and TSO offloads for encapsulated packets. These capabilities were assumed
by default and enabled in current hardware. Going forward, these features
needs to be negotiated with PF before advertising to the stack.
Additionally, strip out the mac.type checks for X722 since outer checksums
are enabled based on the ENCAP_CSUM offload negotiation flag and maintain
consistency between drivers in how the features are configured.
Change-ID: Ie380a6f57eca557a2bb575b66b12fae36d308920 Signed-off-by: Preethi Banala <preethi.banala@intel.com> Signed-off-by: Alan Brady <alan.brady@intel.com> Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This is the second batch of updates to the ftgmac100 driver.
This one tackles the RX path of the driver, simplifying
it greatly to match common practice while significantly
increasing the performance.
(The bulk of the performance gains of my series will be
provided by the TX path improvements, notably fragmented
sends, these will be in the next batch).
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
ftgmac100: Work around HW bug in runt frame detection
The HW incorrectly calculates the frame size without the vlan
tag and compares that against 64. It will thus flag 64-bytes
frames with a vlan tag as 60-bytes frames "runt" packets
which we'll then drop. Thus we end up dropping ARP packets
on vlan's ...
It does that whether vlan tag stripping is enabled or not.
This works around it by ignoring the "runt" error bit of the
frame has been vlan tagged and is at least 60 bytes.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Directly access the fields when needed. The accessors add clutter
not clarity and in some cases cause unnecessary read-modify-write
type access on the slow (uncached) descriptor memory.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: David S. Miller <davem@davemloft.net>
The current driver receive path allocates pages and stashes
them into SKB fragments. This is not particularly useful as
we don't support jumbo frames (which wouldn't be great with
the small FIFOs on all the known implementations) anyway.
It also makes us flush the caches and allocate more memory
for RX than necessary.
So set our RX buf to our max packet size instead (which we
bump to 1536 bytes to account for packets with vlan tags
etc...) like most other ethernet drivers.
Then allocate skbs when populating the receive ring and DMA
directly into them.
This simplifies the RX path further.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: David S. Miller <davem@davemloft.net>
The fast path has a single unlikely() test for any error bit,
calling into a helper that sets the appropriate statistics.
The various netdev_info aren't particularly interesting. If
we want to differentiate the various length errors later we
can introduce driver specific stats using ethtool.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: David S. Miller <davem@davemloft.net>
ftgmac100: Use a scratch buffer for failed RX allocations
We can occasionally fail to allocate new RX buffers at
runtime or when starting the driver. At the moment the
latter just fails to open which is fine but the former
leaves stale DMA pointers in the ring.
Instead, use a scratch page and have all RX ring descriptors
point to it by default unless a proper buffer can be allocated.
It will help later on when re-initializing the whole ring
at runtime on link changes since there is no clean failure
path there unlike open().
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: David S. Miller <davem@davemloft.net>
We don't support jumbo frames, we will never receive a
fragmented packet, the RX buffer is always big enough,
if not then it's a runaway packet that can be dropped.
So take out the loop that handles such things in
ftgmac100_rx_packet() which will help with subsequent
simplifications and improvements to the RX path
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Thu, 6 Apr 2017 21:26:32 +0000 (14:26 -0700)]
Merge branch 'qed-misc-cleanups-and-fixes'
Yuval Mintz says:
====================
qed: Misc cleanups and fixes
Patches #1 and #2 revolve around register access performed by driver;
The first merely adds some debug, while the second does some fixing
of incorrect PTT usage as well as preventing issues similar to those
fixed by 6f437d431930 ("qed: Don't use attention PTT for configuring BW").
Patch #3 better configures HW for architecture where cacheline isn't 64B.
Patches #4-#8 all affect iSCSI related functionaility -
adding statistics information [both to driver & management firmware],
passing information on number of resources to qedi, and simplifying
the Out-of-order implementation in SW.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Michal Kalderon [Thu, 6 Apr 2017 12:58:35 +0000 (15:58 +0300)]
qed: Make OOO archipelagos into an array
No need to maintain the various open archipelagos as a list -
The maximal number of them is known, and we can use the CID
as key for random-access into the array.
Signed-off-by: Michal Kalderon <Michal.Kalderon@caviumc.om> Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Now that management firmware is capable of telling us the number of CQs
available for a given PF, qed needs to communicate the number to qedi
so it would know have many to use.
Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Before initializing the chip's engine, driver currently closes a set
of registers on the HW's ingress flow to prevent packets from slipping
in while they're not supposed to.
This configuration is insufficient, as there are some scenarios where
packets would still arrive even when said registers are set,
but the management firmware already closes other per-port registers
that do suffice, making this setting unnecessray.
Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Default HW configuration is optimal for an architecture where cache
line size is 64B.
During chip initialization, properly initialize the cache line size
in HW to avoid possible redundant PCI transactions.
Signed-off-by: Tomer Tayar <Tomer.Tayar@cavium.com> Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
In order to access HW registers driver needs to acquire a PTT entry
[mapping between bar memory and internal chip address].
Since acquiring PTT entries could fail [at least in theory] as their
number is finite and other flows can hold them, we reserve special PTT
entries for 'important' enough flows - ones we want to guarantee that
would not be susceptible to such issues.
One such special entry is the 'main' PTT which is meant to be used in
flows such as chip initialization and de-initialization.
However, there are other flows that are also using that same entry
for their own purpose, and might run concurrently with the original
flows [notice that for most cases using the main-ptt by mistake, such
a race is still impossible, at least today].
This patch re-organizes the various functions that currently use the
main_ptt in one of two ways:
- If a function shouldn't use the main_ptt it starts acquiring and
releasing it's own PTT entry and use it instead. Notice if those
functions previously couldn't fail, they now can [as acquisition
might fail].
- Change the prototypes so that the main_ptt would be received as
a parameter [instead of explicitly accessing it].
This prevents the future risk of adding codes that introduces new
use-cases for flows using the main_ptt, ones that might be in race
with the actual 'main' flows.
Signed-off-by: Rahul Verma <Rahul.Verma@cavium.com> Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
PTT entries are per-hwfn; If some errneous flow is trying
to use a PTT belonging to a differnet hwfn warn user, as this
can break every register accessing flow later and is very hard
to root-cause.
Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Thu, 6 Apr 2017 21:22:46 +0000 (14:22 -0700)]
Merge tag 'rxrpc-rewrite-20170406' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs
David Howells says:
====================
rxrpc: Miscellany
Here's a set of patches that make some minor changes to AF_RXRPC:
(1) Store error codes in struct rxrpc_call::error as negative codes and
only convert to positive in recvmsg() to avoid confusion inside the
kernel.
(2) Note the result of trying to abort a call (this fails if the call is
already 'completed').
(3) Don't abort on temporary errors whilst processing challenge and
response packets, but rather drop the packet and wait for
retransmission.
And also adds some more tracing:
(4) Protocol errors.
(5) Received abort packets.
(6) Changes in the Rx window size due to ACK packet information.
(7) Client call initiation (to allow the rxrpc_call struct pointer, the
wire call ID and the user ID/afs_call pointer to be cross-referenced).
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Thu, 6 Apr 2017 20:53:14 +0000 (13:53 -0700)]
Merge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue
Jeff Kirsher says:
====================
100GbE Intel Wired LAN Driver Updates 2017-04-05
This series contains updates to fm10k only.
Phil Turnbull from Oracle fixes an issue where the argument provided to
FM10K_REMOVED macro was not what was expecting.
Jake modifies the driver to replace the bitwise operators and defines with
a BITMAP and enumeration values to avoid race conditions. Also future
proof the driver so that developers do not have to remember to re-size the
bitmaps when adding new values. Fixed the wording of a code comment to
avoid stating that we return a value for a void function.
Ngai-Mint makes sure that when configuring the receive ring, we make sure
the receive queue is disabled. Fixed an issue where interfaces were
resetting because the transmit mailbox FIFO was becoming full since the
host was not ready, so ensure the host is ready before queueing up
mailbox messages.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Existing L2TP kernel code does not derive the optimal MTU for Ethernet
pseudowires and instead leaves this to a userspace L2TP daemon or
operator. If an MTU is not specified, the existing kernel code chooses
an MTU that does not take account of all tunnel header overheads, which
can lead to unwanted IP fragmentation. When L2TP is used without a
control plane (userspace daemon), we would prefer that the kernel does a
better job of choosing a default pseudowire MTU, taking account of all
tunnel header overheads, including IP header options, if any. This patch
addresses this.
Change-set is organized as a two part patch series, with one patch
introducing a new kernel function to compute the IP overhead on a
socket, and the other patch using this new kernel function to compute
the default L2TP MTU for an Ethernet pseudowire.
Existing code also seems to assume an Ethernet (non-jumbo) underlay. The
change proposed here uses the PMTU mechanism and the dst entry in the
L2TP tunnel socket to directly pull up the underlay MTU (as the baseline
number on top of which the encapsulation headers are factored in).
An default MTU value of 1500 bytes is assumed as a fallback only if
this fails.
Fixed the kbuild test robot error in the previous posting.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
R. Parameswaran [Thu, 6 Apr 2017 00:00:07 +0000 (17:00 -0700)]
L2TP:Adjust intf MTU, add underlay L3, L2 hdrs.
Existing L2TP kernel code does not derive the optimal MTU for Ethernet
pseudowires and instead leaves this to a userspace L2TP daemon or
operator. If an MTU is not specified, the existing kernel code chooses
an MTU that does not take account of all tunnel header overheads, which
can lead to unwanted IP fragmentation. When L2TP is used without a
control plane (userspace daemon), we would prefer that the kernel does a
better job of choosing a default pseudowire MTU, taking account of all
tunnel header overheads, including IP header options, if any. This patch
addresses this.
Change-set here uses the new kernel function, kernel_sock_ip_overhead(),
to factor the outer IP overhead on the L2TP tunnel socket (including
IP Options, if any) when calculating the default MTU for an Ethernet
pseudowire, along with consideration of the inner Ethernet header.
Signed-off-by: R. Parameswaran <rparames@brocade.com> Signed-off-by: David S. Miller <davem@davemloft.net>
R. Parameswaran [Wed, 5 Apr 2017 23:50:35 +0000 (16:50 -0700)]
New kernel function to get IP overhead on a socket.
A new function, kernel_sock_ip_overhead(), is provided
to calculate the cumulative overhead imposed by the IP
Header and IP options, if any, on a socket's payload.
The new function returns an overhead of zero for sockets
that do not belong to the IPv4 or IPv6 address families.
This is used in the L2TP code path to compute the
total outer IP overhead on the L2TP tunnel socket when
calculating the default MTU for Ethernet pseudowires.
Signed-off-by: R. Parameswaran <rparames@brocade.com> Signed-off-by: David S. Miller <davem@davemloft.net>
When qedr is enabled, qed would try dividing the msi-x vectors between
L2 and RoCE, starting with L2 and providing it with sufficient vectors
for its queues.
Problem is qed would also do that for storage partitions, and as those
don't need queues it would lead qed to award those partitions with 0
msi-x vectors, causing them to believe theye're using INTa and
preventing them from operating.
Fixes: 51ff17251c9c ("qed: Add support for RoCE hw init") Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
net: dsa: loop: Initialize err in dsa_loop_vlan_dump
Dan's static checker reported the following:
drivers/net/dsa/dsa_loop.c:223 dsa_loop_port_vlan_dump()
error: uninitialized symbol 'err'.
which could happen if we do hit the continue statement for each iteration of
the loop. Initialize err to 0 here.
Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Fixes: 98cd1552ea27 ("net: dsa: Mock-up driver") Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
drivers/net/dsa/dsa_loop.c:181 dsa_loop_port_vlan_del()
error: XXX uninitialized symbol 'pvid'.
we were missing the assignment of pvid to ps->vid, so add that.
Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Fixes: 98cd1552ea27 ("net: dsa: Mock-up driver") Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Or Gerlitz [Wed, 5 Apr 2017 16:09:25 +0000 (19:09 +0300)]
net/sched: Removed unused vlan actions definition
Commit c7e2b9689ef "sched: introduce vlan action" added both the
UAPI values for the vlan actions (TCA_VLAN_ACT_) and these two
in-kernel ones which are not used, remove them.
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Colin Ian King [Wed, 5 Apr 2017 12:35:44 +0000 (13:35 +0100)]
qed: fix missing break in OOO_LB_TC case
There seems to be a missing break on the OOO_LB_TC case, pq_id
is being assigned and then re-assigned on the fall through default
case and that seems suspect.
Detected by CoverityScan, CID#1424402 ("Missing break in switch")
Fixes: b5a9ee7cf3be1 ("qed: Revise QM cofiguration") Signed-off-by: Colin Ian King <colin.king@canonical.com> Acked-by: Yuval Mintz <Yuval.Mintz@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Commit 9008ae074885 ("net/mlx5e: Minimize mlx5e_{open/close}_locked")
copied the calls to netif_set_real_num_{tx,rx}_queues from
mlx5e_open_locked to mlx5e_activate_priv_channels and wraps them in an
if condition to test for netdev->real_num_{tx,rx}_queues.
But netdev->real_num_rx_queues is conditionally compiled in if CONFIG_SYSFS
is set. Without CONFIG_SYSFS the build fails:
drivers/net/ethernet/mellanox/mlx5/core/en_main.c: In function 'mlx5e_activate_priv_channels':
drivers/net/ethernet/mellanox/mlx5/core/en_main.c:2515:12: error: 'struct net_device' has no member named 'real_num_rx_queues'; did you mean 'real_num_tx_queues'?
Fix this by unconditionally call netif_set_real_num{tx,rx}_queues like before
commit 9008ae074885.
Fixes: 9008ae074885 ("net/mlx5e: Minimize mlx5e_{open/close}_locked") Signed-off-by: Tobias Regnery <tobias.regnery@gmail.com> Acked-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Prepare to mark sensitive kernel structures for randomization by making
sure they're using designated initializers. These were identified during
allyesconfig builds of x86, arm, and arm64, and the initializer fixes
were extracted from grsecurity. In this case, NULL initialize with { }
instead of undesignated NULLs.
Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: David S. Miller <davem@davemloft.net>
====================
ftgmac100: Rework batch 1 - Link & Interrupts
This is version 2 of the first batch of updates to the
ftgmac100 driver.
Essentially:
- A few misc cleanups
- Fixing link speed & duplex handling (including dealing with
an Aspeed requirement to double reset the controller when
the speed changes)
- And addition of a reset task workqueue which will be used
for delaying the re-initialization of the controller
- Fixing a number of issues with how interrupts and NAPI
are dealt with.
Subsequent batches will rework and improve the rx path, the
tx path, and add a bunch of features and fixes.
Version 2 addresses some review comments to patches 5 and 10
(see version history in the respective emails).
====================
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
First, don't look at the interrupt status in the poll loop
to decide what to poll. It's wrong. If we have run out of
budget, we may still have RX packets to unqueue but no more
RX interrupt pending.
So instead move the code looking at the interrupt status
into the interrupt handler where it belongs. That avoids a slow
MMIO read in the NAPI fast path. We keep the abnormal interrupts
enabled while NAPI is scheduled.
While at it, actually do something useful in the "error" cases:
On AHB bus error, trigger the new reset task, that's about all
we can do. On RX packet fifo or descriptor overflows, we need
to restart the MAC after having freed things up. So set a flag
that NAPI will see and use to perform that restart after
harvesting the RX ring.
Finally, we shouldn't complete NAPI if there are still outgoing
packets that will need harvesting. Waiting for more interrupts
is less efficient than letting NAPI run a while longer while
the queue drains.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: David S. Miller <davem@davemloft.net>
The HW requires a full MAC reset when changing the speed.
Additionally the Aspeed documentation spells out that the
MAC needs to be reset twice with a 10us interval.
We thus move the speed setting and top level reset code
into a new ftgmac100_reset_and_config_mac() function which
handles both. Move the ring pointers initialization there
too in order to reflect the HW change.
Also reduce the timeout for the MAC reset as it shouldn't
take more than 300 clock cycles according to the doc.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: David S. Miller <davem@davemloft.net>
ftgmac100: Add a reset task and use it for link changes
Link speed changes require a full HW reset. This isn't done
properly at the moment. It will involve delays and thus isn't
suitable to do from the link poll callback.
So let's create a reset_task that we can queue up when the
link changes. It will be useful for various cases of error
handling as well.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
ftgmac100: Move the bulk of inits to a separate function
The link monitoring and error handling code will have to
redo the ring inits and HW setup so move the code out of
ftgmac100_open() into a dedicated function.
This forces a bit of re-ordering of ftgmac100_open() but
nothing dramatic.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: David S. Miller <davem@davemloft.net>
ftgmac100: Split ring alloc, init and rx buffer alloc
Currently, a single function is used to allocate the rings
themselves, initialize them, populate the rx ring, and
allocate the rx buffers. The same happens on free.
This splits them into separate functions. This will be
useful when properly implementing re-initialization on
link changes and error handling when the rings will be
repopulated but not freed.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: David S. Miller <davem@davemloft.net>
ftgmac100: Cleanup speed/duplex tracking and fix duplex config
Keep track of both the current speed and duplex settings
instead of only speed and properly apply the duplex setting
to the HW.
This reworks the adjust_link() function to also avoid trying
to reconfigure the HW when there is no link and to display
the link state to the user.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
The divisions they represent are not particularily meaningful
and things are going to be moving around with upcoming changes
making these comments more a burden than anything else.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Detection of watchdog timeout of Octeon cores is flawed and susceptible to
false alarms. Refactor by removing the detection code, and in its place,
leverage existing code that monitors for an indication from the NIC
firmware that an Octeon core crashed; expand the meaning of the indication
to "an Octeon core crashed or its watchdog timer expired". Detection of
watchdog timeout is now delegated to an exception handler in the NIC
firmware; this is free of false alarms.
Also if there's an Octeon core crash or watchdog timeout:
(1) Disable VF Ethernet links.
(2) Decrement the module refcount by an amount equal to the number of
active VFs of the NIC whose Octeon core crashed or had a watchdog
timeout. The refcount will continue to reflect the active VFs of
other liquidio NIC(s) (if present) whose Octeon cores are faultless.
Item (2) is needed to avoid the case of not being able to unload the driver
because the module refcount is stuck at some non-zero number. There is
code that, in normal cases, decrements the refcount upon receiving a
message from the firmware that a VF driver was unloaded. But in
exceptional cases like an Octeon core crash or watchdog timeout, arrival of
that particular message from the firmware might be unreliable. That normal
case code is changed to not touch the refcount in the exceptional case to
avoid contention (over the refcount) with the liquidio_watchdog kernel
thread who will carry out item (2).
Signed-off-by: Felix Manlunas <felix.manlunas@cavium.com> Signed-off-by: Derek Chickles <derek.chickles@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
drivers/net/usb/usbnet.c:85:19: warning: 'driver_name' defined but not
used [-Wunused-const-variable=]
static const char driver_name [] = "usbnet";
^~~~~~~~~~~
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>
David Howells [Thu, 6 Apr 2017 09:12:00 +0000 (10:12 +0100)]
rxrpc: Trace client call connection
Add a tracepoint (rxrpc_connect_call) to log the combination of rxrpc_call
pointer, afs_call pointer/user data and wire call parameters to make it
easier to match the tracebuffer contents to captured network packets.
Signed-off-by: David Howells <dhowells@redhat.com>
David Howells [Thu, 6 Apr 2017 09:12:00 +0000 (10:12 +0100)]
rxrpc: Trace protocol errors in received packets
Add a tracepoint (rxrpc_rx_proto) to record protocol errors in received
packets. The following changes are made:
(1) Add a function, __rxrpc_abort_eproto(), to note a protocol error on a
call and mark the call aborted. This is wrapped by
rxrpc_abort_eproto() that makes the why string usable in trace.
(2) Add trace_rxrpc_rx_proto() or rxrpc_abort_eproto() to protocol error
generation points, replacing rxrpc_abort_call() with the latter.
(3) Only send an abort packet in rxkad_verify_packet*() if we actually
managed to abort the call.
Note that a trace event is also emitted if a kernel user (e.g. afs) tries
to send data through a call when it's not in the transmission phase, though
it's not technically a receive event.
Signed-off-by: David Howells <dhowells@redhat.com>
David Howells [Thu, 6 Apr 2017 09:11:59 +0000 (10:11 +0100)]
rxrpc: Handle temporary errors better in rxkad security
In the rxkad security module, when we encounter a temporary error (such as
ENOMEM) from which we could conceivably recover, don't abort the
connection, but rather permit retransmission of the relevant packets to
induce a retry.
Note that I'm leaving some places that could be merged together to insert
tracing in the next patch.
Signed-off-by; David Howells <dhowells@redhat.com>
David Howells [Thu, 6 Apr 2017 09:11:59 +0000 (10:11 +0100)]
rxrpc: Note a successfully aborted kernel operation
Make rxrpc_kernel_abort_call() return an indication as to whether it
actually aborted the operation or not so that kafs can trace the failure of
the operation. Note that 'success' in this context means changing the
state of the call, not necessarily successfully transmitting an ABORT
packet.
Signed-off-by: David Howells <dhowells@redhat.com>
David Howells [Thu, 6 Apr 2017 09:11:56 +0000 (10:11 +0100)]
rxrpc: Use negative error codes in rxrpc_call struct
Use negative error codes in struct rxrpc_call::error because that's what
the kernel normally deals with and to make the code consistent. We only
turn them positive when transcribing into a cmsg for userspace recvmsg.
Signed-off-by: David Howells <dhowells@redhat.com>
Ngai-Mint Kwan [Mon, 6 Feb 2017 22:21:13 +0000 (14:21 -0800)]
fm10k: do not enqueue mailbox when host not ready
Interfaces will reset whenever the TX mailbox FIFO has become full. This
occurs more frequently whenever the IES API application is not running
to process and clear the messages in the FIFO. Thus, this could lead to
situations where the interface would enter an infinite reset loop. That
is: if the interface is trying to synchronize a huge number of unicast
and multicast entries with the IES API application, the TX mailbox FIFO
will become full and the interface resets. Once the interface exits
reset, it'll try to synchronize the unicast and multicast entries again.
Ergo, this creates an infinite loop. Other actions such as multiple
mulitcast mode or up/down transitions will fill the TX mailbox FIFO and
induce the interface to reset. To correct these situations, check if the
interface's "host_ready" flag is enabled before enqueuing any messages
to the TX mailbox FIFO. This check will be conducted by a function call.
Lastly, this issue mainly affects the PF and, thus, the VF is exempt.
Signed-off-by: Ngai-Mint Kwan <ngai-mint.kwan@intel.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Krishneil Singh <krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Jacob Keller [Thu, 12 Jan 2017 23:59:41 +0000 (15:59 -0800)]
fm10k: update function header comment for fm10k_get_stats64
Re-word the comment to avoid stating that we return a value for this
void function. Additionally, there is no need to mention older kernels,
since this is the upstream kernel.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Krishneil Singh <krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Jacob Keller [Thu, 12 Jan 2017 23:59:40 +0000 (15:59 -0800)]
fm10k: allow service task to reschedule itself
If some code path executes fm10k_service_event_schedule(), it is
guaranteed that we only queue the service task once, since we use
__FM10K_SERVICE_SCHED flag. Unfortunately this has a side effect that if
a service request occurs while we are currently running the watchdog, it
is possible that we will fail to notice the request and ignore it until
the next time the request occurs.
This can cause problems with pf/vf mailbox communication and other
service event tasks. To avoid this, introduce a FM10K_SERVICE_REQUEST
bit. When we successfully schedule (and set the _SCHED bit) the service
task, we will clear this bit. However, if we are unable to currently
schedule the service event, we just set the new SERVICE_REQUEST bit.
Finally, after the service event completes, we will re-schedule if the
request bit has been set.
This should ensure that we do not miss any service event schedules,
since we will re-schedule it once the currently running task finishes.
This means that for each request, we will always schedule the service
task to run at least once in full after the request came in.
This will avoid timing issues that can occur with the service event
scheduling. We do pay a cost in re-running many tasks, but all the
service event tasks use either flags to avoid duplicate work, or are
tolerant of being run multiple times.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Krishneil Singh <krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>