Nicolas Dichtel [Thu, 25 Oct 2012 22:28:49 +0000 (22:28 +0000)]
rtnl: add a new type of msg to advertise protocol configuration
A new type is added to allow userland to monitor protocol configuration, like
IPv4 or IPv6.
For example, monitoring the state of the forwarding status of an interface of
the system.
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Fri, 26 Oct 2012 18:40:55 +0000 (14:40 -0400)]
Merge branch 'master' of git://1984.lsi.us.es/nf-next
Pablo Neira Ayuso says:
====================
The following changeset contains updates for IPVS from Jesper Dangaard
Brouer that did not reach the previous merge window in time.
More specifically, updates to improve IPv6 support in IPVS. More
relevantly, some of the existing code performed wrong handling of the
extensions headers and better fragmentation handling.
Jesper promised more follow-up patches to refine this after this batch
hits net-next. Yet to come.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Instead of updating the sk_cgrp_prioidx struct field on every send
this only updates the field when a task is moved via cgroup
infrastructure.
This allows sockets that may be used by a kernel worker thread
to be managed. For example in the iscsi case today a user can
put iscsid in a netprio cgroup and control traffic will be sent
with the correct sk_cgrp_prioidx value set but as soon as data
is sent the kernel worker thread isssues a send and sk_cgrp_prioidx
is updated with the kernel worker threads value which is the
default case.
It seems more correct to only update the field when the user
explicitly sets it via control group infrastructure. This allows
the users to manage sockets that may be used with other threads.
Since classid is now updated when the task is moved between the
cgroups, we don't have to call sock_update_classid() from various
places to ensure we always using the latest classid value.
[v2: Use iterate_fd() instead of open coding]
Signed-off-by: Daniel Wagner <daniel.wagner@bmw-carit.de> Cc: Li Zefan <lizefan@huawei.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Jamal Hadi Salim <jhs@mojatatu.com> Cc: Joe Perches <joe@perches.com> Cc: John Fastabend <john.r.fastabend@intel.com> Cc: Neil Horman <nhorman@tuxdriver.com> Cc: Stanislav Kinsbursky <skinsbursky@parallels.com> Cc: Tejun Heo <tj@kernel.org> Cc: <netdev@vger.kernel.org> Cc: <cgroups@vger.kernel.org> Acked-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Daniel Wagner [Thu, 25 Oct 2012 04:16:58 +0000 (04:16 +0000)]
cgroup: net_cls: Pass in task to sock_update_classid()
sock_update_classid() assumes that the update operation always are
applied on the current task. sock_update_classid() needs to know on
which tasks to work on in order to be able to migrate task between
cgroups using the struct cgroup_subsys attach() callback.
Signed-off-by: Daniel Wagner <daniel.wagner@bmw-carit.de> Cc: "David S. Miller" <davem@davemloft.net> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Eric Dumazet <edumazet@google.com> Cc: Glauber Costa <glommer@parallels.com> Cc: Joe Perches <joe@perches.com> Cc: Neil Horman <nhorman@tuxdriver.com> Cc: Stanislav Kinsbursky <skinsbursky@parallels.com> Cc: Tejun Heo <tj@kernel.org> Cc: <netdev@vger.kernel.org> Cc: <cgroups@vger.kernel.org> Acked-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Daniel Wagner [Thu, 25 Oct 2012 04:16:57 +0000 (04:16 +0000)]
cgroup: net_cls: Remove rcu_read_lock/unlock
As Eric pointed out:
"Hey task_cls_classid() has its own rcu protection since commit 3fb5a991916091a908d (cls_cgroup: Fix rcu lockdep warning)
So we can safely revert Paul commit (1144182a8757f2a1)
(We no longer need rcu_read_lock/unlock here)"
Signed-off-by: Daniel Wagner <daniel.wagner@bmw-carit.de> Cc: "David S. Miller" <davem@davemloft.net> Cc: Eric Dumazet <edumazet@google.com> Cc: Glauber Costa <glommer@parallels.com> Cc: Li Zefan <lizefan@huawei.com> Cc: Neil Horman <nhorman@tuxdriver.com> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Tejun Heo <tj@kernel.org> Cc: netdev@vger.kernel.org Cc: cgroups@vger.kernel.org Acked-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Daniel Wagner [Thu, 25 Oct 2012 04:16:56 +0000 (04:16 +0000)]
cgroup: net_cls: Fix local variable type decleration
The classid type used throughout the kernel is u32.
Signed-off-by: Daniel Wagner <daniel.wagner@bmw-carit.de> Cc: "David S. Miller" <davem@davemloft.net> Cc: Li Zefan <lizefan@huawei.com> Cc: Tejun Heo <tj@kernel.org> Cc: <netdev@vger.kernel.org> Cc: <cgroups@vger.kernel.org> Acked-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Daniel Wagner [Thu, 25 Oct 2012 04:16:55 +0000 (04:16 +0000)]
cgroup: net_prio: Mark local used function static
net_prio_attach() is only access via cgroup_subsys callbacks,
therefore we can reduce the visibility of this function.
Signed-off-by: Daniel Wagner <daniel.wagner@bmw-carit.de> Cc: "David S. Miller" <davem@davemloft.net> Cc: John Fastabend <john.r.fastabend@intel.com> Cc: Li Zefan <lizefan@huawei.com> Cc: Neil Horman <nhorman@tuxdriver.com> Cc: Tejun Heo <tj@kernel.org> Cc: <netdev@vger.kernel.org> Cc: <cgroups@vger.kernel.org> Acked-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
so that each low level driver doesn't need to implement them
by itself, and the dma buffer allocation for usb transfer and
runtime PM things can be handled just in one place.
Acked-by: Oliver Neukum <oneukum@suse.de> Signed-off-by: Ming Lei <ming.lei@canonical.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Neil Horman [Wed, 24 Oct 2012 09:26:51 +0000 (09:26 +0000)]
MAINTAINERS: Add myself to list of SCTP maintainers
I've been messing with the code for a bit, and I figured Vlad could use a hand
as interest in the protocol has picked up over the last year or so. I've asked
him, and he doesn't seem too upset over the idea :)
Signed-off-by: Neil Horman <nhorman@tuxdriver.com> CC: Vlad Yasevich <vyasevich@gmail.com> CC: "David S. Miller" <davem@davemloft.net> CC: linux-sctp@vger.kernel.org Acked-by: Vlad Yasevich <vyasevich@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Neil Horman [Wed, 24 Oct 2012 09:20:03 +0000 (09:20 +0000)]
sctp: Make hmac algorithm selection for cookie generation dynamic
Currently sctp allows for the optional use of md5 of sha1 hmac algorithms to
generate cookie values when establishing new connections via two build time
config options. Theres no real reason to make this a static selection. We can
add a sysctl that allows for the dynamic selection of these algorithms at run
time, with the default value determined by the corresponding crypto library
availability.
This comes in handy when, for example running a system in FIPS mode, where use
of md5 is disallowed, but SHA1 is permitted.
Note: This new sysctl has no corresponding socket option to select the cookie
hmac algorithm. I chose not to implement that intentionally, as RFC 6458
contains no option for this value, and I opted not to pollute the socket option
namespace.
Change notes:
v2)
* Updated subject to have the proper sctp prefix as per Dave M.
* Replaced deafult selection options with new options that allow
developers to explicitly select available hmac algs at build time
as per suggestion by Vlad Y.
Signed-off-by: Neil Horman <nhorman@tuxdriver.com> CC: Vlad Yasevich <vyasevich@gmail.com> CC: "David S. Miller" <davem@davemloft.net> CC: netdev@vger.kernel.org Acked-by: Vlad Yasevich <vyasevich@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Daniel Borkmann [Wed, 24 Oct 2012 13:27:24 +0000 (15:27 +0200)]
packet: minor: remove unused err assignment
This tiny patch removes two unused err assignments. In those two cases the
err variable is either overwritten with another value at a later point in
time without having read the previous assigment, or it is assigned and the
function returns without using/reading err after the assignment.
Signed-off-by: Daniel Borkmann <daniel.borkmann@tik.ee.ethz.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
hayeswang [Tue, 23 Oct 2012 20:24:03 +0000 (20:24 +0000)]
r8169: enable ALDPS for power saving
Enable ALDPS function to save power when link down. Note that the
feature should be set after the other PHY settings. And the firmware
is necessary. Don't enable it without loading the firmware.
None of the firmware-free chipsets support ALDPS. Neither do the
RTL8168d/8111d.
For 8136 series, make sure the ALDPS is disabled before loading the
firmware. For 8168 series, the ALDPS would be disabled automatically
when loading firmware. You must not disable it directly.
Signed-off-by: Hayes Wang <hayeswang@realtek.com> Acked-by: Francois Romieu <romieu@fr.zoreil.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Nicolas Dichtel [Mon, 22 Oct 2012 23:35:06 +0000 (23:35 +0000)]
ipv6: fix sparse warnings in rt6_info_hash_nhsfn()
Adding by commit 51ebd3181572 which adds the support of ECMP for IPv6.
Spotted-by: Fengguang Wu <fengguang.wu@intel.com> Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>
FW flashing code, even though it works correctly, makes some hidden
assumptions about buffer sizes. This is causing code analysers to
report error. Cleanup FW flashing code to remove these hidden assumptions.
Reported-by: Yuanhan Liu <yuanhan.liu@intel.com> Signed-off-by: Vasundhara Volam <vasundhara.volam@emulex.com> Signed-off-by: Sathya Perla <sathya.perla@emulex.com> Signed-off-by: Padmanabh Ratnakar <padmanabh.ratnakar@emulex.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Hans Zhang [Mon, 22 Oct 2012 22:21:23 +0000 (22:21 +0000)]
netlink: cleanup the unnecessary return value check
It's no needed to check the return value of tab since the NULL situation
has been handled already, and the rtnl_msg_handlers[PF_UNSPEC] has been
initialized as non-NULL during the rtnetlink_init().
Signed-off-by: Hans Zhang <zhanghonghui@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Mon, 22 Oct 2012 21:42:47 +0000 (21:42 +0000)]
ipv4: tcp: clean up tcp_v4_early_demux()
Use same header helpers than tcp_v6_early_demux() because they
are a bit faster, and as they make IPv4/IPv6 versions look
the same.
Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Neal Cardwell <ncardwell@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Neal Cardwell [Mon, 22 Oct 2012 21:41:48 +0000 (21:41 +0000)]
ipv6: tcp: clean up tcp_v6_early_demux() icsk variable
Remove an icsk variable, which by convention should refer to an
inet_connection_sock rather than an inet_sock. In the process, make
the tcp_v6_early_demux() code and formatting a bit more like
tcp_v4_early_demux(), to ease comparisons and maintenance.
Signed-off-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
John Fastabend [Mon, 1 Oct 2012 14:52:20 +0000 (14:52 +0000)]
ixgbevf: fix softirq-safe to unsafe splat on internal mbx_lock
The lockdep splat below identifies a case where irq safe to unsafe
lock order is detected. Resolved by making mbx_lock bh.
======================================================
[ INFO: SOFTIRQ-safe -> SOFTIRQ-unsafe lock order detected ]
3.6.0-rc5jk-net-next+ #119 Not tainted
------------------------------------------------------
ip/2608 [HC0[0]:SC0[2]:HE1:SE0] is trying to acquire:
(&(&adapter->mbx_lock)->rlock){+.+...}, at: [<ffffffffa008114e>] ixgbevf_set_rx_mode+0x36/0xd2 [ixgbevf]
and this task is already holding:
(_xmit_ETHER){+.....}, at: [<ffffffff814097c8>] dev_set_rx_mode+0x1e/0x33
which would create a new lock dependency:
(_xmit_ETHER){+.....} -> (&(&adapter->mbx_lock)->rlock){+.+...}
Signed-off-by: John Fastabend <john.r.fastabend@intel.com> Acked-by: Greg Rose <gregory.v.rose@intel.com> Tested-by: Sibai Li <sibai.li@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Greg Rose [Fri, 21 Sep 2012 00:14:14 +0000 (00:14 +0000)]
ixgbevf: Check for error on dma_map_single call
Ignoring the return value from a call to the kernel dma_map API functions
can cause data corruption and system instability. Check the return value
and take appropriate action.
Signed-off-by: Greg Rose <gregory.v.rose@intel.com> Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com> Tested-by: Sibai Li <sibai.li@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
John Fastabend [Sun, 16 Sep 2012 08:19:46 +0000 (08:19 +0000)]
ixgbevf: make netif_napi_add and netif_napi_del symmetric
ixgbevf_alloc_q_vectors() calls netif_napi_add for each qvector
where qvectors is determined by the number of msix vectors. This
makes perfect sense.
However on cleanup when ixgbevf_free_q_vectors() is called and
for each qvector we should call netif_napi_del there is some
extra logic to add a dependency on RX queues. This patch makes
the add/del operations symmetric by removing the RX queues
dependency.
Without this if free_netdev() is called we see the general
protection fault below in netif_napi_del when list_del_init()
is called.
Signed-off-by: John Fastabend <john.r.fastabend@intel.com> Acked-by: Greg Rose <gregory.v.rose@intel.com> Acked-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Sibai Li <sibai.li@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Carolyn Wyborny [Wed, 10 Oct 2012 04:42:59 +0000 (04:42 +0000)]
igb: Update get cable length function for i210/i211
There was a problem in the initial implementation of the get cable length
function for i210 and it did not work properly. This patch fixes that
problem for i210/i211 devices.
Signed-off-by: Carolyn Wyborny <carolyn.wyborny@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Tushar Dave [Fri, 14 Sep 2012 02:21:37 +0000 (02:21 +0000)]
e1000e: Minimum packet size must be 17 bytes
This is a HW requirement. Although a buffer as short as 1 byte is allowed,
the total length of packet before, padding and CRC insertion, must be at
least 17 bytes. So pad all small packets manually up to 17 bytes before
delivering them to HW.
Signed-off-by: Tushar Dave <tushar.n.dave@intel.com> Tested-by: Jeff Pieper <jeffrey.e.pieper@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
David S. Miller [Tue, 23 Oct 2012 06:51:00 +0000 (02:51 -0400)]
Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/net-next
Jeff Kirsher says:
====================
This series contains updates to ixgbe only. Only change to this series
is I dropped the "ixgbe: Add support for pipeline reset" due to
change requested by Martin Josefsson.
Alexander Duyck (7):
ixgbe: Add support for IPv6 and UDP to ixgbe_get_headlen
ixgbe: Add support for tracking the default user priority to SR-IOV
ixgbe: Add support for GET_QUEUES message to get DCB configuration
ixgbe: Enable support for VF API version 1.1 in the PF.
ixgbevf: Add VF DCB + SR-IOV support
ixgbe: Drop unnecessary addition from ixgbe_set_rx_buffer_len
ixgbe: Fix possible memory leak in ixgbe_set_ringparam
Don Skidmore (1):
ixgbe: Add function ixgbe_reset_pipeline_82599
Emil Tantilov (1):
ixgbe: add WOL support for new subdevice id
Jacob Keller (1):
ixgbe: (PTP) refactor init, cyclecounter and reset
Tushar Dave (1):
ixgbe: Correcting small packet padding
Wei Yongjun (1):
ixgbe: using is_zero_ether_addr() to simplify the code
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Bjørn Mork [Mon, 22 Oct 2012 10:56:40 +0000 (10:56 +0000)]
net: cdc_mbim: Device Service Stream support
MBIM devices can support up to 256 generic streams called
Device Service Streams (DSS). The MBIM spec says
The format of the Device Service Stream payload depends
on the device service (as identified by the corresponding
UUID) that is used when opening the data stream.
Example use cases are serial AT command interfaces and NMEA
data streams. We cannot make any assumptions about these
device services.
Adding support for Device Service Stream by extending
the MBIM session to VLAN mapping scheme, allocating
VLAN IDs 256 to 511 for DSS, using the DSS SessionID
as the lower 8bit of the VLAN ID.
Using a netdev for DSS keeps the device framing intact and
allows userspace to do whatever it want with the streams.
For example, exporting an AT command interface using DSS
session #0 to a PTY for use with a terminal application like
minicom:
vconfig add wwan0 256
ip link set dev wwan0 up
ip link set dev wwan0.256 up
socat INTERFACE:wwan0.256,type=2 PTY:,echo=0,link=/tmp/modem
Device configuration must be done using MBIM control commands
over the /dev/cdc-wdmx device. The userspace management
application should coordinate host VLAN configuration and the
device MBIM configuration using the device capabilities to
find out if it needs to set up PTY mappings etc.
Signed-off-by: Bjørn Mork <bjorn@mork.no> Signed-off-by: David S. Miller <davem@davemloft.net>
Bjørn Mork [Mon, 22 Oct 2012 10:56:39 +0000 (10:56 +0000)]
net: cdc_ncm: map MBIM IPS SessionID to VLAN ID
MBIM devices can support up to 256 independent IP Streams.
The main network device will only handle SessionID 0. Mapping
SessionIDs 1 to 255 to VLANs using the SessionID as VLAN ID
allow userspace to use these streams with traditional tools
like vconfig.
Signed-off-by: Bjørn Mork <bjorn@mork.no> Signed-off-by: David S. Miller <davem@davemloft.net>
Bjørn Mork [Mon, 22 Oct 2012 10:56:38 +0000 (10:56 +0000)]
net: cdc_ncm: do not bind to NCM compatible MBIM devices
The MBIM specification allows a MBIM device to disguise
itself as NCM for backwards compatibility, using additional
altsettings with different subclass (control) or protocol
(data):
Greg Suarez [Mon, 22 Oct 2012 10:56:36 +0000 (10:56 +0000)]
net: cdc_mbim: adding MBIM driver
The CDC Mobile Broadband Interface Model (MBIM) specification
extends CDC NCM by
- removing the redundant ethernet header from the point-to-point
USB channel
- adding support for multiple IP (v4 and/or v6) sessions multiplexed
on the same USB channel
- adding a MBIM control channel encapsulated in CDC
- adding Device Service Streams (DSS), which are non IP generic data
streams multiplexed on the same USB channel as the IP sessions
MBIM devices are managed using the dedicated control channel, and no
data will flow on the data channel until a control session has been
established. This driver has no knowledge of MBIM control messages.
It just exports the control channel to a /dev/cdc-wdmX character
device for userspace management applications. Such an application is
therefore required to use this driver.
This patch implements basic MBIM support, reusing the NCM and WDM driver
APIs, currently limited to IP sessions with SessionID 0. DSS and
multiplexed IP sessions are not yet supported.
Signed-off-by: Greg Suarez <gsuarez@smithmicro.com> Signed-off-by: Bjørn Mork <bjorn@mork.no> Signed-off-by: David S. Miller <davem@davemloft.net>
Bjørn Mork [Mon, 22 Oct 2012 10:56:34 +0000 (10:56 +0000)]
net: cdc_ncm: refactoring for tx multiplexing
Adding multiplexed NDP support to cdc_ncm_fill_tx_frame, allowing
transmissions of multiple independent sessions within the same NTB.
Refactoring the code quite a bit to avoid having to store copies
of multiple NDPs being prepared for tx. The old code would still
reserve enough room for a maximum sized NDP in the skb so we might
as well keep them in the skb while they are being prepared.
Signed-off-by: Bjørn Mork <bjorn@mork.no> Signed-off-by: David S. Miller <davem@davemloft.net>
Bjørn Mork [Mon, 22 Oct 2012 10:56:33 +0000 (10:56 +0000)]
net: cdc_ncm: splitting rx_fixup for code reuse
Verifying and handling received MBIM and NCM frames will need
to be different in three areas:
- verifying the NDP signature
- checking valid datagram length
- datagram header manipulation
This makes it inconvenient to share rx_fixup in whole. But
some verification parts are common. Split these out in separate
functions.
Signed-off-by: Bjørn Mork <bjorn@mork.no> Signed-off-by: David S. Miller <davem@davemloft.net>
Bjørn Mork [Mon, 22 Oct 2012 10:56:32 +0000 (10:56 +0000)]
net: cdc_ncm: process chained NDPs
The NCM 1.0 spefication makes provisions for linking more than
one NDP into a single NTB. This is important for MBIM support,
where these NDPs might be of different types.
Following the chain of NDPs is also correct for NCM, and will
not change anything in the common case where there is only
one NDP
Signed-off-by: Bjørn Mork <bjorn@mork.no> Signed-off-by: David S. Miller <davem@davemloft.net>
Greg Suarez [Mon, 22 Oct 2012 10:56:30 +0000 (10:56 +0000)]
net: cdc_ncm: adding MBIM support to ncm_setup
MBIM and NCM are very similar, so we can reuse most of the
setup and bind logic in cdc_ncm for CDC MBIM devices. Handle
a few minor differences in ncm_setup.
Signed-off-by: Greg Suarez <gsuarez@smithmicro.com> Signed-off-by: Bjørn Mork <bjorn@mork.no> Signed-off-by: David S. Miller <davem@davemloft.net>
Greg Suarez [Mon, 22 Oct 2012 10:56:29 +0000 (10:56 +0000)]
USB: cdc: add MBIM constants and structures
Based on revision 1.0 of "Universal Serial Bus Communications
Class Subclass Specification for Mobile Broadband Interface
Model" available from www.usb.org
Signed-off-by: Greg Suarez <gsuarez@smithmicro.com>
[bmork: added DSS defines] Signed-off-by: Bjørn Mork <bjorn@mork.no> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Some devices do not support the 8 byte variants of the NTB input
size control messages despite announcing such support in their
NCM or MBIM functional descriptor.
According to the NCM specification, all devices must support the
4 byte variant regardless of whether or not the flag is set:
If bit D5 is set in the bmNetworkCapabilities field of
function’s NCM Functional Descriptor, the host may
set wLength either to 4 or to 8. If wLength is 4, the
function shall assume that wNtbInMaxDatagrams is to be
set to zero. If wLength is 8, then the function shall
use the provided value as the limit. The function shall
return an error response (a STALL PID) if wLength is set
to any other value.
We do not set wNtbInMaxDatagrams in any case, so we can just as
well unconditionally use the 4 byte variant without losing any
functionality. This works around the known firmware bug, and
simplifies the code considerably.
Signed-off-by: Bjørn Mork <bjorn@mork.no> Signed-off-by: David S. Miller <davem@davemloft.net>
Joachim Eastwood [Mon, 22 Oct 2012 08:45:31 +0000 (08:45 +0000)]
net/macb: fix truncate warnings
When building macb on x86_64 the following warnings show up:
drivers/net/ethernet/cadence/macb.c: In function macb_interrupt:
drivers/net/ethernet/cadence/macb.c:556:4: warning: large integer implicitly truncated to unsigned type [-Woverflow]
drivers/net/ethernet/cadence/macb.c: In function macb_reset_hw:
drivers/net/ethernet/cadence/macb.c:792:2: warning: large integer implicitly truncated to unsigned type [-Woverflow]
drivers/net/ethernet/cadence/macb.c:793:2: warning: large integer implicitly truncated to unsigned type [-Woverflow]
drivers/net/ethernet/cadence/macb.c:796:2: warning: large integer implicitly truncated to unsigned type [-Woverflow]
Use -1 insted of ~0UL, as done in other places in the driver,
to silence these warnings.
Signed-off-by: Joachim Eastwood <manabian@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Nicolas Dichtel [Mon, 22 Oct 2012 03:42:09 +0000 (03:42 +0000)]
ipv6: add support of equal cost multipath (ECMP)
Each nexthop is added like a single route in the routing table. All routes
that have the same metric/weight and destination but not the same gateway
are considering as ECMP routes. They are linked together, through a list called
rt6i_siblings.
ECMP routes can be added in one shot, with RTA_MULTIPATH attribute or one after
the other (in both case, the flag NLM_F_EXCL should not be set).
The patch is based on a previous work from
Luc Saillard <luc.saillard@6wind.com>.
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Alexander Duyck [Wed, 12 Sep 2012 07:09:51 +0000 (07:09 +0000)]
ixgbe: Fix possible memory leak in ixgbe_set_ringparam
We were not correctly freeing the temporary rings on error in
ixgbe_set_ring_param. In order to correct this I am unwinding a number of
changes that were made in order to get things back to the original working
form with modification for the current ring layouts.
This approach has multiple advantages including a smaller memory footprint,
and the fact that the interface is stopped while we are allocating the rings
meaning that there is less potential for some sort of memory corruption on the
ring.
The only disadvantage I see with this approach is that on a Rx allocation
failure we will report an error and only update the Tx rings. However the
adapter should be fully functional in this state and the likelihood of such
an error is very low. In addition it is not unreasonable to expect the
user to need to recheck the ring configuration should they experience an
error setting the ring sizes.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Don Skidmore [Tue, 11 Sep 2012 06:58:19 +0000 (06:58 +0000)]
ixgbe: Add function ixgbe_reset_pipeline_82599
This patch adds a function that forces a full pipeline reset. This
function will be used in following patches to completely reset the PHY
during resets.
Signed-off-by: Don Skidmore <donald.c.skidmore@intel.com> Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Alexander Duyck [Sat, 8 Sep 2012 03:17:01 +0000 (03:17 +0000)]
ixgbe: Drop unnecessary addition from ixgbe_set_rx_buffer_len
We still had some code floating around from the old single buffer receive
path. As a result we were adding VLAN_HLEN to max_frame although the
resultant value was never used. Since that is the case we can drop this from
the function.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Tushar Dave [Fri, 14 Sep 2012 04:24:49 +0000 (04:24 +0000)]
ixgbe: Correcting small packet padding
Driver pad skb up to 17 bytes because of the HW requirement. However, that code
implementation mess up the skb tail pointer after padding. This patch sets
skb->tail correctly.
Signed-off-by: Tushar Dave <tushar.n.dave@intel.com> Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Jacob Keller [Sat, 25 Aug 2012 03:54:19 +0000 (03:54 +0000)]
ixgbe: (PTP) refactor init, cyclecounter and reset
This patch modifies when and where PTP registers and data are set. Previously
a work-around was used inside cyclecounter_start in order to reset some of the
time registers. This patch creates a new ixgbe_ptp_reset specifically for this
purpose. The cyclecounter configuration has trimmed down to only modify what
is necessary. Due to hardware conditions after probe and before open, PTP init
has now moved into the ixgbe_open call. This allows the ptp device name in the
sysfs to be the ethernet device name instead of the MAC address.
The cyclecounter check flag is renamed to PTP_ENABLED and is used to prevent
PTP init from happening when PTP has not been enabled.
CC: Richard Cochran <richardcochran@gmail.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Emil Tantilov [Thu, 16 Aug 2012 08:13:07 +0000 (08:13 +0000)]
ixgbe: add WOL support for new subdevice id
This patch adds a subdevice id for new 82599 device. The define is needed
to allow enabling WOL support.
Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com> Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Alexander Duyck [Fri, 20 Jul 2012 08:10:03 +0000 (08:10 +0000)]
ixgbevf: Add VF DCB + SR-IOV support
This change adds support for DCB and SR-IOV from the VF. With this change
in place the VF will correctly use a traffic class other than 0 in the case
that the PF is configured with the default user priority belonging to a
traffic class other than 0.
Cc: Greg Rose <gregory.v.rose@intel.com> Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Sibai Li <sibai.li@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Alexander Duyck [Fri, 20 Jul 2012 08:09:37 +0000 (08:09 +0000)]
ixgbe: Enable support for VF API version 1.1 in the PF.
This change switches on the last few bits for us enabling version 1.1 VF
support in the PF.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com> Tested-by: Robert Garrett <RobertX.Garrett@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Alexander Duyck [Fri, 20 Jul 2012 08:09:32 +0000 (08:09 +0000)]
ixgbe: Add support for GET_QUEUES message to get DCB configuration
This patch addresses several issues in regards to the combination of DCB
and SR-IOV. Specifically it allows us to send information to the VF on
which queues it should be using.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com> Tested-by: Ross Brattain <ross.b.brattain@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Alexander Duyck [Tue, 2 Oct 2012 00:17:03 +0000 (00:17 +0000)]
ixgbe: Add support for tracking the default user priority to SR-IOV
It is necessary to track the default user priority in the PF so that we can
force it upon the VFs. The motivation behind this is to keep the VFs from
getting access to user priorities meant for things like storage.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Alexander Duyck [Thu, 24 May 2012 08:26:29 +0000 (08:26 +0000)]
ixgbe: Add support for IPv6 and UDP to ixgbe_get_headlen
This change adds support for IPv6 and UDP to ixgbe_get_headlen. The
advantage to this is that we can now handle ipv4/UDP, ipv6/TCP, and
ipv6/UDP with a single memcpy instead of having to do them in multiple
pskb_may_pull calls.
A quick bit of testing shows that we increase throughput for a single
session of netperf from 8800Mpbs to about 9300Mpbs in the case of ipv6/TCP.
As such overall ipv6 performance should improve with this change.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Stephen Ko <stephen.s.ko@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
ipvs: fix build errors related to config option combinations
Fix two build error introduced by commit 63dca2c0:
"ipvs: Fix faulty IPv6 extension header handling in IPVS"
First build error was fairly trivial and can occur, when
CONFIG_IP_VS_IPV6 is disabled.
The second build error was tricky, and can occur when deselecting
both all Netfilter and IPVS, but selecting CONFIG_IPV6. This is
caused by "kernel/sysctl_binary.c" including "net/ip_vs.h", which
includes "linux/netfilter_ipv6/ip6_tables.h" causing include
of "include/linux/netfilter/x_tables.h" which then cannot find
the typedef nf_hookfn.
Fix this by only including "linux/netfilter_ipv6/ip6_tables.h" in
case of CONFIG_IP_VS_IPV6 as its already used to guard the usage
of ipv6_find_hdr().
Reported-by: Fengguang Wu <fengguang.wu@intel.com> Reported-by: Yuanhan Liu <yuanhan.liu@linux.intel.com> Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: Simon Horman <horms@verge.net.au>
Eric Dumazet [Sun, 21 Oct 2012 20:12:09 +0000 (20:12 +0000)]
ipv4: 16 slots in initial fib_info hash table
A small host typically needs ~10 fib_info structures, so create initial
hash table with 16 slots instead of only one. This removes potential
false sharing and reallocs/rehashes (1->2->4->8->16)
Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Sun, 21 Oct 2012 19:57:11 +0000 (19:57 +0000)]
tcp: RFC 5961 5.2 Blind Data Injection Attack Mitigation
RFC 5961 5.2 [Blind Data Injection Attack].[Mitigation]
All TCP stacks MAY implement the following mitigation. TCP stacks
that implement this mitigation MUST add an additional input check to
any incoming segment. The ACK value is considered acceptable only if
it is in the range of ((SND.UNA - MAX.SND.WND) <= SEG.ACK <=
SND.NXT). All incoming segments whose ACK value doesn't satisfy the
above condition MUST be discarded and an ACK sent back.
Move tcp_send_challenge_ack() before tcp_ack() to avoid a forward
declaration.
Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Neal Cardwell <ncardwell@google.com> Cc: Yuchung Cheng <ycheng@google.com> Cc: Jerry Chu <hkchu@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/horms/ipvs-next
Pull updates from Jesper Dangaard Brouer for IPVS mostly targeted
to improve IPv6 support (7 commits):
ipvs: Trivial changes, use compressed IPv6 address in output
ipvs: IPv6 extend ICMPv6 handling for future types
ipvs: Use config macro IS_ENABLED()
ipvs: Fix faulty IPv6 extension header handling in IPVS
ipvs: Complete IPv6 fragment handling for IPVS
ipvs: API change to avoid rescan of IPv6 exthdr
ipvs: SIP fragment handling
FW flash layout on Skyhawk-R is different from BE3-R.
Hence the code needs to be fixed to flash FW on Skyhawk-R.
Also cleaning up code in BE3-R flashing function.
Signed-off-by: Vasundhara Volam <vasundhara.volam@emulex.com> Signed-off-by: Padmanabh Ratnakar <padmanabh.ratnakar@emulex.com> Signed-off-by: David S. Miller <davem@davemloft.net>
be2net: Enabling Wake-on-LAN is not supported in S5 state
be_shutdown is enabling wake-on-lan by calling be_setup_wol.
Emulex adapter do not support wake-on-lan in S5 state.
Signed-off-by: Vasundhara Volam <vasundhara.volam@emulex.com> Signed-off-by: Padmanabh Ratnakar <padmanabh.ratnakar@emulex.com> Signed-off-by: David S. Miller <davem@davemloft.net>
be2net: Fix issues in error recovery due to wrong queue state
During recovery from a FW error, destroy queue operation may fail.
Queue should be marked as destroyed so that recovery code can recreate
the queue. Also fix queue created state not getting checked at one instance.
Signed-off-by: Padmanabh Ratnakar <padmanabh.ratnakar@emulex.com> Signed-off-by: David S. Miller <davem@davemloft.net>
be2net: Fix error messages while driver load for VFs
VF does not have privileges to execute many commands. When VFs try
to execute those commands there are unnecessary error messages.
Fix this by executing only those commands for which VF has privilege.
Signed-off-by: Padmanabh Ratnakar <padmanabh.ratnakar@emulex.com> Signed-off-by: David S. Miller <davem@davemloft.net>
be2net: Fix change MAC operation for VF for Lancer
For changing MAC of VF from PF, delete MAC operation needs to be done before
assigning new MAC. Also in ndo_set_mac_address operation avoid delete MAC if
it has been already deleted by PF.
Signed-off-by: Padmanabh Ratnakar <padmanabh.ratnakar@emulex.com> Signed-off-by: David S. Miller <davem@davemloft.net>
be2net: Fix driver load failure for different FW configs in Lancer
Driver assumes FW resource counts and capabilities while creating queues and
using functionality like RSS. This causes driver load to fail in FW configs
where resources and capabilities are reduced. Fix this by querying FW
configuration during probe and using resources and capabilities accordingly.
Signed-off-by: Padmanabh Ratnakar <padmanabh.ratnakar@emulex.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Rami Rosen [Fri, 19 Oct 2012 01:09:30 +0000 (01:09 +0000)]
net:dev: remove double indentical assignment in dev_change_net_namespace().
This patch removes double assignment of err to -EINVAL in dev_change_net_namespace().
Signed-off-by: Rami Rosen <ramirose@gmail.com> Acked-by: Serge E. Hallyn <serge.hallyn@ubuntu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Pavel Emelyanov [Thu, 18 Oct 2012 23:55:56 +0000 (23:55 +0000)]
sockopt: Make SO_BINDTODEVICE readable
The SO_BINDTODEVICE option is the only SOL_SOCKET one that can be set, but
cannot be get via sockopt API. The only way we can find the device id a
socket is bound to is via sock-diag interface. But the diag works only on
hashed sockets, while the opt in question can be set for yet unhashed one.
That said, in order to know what device a socket is bound to (we do want
to know this in checkpoint-restore project) I propose to make this option
getsockopt-able and report the respective device index.
Another solution to the problem might be to teach the sock-diag reporting
info on unhashed sockets. Should I go this way instead?
Signed-off-by: Pavel Emelyanov <xemul@parallels.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>