Instead of calling 3 times ntohs(random->param_hdr.length), 2 times
ntohs(hmacs->param_hdr.length), and 3 times ntohs(chunks->param_hdr.length)
within the same function, we only call each once and store it in a
variable.
Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Acked-by: Vlad Yasevich <vyasevich@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Frank Li [Wed, 6 Feb 2013 14:59:59 +0000 (14:59 +0000)]
net: fec: fix spin_lock dead lock
=========================================================
[ INFO: possible irq lock inversion dependency detected ]
3.8.0-rc5+ #82 Not tainted
---------------------------------------------------------
swapper/0/0 just changed the state of lock:
(&(&fep->hw_lock)->rlock){..-...}, at: [<8034e2f8>] fec_enet_start_xmit+0x48/0x 2cc
but this lock took another, SOFTIRQ-unsafe lock in the past:
(prepare_lock){+.+.+.}
and interrupts could create inverse lock ordering between them.
other info that might help us debug this:
Possible interrupt unsafe locking scenario:
Tom Herbert [Wed, 6 Feb 2013 07:58:41 +0000 (07:58 +0000)]
mlx4_en: Fix BQL reset TX queue call point
Fix issue in Mellanox driver related to BQL. netdev_tx_reset_queue
was not being called in certain situations where the device was
being start and stopped. Moved netdev_tx_reset_queue from the reset
device path to mlx4_en_free_tx_buf which is where the rings are
cleaned in a reset (specifically from device being stopped).
Signed-off-by: Tom Herbert <therbert@google.com> Acked-By: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Fri, 8 Feb 2013 04:28:32 +0000 (23:28 -0500)]
Merge branch 'mlx4'
Amir Vadai says:
====================
This series from Yan Burman adds support for unicast MAC address filtering and
ndo FDB operations. It also includes some optimizations to loopback related
decisions and checks in the TX/RX fast path and one cleanup, all in separate
patches.
Today, when adding macvlan devices, the NIC goes into promiscuous mode, since
unicast MAC filtering is not supported. With these changes, macvlan devices can
be added without the penalty of promiscuous mode.
If for some reason adding a unicast address filter fails e.g as of missing space in
the HW mac table, the device forces itself into promiscuous mode (and out of this
forced state when enough space is available).
Also, now it is possible to have bridge under multi-function configuration that include
PF and VFs. In order to use bridge over PF/VFs, VM MAC fdb entries must be added e.g.
using 'bridge fdb add' command.
Changes from v1 - based on more comments from Eric Dumazet:
* added failure handling when adding unicast address filter
Changes from v0 - based on comments from Eric Dumazet:
* Removed unneeded synchronize_rcu()
* Use kfree_rcu() instead of synchronize_rcu() + kfree()
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Yan Burman [Thu, 7 Feb 2013 02:25:27 +0000 (02:25 +0000)]
net/mlx4_en: Implement ndo fdb functionality
Add support for setting embedded switch fdb in case of SRIOV, by
implementing ndo_fdb_{add, del, dump}. This will allow to use
bridged configuration with multi-function. In order to add VM MAC
to the eSwitch fdb, the following command may be used over the relevant function interface:
bridge fdb add <MAC> permanent self dev <IFACE>
Signed-off-by: Yan Burman <yanb@mellanox.com> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Yan Burman [Thu, 7 Feb 2013 02:25:26 +0000 (02:25 +0000)]
net/mlx4_en: Add unicast MAC filtering
Implement and advertise unicast MAC filtering, such that setting macvlan
instance over mlx4_en interfaces will not require the networking core
to put mlx4_en devices in promiscuous mode.
If for some reason adding a unicast address filter fails e.g as of missing space in
the HW mac table, the device forces itself into promiscuous mode (and out of this
forced state when enough space is available).
Signed-off-by: Yan Burman <yanb@mellanox.com> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Yan Burman [Thu, 7 Feb 2013 02:25:25 +0000 (02:25 +0000)]
net/mlx4_en: Manage hash of MAC addresses per port
As a preparation step for supporting multiple unicast addresses, store MAC addresses in hash table.
Remove the radix tree for MAC addresses per QP, as it's not in use.
Signed-off-by: Yan Burman <yanb@mellanox.com> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Yan Burman [Thu, 7 Feb 2013 02:25:24 +0000 (02:25 +0000)]
net/mlx4_en: Save previous MAC address of the port so we can replace it later
In preparation to having more than one unicast MAC per port, we need to keep track
of the previous MAC address in the flow of ndo_set_mac_address,
so that mlx4_en_replace_mac will know what to replace.
Signed-off-by: Yan Burman <yanb@mellanox.com> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Yan Burman [Thu, 7 Feb 2013 02:25:23 +0000 (02:25 +0000)]
net/mlx4_en: Re-arrange ndo_set_rx_mode related code
Currently, mlx4_en_do_set_multicast serves as the ndo_set_rx_mode entry for mlx4_en,
doing all related work. Split it to few calls, one per required functionality
(e.g multicast, promiscuous, etc) and rename some structures and calls
to use rx_mode notation instead of multicast.
Signed-off-by: Yan Burman <yanb@mellanox.com> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Yan Burman [Thu, 7 Feb 2013 02:25:22 +0000 (02:25 +0000)]
net/mlx4: Move Ethernet related functionality from mlx4_core to mlx4_en
Move low level code that deals with management of Ethernet MACs and QPs from mlx4_core to mlx4_en.
Also convert the new functions to deal with MACs in form of char array instead of u64.
Actual functions moved:
mlx4_replace_mac
mlx4_get_eth_qp
mlx4_put_eth_qp
To conduct this change, some functionality had to be exported from the core,
the following functions were added:
mlx4_get_base_qp
__mlx4_replace_mac (low level function for CX1/A0 compatibility)
Signed-off-by: Yan Burman <yanb@mellanox.com> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Yan Burman [Thu, 7 Feb 2013 02:25:20 +0000 (02:25 +0000)]
net/mlx4_en: Optimize Rx fast path filter checks
Currently, RX path code that does RX filtering is not optimized
and does an expensive conversion. In order to use ether_addr_equal_64bits
which is optimized for such cases, we need the MAC address kept by the device
to be in the form of unsigned char array instead of u64. Store the MAC address
as unsigned char array and convert to/from u64 out of the fast path when needed.
Side effect of this is that we no longer need priv->mac, since it's the same
as dev->dev_addr.
This optimization was suggested by Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Yan Burman <yanb@mellanox.com> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Yan Burman [Thu, 7 Feb 2013 02:25:19 +0000 (02:25 +0000)]
net/mlx4_en: Optimize loopback related checks in data path
Currently there are relatively complex conditional checks in the fast path,
for TX loopback enabling and resulting RX filter logic.
Move elaborate if's out of data path, replace them with a single flag
for each state and update that state from appropriate places.
Also, in native (non SRIOV) mode and not in loopback or in selftest,
there is no need to try and filter out packets that HW loopback-ed,
as in native mode we do not loopback packets anymore.
Signed-off-by: Yan Burman <yanb@mellanox.com> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Hauke Mehrtens [Wed, 6 Feb 2013 05:51:49 +0000 (05:51 +0000)]
bgmac: add ndo_set_rx_mode netdev ops
When changing the device from or to promisc mode this only affects the
device after the device is bought up the next time. For bridging it is
needed to change the device to promisc mode while it is up, which is
possible with this patch.
Signed-off-by: Hauke Mehrtens <hauke@hauke-m.de> Signed-off-by: David S. Miller <davem@davemloft.net>
Cong Wang [Tue, 5 Feb 2013 16:36:38 +0000 (16:36 +0000)]
net: adjust skb_gso_segment() for calling in rx path
skb_gso_segment() is almost always called in tx path,
except for openvswitch. It calls this function when
it receives the packet and tries to queue it to user-space.
In this special case, the ->ip_summed check inside
skb_gso_segment() is no longer true, as ->ip_summed value
has different meanings on rx path.
This patch adjusts skb_gso_segment() so that we can at least
avoid such warnings on checksum.
Cc: Jesse Gross <jesse@nicira.com> Cc: David S. Miller <davem@davemloft.net> Signed-off-by: Cong Wang <amwang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Flavio Leitner [Tue, 5 Feb 2013 09:30:55 +0000 (09:30 +0000)]
team: allow userspace to take control over carrier
Some modes don't require any special carrier handling so
in these cases, the kernel can control the carrier as for
any other interface. However, some other modes, e.g. lacp,
requires more than just that, so userspace needs to control
the carrier itself.
The daemon today is ready to control it, but the kernel
still can change it based on events.
This fix so that either kernel or userspace is controlling
the carrier.
Signed-off-by: Flavio Leitner <fbl@redhat.com> Signed-off-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: David S. Miller <davem@davemloft.net>
Neil Horman [Tue, 5 Feb 2013 08:05:43 +0000 (08:05 +0000)]
netpoll: protect napi_poll and poll_controller during dev_[open|close]
Ivan Vercera was recently backporting commit 9c13cb8bb477a83b9a3c9e5a5478a4e21294a760 to a RHEL kernel, and I noticed that,
while this patch protects the tg3 driver from having its ndo_poll_controller
routine called during device initalization, it does nothing for the driver
during shutdown. I.e. it would be entirely possible to have the
ndo_poll_controller method (or subsequently the ndo_poll) routine called for a
driver in the netpoll path on CPU A while in parallel on CPU B, the ndo_close or
ndo_open routine could be called. Given that the two latter routines tend to
initizlize and free many data structures that the former two rely on, the result
can easily be data corruption or various other crashes. Furthermore, it seems
that this is potentially a problem with all net drivers that support netpoll,
and so this should ideally be fixed in a common path.
As Ben H Pointed out to me, we can't preform dev_open/dev_close in atomic
context, so I've come up with this solution. We can use a mutex to sleep in
open/close paths and just do a mutex_trylock in the napi poll path and abandon
the poll attempt if we're locked, as we'll just retry the poll on the next send
anyway.
I've tested this here by flooding netconsole with messages on a system whos nic
driver I modfied to periodically return NETDEV_TX_BUSY, so that the netpoll tx
workqueue would be forced to send frames and poll the device. While this was
going on I rapidly ifdown/up'ed the interface and watched for any problems.
I've not found any.
Signed-off-by: Neil Horman <nhorman@tuxdriver.com> CC: Ivan Vecera <ivecera@redhat.com> CC: "David S. Miller" <davem@davemloft.net> CC: Ben Hutchings <bhutchings@solarflare.com> CC: Francois Romieu <romieu@fr.zoreil.com> CC: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Calling icmpv6_send() on a local message size error leads to an
incorrect update of the path mtu in the case when IPsec is used.
So use ipv6_local_error() instead to notify the socket about the
error.
Reported-by: Jiri Bohac <jbohac@suse.cz> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Joe Perches [Mon, 4 Feb 2013 18:22:29 +0000 (18:22 +0000)]
drivers: net: misc: Remove unused OOM variables
commits 9d11bd159
("wimax: Remove unnecessary alloc/OOM messages, alloc cleanups")
and b2adaca92
("ethernet: Remove unnecessary alloc/OOM messages, alloc cleanups")
added a couple of unused variable warnings.
Remove the now unused variables.
Noticed-by: kbuild test robot <fengguang.wu@intel.com> Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Tue, 5 Feb 2013 19:54:49 +0000 (14:54 -0500)]
Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/net-next
Jeff Kirsher says:
====================
This series contains updates to e1000e and ixgbe. Majority of the patches
are against e1000e, where Bruce makes several cosmetic #define moves into
header files. In addition, Bruce does a cleanup of braces to resolve
checkpatch warnings (when using the strict option).
Ixgbe patches contain several fixes as well as updating the copyright. The
fixes from Josh Hay, resolved a possible NULL pointer dereference and
resolved Smatch warnings by fixing return values and memcpy parameters.
Alex provides 2 fixes, the first is to replace rmb() with
read_barrier_depends() in the Tx cleanup. The second fixes an MTU
warning when using SR-IOV which corrects the fact that we were using 1522
to test for the max frame size in ixgbe_change_mtu and 1518 in
ixgbe_set_vf_lpe. The difference was the addition of VLAN_HLEN, which we
only need to add in the case of computing a buffer size, but not a filter
size. Lastly, a patch from Emil which is based on a community patch from
Aurélien Guillaume which adds functions needed for reading SFF-8472
diagnostic data from SFP modules.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
TCP Appropriate Byte Count was added by me, but later disabled.
There is no point in maintaining it since it is a potential source
of bugs and Linux already implements other better window protection
heuristics.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Tom Parkin [Thu, 31 Jan 2013 23:43:03 +0000 (23:43 +0000)]
l2tp: create tunnel sockets in the right namespace
When creating unmanaged tunnel sockets we should honour the network namespace
passed to l2tp_tunnel_create. Furthermore, unmanaged tunnel sockets should
not hold a reference to the network namespace lest they accidentally keep
alive a namespace which should otherwise have been released.
Unmanaged tunnel sockets now drop their namespace reference via sk_change_net,
and are released in a new pernet exit callback, l2tp_exit_net.
Signed-off-by: Tom Parkin <tparkin@katalix.com> Signed-off-by: James Chapman <jchapman@katalix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Tom Parkin [Thu, 31 Jan 2013 23:43:02 +0000 (23:43 +0000)]
l2tp: prevent tunnel creation on netns mismatch
l2tp_tunnel_create is passed a pointer to the network namespace for the
tunnel, along with an optional file descriptor for the tunnel which may
be passed in from userspace via. netlink.
In the case where the file descriptor is defined, ensure that the namespace
associated with that socket matches the namespace explicitly passed to
l2tp_tunnel_create.
Signed-off-by: Tom Parkin <tparkin@katalix.com> Signed-off-by: James Chapman <jchapman@katalix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Tom Parkin [Thu, 31 Jan 2013 23:43:01 +0000 (23:43 +0000)]
l2tp: set netnsok flag for netlink messages
The L2TP netlink code can run in namespaces. Set the netnsok flag in
genl_family to true to reflect that fact.
Signed-off-by: Tom Parkin <tparkin@katalix.com> Signed-off-by: James Chapman <jchapman@katalix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Tom Parkin [Thu, 31 Jan 2013 23:43:00 +0000 (23:43 +0000)]
l2tp: put tunnel socket release on a workqueue
To allow l2tp_tunnel_delete to be called from an atomic context, place the
tunnel socket release calls on a workqueue for asynchronous execution.
Tunnel memory is eventually freed in the tunnel socket destructor.
Signed-off-by: Tom Parkin <tparkin@katalix.com> Signed-off-by: James Chapman <jchapman@katalix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
The ipv6 route.c conflict is simple, just ignore the 'net' side change
as we fixed the same problem in 'net-next' by eliminating cached
neighbours from ipv6 routes.
The e1000e conflict is an addition of a new statistic in the ethtool
code, trivial.
The vmxnet3 conflict is about one change in 'net' removing a guarding
conditional, whilst in 'net-next' we had a netdev_info() conversion.
The iwlwifi conflict is dealing with a WARN_ON() conversion in
'net-next' vs. a revert happening in 'net'.
Signed-off-by: David S. Miller <davem@davemloft.net>
Alexander Duyck [Wed, 9 Jan 2013 08:50:42 +0000 (08:50 +0000)]
ixgbe: Fix SR-IOV MTU warning
This change corrects the fact that we were using 1522 to test for the
max frame size in ixgbe_change_mtu and 1518 in ixgbe_set_vf_lpe. The
difference was the addition of VLAN_HLEN which we only need to add in the case
of computing a buffer size, but not a filter size.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Sibai Li <Sibai.li@intel.com> Tested-by: Stephen Ko <stephen.s.ko@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Alexander Duyck [Tue, 8 Jan 2013 07:00:58 +0000 (07:00 +0000)]
ixgbe: Replace rmb in Tx cleanup with read_barrier_depends
The rmb in the Tx cleanup path is a much stronger barrier than we really need.
All that is really needed is a read_barrier_depends since the location of the
EOP descriptor is dependent on the eop_desc value.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Don Skidmore [Tue, 8 Jan 2013 05:02:28 +0000 (05:02 +0000)]
ixgbe: update date to 2013
Signed-off-by: Don Skidmore <donald.c.skidmore@intel.com> Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Josh Hay [Fri, 4 Jan 2013 03:34:42 +0000 (03:34 +0000)]
ixgbe: fix return values and memcpy parameters to eliminate Smatch warnings
This patch removes the rval variable returns from function and replaces
them with direct returns in ixgbe_dcbnl_getnumtcs. It also changes how
ixgbe_gstrings_test is copied into data with memcpy in ixgbe_get_strings
because "*ixgbe_gstrings_test too small (32 vs 160)".
Signed-off-by: Josh Hay <joshua.a.hay@intel.com> Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Josh Hay [Fri, 4 Jan 2013 03:34:36 +0000 (03:34 +0000)]
ixgbe: fix potential null dereference
This patch adds a default case which goes to the next loop iteration
in the case where p is not set, preventing p from being dereferenced.
Signed-off-by: Josh Hay <joshua.a.hay@intel.com> Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Bruce Allan [Thu, 24 Jan 2013 00:50:18 +0000 (00:50 +0000)]
e1000e: cleanup checkpatch braces checks
Resolve the following strict checkpatch checks:
CHECK:BRACES: Blank lines aren't necessary after an open brace '{'
CHECK:BRACES: Blank lines aren't necessary before a close brace '}'
CHECK:BRACES: braces {} should be used on all arms of this statement
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Bruce Allan [Tue, 5 Feb 2013 08:30:59 +0000 (00:30 -0800)]
e1000e: convert enums of register offsets and move #defines to regs.h
There are enough register offsets to warrant being in their own header
file, and doing so logically separates them from other header file content.
They have been converted from an enumerated data type to #defines as is
done in all the other Intel wired ethernet drivers.
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Bruce Allan [Tue, 22 Jan 2013 08:44:35 +0000 (08:44 +0000)]
e1000e: cosmetic move of #defines and prototypes to the new manage.h
Move #defines, function prototypes and data types which are applicable to
all/most devices supported by the driver but are specific to the
manageability component of each device to the new manage.h header file.
These #defines, function prototypes and data types can be used by other
files in the driver and moving them to the manageability-specific file
makes it clearer to which component they are applicable.
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Bruce Allan [Tue, 22 Jan 2013 08:44:30 +0000 (08:44 +0000)]
e1000e: cosmetic move of #defines and function prototypes to the new nvm.h
Move #defines and function prototypes which are applicable to all/most
devices supported by the driver and are specific to the NVM component of
each device to the new nvm.h header file. These #defines and function
prototypes can be used by other files in the driver and moving them to the
NVM-specific file makes it clearer to which component they are applicable.
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Bruce Allan [Tue, 22 Jan 2013 08:44:25 +0000 (08:44 +0000)]
e1000e: cosmetic move of #defines and function prototypes to the new phy.h
Move #defines and function prototypes which are applicable to all/most
devices supported by the driver and are specific to the PHY component of
each device to the new phy.h header file. These function prototypes can be
used by other files in the driver and moving them to the PHY-specific file
makes it clearer to which component they are applicable.
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Bruce Allan [Tue, 22 Jan 2013 08:44:19 +0000 (08:44 +0000)]
e1000e: cosmetic move of function prototypes to the new mac.h
Move prototypes for functions which are applicable to all/most devices
supported by the driver and are specific to the MAC component of each
device to the new mac.h header file. These function prototypes can be used
by other files in the driver and moving them to the MAC-specific file makes
it clearer to which component they are applicable.
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Bruce Allan [Tue, 22 Jan 2013 08:44:14 +0000 (08:44 +0000)]
e1000e: cosmetic move of #defines and prototypes to the new ich8lan.h
Move #defines and function prototypes specific to the ICH/PCH family of
devices (ICH8/82562, ICH8/82566, ICH8/82567, ICH9/82562, ICH9/82566,
ICH9/82567, ICH10/82567, 82577, 82578, 82579, I217, I218) to the new
ich8lan.h header file (the convention for Intel wired ethernet drivers is
to use the name of the first device in the family for related file and
function names). These defines and function prototypes can be used by
other files in the driver and moving them to the ICH/PCH-family-specific
file makes it clearer to which devices they are applicable.
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Bruce Allan [Tue, 22 Jan 2013 08:44:09 +0000 (08:44 +0000)]
e1000e: cosmetic move of #defines to the new 80003es2lan.h
Move #defines specific to the ESB2/82563 family of devices to the new
80003es2lan.h header file. These defines can be used by other files in the
driver and moving them to the 80003es2lan-family-specific file makes it
clearer to which devices they are applicable.
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Bruce Allan [Tue, 22 Jan 2013 08:44:04 +0000 (08:44 +0000)]
e1000e: cosmetic move of #defines and prototypes to the new 82571.h
Move #defines and function prototypes specific to the 8257x family of
devices (82571, 82572, 82573, 82574, 82583) to the new 82571.h header file
(the convention for Intel wired ethernet drivers is to use the name of the
first device in the family for related file and function names). These
defines and function prototypes can be used by other files in the driver
and moving them to the 8257x-family-specific file makes it clearer to which
devices they are applicable.
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
"gianfar: Pack struct gfar_priv_grp into three cachelines"
causes the following null dereference at driver init on sbc8548:
libphy: Freescale PowerQUICC MII Bus: probed
Unable to handle kernel paging request for data at address 0x00000000
Faulting instruction address: 0xc01d6a38
Oops: Kernel access of bad area, sig: 11 [#1]
[...]
NIP [c01d6a38] gfar_parse_group+0x228/0x280
LR [c01d6a34] gfar_parse_group+0x224/0x280
Call Trace:
[ef82dd60] [c01d6a34] gfar_parse_group+0x224/0x280 (unreliable)
[ef82dd90] [c01d73a4] gfar_probe+0x284/0xfe0
The reason is that the commit also changed the allocation of the
Rx and error handling irq structs to be skipped for !MQ_MG_MODE.
In the !MQ_MG_MODE case, only the Tx irq struct is allocated.
Digging further, we see that MQ_MG_MODE is set only if we find
the OF compatible string "fsl,etsec2".
A quick grep in the dts directory shows lots of boards that support
Rx/Tx/Err, but without this specific compat string. And hence they
go after the unallocated Rx/Error structs and cause the above oops.
Hence such a change can not be deployed until all the dts files
are updated and sufficiently deployed. Further, the optimization
is of limited value, since the kmalloc'd struct in question has only
a single unsigned int, and an (IFNAMSIZ + 6) sized string.
Note that no changes to the freeing code are needed here, as it
already did an unconditional free of Rx/Tx/Error gfar_irqinfo.
Cc: Claudiu Manoil <claudiu.manoil@freescale.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Jiri Pirko [Fri, 1 Feb 2013 08:17:25 +0000 (08:17 +0000)]
team: move netlink event notifiers after team_port_leave()
In team_port_del(), there is need to be do all the cleanup related
things first and netlink event notifiers should be called after that.
This fixes two problems:
team carrier is now correctly set (port is removed from list first)
mode can set option as changed in .port_leave op
Signed-off-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: David S. Miller <davem@davemloft.net>
Jiri Pirko [Fri, 1 Feb 2013 08:17:24 +0000 (08:17 +0000)]
team: handle sending port list in the same way option list is sent
Essentially do the same thing with port list as with option list.
Multipart netlink message.
Side effect is that port event message can send port which is not longer
in team->port_list.
Signed-off-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: David S. Miller <davem@davemloft.net>
Hadar Hen Zion [Mon, 4 Feb 2013 03:01:21 +0000 (03:01 +0000)]
net/mlx4_en: Fix compilation error when CONFIG_INET isn't defined
ip_eth_mc_map function can't be used when CONFIG_INET isn't defined.
Fixed compilation error by adding CONFIG_INET define check before using the
function.
Signed-off-by: Hadar Hen Zion <hadarh@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Ying Xue [Sun, 3 Feb 2013 20:32:57 +0000 (20:32 +0000)]
net: remove redundant check for timer pending state before del_timer
As in del_timer() there has already placed a timer_pending() function
to check whether the timer to be deleted is pending or not, it's
unnecessary to check timer pending state again before del_timer() is
called.
Signed-off-by: Ying Xue <ying.xue@windriver.com> Cc: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Sathya Perla [Sun, 3 Feb 2013 20:30:11 +0000 (20:30 +0000)]
be2net: fix re-loaded PF driver to re-gain control of its VFs
Currently, when the PF driver is unloaded and re-loaded while VFs are attached
to VMs, it loses control of its VFs.
The PF driver now uses the newly defined/created GET_IFACE_LIST cmd
(available in FW ver >= 4.6) to query the if_id of the VFs
(enabled in its previous life). The PF driver then uses the if_id for
further VF configuration.
The GET_IFACE_MAC_LIST cmd has also implemented in BE3 FW for PF to
query pmac-ids used by its VFs.
Signed-off-by: Sathya Perla <sathya.perla@emulex.com> Signed-off-by: David S. Miller <davem@davemloft.net>
alloc failures already get standardized OOM
messages and a dump_stack.
Convert kzalloc's with multiplies to kcalloc.
Convert kmalloc's with multiplies to kmalloc_array.
Remove now unused variables.
Remove unnecessary memset after kzalloc->kcalloc.
Whitespace cleanups for these changes.
Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: David S. Miller <davem@davemloft.net>
alloc failures already get standardized OOM
messages and a dump_stack.
Convert kzalloc's with multiplies to kcalloc.
Convert kmalloc's with multiplies to kmalloc_array.
Fix a few whitespace defects.
Convert a constant 6 to ETH_ALEN.
Use parentheses around sizeof.
Convert vmalloc/memset to vzalloc.
Remove now unused size variables.
Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Daniel Borkmann [Fri, 1 Feb 2013 04:37:43 +0000 (04:37 +0000)]
sctp: sctp_close: fix release of bindings for deferred call_rcu's
It seems due to RCU usage, i.e. within SCTP's address binding list,
a, say, ``behavioral change'' was introduced which does actually
not conform to the RFC anymore. In particular consider the following
(fictional) scenario to demonstrate this:
do:
Two SOCK_SEQPACKET-style sockets are opened (S1, S2)
S1 is bound to 127.0.0.1, port 1024 [server]
S2 is bound to 127.0.0.1, port 1025 [client]
listen(2) is invoked on S1
From S2 we call one sendmsg(2) with msg.msg_name and
msg.msg_namelen parameters set to the server's
address
S1, S2 are closed
goto do
The first pass of this loop passes successful, while the second round
fails during binding of S1 (address still in use). What is happening?
In the first round, the initial handshake is being done, and, at the
time close(2) is called on S1, a non-graceful shutdown is performed via
ABORT since in S1's receive queue an unprocessed packet is present,
thus stating an error condition. This can be considered as a correct
behavior.
During close also all bound addresses are freed, thus nothing *must*
be active anymore. In reference to RFC2960:
After checking the Verification Tag, the receiving endpoint shall
remove the association from its record, and shall report the
termination to its upper layer. (9.1 Abort of an Association)
Also, no half-open states are supported, thus after an ungraceful
shutdown, we leave nothing behind. However, this seems not to be
happening though. In a real-world scenario, this is exactly where
it breaks the lksctp-tools functional test suite, *for instance*:
./test_sockopt
test_sockopt.c 1 PASS : getsockopt(SCTP_STATUS) on a socket with no assoc
test_sockopt.c 2 PASS : getsockopt(SCTP_STATUS)
test_sockopt.c 3 PASS : getsockopt(SCTP_STATUS) with invalid associd
test_sockopt.c 4 PASS : getsockopt(SCTP_STATUS) with NULL associd
test_sockopt.c 5 BROK : bind: Address already in use
The underlying problem is that sctp_endpoint_destroy() hasn't been
triggered yet while the next bind attempt is being done. It will be
triggered eventually (but too late) by sctp_transport_destroy_rcu()
after one RCU grace period:
Thus, we move out the condition with sctp_association_put() as well as
the sctp_packet_free() invocation and the issue can be solved. We also
better free the SCTP chunks first before putting the ref of the association.
With this patch, the example above (which simulates a similar scenario
as in the implementation of this test case) and therefore also the test
suite run successfully through. Tested by myself.
Cc: Vlad Yasevich <vyasevich@gmail.com> Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Acked-by: Vlad Yasevich <vyasevich@gmail.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
since the mdb table is belong to bridge device,and the
bridge device can only be seen in one netns.
So it's safe to allow unprivileged user which is the
creator of userns and netns to modify the mdb table.
Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Gao feng [Thu, 31 Jan 2013 16:30:58 +0000 (16:30 +0000)]
netns: ebtable: allow unprivileged users to operate ebtables
ebt_table is a private resource of netns, operating ebtables
in one netns will not affect other netns, we can allow the
creator user of userns and netns to change the ebtables.
Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Gao feng [Thu, 31 Jan 2013 16:30:57 +0000 (16:30 +0000)]
netns: fdb: allow unprivileged users to add/del fdb entries
Right now,only ixgdb,macvlan,vxlan and bridge implement
fdb_add/fdb_del operations.
these operations only operate the private data of net
device. So allowing the unprivileged users who creates
the userns and netns to add/del fdb entries will do no
harm to other netns.
Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Bjørn Mork [Thu, 31 Jan 2013 08:36:05 +0000 (08:36 +0000)]
net: usbnet: fix tx_dropped statistics
It is normal for minidrivers accumulating frames to return NULL
from their tx_fixup function. We do not want to count this as a
drop, or log any debug messages. A different exit path is
therefore chosen for such drivers, skipping the debug message
and the tx_dropped increment.
The test for accumulating drivers was however completely bogus,
making the exit path selection depend on whether the user had
enabled tx_err logging or not. This would arbitrarily mess up
accounting for both accumulating and non-accumulating minidrivers,
and would result in unwanted debug messages for the accumulating
drivers.
Fix by testing for FLAG_MULTI_PACKET instead, which probably was
the intention from the beginning. This usage match the documented
behaviour of this flag:
Indicates to usbnet, that USB driver accumulates multiple IP packets.
Affects statistic (counters) and short packet handling.
Signed-off-by: Bjørn Mork <bjorn@mork.no> Signed-off-by: David S. Miller <davem@davemloft.net>
This patch updates LINUX_MIB_LISTENDROPS and LINUX_MIB_LISTENOVERFLOWS in
tcp_v6_conn_request() and tcp_v6_err(). tcp_v6_conn_request() in particular can
drop SYNs for various reasons which are not currently tracked.
Signed-off-by: Vijay Subramanian <subramanian.vijay@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
This patch updates LINUX_MIB_LISTENDROPS in tcp_v4_conn_request() and
tcp_v4_err(). tcp_v4_conn_request() in particular can drop SYNs for various
reasons which are not currently tracked.
Signed-off-by: Vijay Subramanian <subramanian.vijay@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Phil Sutter [Fri, 1 Feb 2013 07:21:41 +0000 (07:21 +0000)]
packet: fix leakage of tx_ring memory
When releasing a packet socket, the routine packet_set_ring() is reused
to free rings instead of allocating them. But when calling it for the
first time, it fills req->tp_block_nr with the value of rb->pg_vec_len
which in the second invocation makes it bail out since req->tp_block_nr
is greater zero but req->tp_block_size is zero.
This patch solves the problem by passing a zeroed auto-variable to
packet_set_ring() upon each invocation from packet_release().
As far as I can tell, this issue exists even since 69e3c75 (net: TX_RING
and packet mmap), i.e. the original inclusion of TX ring support into
af_packet, but applies only to sockets with both RX and TX ring
allocated, which is probably why this was unnoticed all the time.
Signed-off-by: Phil Sutter <phil.sutter@viprinet.com> Cc: Johann Baudy <johann.baudy@gnu-log.net> Cc: Daniel Borkmann <dborkman@redhat.com> Acked-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
stmmac: don't return zero on failure path in stmmac_pci_probe()
If stmmac_dvr_probe() fails in stmmac_pci_probe(), it breaks off initialization,
deallocates all resources, but returns zero.
The patch adds -ENODEV as return value in this case.
Found by Linux Driver Verification project (linuxtesting.org).
Signed-off-by: Alexey Khoroshilov <khoroshilov@ispras.ru> Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Sun, 3 Feb 2013 09:13:05 +0000 (09:13 +0000)]
tcp: frto should not set snd_cwnd to 0
Commit 9dc274151a548 (tcp: fix ABC in tcp_slow_start())
uncovered a bug in FRTO code :
tcp_process_frto() is setting snd_cwnd to 0 if the number
of in flight packets is 0.
As Neal pointed out, if no packet is in flight we lost our
chance to disambiguate whether a loss timeout was spurious.
We should assume it was a proper loss.
Reported-by: Pasi Kärkkäinen <pasik@iki.fi> Signed-off-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Cc: Yuchung Cheng <ycheng@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Sat, 2 Feb 2013 05:23:16 +0000 (05:23 +0000)]
tcp: fix an infinite loop in tcp_slow_start()
Since commit 9dc274151a548 (tcp: fix ABC in tcp_slow_start()),
a nul snd_cwnd triggers an infinite loop in tcp_slow_start()
Avoid this infinite loop and log a one time error for further
analysis. FRTO code is suspected to cause this bug.
Reported-by: Pasi Kärkkäinen <pasik@iki.fi> Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Neal Cardwell <ncardwell@google.com> Cc: Yuchung Cheng <ycheng@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Sun, 3 Feb 2013 04:13:00 +0000 (23:13 -0500)]
Merge branch 'delete-wanrouter' of git://git.kernel.org/pub/scm/linux/kernel/git/paulg/linux
Paul Gortmaker says:
====================
The removal of wanrouter code was originally listed in the (now
gone) feature removal file since May 2012, and an RFC of the
deletion was posted[1] in late 2012. The overall concept was given
an OK, but defconfig contamination, build failures, etc. meant that
it didn't quite make it into mainline for 3.8.
Since that time, Dan discovered (via code audit) a runtime bug that
proves nobody has been using this for over four years[2]. With that
new information, I think it makes sense for someone to follow through
on Joe's original RFC and get this done for the 3.9 release.
In addition to resolving the build failures of the RFC by keeping
stub headers, this also splits the change into two parts, just like
the token ring removal did. Part #1 decouples the mainline kernel
from the expired subsystem, and part #2 does the large scale
deletion of the subsystem content.
The advantage of the above, is that a "git blame" will never lead
you to a 4000+ line deletion commit. The large scale deletion will
never show up in a "git blame" and hence the same advantages that we
get from the "--irreversible-delete" in the review stage of "git
format-patch" are also embedded into the git history itself. This
may seem like a moot point to some, but for those who spend a
considerable amount of time data mining in the git history, this is
probably worth doing.
I have done build tests of all[mod/yes]config for both the stage 1
(Makefile and Kconfig) and stage 2 (full driver delete) as a sanity
check, and the issues with the previously posted RFC should be gone.
Speaking of "--irreversible-delete" -- these patches were created
with that option, so if you want to use them locally, you are going
to have to pull (location below) the content instead of doing a
"git am" of the mailed out content.
David S. Miller [Sun, 3 Feb 2013 04:09:32 +0000 (23:09 -0500)]
Merge branch 'fixes-for-3.8' of git://gitorious.org/linux-can/linux-can
Marc Kleine-Budde says:
====================
here's a patch for net for the v3.8 release cycle. Alexander Stein noticed that
the c_can hardware has a fixed bit in the IFx_MASK2 register. His patch fixes
writing of this register by always setting this bit.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Dan Carpenter [Wed, 30 Jan 2013 22:14:10 +0000 (22:14 +0000)]
qlcnic: silence false positive overflow warning
We actually store the MAC address as well as the board_name here. The
longest board_name is 75 characters so there is more than enough room
to hold the 17 character MAC and the ": " divider. But making this
buffer larger silences a static checker warning.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Acked-By: Jitendra Kalsaria <jitendra.kalsaria@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Mahesh Bandewar [Wed, 30 Jan 2013 07:00:12 +0000 (07:00 +0000)]
bnx2x: Force link UP when the interface is in LOOPBACK mode
When the interface does not have carrier but when it's put into
loopback mode (for tests), it does not make sense to not have
the carrier. So force it!
Signed-off-by: Mahesh Bandewar <maheshb@google.com> Acked-by: Eilon Greenstein <eilong@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Wed, 30 Jan 2013 03:58:04 +0000 (22:58 -0500)]
via-rhine: Fix bugs in NAPI support.
1) rhine_tx() should use dev_kfree_skb() not dev_kfree_skb_irq()
2) rhine_slow_event_task's NAPI triggering logic is racey, it
should just hit the interrupt mask register. This is the
same as commit 7dbb491878a2c51d372a8890fa45a8ff80358af1
("r8169: avoid NAPI scheduling delay.") made to fix the same
problem in the r8169 driver. From Francois Romieu.
Reported-by: Jamie Gloudon <jamie.gloudon@gmail.com> Tested-by: Jamie Gloudon <jamie.gloudon@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Sun, 3 Feb 2013 03:55:16 +0000 (22:55 -0500)]
Merge branch 'intel'
Jeff Kirsher says:
====================
This series contains updates to ixgbe and e1000e. The ixgbe patches are
a mix of fixes, cleanup and added functionality. The first fix is for
traffic classes, where if the mapping has changed reset the NIC. The other
ixgbe fix resolves an issue where the device lookup neglected to do a
pci_dev_put() to decrement the device reference count.
The ixgbe cleanup was done by Josh, where the auto-negotiation variables
were renamed/cleaned up and refactored.
The remaining patches are from Bruce to do additional cleanup on e1000e as
well as bump the driver version. Most notably is the cleanup to use the
kernel IEEE MII definitions where possible instead of the local MII
definitions.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>