ixgbe: Add module parameter to allow untested and unsafe SFP+ modules
The X520 family of network devices, with the 82599 chip, support a
small number of Intel-verified SFP+ modules on their NICs. To maintain
stability and quality, the current devices restrict untested 3rd party
SFP+ modules.
This patch introduces a module parameter for ixgbe to allow these untested
modules at the user's peril. It also includes a warning to the syslog
alerting users that the modules aren't supported, and results may
vary.
CC: Jesper Dangaard Brouer <hawk@comx.dk> Signed-off-by: Peter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com> Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Akinobu Mita [Fri, 27 Jan 2012 04:24:55 +0000 (04:24 +0000)]
mISDN: use memchr_inv
Use memchr_inv to check if the data contains all same bytes. It is
faster than looping for each byte.
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com> Cc: Karsten Keil <isdn@linux-pingi.de> Cc: netdev@vger.kernel.org Signed-off-by: David S. Miller <davem@davemloft.net>
Shawn Lu [Tue, 31 Jan 2012 22:35:48 +0000 (22:35 +0000)]
tcp: md5: RST: getting md5 key from listener
TCP RST mechanism is broken in TCP md5(RFC2385). When
connection is gone, md5 key is lost, sending RST
without md5 hash is deem to ignored by peer. This can
be a problem since RST help protocal like bgp to fast
recove from peer crash.
In most case, users of tcp md5, such as bgp and ldp,
have listener on both sides to accept connection from peer.
md5 keys for peers are saved in listening socket.
There are two cases in finding md5 key when connection is
lost:
1.Passive receive RST: The message is send to well known port,
tcp will associate it with listner. md5 key is gotten from
listener.
2.Active receive RST (no sock): The message is send to ative
side, there is no socket associated with the message. In this
case, finding listener from source port, then find md5 key from
listener.
we are not loosing sercuriy here:
packet is checked with md5 hash. No RST is generated
if md5 hash doesn't match or no md5 key can be found.
Signed-off-by: Shawn Lu <shawn.lu@ericsson.com> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Dan Carpenter [Tue, 31 Jan 2012 21:45:26 +0000 (21:45 +0000)]
xfrm6: remove unneeded NULL check in __xfrm6_output()
We don't check for NULL consistently in __xfrm6_output(). If "x" were
NULL here it would lead to an OOPs later. I asked Steffen Klassert
about this and he suggested that we remove the NULL check.
On 10/29/11, Steffen Klassert <steffen.klassert@secunet.com> wrote:
>> net/ipv6/xfrm6_output.c
>> 148
>> 149 if ((x && x->props.mode == XFRM_MODE_TUNNEL) &&
>> ^
>
> x can't be null here. It would be a bug if __xfrm6_output() is called
> without a xfrm_state attached to the skb. I think we can just remove
> this null check.
Cc: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Tue, 31 Jan 2012 18:45:40 +0000 (18:45 +0000)]
tcp: md5: protects md5sig_info with RCU
This patch makes sure we use appropriate memory barriers before
publishing tp->md5sig_info, allowing tcp_md5_do_lookup() being used from
tcp_v4_send_reset() without holding socket lock (upcoming patch from
Shawn Lu)
Note we also need to respect rcu grace period before its freeing, since
we can free socket without this grace period thanks to
SLAB_DESTROY_BY_RCU
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Cc: Shawn Lu <shawn.lu@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Francois Romieu [Tue, 31 Jan 2012 09:47:34 +0000 (10:47 +0100)]
r8169: bh locking redux and task scheduling.
- atomic bit operations are globally visible
- pending status is always cleared before execution
- scheduled works are either idempotent or only required to happen once
after a series of originating events, say link events for instance
Signed-off-by: Francois Romieu <romieu@fr.zoreil.com> Suggested-by: Michał Mirosław <mirqus@gmail.com> Cc: Hayes Wang <hayeswang@realtek.com>
This change also removes commented-out copy of __nlmsg_put
which was last touched in 2005 with "Enable once all users
have been converted" comment on top.
Changes in v2: rediffed against net-next.
Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Mon, 30 Jan 2012 04:29:24 +0000 (04:29 +0000)]
ipv6: fix RFC5722 comment
RFC5722 Section 4 was amended by Errata 3089
Our implementation did the right thing anyway...
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Cc: Nicolas Dichtel <nicolas.dichtel@6wind.com> Acked-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Tony Zelenoff [Thu, 26 Jan 2012 22:28:58 +0000 (22:28 +0000)]
net: Allow ipv6 proxies and arp proxies be shown with iproute2
Add ability to return neighbour proxies list to caller if
it sent full ndmsg structure and has NTF_PROXY flag set.
Before this patch (and before iproute2 patches):
$ ip neigh add proxy 2001::1 dev eth0
$ ip -6 neigh show
$
After it and with applied iproute2 patches:
$ ip neigh add proxy 2001::1 dev eth0
$ ip -6 neigh show
2001::1 dev eth0 proxy
$
Compatibility with old versions of iproute2 is not broken,
kernel checks for incoming structure size and properly
works if old structure is came.
[v2]
* changed comments style.
* removed useless line with continue and curly bracket.
* changed incoming message size check from equal to more or
equal.
CC: davem@davemloft.net CC: kuznet@ms2.inr.ac.ru CC: netdev@vger.kernel.org CC: xemul@parallels.com Signed-off-by: Tony Zelenoff <antonz@parallels.com> Acked-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
Paul Gortmaker [Fri, 27 Jan 2012 13:55:46 +0000 (13:55 +0000)]
drivers/net: strip unused module code from sun3_82586.c
This code is clearly unused, since it has a #error right
in it. Given the vintage of sun3 hardware, it is probably
safe to assume that there is little interest in adding new
functionality to the driver now, so just delete the unused
block of code.
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: Sam Creasey <sammy@sammy.net> Signed-off-by: David S. Miller <davem@davemloft.net>
Paul Gortmaker [Fri, 27 Jan 2012 13:36:01 +0000 (13:36 +0000)]
drivers/net: fix up stale paths from driver reorg
The reorganization of the driver layout in drivers/net
left behind some stale paths in comments and in Kconfig
help text. Bring them up to date. No actual change to
any code takes place here.
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com> CC: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Ben Hutchings [Mon, 30 Jan 2012 16:55:05 +0000 (16:55 +0000)]
sfc: Use a more sensible cast in efx_rx_buf_offset()
This function returns the page offset of the buffer, which can be
calculated based on either its DMA address or its virtual address. It
used to use the virtual address and we would cast that to unsigned
long, as anything smaller would result in a compiler warning. Now
that it's using the DMA address we should use unsigned int, matching
the return type. It is also unnecessary to use __force.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
David S. Miller [Fri, 27 Jan 2012 23:14:01 +0000 (15:14 -0800)]
ipv6: fib: Convert fib6_age() to dst_neigh_lookup().
In this specific situation we know we are dealing with a gatewayed route
and therefore rt6i_gateway is not going to be in6addr_any even in future
interpretations.
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Fri, 27 Jan 2012 23:07:56 +0000 (15:07 -0800)]
ipv6: ndisc: Convert to dst_neigh_lookup()
Now all code paths grab a local reference to the neigh, so if neigh
is not NULL we unconditionally release it at the end. The old logic
would only release if we didn't have a non-NULL 'rt'.
Signed-off-by: David S. Miller <davem@davemloft.net>
Francois Romieu [Thu, 26 Jan 2012 13:18:23 +0000 (14:18 +0100)]
r8169: remove work from irq handler.
The irq handler was a mess.
See 7ab87ff4c770eed71e3777936299292739fcd0fe ("via-rhine: move work from
irq handler to softirq and beyond") for similar changes. One can notice:
- all non-napi tasks are explicitely scheduled trough a single work queue.
- hiding software tx queue start behind the rtl_hw_start method is mildly
natural. Move it in the caller where needed.
- as can be seen from the heavy use of bh disabling locks, the driver is
not safe for irq context messages with netconsole. It is still quite
usable for general messaging though. Tested ok with concurrent registers
dump (ethtool -d) + background traffic + "echo t > /proc/sysrq-trigger".
As a side note, the comments in f11a377b3f4e897d11f0e8d1fc688667e2f19708
("r8169: avoid losing MSI interrupts") does not seem completely clear: if
I hack the driver further to stop acking the irq link event bit, MSI
interrupts keep being delivered (RTL8168b/8111b, XID 18000000).
Signed-off-by: Francois Romieu <romieu@fr.zoreil.com> Cc: Hayes Wang <hayeswang@realtek.com>
Francois Romieu [Thu, 26 Jan 2012 10:23:32 +0000 (11:23 +0100)]
r8169: stop delaying workqueue.
Though motivated by the move of the driver to a single work queue of
sequential events and removal of hard irq processing, it looks safe as
a standalone change.
Signed-off-by: Francois Romieu <romieu@fr.zoreil.com> Cc: Hayes Wang <hayeswang@realtek.com>
Francois Romieu [Thu, 26 Jan 2012 08:59:50 +0000 (09:59 +0100)]
r8169: remove rtl8169_reinit_task.
I see no good reason to keep both rtl8169_reinit_task and rtl8169_reset_task:
- rtl8169_reinit_task adds a software failure point which does relate to
any hardware state
- they handle hardware the same. Remember that rtl8169_reinit_task was
introduced in the 8169 only era to handle PCI errors way before the 8168
asked for pll and firmware ops and compare :
Bruce Allan [Sun, 1 Jan 2012 16:00:03 +0000 (16:00 +0000)]
e1000e: update copyright year
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Bruce Allan [Wed, 21 Dec 2011 09:47:10 +0000 (09:47 +0000)]
e1000e: split lib.c into three more-appropriate files
The generic lib.c file contains code relative to the various MACs, NVM and
Manageability supported by the driver. This patch splits the file into
three which are specific to those areas similar to how the PHY-specific
code is in phy.c and code specific to the 80003es2lan, 8257x, and ichX
MAC families are in their own files. The generic code that is applicable
to all MAC/PHY parts supported by the driver remains in netdev.c, param.c
and ethtool.c files. No change in functionality, just moving code
around for ease of maintenance, with some whitespace and other checkpatch
cleanups.
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Bruce Allan [Sat, 17 Dec 2011 08:32:57 +0000 (08:32 +0000)]
e1000e: call er16flash() instead of __er16flash()
__er16flash() is not meant to be called directly.
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Bruce Allan [Fri, 16 Dec 2011 00:47:04 +0000 (00:47 +0000)]
e1000e: increase version number
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Bruce Allan [Fri, 16 Dec 2011 00:46:59 +0000 (00:46 +0000)]
e1000e: convert final strncpy() to strlcpy()
Convert the last instances of strncpy() to the preferred strlcpy().
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Bruce Allan [Fri, 16 Dec 2011 00:46:54 +0000 (00:46 +0000)]
e1000e: concatenate long debug strings which span multiple lines
To ease searching for debug message strings, concatenate strings that span
multiple lines even if the resulting line exceeds 80 columns; these will
not cause checkpatch warnings.
Also, add '\n' and remove unnecessary '\r' from a few debug strings.
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Bruce Allan [Fri, 16 Dec 2011 00:46:49 +0000 (00:46 +0000)]
e1000e: conditionally restart autoneg on 82577/8/9 when setting LPLU state
When setting the Low Power Link Up (LPLU, a.k.a. reverse auto-negotiation)
on 82577/8278/82579, do not restart auto-negotiation if reset of the Phy is
blocked by the Manageability Engine.
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Bruce Allan [Fri, 16 Dec 2011 00:46:43 +0000 (00:46 +0000)]
e1000e: increase Rx PBA to prevent dropping received packets on 82566/82567
During bi-directional stress on some 82566/82567 devices, some received
packets were dropped. Increasing the Receive Packet Buffer Allocation
resolves this.
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Bruce Allan [Fri, 16 Dec 2011 00:46:38 +0000 (00:46 +0000)]
e1000e: ICHx/PCHx LOMs should use LPLU setting in NVM when going to Sx
When going to Sx with an ICHx/PCH device, the default Low Power Link Up
(LPLU, a.k.a. reverse auto-negotiation) behavior should be whatever is set
in the NVM. However, the function e1000_suspend_workarounds_ich8lan()
called when going to Sx always enabled LPLU in all power states.
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Bruce Allan [Fri, 16 Dec 2011 00:46:33 +0000 (00:46 +0000)]
e1000e: update workaround for 82579 intermittently disabled during S0->Sx
The workaround which toggles the LANPHYPC (LAN PHY Power Control) value bit
to force the MAC-Phy interconnect into PCIe mode from SMBus mode during
driver load and resume should always be done except if PHY resets are
blocked by the Manageability Engine (ME). Previously, the toggle was done
only if PHY resets are blocked and the ME was disabled.
The rest of the patch is just indentation changes as a consequence of the
updated workaround.
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Bruce Allan [Fri, 16 Dec 2011 00:46:27 +0000 (00:46 +0000)]
e1000e: disable Early Receive DMA on ICH LOMs
Internal stress testing with jumbo frames shows the reliability of ICH9 and
ICH10D devices is improved in certain corner cases by disabling the Early
Receive feature. To reduce the performance impact caused by disabling this
feature, the packet buffer sizes and relevant flow control settings are
modified accordingly.
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Ben Hutchings [Wed, 12 Oct 2011 16:20:25 +0000 (17:20 +0100)]
sfc: Make all MAC statistics consistently 64 bits wide
Currently we use type u64 for byte counts, which can very quickly
exceed 2^32, and unsigned long for packet counts, which do not. But
it can still take only 20-something minutes to send or receive 2^32
packets, and not all tools properly handle overflow even if they
sample more often than this.
The MAC statistics are all updated synchronously, so it costs very
little to make them all 64-bit regardless of native word size.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Ben Hutchings [Mon, 9 Jan 2012 19:47:08 +0000 (19:47 +0000)]
sfc: Remove remnants of on-load self-test
The out-of-tree version of the sfc driver used to run a self-test on
each device before registering it. Although this was never included
in-tree, some functions have checks for this special case which is not
really possible.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Ben Hutchings [Fri, 6 Jan 2012 20:25:39 +0000 (20:25 +0000)]
sfc: Add hwmon driver for boards using SFC9000-family controllers
The SFC9000-family controllers have firmware to manage all board
peripherals including temperature, heat sink continuity and voltage
sensors. The firmware reports sensor alarms, which we log, and
will shut down the board if necessary.
Some users may want to monitor their boards more closely, so add an
hwmon driver that exposes all sensors reported by the firmware. Move
efx_mcdi_sensor_event() into the new file so it can share the array of
sensor labels with the hwmon driver.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Ben Hutchings [Thu, 5 Jan 2012 20:14:10 +0000 (20:14 +0000)]
sfc: Clean up test interrupt handling
Interrupts are normally generated by the event queues, moderated by
timers. However, they may also be triggered by detection of a 'fatal'
error condition (e.g. memory parity error) or by the host writing to
certain CSR fields as part of a self-test.
The IRQ level/index used for these on Falcon rev B0 and Siena is set
by the KER_INT_LEVE_SEL field and cached by the driver in
efx_nic::fatal_irq_level. Since this value is also relevant to
self-tests rename the field to just 'irq_level'.
Avoid unnecessary cache traffic by using a per-channel 'last_irq_cpu'
field and only writing to the per-controller field when the interrupt
matches efx_nic::irq_level. Remove the volatile qualifier and use
ACCESS_ONCE in the places we read these fields.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Unlike the INT_ISR0 register on later controller revisions, the
NET_IVEC_INT_Q bits written to memory are only ever set for
interrupting event queues, not for any other interrupt sources.
By definition there can only be one legacy interrupt handler per
function, so there is no need to worry about detecting a fatal
interrupt more than once.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Ben Hutchings [Fri, 4 Nov 2011 23:06:04 +0000 (23:06 +0000)]
sfc: Remove dependence on NAPI polling in efx_test_eventq_irq()
We cannot safely assume that the NAPI handler will complete within the
20 ms that we allow for the event self-test. The handler may be
deferred for longer than this, particularly on realtime kernels.
Instead, check whether either an event has been handled or (as in the
old failure path) whether an interrupt has been received and an event
has been delivered but not yet handled. Use napi_disable() to
synchronize with the NAPI handler before checking, since it will
clear events before updating eventq_read_ptr.
Remove the test result chan.N.eventq.poll, since it is not an error
if the NAPI handler does not run during the test.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Ben Hutchings [Thu, 8 Dec 2011 19:51:47 +0000 (19:51 +0000)]
sfc: Correct interrupt timer quantum for Siena (normal and turbo mode)
We currently assume that the timer quantum for Siena is 5 us, the same
as for Falcon. This is not correct; timer ticks are generated on a
rota which takes a minimum of 768 cycles (each event delivery or other
timer change will delay it by 3 cycles). The timer quantum should be
6.144 or 3.072 us depending on whether turbo mode is active.
Replace EFX_IRQ_MOD_RESOLUTION with a timer_quantum_ns field in struct
efx_nic, initialised by the efx_nic_type::probe function.
While we're at it, replace EFX_IRQ_MOD_MAX with a timer_period_max
field in struct efx_nic_type.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Ben Hutchings [Fri, 4 Nov 2011 22:29:14 +0000 (22:29 +0000)]
sfc: Consistently test DEBUG macro, not EFX_ENABLE_DEBUG
The netif_dbg() macro is defined in <linux/netdevice.h>. If the DEBUG
macro is defined, it logs a message at 'debug' level, otherwise it
does nothing.
In net_driver.h we define DEBUG if EFX_ENABLE_DEBUG is defined, but
this is too late for those source files that already got a
definition of netif_dbg() by including <linux/netdevice.h>
Get rid of EFX_ENABLE_DEBUG, and only define and test DEBUG.
In mtd.c, we do not use DEBUG as a condition flag but are forced to
use the DEBUG macro-function from <linux/mtd/mtd.h>. Undefine DEBUG
before including it.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Both implementations of efx_nic_type::reconfigure_mac operation
push the multicast hash filter to the hardware. It is therefore
redundant to call efx_nic_type::push_multicast_hash as well.
efx_mcdi_mac_reconfigure() also uses this operation, but the
implementation for Siena just uses MCDI anyway. Merge that into
efx_mcdi_mac_reconfigure().
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Ben Hutchings [Thu, 8 Sep 2011 01:09:42 +0000 (02:09 +0100)]
sfc: Merge efx_mcdi_mac_check_fault() and efx_mcdi_get_mac_faults()
The latter is only called by the former, which is a very short
wrapper. Further, gcc 4.5 may currently wrongly warn that the
'faults' variable may be used uninitialised.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Ben Hutchings [Fri, 2 Sep 2011 23:15:00 +0000 (00:15 +0100)]
sfc: Merge efx_mac_operations into efx_nic_type
No NICs need to switch efx_mac_operations at run-time, and the MAC
operations are fairly closely bound to NIC types.
Move efx_mac_operations::reconfigure to efx_nic_type::reconfigure_mac
and efx_mac_operations::check_fault fo efx_nic_type::check_mac_fault.
Change callers to call through efx->type or directly if the NIC type
is known.
Remove efx_mac_operations::update_stats. The implementations for
Falcon used to fetch MAC statistics synchronously and this was used by
efx_register_netdev() to clear statistics after running self-tests.
However, it now only converts statistics that have already been
fetched (and that only for Falcon), and the call from
efx_register_netdev() has no effect.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Ben Hutchings [Fri, 2 Sep 2011 22:23:00 +0000 (23:23 +0100)]
sfc: Hold efx_nic::stats_lock while reading efx_nic::mac_stats
efx_nic::stats_lock is used to serialise stats updates, but each
reader was dropping it before it finished reading efx_nic::mac_stats.
If there were concurrent stats reads using procfs, or one using procfs
and one using ethtool, an update could race with a read. On a 32-bit
system, the reader could see word-tearing of 64-bit stats (32 bits of
the old value and 32 bits of the new).
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Ben Hutchings [Tue, 20 Dec 2011 23:39:31 +0000 (23:39 +0000)]
sfc: Make handling of MC reboot more reliable
When the MC reboots, either as part of a firmware upgrade or due to a
bug, it attempts to complete (with an error) any requests that were
outstanding before the reboot. Since there is an inherent race
condition in checking this, it will also write to a status word in
shared memory.
If we look at each of these separately, we may detect each reboot
twice, resulting in a spurious command failure after a firmware
upgrade or frustrating recovery from a firmware bug. Instead, if a
request completion indicates a reboot, we must poll and clear the
status word.
This bug was previously masked by use of an incorrect address for the
status word. Fix that, using the definition now included in
mcdi_pcol.h.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Ben Hutchings [Tue, 20 Dec 2011 01:22:51 +0000 (01:22 +0000)]
sfc: Remove fallback for invalid permanent MAC address
By the time we look at the MAC address in efx_probe_port(), either the
driver or the firmware has already validated the board configuration.
The possibility of having an invalid MAC address just isn't worth
considering. It certainly isn't worth having a compile-time option
for this.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
David S. Miller [Thu, 26 Jan 2012 20:22:32 +0000 (15:22 -0500)]
ipv4/ipv6: Prepare for new route gateway semantics.
In the future the ipv4/ipv6 route gateway will take on two types
of values:
1) INADDR_ANY/IN6ADDR_ANY, for local network routes, and in this case
the neighbour must be obtained using the destination address in
ipv4/ipv6 header as the lookup key.
2) Everything else, the actual nexthop route address.
So if the gateway is not inaddr-any we use it, otherwise we must use
the packet's destination address.
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Wed, 25 Jan 2012 04:44:20 +0000 (04:44 +0000)]
tcp: add LINUX_MIB_TCPRETRANSFAIL counter
It might be useful to get a counter of failed tcp_retransmit_skb()
calls.
Reported-by: Satoru Moriya <satoru.moriya@hds.com> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Wed, 25 Jan 2012 03:56:30 +0000 (03:56 +0000)]
be2net: allocate more headroom in incoming skbs
Allocation of 64 bytes in skb headroom is not enough if we have to pull
ethernet + ipv6 + tcp headers, and/or extra tunneling header.
Its currently not noticed because netdev_alloc_skb_ip_align(64) give us
more room, thanks to power-of-two kmalloc() roundups.
Make sure we ask for 128 bytes so that side effects of upcoming patches
from Ian Campbell dont decrease benet rx performance, because of extra
skb head reallocations.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Cc: Ian Campbell <Ian.Campbell@citrix.com> Cc: Vasundhara Volam <vasundhara.volam@emulex.com> Cc: Sathya Perla <sathya.perla@emulex.com> Cc: Ajit Khaparde <ajit.khaparde@emulex.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Ariel Elior [Thu, 26 Jan 2012 06:01:54 +0000 (06:01 +0000)]
bnx2x: Update version to 1.72.0 and copyrights
Signed-off-by: Ariel Elior <ariele@broadcom.com> Signed-off-by: Eilon Greenstein <eilong@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Ariel Elior [Thu, 26 Jan 2012 06:01:53 +0000 (06:01 +0000)]
bnx2x: Recoverable and unrecoverable error statistics
Add statistics for tracking parity errors from which we successfully
recovered and those which were deemed unrecoverable.
Signed-off-by: Ariel Elior <ariele@broadcom.com> Signed-off-by: Eilon Greenstein <eilong@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Ariel Elior [Thu, 26 Jan 2012 06:01:52 +0000 (06:01 +0000)]
bnx2x: Recovery flow bug fixes
1. Sample mcp pulse and mcp sequence in nic load instead of in init_one
as they may change by the time we want to use them.
2. Allow cnic to access device during nic load (by adding a new "LOADING" state
to recovery flow). This prevents the unnecessary cnic timeout which resulted
by cnic attempting to access because nic is loading, but being blocked because
of the Recovery state.
3. Issue 'fake' driver load command to mcp when last driver unloads to prevent
mcp from taking ownership. When recovery is complete unload fake driver to
allow mcp to initialize the hardware before first driver loads.
Signed-off-by: Ariel Elior <ariele@broadcom.com> Signed-off-by: Eilon Greenstein <eilong@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Ariel Elior [Thu, 26 Jan 2012 06:01:51 +0000 (06:01 +0000)]
bnx2x: Track active PFs with bitmap
The recovery register (to which a hardware lock has been added in previous
patch) is used amongst other things to track the active PFs. The old
implementation which used a per path counter is not viable in a virtualized
environment where a pf may increment the counter and then have the kernel
crash around it preventing the counter from ever reaching zero.
In the new implementation the scenario described will result in the PF timing
out against the mcp, which will clear the PF's bit in the bitmask allowing
recovery process to proceed.
Signed-off-by: Ariel Elior <ariele@broadcom.com> Signed-off-by: Eilon Greenstein <eilong@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Ariel Elior [Thu, 26 Jan 2012 06:01:50 +0000 (06:01 +0000)]
bnx2x: Lock PF-common resources
Use hardware locks to protect resources common to several Physical Functions. In
a virtualized environment the RTNL lock only protects a PF's driver against
the PFs sharing it's VMs with regard to device resources. Other PFs may reside
in other VMs under other OSs, and are not subject to the lock. Such resources
which were previously protected implicitly by the RTNL lock must now be
protected explicitly with dedicated HW locks.
Signed-off-by: Ariel Elior <ariele@broadcom.com> Signed-off-by: Eilon Greenstein <eilong@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Ariel Elior [Thu, 26 Jan 2012 06:01:49 +0000 (06:01 +0000)]
bnx2x: Loaded Firmware Version Validation
In a virtualized environment it is possible for a loading driver to discover
that Firmware is already loaded to the device, and that this FW does not match
its own. This can happen for example if different Physical Functions are
Assigned to different VMs in which different driver versions are loaded. The
code in this patch ensures that only drivers with matching FW are loaded over
the device, and that in the case described above where the Firmware version
doesn't match the driver load is aborted.
Signed-off-by: Ariel Elior <ariele@broadcom.com> Signed-off-by: Eilon Greenstein <eilong@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Ariel Elior [Thu, 26 Jan 2012 06:01:48 +0000 (06:01 +0000)]
bnx2x: Function Level Reset Final Cleanup
1. Fix bug where return value is ignored
2. Improve printouts
3. Fix typos
Signed-off-by: Ariel Elior <ariele@broadcom.com> Signed-off-by: Eilon Greenstein <eilong@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Ariel Elior [Thu, 26 Jan 2012 06:01:47 +0000 (06:01 +0000)]
bnx2x: Obtain Bus Device Function from register
BDF was obtained from kernel but since in virtualized environment
(e.g. physical device assigment in KVM) the function number may
not be the real one, the info must be obtained from the device.
Signed-off-by: Ariel Elior <ariele@broadcom.com> Signed-off-by: Eilon Greenstein <eilong@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Ariel Elior [Thu, 26 Jan 2012 06:01:46 +0000 (06:01 +0000)]
bnx2x: Removing indirect register access
In virtualized environments indirect access to the device may not be supported
(depending on the Hypervisor type). Indirect device access was used since in
some harware contexts (i.e. certain chipset and BIOS) every access the driver
makes across the pci is followed by a BIOS initiated Zero Length Read to the
same address. When accessing widebus registers this zero length read corrupts
the serialization of the read/write sequence resulting with errors. To avoid
this problem widebus registers are always accessed via the DMAE or the indirect
interface. However, the 57712x and 578xx devices intercept the zero length read
and so using the indirect interface with these devices is not necessary. Since
PDA is only supported for 57712x and 578xx the indirect access to device was
restricted to 57710 and 57711x.
Signed-off-by: Ariel Elior <ariele@broadcom.com> Signed-off-by: Eilon Greenstein <eilong@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Ariel Elior [Thu, 26 Jan 2012 06:01:45 +0000 (06:01 +0000)]
bnx2x: Support Queue Per Cos in 5771xx devices
Enable the use of up to three hardware queues for transmission. The queues
are always dequed round robin (i.e. strict priority, PFC and ETS are not
supported). This does allow the allocation of a seperate HW queue for low
volume, high priority traffic which will be serviced more promptly.
Signed-off-by: Ariel Elior <ariele@broadcom.com> Signed-off-by: Eilon Greenstein <eilong@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Bruce Allan [Fri, 16 Dec 2011 00:46:22 +0000 (00:46 +0000)]
e1000e: 82574/82583 Tx hang workaround
On 82574/82583, there is a hardware bug which might cause a Tx hang when
the internal buffer is full. Setting this bit enables a hardware fix to
work around the issue.
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Bruce Allan [Fri, 16 Dec 2011 00:46:17 +0000 (00:46 +0000)]
e1000e: use hardware default values for Transmit Control register
This code snippet is simply writing default values to the register which is
unnecessary since the values are programmed into the register by default.
There is a special case for 80003es2lan needing the Retransmit on Late
Collision bit set but that is also done in e1000_init_hw_80003es2lan().
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Bruce Allan [Fri, 16 Dec 2011 00:46:12 +0000 (00:46 +0000)]
e1000e: use default settings for Tx Inter Packet Gap timer
Use the default hardware values for TIPG except for 80003es2lan(*). The
code that is removed in this patch is either unnecessarily writing the TIPG
register with the hardware default values for some devices (82571/2/3/4) or
writing the wrong value for others (ICH/PCH LOMs). The only change in
functionality is setting the correct default TIPG for the latter devices.
(*) The correct value for 80003es2lan is already set properly in
e1000_init_hw_80003es2lan() and e1000_cfg_kmrn_{10_100|1000}_80003es2lan(),
and the unused flag FLAG_TIPG_MEDIUM_FOR_80003ESLAN is removed.
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Bruce Allan [Fri, 16 Dec 2011 00:46:06 +0000 (00:46 +0000)]
e1000e: 82579: workaround for link drop issue
When connected to certain switches, the 82579 PHY might drop link
unexpectedly. Work around the issue by setting the Mean Square Error
higher than the hardware default.
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Bruce Allan [Fri, 16 Dec 2011 00:46:01 +0000 (00:46 +0000)]
e1000e: always set transmit descriptor control registers the same
The hardware erratum workaround where the TXDCTL register must be the same
setting for both queues should always be done.
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Bruce Allan [Fri, 16 Dec 2011 00:45:56 +0000 (00:45 +0000)]
e1000e: default IntMode based on kernel config & available hardware support
Based on a patch from Prabhakar Kushwaha <prabhakar@freescale.com>, set
appropriate default interrupt mode dependent on whether CONFIG_PCI_MSI
is enabled in the kernel configuration and if the hardware supports
MSI-X. Set the module parameter log message accordingly.
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Cc: Jin Qing <b24347@freescale.com> Cc: Prabhakar Kushwaha <prabhakar@freescale.com> Cc: Jin Qing <b24347@freescale.com> Cc: Kumar Gala <galak@kernel.crashing.org> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Bruce Allan [Fri, 16 Dec 2011 00:45:51 +0000 (00:45 +0000)]
e1000e: re-factor ethtool get/set ring parameter
Make it more like how igb does it, with some additional error checking.
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Bruce Allan [Fri, 16 Dec 2011 00:45:45 +0000 (00:45 +0000)]
e1000e: pass pointer to ring struct instead of adapter struct
For ring-specific functions, pass a pointer to the ring struct instead of a
pointer to the adapter struct.
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Bruce Allan [Fri, 16 Dec 2011 00:45:40 +0000 (00:45 +0000)]
e1000e: convert head, tail and itr_register offsets to __iomem pointers
The Tx/Rx head and tail registers and itr_register are always at known
addresses based on the __iomem address at which the PCI region (from BAR 0)
is mapped and known offsets within the region for each of these registers.
Store and use the full address rather than just the region offset to reduce
unnecessary address calculations. Also, change current u8 __iomem pointers
to void __iomem pointers.
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Bruce Allan [Fri, 16 Dec 2011 00:45:35 +0000 (00:45 +0000)]
e1000e: re-enable alternate MAC address for all devices which support it
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>