Johannes Berg [Thu, 23 Apr 2009 14:13:26 +0000 (16:13 +0200)]
mac80211: unify config_interface and bss_info_changed
The config_interface method is a little strange, it contains the
BSSID and beacon updates, while bss_info_changed contains most
other BSS information for each interface. This patch removes
config_interface and rolls all the information it previously
passed to drivers into bss_info_changed.
Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>
Johannes Berg [Thu, 23 Apr 2009 14:10:04 +0000 (16:10 +0200)]
mac80211: clean up beacon interval settings
We currently have two beacon interval configuration knobs:
hw.conf.beacon_int and vif.bss_info.beacon_int. This is
rather confusing, even though the former is used when we
beacon ourselves and the latter when we are associated to
an AP.
This just deprecates the hw.conf.beacon_int setting in favour
of always using vif.bss_info.beacon_int. Since it touches all
the beaconing IBSS code anyway, we can also add support for
the cfg80211 IBSS beacon interval configuration easily.
NOTE: The hw.conf.beacon_int setting is retained for now due
to drivers still using it -- I couldn't untangle all
drivers, some are updated in this patch.
Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>
Johannes Berg [Thu, 23 Apr 2009 14:01:47 +0000 (16:01 +0200)]
mac80211: fix scan races and rework scanning
There are some places marked
/* XXX maybe racy? */
and they really are racy because there's no locking.
This patch reworks much of the scan code, and introduces proper
locking for the scan request as well as the internal scanning
(which is necessary for IBSS/managed modes). Helper functions
are added to call the scanning code whenever necessary. The
scan deferring is changed to simply queue the scanning work
instead of trying to start the scan in place, the scanning work
will then take care of the rest.
Also, currently when internal scans are requested for an interface
that is trying to associate, we reject such scans. This was not
intended, the mlme code has provisions to scan twice when it can't
find the BSS to associate with right away; this has never worked
properly. Fix this by not rejecting internal scan requests for an
interface that is associating.
Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>
Johannes Berg [Thu, 23 Apr 2009 08:38:26 +0000 (10:38 +0200)]
mac80211: internally clear failed scans properly
When the IBSS code wants to scan, but that fails, we can
get stuck in a situation where you can never scan again.
Fix this by properly notifying ourselves when the scan
request has failed.
Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>
Johannes Berg [Thu, 23 Apr 2009 08:32:36 +0000 (10:32 +0200)]
mac80211: rename max_sleep_interval to max_sleep_period
Kalle points out that max_sleep_interval is somewhat confusing
because the value is measured in beacon intervals, and not in
TU. Rename it to max_sleep_period to be consistent with things
like DTIM period that are also measured in beacon intervals.
Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>
Johannes Berg [Wed, 22 Apr 2009 21:02:51 +0000 (23:02 +0200)]
mac80211: fix PS vs. scan race
When somebody changes the PS parameters while scanning
is in progress, we enable PS -- during the scan. This
is clearly not desirable, and we can just abort enabling
PS when scanning since when the scan finishes it will
be taken care of.
Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Reviewed-by: Kalle Valo <kalle.valo@iki.fi> Signed-off-by: John W. Linville <linville@tuxdriver.com>
Johannes Berg [Thu, 23 Apr 2009 09:48:56 +0000 (11:48 +0200)]
mac80211: fix various problems in ibss code
There are a few problems in the IBSS code:
a) it tries to activate interfaces that are down after scanning
b) it crashes after scanning on an IBSS iface that isn't active
c) since the ssid_len is used as a flag, need to make it visible
only after all other settings are set, this helps protect
against b)
For b), we get a system crash:
wlan0: Creating new IBSS network, BSSID ce:f9:88:76:1e:4d
BUG: unable to handle kernel NULL pointer dereference at (null)
IP: [<...>] ieee80211_sta_find_ibss+0x294/0x37d [mac80211]
Call Trace:
[<...>] ieee80211_ibss_notify_scan_completed+0x0/0x88 [mac80211]
Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>
net: remove driver_data direct access of struct device from more drivers
In the near future, the driver core is going to not allow direct access
to the driver_data pointer in struct device. Instead, the functions
dev_get_drvdata() and dev_set_drvdata() should be used. These functions
have been around since the beginning, so are backwards compatible with
all older kernel versions.
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by: David S. Miller <davem@davemloft.net>
With this patch in place, I'm successfully able to use the netconsole
mechanism with the Calao USB-A9263 board, which uses the AT91SAM9263
CPU, which in terms of Ethernet controller is supported by the macb
driver.
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
[haavard.skinnemoen@atmel.com: disable_irq() -> local_irq_save()]
[haavard.skinnemoen@atmel.com: convert to net_device_ops] Signed-off-by: Haavard Skinnemoen <haavard.skinnemoen@atmel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Ilpo Järvinen [Mon, 4 May 2009 18:07:36 +0000 (11:07 -0700)]
tcp: extend ECN sysctl to allow server-side only ECN
This should be very safe compared with full enabled, so I see
no reason why it shouldn't be done right away. As ECN can only
be negotiated if the SYN sending party is also supporting it,
somebody in the loop probably knows what he/she is doing. If
SYN does not ask for ECN, the server side SYN-ACK is identical
to what it is without ECN. Thus it's quite safe.
The chosen value is safe w.r.t to existing configs which
choose to currently set manually either 0 or 1 but
silently upgrades those who have not explicitly requested
ECN off.
Whether to just enable both sides comes up time to time but
unless that gets done now we can at least make the servers
aware of ECN already. As there are some known problems to occur
if ECN is enabled, it's currently questionable whether there's
any real gain from enabling clients as servers mostly won't
support it anyway (so we'd hit just the negative sides). After
enabling the servers and getting that deployed, the client end
enable really has some potential gain too.
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>
Commit 0ba25ff4c669e5395110ba6ab4958a97a9f96922 ("br2684: convert to
net_device_ops") inadvertently deleted the initialization of the net_dev
pointer in the br2684_dev structure, leading to crashes. This patch
adds it back.
Reported-by: Mikko Vinni <mmvinni@yahoo.com> Tested-by: Mikko Vinni <mmvinni@yahoo.com> Signed-off-by: Rabin Vincent <rabin@rab.in> Signed-off-by: David S. Miller <davem@davemloft.net>
Robert Love [Sat, 2 May 2009 20:48:32 +0000 (13:48 -0700)]
net: Only store high 16 bits of kernel generated filter priorities
The kernel should only be using the high 16 bits of a kernel
generated priority. Filter priorities in all other cases only
use the upper 16 bits of the u32 'prio' field of 'struct tcf_proto',
but when the kernel generates the priority of a filter is saves all
32 bits which can result in incorrect lookup failures when a filter
needs to be deleted or modified.
Signed-off-by: Robert Love <robert.w.love@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
We were avoiding calling sg_init* on scatterlists passed
into virtnet_send_command to prevent extraneous end markers.
This caused build warnings for uninitialized variables.
Cleanup the code to create proper scatterlists.
Signed-off-by: Alex Williamson <alex.williamson@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Ayaz Abdulla [Fri, 1 May 2009 01:41:50 +0000 (01:41 +0000)]
forcedeth: add clock gating feature <resend>
This patch adds new logic to support a clock gating feature found on the
latest set of chipsets. The clock gating is performed on the tx/rx
engines when the link is disconnected. Clock gating helps in reducing
power consumption.
* modified based on comments from netdev
Signed-off-by: Ayaz Abdulla <aabdulla@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
LAN9512 and LAN9514 are USB hubs with an integrated 10/100 ethernet
controller. Logically this looks like an ethernet controller (similar
to LAN9500) permanently attached to one of the hub's downstream ports.
This patch adds the usb device id of the new ethernet controller to the
smsc95xx driver. This id is the same in both new devices.
Signed-off-by: Steve Glendinning <steve.glendinning@smsc.com> Signed-off-by: David S. Miller <davem@davemloft.net>
SMSC LAN9500 has dual purpose GPIO/LED pins, and by default at power-on
these are configured as GPIOs. This means that if LEDs are fitted they
won't ever light.
This patch sets them to be LED outputs for speed, duplex and
link/activity.
Signed-off-by: Steve Glendinning <steve.glendinning@smsc.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Bruno Prémont [Wed, 29 Apr 2009 20:45:17 +0000 (20:45 +0000)]
netconsole: take care of NETDEV_UNREGISTER event
When netconsole is loaded and a network interface fades away (e.g. on
rmmod $interface_driver_module) the rmmod remains stuck and some locks
are taken that prevent any additional module loading/unloading as well
as interface up/down changes.
In addition kernel logs (and console) get flooded at 10s interval with
[ 122.464065] unregister_netdevice: waiting for eth0 to become free. Usage count = 1
[ 132.704059] unregister_netdevice: waiting for eth0 to become free. Usage count = 1
This patch lets netconsole take NETDEV_UNREGISTER event into account
and release the affected interface if it was in use.
Signed-off-by: Bruno Prémont <bonbons@linux-vserver.org> Acked-by: Matt Mackall <mpm@selenic.com> Signed-off-by: David S. Miller <davem@davemloft.net>
net: remove driver_data direct access of struct device
In the near future, the driver core is going to not allow direct access
to the driver_data pointer in struct device. Instead, the functions
dev_get_drvdata() and dev_set_drvdata() should be used. These functions
have been around since the beginning, so are backwards compatible with
all older kernel versions.
Cc: netdev@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by: David S. Miller <davem@davemloft.net>
cxgb3: fixing gcc 4.4 compiler warning: suggest parentheses around operand of ‘!’
Trivial: fixing gcc 4.4 compiler warning:
drivers/net/cxgb3/t3_hw.c: In function ‘t3_prep_adapter’:
drivers/net/cxgb3/t3_hw.c:3782: warning: suggest parentheses around operand of ‘!’ or change ‘|’ to ‘||’ or ‘!’ to ‘~’
Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@mail.by> Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Fri, 1 May 2009 16:05:06 +0000 (09:05 -0700)]
net: Fix skb_tx_hash() for forwarding workloads.
When skb_rx_queue_recorded() is true, we dont want to use jash distribution
as the device driver exactly told us which queue was selected at RX time.
jhash makes a statistical shuffle, but this wont work with 8 static inputs.
Later improvements would be to compute reciprocal value of real_num_tx_queues
to avoid a divide here. But this computation should be done once,
when real_num_tx_queues is set. This needs a separate patch, and a new
field in struct net_device.
Reported-by: Andrew Dickinson <andrew@whydna.net> Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Signed-off-by: David S. Miller <davem@davemloft.net>
net: Fix oops when splicing skbs from a frag_list.
Lennert Buytenhek wrote:
> Since 4fb669948116d928ae44262ab7743732c574630d ("net: Optimize memory
> usage when splicing from sockets.") I'm seeing this oops (e.g. in
> 2.6.30-rc3) when splicing from a TCP socket to /dev/null on a driver
> (mv643xx_eth) that uses LRO in the skb mode (lro_receive_skb) rather
> than the frag mode:
My patch incorrectly assumed skb->sk was always valid, but for
"frag_listed" skbs we can only use skb->sk of their parent.
Reported-by: Lennert Buytenhek <buytenh@wantstofly.org> Debugged-by: Lennert Buytenhek <buytenh@wantstofly.org> Tested-by: Lennert Buytenhek <buytenh@wantstofly.org> Signed-off-by: Jarek Poplawski <jarkao2@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Ben Hutchings [Wed, 29 Apr 2009 08:04:14 +0000 (08:04 +0000)]
mdio: Add register definitions for MDIO (clause 45)
IEEE 802.3 clause 45 specifies the MDIO interface and registers for
use in 10G and other PHYs, similar to the MII management interface.
PHYs may have up to 32 MMDs corresponding to different sub-layers and
functions, each with up to 65536 registers. These are addressed by
PRTAD (similar to the MII PHY address) and DEVAD. Define a mapping
for specifying PRTAD and DEVAD through the existing MII ioctls.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Ben Hutchings [Wed, 29 Apr 2009 08:02:59 +0000 (08:02 +0000)]
ethtool: Add port type PORT_OTHER
Add a PORT_OTHER to represent all other physical port types. Current
NICs generally do not allow switching between multiple port types in
software so specific types should not be needed.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
On several mv643xx_eth hardware versions, the two 64bit mib counters
for 'good octets received' and 'good octets sent' are actually 32bit
counters, and reading from the upper half of the register has the same
effect as reading from the lower half of the register: an atomic
read-and-clear of the entire 32bit counter value. This can under heavy
traffic occasionally lead to small numbers being added to the upper
half of the 64bit mib counter even though no 32bit wrap has occured.
Since we poll the mib counters at least every 30 seconds anyway, we
might as well just skip the reads of the upper halves of the hardware
counters without breaking the stats, which this patch does.
Signed-off-by: Lennert Buytenhek <buytenh@marvell.com> Cc: stable@kernel.org Signed-off-by: David S. Miller <davem@davemloft.net>
Currently, when OOM occurs during rx ring refill, mv643xx_eth will get
into an infinite loop, due to the refill function setting the OOM bit
but not clearing the 'rx refill needed' bit for this queue, while the
calling function (the NAPI poll handler) will call the refill function
in a loop until the 'rx refill needed' bit goes off, without checking
the OOM bit.
This patch fixes this by checking the OOM bit in the NAPI poll handler
before attempting to do rx refill. This means that once OOM occurs,
we won't try to do any memory allocations again until the next invocation
of the poll handler.
While we're at it, change the OOM flag to be a single bit instead of
one bit per receive queue since OOM is a system state rather than a
per-queue state, and cancel the OOM timer on entry to the NAPI poll
handler if it's running to prevent it from firing when we've already
come out of OOM.
Signed-off-by: Lennert Buytenhek <buytenh@marvell.com> Cc: stable@kernel.org Signed-off-by: David S. Miller <davem@davemloft.net>
John Dykstra [Thu, 30 Apr 2009 00:22:30 +0000 (17:22 -0700)]
pcnet32: Remove pointless memory barriers
These two memory barriers in performance-critical paths are not needed
on x86. Even if some other architecture does buffer PCI I/O space
writes, the existing memory-mapped I/O barriers are unlikely to be what
is needed.
Signed-off-by: John Dykstra <john.dykstra1@gmail.com> Acked-by: Don Fry <pcnet32@verizon.net> Signed-off-by: David S. Miller <davem@davemloft.net>
Johannes Berg [Tue, 28 Apr 2009 22:28:18 +0000 (00:28 +0200)]
mac80211: default to automatic power control
In "mac80211: correct wext transmit power handler"
I fixed the wext handler, but forgot to make the default of the
user_power_level -1 (aka "auto"), so that now the transmit power
is always set to 0, causing associations to time out and similar
problems since we're transmitting with very little power. Correct
this by correcting the default user_power_level to -1.
Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Bisected-by: Niel Lambrechts <niel.lambrechts@gmail.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
Don Skidmore [Wed, 29 Apr 2009 07:22:31 +0000 (00:22 -0700)]
ixgbe: Use pci_wake_from_d3() instead of multiple pci_enable_wake()
We were calling pci_enable_wake() twice in a row for both D3_hot
and D3_cold. This replaces those calls with a call to pci_wake_from_d3()
to avoid issues with PCI PM vs ordering constraints.
Signed-off-by: Don Skidmore <donald.c.skidmore@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
o hold the firmware in memory across suspend, since filesystem
may not be up after resuming.
o reset the chip after requesting firmware, to minimize downtime
for NC-SI.
Signed-off-by: Dhananjay Phadke <dhananjay@netxen.com> Signed-off-by: David S. Miller <davem@davemloft.net>
e100: do not go D3 in shutdown unless system is powering off
After experimenting with kexec with the last merges after 2.6.29, I've
had some problems when probing e100. It would not read the eeprom. After
some bisects, I realized this has been like that since forever (at least
2.6.18). The problem is that shutdown is doing the same thing that
suspend does and puts the device in D3 state. I couldn't find a way to
get the device back to a sane state in the probe function. So, based on
some similar patches from Rafael J. Wysocki for e1000, e1000e, and ixgbe,
I wrote this one for e100.
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@holoscopio.com> Acked-by: "Rafael J. Wysocki" <rjw@sisk.pl> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
The x_tables are organized with a table structure and a per-cpu copies
of the counters and rules. On older kernels there was a reader/writer
lock per table which was a performance bottleneck. In 2.6.30-rc, this
was converted to use RCU and the counters/rules which solved the performance
problems for do_table but made replacing rules much slower because of
the necessary RCU grace period.
This version uses a per-cpu set of spinlocks and counters to allow to
table processing to proceed without the cache thrashing of a global
reader lock and keeps the same performance for table updates.
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Bob Copeland [Tue, 28 Apr 2009 02:12:43 +0000 (22:12 -0400)]
ath5k: fix buffer overrun in rate debug code
char bname[5] is too small for the string "X GHz" when the null
terminator is taken into account. Thus, turning on rate debugging
can crash unless we have lucky stack alignment.
Cc: stable@kernel.org Reported-by: Paride Legovini <legovini@spiro.fisica.unipd.it> Signed-off-by: Bob Copeland <me@bobcopeland.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
Johannes Berg [Thu, 23 Apr 2009 08:45:04 +0000 (10:45 +0200)]
iwlwifi: notify on scan completion even when shutting down
Under certain circumstances iwlwifi can get stuck and will no
longer accept scan requests, because the core code (cfg80211)
thinks that it's still processing one. This fixes one of the
points where it can happen, but I've still seen it (although
only with my radio-off-when-idle patch).
Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Acked-by: Reinette Chatre <reinette.chatre@intel.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
Jussi Kivilinna [Wed, 22 Apr 2009 07:59:37 +0000 (10:59 +0300)]
rndis_wlan: fix initialization order for workqueue&workers
rndis_wext_link_change() might be called from rndis_command() at
initialization stage and priv->workqueue/priv->work have not been
initialized yet. This causes invalid opcode at rndis_wext_bind on
some brands of bcm4320.
Fix by initializing workqueue/workers in rndis_wext_bind() before
rndis_command is used.
This bug has existed since 2.6.25, reported at:
http://bugzilla.kernel.org/show_bug.cgi?id=12794
Signed-off-by: Jussi Kivilinna <jussi.kivilinna@mbnet.fi> Signed-off-by: John W. Linville <linville@tuxdriver.com>
Stephen Rothwell [Wed, 22 Apr 2009 05:11:05 +0000 (15:11 +1000)]
wireless: remove unneeded EXPORT_SYMBOL the tickles a powerpc compiler bug
drivers/net/wireless/iwlwifi/iwl3945-base.c:1415: error: __ksymtab_iwl3945_rx_queue_reset causes a section type conflict
I am pretty sure that this is a compiler bug, so not to worry. However,
as far as I can see, iwl-3945.o (the only user) and iwl3945-base.o are
always linked into the same module, so the EXPORT_SYMBOL (which causes
the problem) should not be needed. Correct?
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: John W. Linville <linville@tuxdriver.com>
Bluetooth: Fix connection establishment with low security requirement
The Bluetooth 2.1 specification introduced four different security modes
that can be mapped using Legacy Pairing and Simple Pairing. With the
usage of Simple Pairing it is required that all connections (except
the ones for SDP) are encrypted. So even the low security requirement
mandates an encrypted connection when using Simple Pairing. When using
Legacy Pairing (for Bluetooth 2.0 devices and older) this is not required
since it causes interoperability issues.
To support this properly the low security requirement translates into
different host controller transactions depending if Simple Pairing is
supported or not. However in case of Simple Pairing the command to
switch on encryption after a successful authentication is not triggered
for the low security mode. This patch fixes this and actually makes
the logic to differentiate between Simple Pairing and Legacy Pairing
a lot simpler.
Based on a report by Ville Tervo <ville.tervo@nokia.com>
Bluetooth: Add different pairing timeout for Legacy Pairing
The Bluetooth stack uses a reference counting for all established ACL
links and if no user (L2CAP connection) is present, the link will be
terminated to save power. The problem part is the dedicated pairing
when using Legacy Pairing (Bluetooth 2.0 and before). At that point
no user is present and pairing attempts will be disconnected within
10 seconds or less. In previous kernel version this was not a problem
since the disconnect timeout wasn't triggered on incoming connections
for the first time. However this caused issues with broken host stacks
that kept the connections around after dedicated pairing. When the
support for Simple Pairing got added, the link establishment procedure
needed to be changed and now causes issues when using Legacy Pairing
When using Simple Pairing it is possible to do a proper reference
counting of ACL link users. With Legacy Pairing this is not possible
since the specification is unclear in some areas and too many broken
Bluetooth devices have already been deployed. So instead of trying to
deal with all the broken devices, a special pairing timeout will be
introduced that increases the timeout to 60 seconds when pairing is
triggered.
If a broken devices now puts the stack into an unforeseen state, the
worst that happens is the disconnect timeout triggers after 120 seconds
instead of 4 seconds. This allows successful pairings with legacy and
broken devices now.
Based on a report by Johan Hedberg <johan.hedberg@nokia.com>
Kumar Gala [Tue, 28 Apr 2009 15:04:10 +0000 (08:04 -0700)]
gianfar: Use memset instead of cacheable_memzero
cacheable_memzero() is completely overkill for the clearing out the FCB
block which is only 8-bytes. The compiler should easily optimize this
with memset. Additionally, cacheable_memzero() only exists on ppc32 and
thus breaks builds of gianfar on ppc64.
Signed-off-by: Kumar Gala <galak@kernel.crashing.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Tue, 28 Apr 2009 09:24:21 +0000 (02:24 -0700)]
net: Avoid extra wakeups of threads blocked in wait_for_packet()
In 2.6.25 we added UDP mem accounting.
This unfortunatly added a penalty when a frame is transmitted, since
we have at TX completion time to call sock_wfree() to perform necessary
memory accounting. This calls sock_def_write_space() and utimately
scheduler if any thread is waiting on the socket.
Thread(s) waiting for an incoming frame was scheduled, then had to sleep
again as event was meaningless.
(All threads waiting on a socket are using same sk_sleep anchor)
This adds lot of extra wakeups and increases latencies, as noted
by Christoph Lameter, and slows down softirq handler.
Fortunatly, Davide Libenzi recently added concept of keyed wakeups
into kernel, and particularly for sockets (see commit 37e5540b3c9d838eb20f2ca8ea2eb8072271e403
epoll keyed wakeups: make sockets use keyed wakeups)
Davide goal was to optimize epoll, but this new wakeup infrastructure
can help non epoll users as well, if they care to setup an appropriate
handler.
This patch introduces new DEFINE_WAIT_FUNC() helper and uses it
in wait_for_packet(), so that only relevant event can wakeup a thread
blocked in this function.
Trace of function calls from bnx2 TX completion bnx2_poll_work() is :
__kfree_skb()
skb_release_head_state()
sock_wfree()
sock_def_write_space()
__wake_up_sync_key()
__wake_up_common()
receiver_wake_function() : Stops here since thread is waiting for an INPUT
Reported-by: Christoph Lameter <cl@linux.com> Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Grant Likely [Tue, 28 Apr 2009 09:11:53 +0000 (02:11 -0700)]
net: Fix ucc_geth.c handling of fixed-link w/o phy-connection-type property.
Previous rework to ucc_geth.c to add of_mdio support (net: Rework
ucc_geth driver to use of_mdio infrastructure) added a block of
code which broke older openfirmware device trees which this case.
This patch removes the offending blurb.
Signed-off-by: Grant Likely <grant.likely@secretlab.ca> Signed-off-by: David S. Miller <davem@davemloft.net>
Ayyappan at VMware noticed that we're missing this check from ixgbe which
is in our other drivers. The difference with this implementation from our
other drivers is that this checks all the tx queues rather than just tx[0].
Signed-off-by: Shannon Nelson <shannon.nelson@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Update the interrupt management to correctly handle greater
than 16 queue vectors.
Signed-off-by: Shannon Nelson <shannon.nelson@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Alexander Duyck [Mon, 27 Apr 2009 22:42:37 +0000 (22:42 +0000)]
ixgbe: enable HW RSC for 82599
This patch enables hardware receive side coalescing for 82599 hardware.
82599 can merge multiple frames from the same TCP/IP flow into a single
structure that can span one ore more descriptors. The accumulated data is
arranged similar to how jumbo frames are arranged with the exception that
other packets can be interlaced inbetween. To overcome this issue a next
pointer is included in the written back descriptor which indicates the next
descriptor in the writeback sequence.
This feature sets the NETIF_F_LRO flag and clearing it via the ethtool set
flags operation will also disable hardware RSC.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
This is the code to enable ixgbe for hardware offload support
of CRC32c on both transmit and receive of SCTP traffic.
only 82599 supports this offload, not 82598.
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Originally from: Vlad Yasevich <vladislav.yasevich@hp.com>
This patch, both the driver portion and the sctp code was
modified by Jesse Brandeburg and is
Copyright(c) 2009 Intel Corporation.
Thanks go to Vlad for starting this work.
Intel 82576 chipset supports SCTP checksum offloading. This
patch enables this functionality in the driver. A new NETIF
feature is introduced for SCTP checksum offload. If the driver
supports CRC32c checksum, it can set this feature flag. The
hardware can offload both transmit and receive.
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
sctp: add feature bit for SCTP offload in hardware
this is the sctp code to enable hardware crc32c offload for
adapters that support it.
Originally by: Vlad Yasevich <vladislav.yasevich@hp.com>
modified by Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Alexander Duyck [Mon, 27 Apr 2009 22:35:33 +0000 (22:35 +0000)]
igb/ixgbe: remove unecessary checks for CHECKSUM_UNNECESSARY
Both of these drivers do a check to verify ip_summed is set to
CHECKSUM_UNNECESSARY prior to passing the packet to GRO. GRO itself
already does such a check so it is redundant and can be removed as this
will likely cause out of order issues when receiving a packet that didn't
pass checksum validation.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Alexander Duyck [Mon, 27 Apr 2009 22:35:14 +0000 (22:35 +0000)]
igb: make rxcsum configuration seperate from multiqueue
The igb driver was being incorrectly setup to only allow disabling receive
checksum if multiqueue was disabled. This change corrects that so that
RXCSUM is configured regardless of queue configuration.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Alexander Duyck [Mon, 27 Apr 2009 22:34:54 +0000 (22:34 +0000)]
igb: reconfigure mailbox timeout logic
This change updates the timeout logic so that it is not possible to have a
sucessful check for message and still return an error if countdown = 0.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Reported-by: Juha Leppanen <juha_motorsportscom@luukku.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Magnus Damm [Mon, 27 Apr 2009 21:32:16 +0000 (21:32 +0000)]
smsc911x: add fifo byteswap support V2
This is V2 of the smsc911x fifo byteswap patch.
The smsc911x hardware supports both big and little and endian
hardware configurations, and the linux smsc911x driver currently
detects word order.
For correct operation on big endian platforms lacking swapped
byte lanes the following patch is needed. Only fifo data is
swapped, register data does not require any swapping.
Signed-off-by: Magnus Damm <damm@igel.co.jp> Acked-by: Steve Glendinning <steve.glendinning@smsc.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Jie Yang [Mon, 27 Apr 2009 19:42:03 +0000 (19:42 +0000)]
atl1c: disable L1/L0s when link detected
Disable L1/L0s when link detected. We enable L1/L0s when link connected
before, but there is some hareware error on some platform. So just diable
this feature when link connected. This feature is about power saving.
Signed-off-by: Jie Yang <jie.yang@atheros.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Herbert Xu [Mon, 27 Apr 2009 12:44:45 +0000 (05:44 -0700)]
gro: Fix COMPLETE checksum handling
On a brand new GRO skb, we cannot call ip_hdr since the header
may lie in the non-linear area. This patch adds the helper
skb_gro_network_header to handle this.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
Herbert Xu [Mon, 27 Apr 2009 12:44:29 +0000 (05:44 -0700)]
gro: Fix handling of headers that extend over the tail
The skb_gro_* code fails to handle the case where a header starts
in the linear area but ends in the frags area. Since the goal
of skb_gro_* is to optimise the case of completely non-linear
packets, we can simply bail out if we have anything in the linear
area.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
It would be nice to cap this for memory consumption reasons, but a massive
hashtable also causes a significant spike when measuring OS jitter.
With a 32MB hashtable and 4 million entries, rt_worker_func is taking
5 ms to complete. On another system with more memory it's taking 14 ms.
Even though rt_worker_func does call cond_sched() to limit its impact,
in an HPC environment we want to keep all sources of OS jitter to a minimum.
With the patch applied we limit the number of entries to 512k which
can still be overriden by using the rt_entries boot option:
Create a file Documentation/isdn/INTERFACE.CAPI describing the
interface between the kernel CAPI subsystem and ISDN device drivers,
analogous to the existing Documentation/isdn/INTERFACE for the old
isdn4linux subsystem. Also add kerneldoc comments to the exported
functions in drivers/isdn/capi/kcapi.c.
Impact: Documentation Signed-off-by: Tilman Schmidt <tilman@imap.cc> Signed-off-by: Karsten Keil <keil@b1-systems.de> Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Vladislav Zolotarov <vladz@broadcom.com> Signed-off-by: Eilon Greenstein <eilong@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>