Cc: Amir Vadai <amirv@mellanox.com> Cc: David S. Miller <davem@davemloft.net> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
John Fastabend [Sun, 15 Apr 2012 06:44:37 +0000 (06:44 +0000)]
macvlan: add FDB bridge ops and macvlan flags
This adds FDB bridge ops to the macvlan device passthru mode.
Additionally a flags field was added and a NOPROMISC bit to
allow users to use passthru mode without the driver calling
dev_set_promiscuity(). The flags field is a u16 placed in a
4 byte hole (consuming 2 bytes) of the macvlan_dev struct.
We want to do this so that the macvlan driver or stack
above the macvlan driver does not have to process every
packet. For the use case where we know all the MAC addresses
of the endstations above us this works well.
This patch is a result of Roopa Prabhu's work. Follow up
patches are needed for VEPA and VEB macvlan modes.
v2: Change from distinct nopromisc mode to a flags field to
configure this. This avoids the tendency to add a new
mode every time we need some slightly different behavior.
v3: fix error in dev_set_promiscuity and add change and get
link attributes for flags.
CC: Roopa Prabhu <roprabhu@cisco.com> CC: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: John Fastabend <john.r.fastabend@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Greg Rose [Sun, 15 Apr 2012 06:44:31 +0000 (06:44 +0000)]
ixgbe: UTA table incorrectly programmed
The UTA table was being set to the functional equivalent of promiscuous
mode. This was resulting in traffic from the virtual function being
flooded onto the wire and the PF device. This resulted in additional
overhead for VF traffic sent to the network and in the case of traffic
sent to the PF or another VF resulted in unwanted packets on the wire.
This was actually not the intended behavior. Now that we can program
the embedded switch correctly we can remove this snippit of code. Users
who want to support this should configure the FDB correctly using the
FDB ops.
Signed-off-by: Greg Rose <gregory.v.rose@intel.com> Signed-off-by: John Fastabend <john.r.fastabend@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
John Fastabend [Sun, 15 Apr 2012 06:44:25 +0000 (06:44 +0000)]
ixgbe: allow RAR table to be updated in promisc mode
This allows RAR table updates while in promiscuous. With
SR-IOV enabled it is valuable to allow the RAR table to
be updated even when in promisc mode to configure forwarding
Signed-off-by: John Fastabend <john.r.fastabend@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
John Fastabend [Sun, 15 Apr 2012 06:44:14 +0000 (06:44 +0000)]
net: rtnetlink notify events for FDB NTF_SELF adds and deletes
It is useful to be able to monitor for FDB events in user space.
This patch adds support to generate netlink events when a change
is made to a device supporting the FDB ops.
This brings embedded switches inline with the SW net/bridge which
triggers events on FDB updates as well.
Signed-off-by: John Fastabend <john.r.fastabend@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
John Fastabend [Sun, 15 Apr 2012 06:44:08 +0000 (06:44 +0000)]
net: add fdb generic dump routine
This adds a generic dump routine drivers can call. It
should be sufficient to handle any bridging model that
uses the unicast address list. This should be most SR-IOV
enabled NICs.
v2: return error on nlmsg_put and use -EMSGSIZE instead
of -ENOMEM this is inline other usages
Signed-off-by: John Fastabend <john.r.fastabend@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
John Fastabend [Sun, 15 Apr 2012 06:44:02 +0000 (06:44 +0000)]
net: addr_list: add exclusive dev_uc_add and dev_mc_add
This adds a dev_uc_add_excl() and dev_mc_add_excl() calls
similar to the original dev_{uc|mc}_add() except it sets
the global bit and returns -EEXIST for duplicat entires.
This is useful for drivers that support SR-IOV, macvlan
devices and any other devices that need to manage the
unicast and multicast lists.
v2: fix typo UNICAST should be MULTICAST in dev_mc_add_excl()
CC: Ben Hutchings <bhutchings@solarflare.com> Signed-off-by: John Fastabend <john.r.fastabend@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
John Fastabend [Sun, 15 Apr 2012 06:43:56 +0000 (06:43 +0000)]
net: add generic PF_BRIDGE:RTM_ FDB hooks
This adds two new flags NTF_MASTER and NTF_SELF that can
now be used to specify where PF_BRIDGE netlink commands should
be sent. NTF_MASTER sends the commands to the 'dev->master'
device for parsing. Typically this will be the linux net/bridge,
or open-vswitch devices. Also without any flags set the command
will be handled by the master device as well so that current user
space tools continue to work as expected.
The NTF_SELF flag will push the PF_BRIDGE commands to the
device. In the basic example below the commands are then parsed
and programmed in the embedded bridge.
Note if both NTF_SELF and NTF_MASTER bits are set then the
command will be sent to both 'dev->master' and 'dev' this allows
user space to easily keep the embedded bridge and software bridge
in sync.
There is a slight complication in the case with both flags set
when an error occurs. To resolve this the rtnl handler clears
the NTF_ flag in the netlink ack to indicate which sets completed
successfully. The add/del handlers will abort as soon as any
error occurs.
To support this new net device ops were added to call into
the device and the existing bridging code was refactored
to use these. There should be no required changes in user space
to support the current bridge behavior.
A basic setup with a SR-IOV enabled NIC looks like this,
In this case the embedded bridge must be managed to allow 'veth0'
to communicate with 'ethx.y' correctly. At present drivers managing
the embedded bridge either send frames onto the network which
then get dropped by the switch OR the embedded bridge will flood
these frames. With this patch we have a mechanism to manage the
embedded bridge correctly from user space. This example is specific
to SR-IOV but replacing the VF with another PF or dropping this
into the DSA framework generates similar management issues.
Examples session using the 'br'[1] tool to add, dump and then
delete a mac address with a new "embedded" option and enabled
ixgbe driver:
# br fdb add 22:35:19:ac:60:59 dev eth3
# br fdb
port mac addr flags
veth0 22:35:19:ac:60:58 static
veth0 9a:5f:81:f7:f6:ec local
eth3 00:1b:21:55:23:59 local
eth3 22:35:19:ac:60:59 static
veth0 22:35:19:ac:60:57 static
#br fdb add 22:35:19:ac:60:59 embedded dev eth3
#br fdb
port mac addr flags
veth0 22:35:19:ac:60:58 static
veth0 9a:5f:81:f7:f6:ec local
eth3 00:1b:21:55:23:59 local
eth3 22:35:19:ac:60:59 static
veth0 22:35:19:ac:60:57 static
eth3 22:35:19:ac:60:59 local embedded
#br fdb del 22:35:19:ac:60:59 embedded dev eth3
I added a couple lines to 'br' to set the flags correctly is all. It
is my opinion that the merit of this patch is now embedded and SW
bridges can both be modeled correctly in user space using very nearly
the same message passing.
[1] 'br' tool was published as an RFC here and will be renamed 'bridge'
http://patchwork.ozlabs.org/patch/117664/
Thanks to Jamal Hadi Salim, Stephen Hemminger and Ben Hutchings for
valuable feedback, suggestions, and review.
v2: fixed api descriptions and error case with both NTF_SELF and
NTF_MASTER set plus updated patch description.
Signed-off-by: John Fastabend <john.r.fastabend@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Tony Zelenoff [Fri, 13 Apr 2012 06:09:54 +0000 (06:09 +0000)]
atl1: do not drop rx/tx interrupts before they are scheduled
To prevent interrupts lost they should be dropped only if
they are scheduled via napi interfaces. In other case, there is
exists situation when napi handler process TX interrupt, stay in
RX processing and in that moment any other interrupt received.
Then before this patch TX bit in ISR will be cleaned, napi
schedule will not occur in case of currently processing event and
TX interrupt definitely will be lost.
Signed-off-by: Tony Zelenoff <antonz@parallels.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Tony Zelenoff [Fri, 13 Apr 2012 06:09:51 +0000 (06:09 +0000)]
atl1: add value to check ability of reenabling IRQs
Unfortunately it is not clear from code is usage of
IMR register possible or not. So, to prevent possible
side-effects of reading this register i prefer store
interrupts enable flag separately.
Signed-off-by: Tony Zelenoff <antonz@parallels.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Herbert Xu [Fri, 13 Apr 2012 02:37:42 +0000 (02:37 +0000)]
bridge: Add multicast_querier toggle and disable queries by default
Sending general queries was implemented as an optimisation to speed
up convergence on start-up. In order to prevent interference with
multicast routers a zero source address has to be used.
Unfortunately these packets appear to cause some multicast-aware
switches to misbehave, e.g., by disrupting multicast packets to us.
Since the multicast snooping feature still functions without sending
our own queries, this patch will change the default to not send
queries.
For those that need queries in order to speed up convergence on start-up,
a toggle is provided to restore the previous behaviour.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
Herbert Xu [Fri, 13 Apr 2012 02:37:42 +0000 (02:37 +0000)]
bridge: Restart queries when last querier expires
As it stands when we discover that a real querier (one that queries
with a non-zero source address) we stop querying. However, even
after said querier has fallen off the edge of the earth, we will
never restart querying (unless the bridge itself is restarted).
This patch fixes this by kicking our own querier into gear when
the timer for other queriers expire.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
Jason Wang [Wed, 11 Apr 2012 20:43:52 +0000 (20:43 +0000)]
virtio-net: send gratuitous packets when needed
As hypervior does not have the knowledge of guest network configuration, it's
better to ask guest to send gratuitous packets when needed.
This patch implements VIRTIO_NET_F_GUEST_ANNOUNCE feature: hypervisor would
notice the guest when it thinks it's time for guest to announce the link
presnece. Guest tests VIRTIO_NET_S_ANNOUNCE bit during config change interrupt
and woule send gratuitous packets through netif_notify_peers() and ack the
notification through ctrl vq.
We need to make sure the atomicy of read and ack in guest otherwise we may ack
more times than being notified. This is done through handling the whole config
change interrupt in an non-reentrant workqueue.
Signed-off-by: Jason Wang <jasowang@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Peter Hüwe [Sat, 14 Apr 2012 13:42:59 +0000 (13:42 +0000)]
isdn/hysdn: Convert to kstrtoul_from_user
This patch replaces the code for getting an number from a
userspace buffer by a simple call to kstroul_from_user.
This makes it easier to read and less error prone.
Signed-off-by: Peter Huewe <peterhuewe@gmx.de> Signed-off-by: David S. Miller <davem@davemloft.net>
tcp_enter_quickack_mode() already calls tcp_incr_quickack() and sets
icsk->icsk_ack.ato to TCP_ATO_MIN. This patch removes the duplication.
Signed-off-by: Vijay Subramanian <subramanian.vijay@gmail.com> Reviewed-by: Flavio Leitner <fbl@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Alex Copot [Thu, 12 Apr 2012 22:21:45 +0000 (22:21 +0000)]
tcp: bind() use stronger condition for bind_conflict
We must try harder to get unique (addr, port) pairs when
doing port autoselection for sockets with SO_REUSEADDR
option set.
We achieve this by adding a relaxation parameter to
inet_csk_bind_conflict. When 'relax' parameter is off
we return a conflict whenever the current searched
pair (addr, port) is not unique.
Tests where ran for creating and binding(0) many sockets
on 100 IPs. The results are, on average:
* 60000 sockets, 600 ports / IP:
* 0.210 s, 620 (IP, port) duplicates without patch
* 0.219 s, no duplicates with patch
* 100000 sockets, 1000 ports / IP:
* 0.371 s, 1720 duplicates without patch
* 0.373 s, no duplicates with patch
* 200000 sockets, 2000 ports / IP:
* 0.766 s, 6900 duplicates without patch
* 0.768 s, no duplicates with patch
* 500000 sockets, 5000 ports / IP:
* 2.227 s, 41500 duplicates without patch
* 2.284 s, no duplicates with patch
Signed-off-by: Alex Copot <alex.mihai.c@gmail.com> Signed-off-by: Daniel Baluta <dbaluta@ixiacom.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Thu, 12 Apr 2012 22:16:05 +0000 (22:16 +0000)]
inet: makes syn_ack_timeout mandatory
There are two struct request_sock_ops providers, tcp and dccp.
inet_csk_reqsk_queue_prune() can avoid testing syn_ack_timeout being
NULL if we make it non NULL like syn_ack_timeout
Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Gerrit Renker <gerrit@erg.abdn.ac.uk> Cc: dccp@vger.kernel.org Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Thu, 12 Apr 2012 19:48:40 +0000 (19:48 +0000)]
tcp: RFC6298 supersedes RFC2988bis
Updates some comments to track RFC6298
Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: H.K. Jerry Chu <hkchu@google.com> Cc: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
At the beginning of ks_rcv(), a for loop retrieves the
header information relevant to all the frames stored
in the mac's internal buffers. The number of pending
frames is stored as an 8 bits field in KS_RXFCTR.
If interrupts are disabled long enough to allow for more than
32 frames to accumulate in the MAC's internal buffers, a buffer
overflow occurs.
This patch fixes the problem by making the
driver's frame_head_info buffer big enough.
Well actually, since the chip appears to have 12K of
internal rx buffers and the shortest ethernet frame should
be 64 bytes long, maybe the limit could be set to
12*1024/64 = 192 frames, but 255 should be safer.
Signed-off-by: Davide Ciminaghi <ciminaghi@gnudd.com> Signed-off-by: Raffaele Recalcati <raffaele.recalcati@bticino.it> Signed-off-by: David S. Miller <davem@davemloft.net>
Axel Lin [Fri, 13 Apr 2012 18:41:21 +0000 (18:41 +0000)]
net/wan: use module_pci_driver
This patch converts the drivers in drivers/net/wan/* to use
module_pci_driver() macro which makes the code smaller and a bit simpler.
Signed-off-by: Axel Lin <axel.lin@gmail.com> Cc: Francois Romieu <romieu@fr.zoreil.com> Cc: "David S. Miller" <davem@davemloft.net> Signed-off-by: David S. Miller <davem@davemloft.net>
Axel Lin [Fri, 13 Apr 2012 18:40:17 +0000 (18:40 +0000)]
net/tokenring: use module_pci_driver
This patch converts the drivers in drivers/net/tokenring/* to use
module_pci_driver() macro which makes the code smaller and a bit simpler.
Signed-off-by: Axel Lin <axel.lin@gmail.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: David Howells <dhowells@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Don Skidmore [Thu, 15 Mar 2012 07:36:37 +0000 (07:36 +0000)]
ixgbe: add I2C clock stretching
This patch adds support for I2C clock stretching which is required per
SFF-8636. Customers with passive DA cables implement clock stretching
would fail without this patch.
Signed-off-by: Don Skidmore <donald.c.skidmore@intel.com> Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Bruce Allan [Fri, 13 Apr 2012 00:08:31 +0000 (00:08 +0000)]
e1000e: cleanup boolean logic
Replace occurrences of 'if (<bool expr> == <1|0>)' with
'if ([!]<bool expr>)'
Replace occurrences of '<bool var> = (<non-bool expr>) ? true : false'
with '<bool var> = <non-bool expr>'.
Replace occurrence of '<bool var> = <non-bool expr>' with
'<bool var> = !!<non-bool expr>'
While the latter replacement is not really necessary, it is done here for
consistency and clarity. No functional changes.
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Bruce Allan [Thu, 12 Apr 2012 05:47:09 +0000 (05:47 +0000)]
e1000e: cleanup remaining strings split across multiple lines
Now that split strings generate checkpatch warnings (per Chapter 2 of
Documentation/CodingStyle to make it easier to grep the code for the
string) cleanup the remaining instances of them in the driver.
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Richard Cochran [Sun, 8 Apr 2012 14:38:10 +0000 (14:38 +0000)]
e100: enable transmit time stamping.
This patch enables software (and phy device) transmit time stamping.
Tested on an old PIII laptop with built in NIC.
Signed-off-by: Richard Cochran <richardcochran@gmail.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Richard Cochran [Wed, 4 Apr 2012 17:43:31 +0000 (17:43 +0000)]
e100: Support the get_ts_info ethtool method.
Signed-off-by: Richard Cochran <richardcochran@gmail.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Don Skidmore [Thu, 5 Apr 2012 08:12:05 +0000 (08:12 +0000)]
ixgbe: fix WoL issue with fiber
There are times we turn of the laser before shutdown. This is a bad thing
if we want to wake on lan to work so now we make sure the laser is on
before shutdown if we support WoL.
Signed-off-by: Don Skidmore <donald.c.skidmore@intel.com> Tested-by: Stephen Ko <stephen.s.ko@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Bruce Allan [Thu, 12 Apr 2012 06:27:03 +0000 (06:27 +0000)]
e1000e: issues in Sx on 82577/8/9
A workaround was previously put in the driver to reset the device when
transitioning to Sx in order to activate the changed settings of the PHY
OEM bits (Low Power Link Up, or LPLU, and GbE disable configuration) for
82577/8/9 devices. After further review, it was found such a reset can
cause the 82579 to confuse which version of 82579 it actually is and broke
LPLU on all 82577/8/9 devices. The workaround during an S0->Sx transition
on 82579 (instead of resetting the PHY) is to restart auto-negotiation
after the OEM bits are configured; the restart of auto-negotiation
activates the new OEM bits as does the reset. With 82577/8, the reset is
changed to a generic reset which fixes the LPLU bits getting set wrong.
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
David S. Miller [Fri, 13 Apr 2012 18:21:04 +0000 (14:21 -0400)]
rtnetlink: ops->get_tx_queue() cannot take a const 'tb'.
net/core/rtnetlink.c: In function ‘rtnl_create_link’:
net/core/rtnetlink.c:1645:3: warning: passing argument 2 of ‘ops->get_tx_queues’ from incompatible pointer type [enabled by default]
net/core/rtnetlink.c:1645:3: note: expected ‘const struct nlattr **’ but argument is of type ‘struct nlattr **’
Signed-off-by: David S. Miller <davem@davemloft.net>
Will Deacon [Thu, 12 Apr 2012 05:54:09 +0000 (05:54 +0000)]
net: smsc911x: fix skb handling in receive path
The SMSC911x driver resets the ->head, ->data and ->tail pointers in the
skb on the reset path in order to avoid buffer overflow due to packet
padding performed by the hardware.
This patch fixes the receive path so that the skb pointers are fixed up
after the data has been read from the device, The error path is also
fixed to use number of words consistently and prevent erroneous FIFO
fastforwarding when skipping over bad data.
Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Matt Renzelmann [Fri, 13 Apr 2012 07:59:40 +0000 (07:59 +0000)]
ks8851: Fix missing mutex_lock/unlock
Move the ks8851_rdreg16 call above the call to request_irq and cache
the result for subsequent repeated use. A spurious interrupt may
otherwise cause a crash. Thanks to Stephen Boyd, Flavio Leitner, and
Ben Hutchings for feedback.
Signed-off-by: Matt Renzelmann <mjr@cs.wisc.edu> Signed-off-by: David S. Miller <davem@davemloft.net>
Jason Wang [Wed, 11 Apr 2012 22:10:54 +0000 (22:10 +0000)]
8139cp: set intr mask after its handler is registered
We set intr mask before its handler is registered, this does not work well when
8139cp is sharing irq line with other devices. As the irq could be enabled by
the device before 8139cp's hander is registered which may lead unhandled
irq. Fix this by introducing an helper cp_irq_enable() and call it after
request_irq().
Signed-off-by: Jason Wang <jasowang@redhat.com> Reviewed-by: Flavio Leitner <fbl@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Wed, 11 Apr 2012 23:05:28 +0000 (23:05 +0000)]
udp: intoduce udp_encap_needed static_key
Most machines dont use UDP encapsulation (L2TP)
Adds a static_key so that udp_queue_rcv_skb() doesnt have to perform a
test if L2TP never setup the encap_rcv on a socket.
Idea of this patch came after Simon Horman proposal to add a hook on TCP
as well.
If static_key is not yet enabled, the fast path does a single JMP .
When static_key is enabled, JMP destination is patched to reach the real
encap_type/encap_rcv logic, possibly adding cache misses.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Cc: Simon Horman <horms@verge.net.au> Cc: dev@openvswitch.org Signed-off-by: David S. Miller <davem@davemloft.net>
Tony Zelenoff [Wed, 11 Apr 2012 06:15:03 +0000 (06:15 +0000)]
atl1: fix kernel panic in case of DMA errors
Problem:
There was two separate work_struct structures which share one
handler. Unfortunately getting atl1_adapter structure from
work_struct in case of DMA error was done from incorrect
offset which cause kernel panics.
Solution:
The useless work_struct for DMA error removed and
handler name changed to more generic one.
Signed-off-by: Tony Zelenoff <antonz@parallels.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Mike Sinkovsky [Tue, 10 Apr 2012 19:53:53 +0000 (19:53 +0000)]
net: WIZnet drivers: fix possible NULL dereference
This fixes possible null dereference in probe() function: when both
.mac_addr and .link_gpio are unknown, dev.platform_data may be NULL
Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Mike Sinkovsky <msink@permonline.ru> Signed-off-by: David S. Miller <davem@davemloft.net>
net: Remove redundant spi driver bus initialization
In ancient times it was necessary to manually initialize the bus field of an
spi_driver to spi_bus_type. These days this is done in spi_driver_register() so
we can drop the manual assignment.
The patch was generated using the following coccinelle semantic patch:
// <smpl>
@@
identifier _driver;
@@
struct spi_driver _driver = {
.driver = {
- .bus = &spi_bus_type,
},
};
// </smpl>
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de> Cc: "David S. Miller" <davem@davemloft.net> Cc: Gabor Juhos <juhosg@openwrt.org> Cc: Frederic Lambert <frdrc66@gmail.com> Cc: netdev@vger.kernel.org Acked-by: Gabor Juhos <juhosg@openwrt.org> Signed-off-by: David S. Miller <davem@davemloft.net>
David Ward [Mon, 9 Apr 2012 04:13:53 +0000 (04:13 +0000)]
net/garp: fix GID rbtree ordering
The comparison operators were backwards in both garp_attr_lookup and
garp_attr_create, so the entire GID rbtree was in reverse order.
(There was no practical side effect to this though, except that PDUs
were sent with attributes listed in reverse order, which is still
valid by the protocol. This change is only for clarity.)
Signed-off-by: David Ward <david.ward@ll.mit.edu> Signed-off-by: David S. Miller <davem@davemloft.net>
The skb struct ubuf_info callback gets passed struct ubuf_info
itself, not the arg value as the field name and the function signature
seem to imply. Rename the arg field to ctx to match usage,
add documentation and change the callback argument type
to make usage clear and to have compiler check correctness.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
David Woodhouse [Sun, 8 Apr 2012 10:01:44 +0000 (10:01 +0000)]
ppp: Fix race condition with queue start/stop
Commit e675f0cc9a872fd152edc0c77acfed19bf28b81e ("ppp: Don't stop and
restart queue on every TX packet") introduced a race condition which
could leave the net queue stopped even when the channel is no longer
busy. By calling netif_stop_queue() from ppp_start_xmit(), based on the
return value from ppp_xmit_process() but *after* all the locks have been
dropped, we could potentially do so *after* the channel has actually
finished transmitting and attempted to re-wake the queue.
Fix this by moving the netif_stop_queue() into ppp_xmit_process() under
the xmit lock. I hadn't done this previously, because it gets called
from other places than ppp_start_xmit(). But I now think it's the better
option. The net queue *should* be stopped if the channel becomes
congested due to writes from pppd, anyway.
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
David Woodhouse [Sun, 8 Apr 2012 09:55:43 +0000 (09:55 +0000)]
pppoatm: Fix excessive queue bloat
We discovered that PPPoATM has an excessively deep transmit queue. A
queue the size of the default socket send buffer (wmem_default) is
maintained between the PPP generic core and the ATM device.
Fix it to queue a maximum of *two* packets. The one the ATM device is
currently working on, and one more for the ATM driver to process
immediately in its TX done interrupt handler. The PPP core is designed
to feed packets to the channel with minimal latency, so that really
ought to be enough to keep the ATM device busy.
While we're at it, fix the fact that we were triggering the wakeup
tasklet on *every* pppoatm_pop() call. The comment saying "this is
inefficient, but doing it right is too hard" turns out to be overly
pessimistic... I think :)
On machines like the Traverse Geos, with a slow Geode CPU and two
high-speed ADSL2+ interfaces, there were reports of extremely high CPU
usage which could partly be attributed to the extra wakeups.
(The wakeup handling could actually be made a whole lot easier if we
stop checking sk->sk_sndbuf altogether. Given that we now only queue
*two* packets ever, one wonders what the point is. As it is, you could
already deadlock the thing by setting the sk_sndbuf to a value lower
than the MTU of the device, and it'd just block for ever.)
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
If the ipv6 dst cache which copy from the dst generated by ICMPV6 RA packet.
this dst cache will not check expire because it has no RTF_EXPIRES flag.
So this dst cache will always be used until the dst gc run.
Change the struct dst_entry,add a union contains new pointer from and expires.
When rt6_info.rt6i_flags has no RTF_EXPIRES flag,the dst.expires has no use.
we can use this field to point to where the dst cache copy from.
The dst.from is only used in IPV6.
rt6_check_expired check if rt6_info.dst.from is expired.
ip6_rt_copy only set dst.from when the ort has flag RTF_ADDRCONF
and RTF_DEFAULT.then hold the ort.
ip6_dst_destroy release the ort.
Add some functions to operate the RTF_EXPIRES flag and expires(from) together.
and change the code to use these new adding functions.
Changes from v5:
modify ip6_route_add and ndisc_router_discovery to use new adding functions.
Only set dst.from when the ort has flag RTF_ADDRCONF
and RTF_DEFAULT.then hold the ort.
Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Do the initialization of the HSI interface when the
interface is opened, instead of upon registration.
When the interface is closed the HSI interface is
de-initialized, allowing other modules to use the
HSI interface.
Signed-off-by: Sjur Brændeland <sjur.brandeland@stericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Implement aggregation algorithm, combining more data into a single
HSI transfer. 4 different traffic categories are supported:
1. TC_PRIO_CONTROL .. TC_PRIO_MAX (CTL)
2. TC_PRIO_INTERACTIVE (VO)
3. TC_PRIO_INTERACTIVE_BULK (VI)
4. TC_PRIO_BESTEFFORT, TC_PRIO_BULK, TC_PRIO_FILLER (BEBK)
Signed-off-by: Dmitry Tarnyagin <dmitry.tarnyagin@stericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
James Chapman [Tue, 10 Apr 2012 00:10:43 +0000 (00:10 +0000)]
l2tp: don't overwrite source address in l2tp_ip_bind()
Applications using L2TP/IP sockets want to be able to bind() an L2TP/IP
socket to set the local tunnel id while leaving the auto-assigned source
address alone. So if no source address is supplied, don't overwrite
the address already stored in the socket.
Signed-off-by: James Chapman <jchapman@katalix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
James Chapman [Tue, 10 Apr 2012 00:10:42 +0000 (00:10 +0000)]
l2tp: fix refcount leak in l2tp_ip sockets
The l2tp_ip socket close handler does not update the module refcount
correctly which prevents module unload after the first bind() call on
an L2TPv3 IP encapulation socket.
Signed-off-by: James Chapman <jchapman@katalix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Torsten Kaiser [Mon, 9 Apr 2012 05:14:15 +0000 (05:14 +0000)]
net: Fix misplaced parenthesis in virtio_net.c
Commit 2e57b79ccef1ff1422fdf45a9b28fe60f8f084f7 misplaced its
parenthesis and now tx_fifo_errors will only be incremented if an
ENOMEM error is not written to the syslog.
Correct the parenthesis and indentation to the original goal of
counting all non ENOMEM errors and ratelimiting only the messages.
Signed-of-by: Torsten Kaiser <just.for.lkml@googlemail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
After investigation it turns out there were two issues.
1) Phonet was not implementing network devices but was using register_pernet_device
instead of register_pernet_subsys.
This was allowing there to be cases when phonenet was not initialized and
the phonet net_generic was not set for a network namespace when network
device events were being reported on the netdevice_notifier for a network
namespace leading to the oops above.
2) phonet_exit_net was implementing a confusing and special case of handling all
network devices from going away that it was hard to see was correct, and would
only occur when the phonet module was removed.
Now that unregister_netdevice_notifier has been modified to synthesize unregistration
events for the network devices that are extant when called this confusing special
case in phonet_exit_net is no longer needed.
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Acked-by: Rémi Denis-Courmont <remi.denis-courmont@nokia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
net: In unregister_netdevice_notifier unregister the netdevices.
We already synthesize events in register_netdevice_notifier and synthesizing
events in unregister_netdevice_notifier allows to us remove the need for
special case cleanup code.
This change should be safe as it adds no new cases for existing callers
of unregiser_netdevice_notifier to handle.
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Added strict checking of PadN, as PadN can be used to increase header
size and thus push the protocol header into the 2nd fragment.
PadN is used to align the options within the Hop-by-Hop or
Destination Options header to 64-bit boundaries. The maximum valid
size is thus 7 bytes.
RFC 4942 recommends to actively check the "payload" itself and
ensure that it contains only zeroes.
See also RFC 4942 section 2.1.9.5.
Signed-off-by: Eldad Zack <eldad@fogrefinery.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* emailed from Andrew Morton <akpm@linux-foundation.org>: (14 patches)
panic: fix stack dump print on direct call to panic()
drivers/rtc/rtc-pl031.c: enable clock on all ST variants
Revert "mm: vmscan: fix misused nr_reclaimed in shrink_mem_cgroup_zone()"
hugetlb: fix race condition in hugetlb_fault()
drivers/rtc/rtc-twl.c: use static register while reading time
drivers/rtc/rtc-s3c.c: add placeholder for driver private data
drivers/rtc/rtc-s3c.c: fix compilation error
MAINTAINERS: add PCDP console maintainer
memcg: do not open code accesses to res_counter members
drivers/rtc/rtc-efi.c: fix section mismatch warning
drivers/rtc/rtc-r9701.c: reset registers if invalid values are detected
drivers/char/random.c: fix boot id uniqueness race
memcg: fix broken boolen expression
memcg: fix up documentation on global LRU
1) Fix bluetooth userland regression reported by Keith Packard, from
Gustavo Padovan.
2) Revert ath9k PS idle change, from Sujith Manoharan.
3) Correct default TCP memory limits (again), from Eric Dumazet.
4) Fix tcp_rcv_rtt_update() accidental use of unscaled RTT, from Neal
Cardwell.
5) We made a facility for layers like wireless to say how much tailroom
they need in the SKB for link layer stuff such as wireless
encryption etc., but TCP works hard to fill every SKB out to the end
defeating this specification.
This leads to every TCP packet getting reallocated by the wireless
code in order to have the right amount of tailroom available.
Fix TCP to only fill SKBs out to the real amount of data area it
asked for during the allocation, this way it won't eat into the
slack added for the device's tailroom needs.
Reported by Marc Merlin and fixed by Eric Dumazet.
6) Leaks, endian bugs, and new device IDs in bluetooth from Santosh
Nayak, João Paulo Rechi Vita, Cho, Yu-Chen, Andrei Emeltchenko,
AceLan Kao, and Andrei Emeltchenko.
7) OOPS on tty_close fix in bluetooth's hci_ldisc from Johan Hovold.
10) Consistently handle invalid TCP packets in ipv4 vs. ipv6 conntrack,
from Jozsef Kadlecsik.
11) Validate IP header length properly in netfilter conntrack's
ipv4_get_l4proto().
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (39 commits)
NFC: Fix the LLCP Tx fragmentation loop
rtlwifi: Add missing DMA buffer unmapping for PCI drivers
rtlwifi: Preallocate USB read buffers and eliminate kalloc in read routine
tcp: avoid order-1 allocations on wifi and tx path
net: allow pskb_expand_head() to get maximum tailroom
bridge: Do not send queries on multicast group leaves
MAINTAINERS: Mark NATSEMI driver as orphan'd.
tcp: fix tcp_rcv_rtt_update() use of an unscaled RTT sample
tcp: restore correct limit
Revert "ath9k: fix going to full-sleep on PS idle"
rt2x00: Fix rfkill_polling register function.
bcma: fix build error on MIPS; implicit pcibios_enable_device
netfilter: nf_conntrack: fix incorrect logic in nf_conntrack_init_net
netfilter: nf_ct_ipv4: packets with wrong ihl are invalid
netfilter: nf_ct_ipv4: handle invalid IPv4 and IPv6 packets consistently
net/wireless/wext-core.c: add missing kfree
rtlwifi: Fix oops on rate-control failure
mac80211: Convert WARN_ON to WARN_ON_ONCE
rtlwifi: rtl8192de: Fix firmware initialization
nl80211: ensure interface is up in various APIs
...
Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux
Pull drm fixes from Dave Airlie:
"Mostly exynos and intel.
Intel has 3 regression fixers (more info in intel merge commit), along
with some other make hw work fixes, exynos has some cleanups and an
ioctl fix.
A couple of radeon fixes, couple of build fixes, and a savage
userspace interface possible overflow fix."
* 'drm-fixes' of git://people.freedesktop.org/~airlied/linux: (23 commits)
drm/exynos: fixed exynos broken ioctl
drm/i915: clear fencing tracking state when retiring requests
drm/exynos: fix to pointer manager member of struct exynos_drm_subdrv
drm/exynos: fix struct for operation callback functions to driver name
drm/exynos: use define instead of default_win member in struct mixer_context
drm/exynos: rename s/HDMI_OVERLAY_NUMBER/MIXER_WIN_NR
drm/exynos: remove unused codes in hdmi and mixer
drm/exynos: remove unnecessary type conversion of hdmi and mixer
drm/i915: make rc6 module parameter read-only
drm/i915: implement ColorBlt w/a
drm/i915/ringbuffer: Exclude last 2 cachlines of ring on 845g
Revert "drm/i915: reenable gmbus on gen3+ again"
drm/radeon: only add the mm i2c bus if the hw_i2c module param is set
vgaarb.h: fix build warnings
drm/i915: properly compute dp dithering for user-created modes
drm/radeon/kms: fix DVO setup on some r4xx chips
drm/savage: fix integer overflows in savage_bci_cmdbuf()
drm/radeon: replace udelay with mdelay for long timeouts
drm/i915: Finish any pending operations on the framebuffer before disabling
drm/i915: Removed IVB forced enable of sprite dest key.
...
Merge tag 'md-3.4-fixes' of git://neil.brown.name/md
Pull a few more fixes for md from NeilBrown:
"Two are tagged for -stable. They can cause an oops, but very rarely."
* tag 'md-3.4-fixes' of git://neil.brown.name/md:
md/bitmap: prevent bitmap_daemon_work running while initialising bitmap
md/raid1,raid10: Fix calculation of 'vcnt' when processing error recovery.
MD: Bitmap version cleanup.
Jason Wessel [Thu, 12 Apr 2012 19:49:17 +0000 (12:49 -0700)]
panic: fix stack dump print on direct call to panic()
Commit 6e6f0a1f0fa6 ("panic: don't print redundant backtraces on oops")
causes a regression where no stack trace will be printed at all for the
case where kernel code calls panic() directly while not processing an
oops, and of course there are 100's of instances of this type of call.
The original commit executed the check (!oops_in_progress), but this will
always be false because just before the dump_stack() there is a call to
bust_spinlocks(1), which does the following:
void __attribute__((weak)) bust_spinlocks(int yes)
{
if (yes) {
++oops_in_progress;
The proper way to resolve the problem that original commit tried to
solve is to avoid printing a stack dump from panic() when the either of
the following conditions is true:
1) TAINT_DIE has been set (this is done by oops_end())
This indicates and oops has already been printed.
2) oops_in_progress > 1
This guards against the rare case where panic() is invoked
a second time, or in between oops_begin() and oops_end()
Signed-off-by: Jason Wessel <jason.wessel@windriver.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: <stable@vger.kernel.org> [3.3+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
drivers/rtc/rtc-pl031.c: enable clock on all ST variants
The ST variants of the PL031 all require bit 26 in the control register
to be set before they work properly. Discovered this when testing on
the Nomadik board where it would suprisingly just stand still.
Before the commit, the code makes senses to me but not after the commit.
The "nr_reclaimed" is the number of pages reclaimed by scanning through
the memcg's lru lists. The "nr_to_reclaim" is the target value for the
whole function. For example, we like to early break the reclaim if
reclaimed 32 pages under direct reclaim (not DEF_PRIORITY).
After the reverted commit, the target "nr_to_reclaim" is decremented each
time by "nr_reclaimed" but we still use it to compare the "nr_reclaimed".
It just doesn't make sense to me...
Signed-off-by: Ying Han <yinghan@google.com> Acked-by: Hugh Dickins <hughd@google.com> Cc: Rik van Riel <riel@redhat.com> Cc: Hillf Danton <dhillf@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Chris Metcalf [Thu, 12 Apr 2012 19:49:15 +0000 (12:49 -0700)]
hugetlb: fix race condition in hugetlb_fault()
The race is as follows:
Suppose a multi-threaded task forks a new process (on cpu A), thus
bumping up the ref count on all the pages. While the fork is occurring
(and thus we have marked all the PTEs as read-only), another thread in
the original process (on cpu B) tries to write to a huge page, taking an
access violation from the write-protect and calling hugetlb_cow(). Now,
suppose the fork() fails. It will undo the COW and decrement the ref
count on the pages, so the ref count on the huge page drops back to 1.
Meanwhile hugetlb_cow() also decrements the ref count by one on the
original page, since the original address space doesn't need it any
more, having copied a new page to replace the original page. This
leaves the ref count at zero, and when we call unlock_page(), we panic.
fork on CPU A fault on CPU B
============= ==============
...
down_write(&parent->mmap_sem);
down_write_nested(&child->mmap_sem);
...
while duplicating vmas
if error
break;
...
up_write(&child->mmap_sem);
up_write(&parent->mmap_sem); ...
down_read(&parent->mmap_sem);
...
lock_page(page);
handle COW
page_mapcount(old_page) == 2
alloc and prepare new_page
...
handle error
page_remove_rmap(page);
put_page(page);
...
fold new_page into pte
page_remove_rmap(page);
put_page(page);
...
oops ==> unlock_page(page);
up_read(&parent->mmap_sem);
The solution is to take an extra reference to the page while we are
holding the lock on it.
Signed-off-by: Chris Metcalf <cmetcalf@tilera.com> Cc: Hillf Danton <dhillf@gmail.com> Cc: Michal Hocko <mhocko@suse.cz> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Hugh Dickins <hughd@google.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
drivers/rtc/rtc-twl.c: use static register while reading time
RTC stores time and date in several registers. Due to the fact that
these registers can't be read instantaneously, there is a chance that
reading from counting registers gives an error of one minute, one hour,
one day, etc.
To address this issue, the RTC has hardware support to copy the RTC
counting registers to static shadowed registers. The current
implementation does not use this feature, and in a stress test, we can
reproduce this error at a rate of around two times per 300000 readings.
Fix the implementation to ensure that the right snapshot of time is
captured.
Tushar Behera [Thu, 12 Apr 2012 19:49:14 +0000 (12:49 -0700)]
drivers/rtc/rtc-s3c.c: add placeholder for driver private data
Driver data field is a pointer, hence assigning that to an integer results
in compilation warnings.
Fixes following compilation warnings:
drivers/rtc/rtc-s3c.c: In function `s3c_rtc_get_driver_data':
drivers/rtc/rtc-s3c.c:452:3: warning: return makes integer from pointer without a cast [enabled by default]
drivers/rtc/rtc-s3c.c: At top level:
drivers/rtc/rtc-s3c.c:674:3: warning: initialization makes pointer from integer without a cast [enabled by default]
drivers/rtc/rtc-s3c.c:674:3: warning: (near initialization for `s3c_rtc_dt_match[1].data') [enabled by default]
drivers/rtc/rtc-s3c.c:677:3: warning: initialization makes pointer from integer without a cast [enabled by default]
drivers/rtc/rtc-s3c.c:677:3: warning: (near initialization for `s3c_rtc_dt_match[2].data') [enabled by default]
drivers/rtc/rtc-s3c.c:680:3: warning: initialization makes pointer from integer without a cast [enabled by default]
drivers/rtc/rtc-s3c.c:680:3: warning: (near initialization for `s3c_rtc_dt_match[3].data') [enabled by default]
Signed-off-by: Tushar Behera <tushar.behera@linaro.org> Cc: Heiko Stuebner <heiko@sntech.de> Cc: Alessandro Zummo <a.zummo@towertech.it> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Tushar Behera [Thu, 12 Apr 2012 19:49:14 +0000 (12:49 -0700)]
drivers/rtc/rtc-s3c.c: fix compilation error
Fix this error:
drivers/rtc/rtc-s3c.c: At top level:
drivers/rtc/rtc-s3c.c:671:3: error: request for member `data' in something not a structure or union
drivers/rtc/rtc-s3c.c:674:3: error: request for member `data' in something not a structure or union
drivers/rtc/rtc-s3c.c:677:3: error: request for member `data' in something not a structure or union
drivers/rtc/rtc-s3c.c:680:3: error: request for member `data' in something not a structure or union
Signed-off-by: Tushar Behera <tushar.behera@linaro.org> Cc: Heiko Stuebner <heiko@sntech.de> Cc: Alessandro Zummo <a.zummo@towertech.it> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>