Willem de Bruijn [Tue, 19 Mar 2013 10:18:11 +0000 (10:18 +0000)]
packet: packet fanout rollover during socket overload
Changes:
v3->v2: rebase (no other changes)
passes selftest
v2->v1: read f->num_members only once
fix bug: test rollover mode + flag
Minimize packet drop in a fanout group. If one socket is full,
roll over packets to another from the group. Maintain flow
affinity during normal load using an rxhash fanout policy, while
dispersing unexpected traffic storms that hit a single cpu, such
as spoofed-source DoS flows. Rollover breaks affinity for flows
arriving at saturated sockets during those conditions.
The patch adds a fanout policy ROLLOVER that rotates between sockets,
filling each socket before moving to the next. It also adds a fanout
flag ROLLOVER. If passed along with any other fanout policy, the
primary policy is applied until the chosen socket is full. Then,
rollover selects another socket, to delay packet drop until the
entire system is saturated.
Probing sockets is not free. Selecting the last used socket, as
rollover does, is a greedy approach that maximizes chance of
success, at the cost of extreme load imbalance. In practice, with
sufficiently long queues to absorb bursts, sockets are drained in
parallel and load balance looks uniform in `top`.
To avoid contention, scales counters with number of sockets and
accesses them lockfree. Values are bounds checked to ensure
correctness.
Tested using an application with 9 threads pinned to CPUs, one socket
per thread and sufficient busywork per packet operation to limits each
thread to handling 32 Kpps. When sent 500 Kpps single UDP stream
packets, a FANOUT_CPU setup processes 32 Kpps in total without this
patch, 270 Kpps with the patch. Tested with read() and with a packet
ring (V1).
Also, passes psock_fanout.c unit test added to selftests.
Signed-off-by: Willem de Bruijn <willemb@google.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Oliver Hartkopp [Mon, 18 Mar 2013 07:52:06 +0000 (07:52 +0000)]
can: dump stack on protocol bugs
The rework of the kernel hlist implementation "hlist: drop the node parameter
from iterators" (b67bfe0d42cac56c512dd5da4b1b347a23f4b70a) created some
fallout in the form of non matching comments and obsolete code.
Additionally to the cleanup this patch adds a WARN() statement to catch the
caller of the wrong filter removal request.
Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net> Signed-off-by: David S. Miller <davem@davemloft.net>
Dmitry Kravkov [Mon, 18 Mar 2013 06:51:04 +0000 (06:51 +0000)]
bnx2x: add RSS capability for GRE traffic
The patch drives FW to perform RSS for GRE traffic,
based on inner headers.
Signed-off-by: Dmitry Kravkov <dmitry@broadcom.com> Signed-off-by: Eilon Greenstein <eilong@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Dmitry Kravkov [Mon, 18 Mar 2013 06:51:03 +0000 (06:51 +0000)]
bnx2x: add CSUM and TSO support for encapsulation protocols
The patch utilizes FW offload capabilities for
encapsulation protocols.
Signed-off-by: Dmitry Kravkov <dmitry@broadcom.com> Signed-off-by: Eilon Greenstein <eilong@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Christoph Paasch [Sun, 17 Mar 2013 08:23:34 +0000 (08:23 +0000)]
tcp: Remove TCPCT
TCPCT uses option-number 253, reserved for experimental use and should
not be used in production environments.
Further, TCPCT does not fully implement RFC 6013.
As a nice side-effect, removing TCPCT increases TCP's performance for
very short flows:
Doing an apache-benchmark with -c 100 -n 100000, sending HTTP-requests
for files of 1KB size.
before this patch:
average (among 7 runs) of 20845.5 Requests/Second
after:
average (among 7 runs) of 21403.6 Requests/Second
Signed-off-by: Christoph Paasch <christoph.paasch@uclouvain.be> Signed-off-by: David S. Miller <davem@davemloft.net>
Cong Wang [Sat, 16 Mar 2013 04:47:55 +0000 (04:47 +0000)]
net: fix some typos in netif features
Cc: Pravin B Shelar <pshelar@nicira.com> Cc: "David S. Miller" <davem@davemloft.net> Signed-off-by: Cong Wang <amwang@redhat.com> Acked-by: Pravin B Shelar <pshelar@nicira.com> Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Sun, 17 Mar 2013 16:58:47 +0000 (12:58 -0400)]
Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/jesse/openvswitch
Conflicts:
net/openvswitch/vport-internal_dev.c
Jesse Gross says:
====================
A couple of minor enhancements for net-next/3.10. The largest is an
extension to allow variable length metadata to be passed to userspace
with packets.
There is a merge conflict in net/openvswitch/vport-internal_dev.c:
A existing commit modifies internal_dev_mac_addr() and a new commit
deletes it. The new one is correct, so you can just remove that function.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Joe Perches [Fri, 15 Mar 2013 07:23:58 +0000 (07:23 +0000)]
drivers:net: dma_alloc_coherent: use __GFP_ZERO instead of memset(, 0)
Reduce the number of calls required to alloc
a zeroed block of memory.
Trivially reduces overall object size.
Other changes around these removals
o Neaten call argument alignment
o Remove an unnecessary OOM message after dma_alloc_coherent failure
o Remove unnecessary gfp_t stack variable
Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: David S. Miller <davem@davemloft.net>
David Stevens [Fri, 15 Mar 2013 04:35:51 +0000 (04:35 +0000)]
vxlan: generalize forwarding tables
This patch generalizes VXLAN forwarding table entries allowing an administrator
to:
1) specify multiple destinations for a given MAC
2) specify alternate vni's in the VXLAN header
3) specify alternate destination UDP ports
4) use multicast MAC addresses as fdb lookup keys
5) specify multicast destinations
6) specify the outgoing interface for forwarded packets
The combination allows configuration of more complex topologies using VXLAN
encapsulation.
Changes since v1: rebase to 3.9.0-rc2
Signed-Off-By: David L Stevens <dlstevens@us.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Erwan Yvin [Mon, 11 Mar 2013 03:13:03 +0000 (03:13 +0000)]
caif: remove caif_shm
caif_shm is an old implementation
caif_shm will be replaced by caif_virtio
[ As explained by Linus Walleij: "U5500 used this, but was cancelled
and the silicon did not reach anyone outside ST-Ericsson. Then for
the next platforms, we have gone for the leaner & cleaner approach
of using virtio, rpmesg and rproc." ]
Signed-off-by: Erwan Yvin <erwan.yvin@stericsson.com> Acked-by: Linus Walleij <linus.walleij@linaro.org> Acked-by: Sjur Brendeland <sjur.brandeland@stericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Somnath Kotur [Thu, 14 Mar 2013 02:41:51 +0000 (02:41 +0000)]
be2net: enable interrupts in be_probe() (RoCE and other ULPs need them)
As the NIC PCI function may be used by other protocols, the chip interrupts
must be enabled in be_probe() itself rather than be_open().
Signed-off-by: Sathya Perla <sathya.perla@emulex.com> Signed-off-by: Somnath Kotur <somnath.kotur@emulex.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Reilly Grant [Thu, 14 Mar 2013 11:55:41 +0000 (11:55 +0000)]
VSOCK: Support VM sockets connected to the hypervisor.
The resource ID used for VM socket control packets (0) is already
used for the VMCI_GET_CONTEXT_ID hypercall so a new ID (15) must be
used when the guest sends these datagrams to the hypervisor.
The hypervisor context ID must also be removed from the internal
blacklist.
Signed-off-by: Reilly Grant <grantr@vmware.com> Acked-by: Andy King <acking@vmware.com> Signed-off-by: David S. Miller <davem@davemloft.net>
netxen: write IP address to firmware when using bonding
This patch allows LRO aggregation on bonded devices that contain an
NX3031 device. It also adds a for_each_netdev_in_bond_rcu(bond, slave)
macro which executes for each slave that has bond as master.
V3: After testing and discussing this with Rajesh, I decided to keep the
vlan ip cache and just rename it to ip_cache since it will store bond
ip addresses too. A new master flag has been added to the ip cache to
denote that the address has been added because of a master device.
I've taken care of the enslave/release cases by checking for various
combinations of events and flags (e.g. netxen has a master, it's a
bond master and it's not marked as a slave means it is being enslaved
and is dev_open()ed in bond_enslave).
I've changed netxen_free_ip_list() to have a "master" parameter which
causes all IP addresses marked as master to be deleted (used when a
netxen is being released). I've made the patch use the new upper
device API as well. The following cases were tested:
- bond -> netxen
- vlan -> netxen
- vlan -> bond -> netxen
V2: Remove local ip caching, retrieve addresses dynamically and
restore them if necessary.
Note: Tested with NX3031 adapter.
Tested-by: Rajesh Borundia <rajesh.borundia@qlogic.com> Signed-off-by: Andy Gospodarek <agospoda@redhat.com> Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Thu, 14 Mar 2013 15:47:15 +0000 (11:47 -0400)]
Merge tag 'batman-adv-for-davem' of git://git.open-mesh.org/linux-merge
Included changes:
- introduction of the new Network Coding component. This new mechanism aims to
increase throughput by fusing multiple packets in one transmission.
- minor cleanups
Signed-off-by: David S. Miller <davem@davemloft.net>
Arvind Bhushan [Thu, 14 Mar 2013 05:09:08 +0000 (05:09 +0000)]
csiostor: Cleanup chip specific operations.
This patch removes chip specific operations from the common hardware
paths, as well as the Makefile change to accomodate the new files.
Signed-off-by: Arvind Bhushan <arvindb@chelsio.com> Signed-off-by: Naresh Kumar Inna <naresh@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Arvind Bhushan [Thu, 14 Mar 2013 05:09:07 +0000 (05:09 +0000)]
csiostor: Header file modifications for chip support and bug fixes.
This patch defines the common operations to support multiple chips. It
includes common header file modifications to support the current chips
(T4 and T5). It also includes the following bug fixes:
- reconfirms the rnode state after an implicit logo.
- corrects the stats array size.
- sets up and checks flags correctly when coming up as master and finding
the card initialized
Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Arvind Bhushan <arvindb@chelsio.com> Signed-off-by: Naresh Kumar Inna <naresh@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Arvind Bhushan [Thu, 14 Mar 2013 05:09:06 +0000 (05:09 +0000)]
csiostor: Add T5 adapter operations.
This patch creates a new file for T5 adapter operations.
Signed-off-by: Arvind Bhushan <arvindb@chelsio.com> Signed-off-by: Naresh Kumar Inna <naresh@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Arvind Bhushan [Thu, 14 Mar 2013 05:09:05 +0000 (05:09 +0000)]
csiostor: Segregate T4 adapter operations.
This patch separates T4 adapter operations into a new file.
Signed-off-by: Arvind Bhushan <arvindb@chelsio.com> Signed-off-by: Naresh Kumar Inna <naresh@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Vipul Pandya [Thu, 14 Mar 2013 05:09:04 +0000 (05:09 +0000)]
RDMA/cxgb4: Fix onchip queue support for T5
T5 adapter does not support onchip queue memory. Present logic fails to
allocate QP for T5 and returns an error. Also, if module parameter ocqp_support
is zero then we are unable to allocate QP which should not be the case. Ideally
if ocqp_support parameter is 0 or onchip queue support is disable then host QP
should be allocated before returning an error.
Signed-off-by: Vipul Pandya <vipul@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Vipul Pandya [Thu, 14 Mar 2013 05:09:01 +0000 (05:09 +0000)]
RDMA/cxgb4: Use DSGLs for fastreg and adapter memory writes for T5.
It enables direct DMA by HW to memory region PBL arrays and fast register PBL
arrays from host memory, vs the T4 way of passing these arrays in the WR itself.
The result is lower latency for memory registration, and larger PBL array
support for fast register operations.
This patch also updates ULP_TX_MEM_WRITE command fields for T5. Ordering bit of
ULP_TX_MEM_WRITE is at bit position 22 in T5 and at 23 in T4.
Signed-off-by: Vipul Pandya <vipul@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Santosh Rastapur [Thu, 14 Mar 2013 05:08:57 +0000 (05:08 +0000)]
cxgb4vf: Add support for Chelsio T5 adapter
Signed-off-by: Santosh Rastapur <santosh@chelsio.com> Signed-off-by: Vipul Pandya <vipul@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Santosh Rastapur [Thu, 14 Mar 2013 05:08:56 +0000 (05:08 +0000)]
cxgb4: Disable SR-IOV support for PF4-7 for T5
All T5 adapters will only support VFs on PF0-3 despite the ability of the
hardware to support them on PF4-7. This keeps our T4 and T5 adapters more
similar which simplifies host driver software.
Signed-off-by: Vipul Pandya <vipul@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Santosh Rastapur [Thu, 14 Mar 2013 05:08:55 +0000 (05:08 +0000)]
cxgb4: Update driver version and description
Signed-off-by: Santosh Rastapur <santosh@chelsio.com> Signed-off-by: Vipul Pandya <vipul@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Santosh Rastapur [Thu, 14 Mar 2013 05:08:54 +0000 (05:08 +0000)]
cxgb4: Add T5 PCI ids
Signed-off-by: Santosh Rastapur <santosh@chelsio.com> Signed-off-by: Vipul Pandya <vipul@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Santosh Rastapur [Thu, 14 Mar 2013 05:08:53 +0000 (05:08 +0000)]
cxgb4: Add T5 debugfs support
Signed-off-by: Santosh Rastapur <santosh@chelsio.com> Signed-off-by: Vipul Pandya <vipul@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Santosh Rastapur [Thu, 14 Mar 2013 05:08:52 +0000 (05:08 +0000)]
cxgb4: Enable doorbell drop recovery only for T4 adapter
Signed-off-by: Santosh Rastapur <santosh@chelsio.com> Signed-off-by: Vipul Pandya <vipul@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Santosh Rastapur [Thu, 14 Mar 2013 05:08:51 +0000 (05:08 +0000)]
cxgb4: Add T5 write combining support
This patch implements a low latency Write Combining (aka Write Coalescing) work
request path. PCIE maps User Space Doorbell BAR2 region writes to the new
interface to SGE. SGE pulls a new message from PCIE new interface and if its a
coalesced write work request then pushes it for processing. This patch copies
coalesced work request to memory mapped BAR2 space.
Signed-off-by: Santosh Rastapur <santosh@chelsio.com> Signed-off-by: Vipul Pandya <vipul@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Santosh Rastapur [Thu, 14 Mar 2013 05:08:50 +0000 (05:08 +0000)]
cxgb4: Dump T5 registers
Signed-off-by: Santosh Rastapur <santosh@chelsio.com> Signed-off-by: Vipul Pandya <vipul@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Santosh Rastapur [Thu, 14 Mar 2013 05:08:49 +0000 (05:08 +0000)]
cxgb4: Initialize T5
Signed-off-by: Santosh Rastapur <santosh@chelsio.com> Signed-off-by: Vipul Pandya <vipul@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Santosh Rastapur [Thu, 14 Mar 2013 05:08:48 +0000 (05:08 +0000)]
cxgb4: Add macros, structures and inline functions for T5
Signed-off-by: Santosh Rastapur <santosh@chelsio.com> Signed-off-by: Vipul Pandya <vipul@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Santosh Rastapur [Thu, 14 Mar 2013 05:08:47 +0000 (05:08 +0000)]
cxgb4: Add register definations for T5
Signed-off-by: Santosh Rastapur <santosh@chelsio.com> Signed-off-by: Vipul Pandya <vipul@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Martin Hundebøll [Fri, 25 Jan 2013 10:12:43 +0000 (11:12 +0100)]
batman-adv: network coding - receive coded packets and decode them
When receiving a network coded packet, the decoding buffer is searched
for a packet to use for decoding. The source, destination, and crc32 from
the coded packet is used to identify the wanted packet. The decoded
packet is passed to the usual unicast receiver function, as had it never
been network coded.
Signed-off-by: Martin Hundebøll <martin@hundeboll.net> Signed-off-by: Marek Lindner <lindner_marek@yahoo.de> Signed-off-by: Antonio Quartulli <ordex@autistici.org>
Martin Hundebøll [Fri, 25 Jan 2013 10:12:42 +0000 (11:12 +0100)]
batman-adv: network coding - save overheard and tx packets for decoding
To be able to decode a network coded packet, a node must already know
one of the two coded packets. This is done by buffering skbs before
transmission and buffering packets sniffed with promiscuous mode from
other hosts.
Packets are kept in a buffer similar to the one with forward-skbs: A
hash table, where each entry, which corresponds to a src-dst pair, has a
linked list packets.
Signed-off-by: Martin Hundebøll <martin@hundeboll.net> Signed-off-by: Marek Lindner <lindner_marek@yahoo.de> Signed-off-by: Antonio Quartulli <ordex@autistici.org>
Martin Hundebøll [Fri, 25 Jan 2013 10:12:41 +0000 (11:12 +0100)]
batman-adv: network coding - code and transmit packets if possible
Before adding forward-skbs to the coding buffer, the buffer is searched
for a potential coding opportunity. If one is found, the two packets are
network coded and transmitted right away. If not, the forward-skb is
added to the buffer.
Network coded packets are transmitted with information about the two
receivers and the two coded packets. The first receiver is given by the
MAC header, while the second is given in the payload/bat-header. The
second receiver uses promiscuous mode to receive the packet and check
the second destination.
Signed-off-by: Martin Hundebøll <martin@hundeboll.net> Signed-off-by: Marek Lindner <lindner_marek@yahoo.de> Signed-off-by: Antonio Quartulli <ordex@autistici.org>
Martin Hundebøll [Fri, 25 Jan 2013 10:12:40 +0000 (11:12 +0100)]
batman-adv: network coding - buffer unicast packets before forward
Two be able to network code two packets, one packet must be buffered
until the next is available. This is done in a "coding buffer", which is
essentially a hash table with lists of packets. Each entry in the hash
table corresponds to a specific src-dst pair, which has a linked list of
packets that are buffered.
This patch adds skbs to the buffer just before forwarding them. The
buffer is traversed every 10 ms, where timed skbs are removed from the
buffer and transmitted. To allow experiments with the network coding
scheme, the timeout is tunable through a file in debugfs.
Signed-off-by: Martin Hundebøll <martin@hundeboll.net> Signed-off-by: Marek Lindner <lindner_marek@yahoo.de> Signed-off-by: Antonio Quartulli <ordex@autistici.org>
Martin Hundebøll [Fri, 25 Jan 2013 10:12:39 +0000 (11:12 +0100)]
batman-adv: network coding - detect coding nodes and remove these after timeout
To use network coding efficiently, a relay must know when neighbor nodes
are likely to have enough information to be able to decode a network
coded packet. This is detected by using OGMs from batman-adv to discover
when one neighbor is in range of another neighbor. The relay check the
TLL to detect when an OGM is forwarded from one neighbor by another
neighbor, and thereby knows that the two neighbors are in range and thus
overhear packets sent by each other.
This information is saved in the orig_node struct to be used when
searching for coding opportunities. Two lists are added to the
orig_node struct: One for neighbors that can hear the orig_node
(outgoing nc_nodes) and one for neighbors that the orig_node can hear
(incoming nc_nodes).
Information about nc_nodes is kept for 10 seconds and is available
through debugfs in batman_adv/nc_nodes to use when debugging network
coding.
Signed-off-by: Martin Hundebøll <martin@hundeboll.net> Signed-off-by: Marek Lindner <lindner_marek@yahoo.de> Signed-off-by: Antonio Quartulli <ordex@autistici.org>
Martin Hundebøll [Fri, 25 Jan 2013 10:12:38 +0000 (11:12 +0100)]
batman-adv: network coding - add the initial infrastructure code
Network coding exploits the 802.11 shared medium to allow multiple
packets to be sent in a single transmission. In brief, a relay can XOR
two packets, and send the coded packet to two destinations. The
receivers can decode one of the original packets by XOR'ing the coded
packet with the other original packet. This will lead to increased
throughput in topologies where two packets cross one relay.
In a simple topology with three nodes, it takes four transmissions
without network coding to get one packet from Node A to Node B and one
from Node B to Node A:
1. Node A ---- p1 ---> Node R Node B
2. Node A Node R <--- p2 ---- Node B
3. Node A <--- p2 ---- Node R Node B
4. Node A Node R ---- p1 ---> Node B
With network coding, the relay only needs one transmission, which saves
us one slot of valuable airtime:
1. Node A ---- p1 ---> Node R Node B
2. Node A Node R <--- p2 ---- Node B
3. Node A <- p1 x p2 - Node R - p1 x p2 -> Node B
The same principle holds for a topology including five nodes. Here the
packets from Node A and Node B are overheard by Node C and Node D,
respectively. This allows Node R to send a network coded packet to save
one transmission:
Node A Node B
| \ / |
| p1 p2 |
| \ / |
p1 > Node R < p2
| |
| / \ |
| p1 x p2 p1 x p2 |
v / \ v
/ \
Node C < > Node D
More information is available on the open-mesh.org wiki[1].
This patch adds the initial code to support network coding in
batman-adv. It sets up a worker thread to do house keeping and adds a
sysfs file to enable/disable network coding. The feature is disabled by
default, as it requires a wifi-driver with working promiscuous mode, and
also because it adds a small delay at each hop.
Signed-off-by: Martin Hundebøll <martin@hundeboll.net> Signed-off-by: Marek Lindner <lindner_marek@yahoo.de> Signed-off-by: Antonio Quartulli <ordex@autistici.org>
In C standard any expression different from 0 will be converted to
'true' when casting to bool (whatever is the length of the value).
Therefore all the "!!" conversions can be removed.
Signed-off-by: Antonio Quartulli <ordex@autistici.org> Signed-off-by: Marek Lindner <lindner_marek@yahoo.de>
Martin Hundebøll [Sun, 13 Jan 2013 23:20:32 +0000 (00:20 +0100)]
batman-adv: Return reason for failure in batadv_check_unicast_packet()
batadv_check_unicast_packet() is changed to return a value based on the
reason to drop the packet, which will be useful information for
future users of batadv_check_unicast_packet().
Signed-off-by: Martin Hundebøll <martin@hundeboll.net> Acked-by: Antonio Quartulli <ordex@autistici.org> Signed-off-by: Marek Lindner <lindner_marek@yahoo.de> Signed-off-by: Antonio Quartulli <ordex@autistici.org>
The batadv_priv struct carries a pointer to its own interface
struct. Therefore, it is not necessary to retrieve the soft_iface
via the primary interface.
Signed-off-by: Marek Lindner <lindner_marek@yahoo.de> Signed-off-by: Antonio Quartulli <ordex@autistici.org>
Wei Yongjun [Wed, 13 Mar 2013 03:03:58 +0000 (03:03 +0000)]
tuntap: remove unused variable in __tun_detach()
The variable dev is initialized but never used
otherwise, so remove the unused variable.
Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn> Acked-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Himanshu Madhani [Tue, 12 Mar 2013 09:02:16 +0000 (09:02 +0000)]
qlcnic: Implement flash sysfs callback for 83xx adapter
QLogic applications use these callbacks to perform
o NIC Partitioning (NPAR) configuration and management
o Diagnostic tests
o Flash access and updates
Signed-off-by: Himanshu Madhani <himanshu.madhani@qlogic.com> Signed-off-by: Shahed Shaikh <shahed.shaikh@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Wed, 13 Mar 2013 08:38:27 +0000 (04:38 -0400)]
Merge branch 'cpsw'
Mugunthan V N says:
====================
This patch serires implements the following features in CPSW driver
* get/set phy link settings
* interrupt pacing
* get phy id via ioctl cmd SIOCGMIIPHY
Changes from initial version
* Made active-slave common for cpts, ethtool and SIOCGMIIPHY ioctl
* Cleaned CPSW DT binding documentation by seperating slave nodes
under sub-section
* implemented get phy id via ioctl cmd SIOCGMIIPHY
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Mugunthan V N [Mon, 11 Mar 2013 23:16:37 +0000 (23:16 +0000)]
driver: net: ethernet: cpsw: implement interrupt pacing via ethtool
This patch implements support for interrupt pacing block of CPSW via ethtool
Inetrrupt pacing block is common of both the ethernet interface in
dual emac mode
Signed-off-by: Mugunthan V N <mugunthanvnm@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Zhang Yanfei [Mon, 11 Mar 2013 19:15:49 +0000 (19:15 +0000)]
driver: isdn: hisax: remove cast for kmalloc/kzalloc return value
remove cast for kmalloc/kzalloc return value.
Signed-off-by: Zhang Yanfei <zhangyanfei@cn.fujitsu.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Karsten Keil <isdn@linux-pingi.de> Cc: netdev@vger.kernel.org Signed-off-by: David S. Miller <davem@davemloft.net>
Zhang Yanfei [Mon, 11 Mar 2013 19:13:47 +0000 (19:13 +0000)]
driver: isdn: capi: remove cast for kmalloc return value
remove cast for kmalloc return value.
Signed-off-by: Zhang Yanfei <zhangyanfei@cn.fujitsu.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Karsten Keil <isdn@linux-pingi.de> Cc: netdev@vger.kernel.org Signed-off-by: David S. Miller <davem@davemloft.net>
This has been tested on a qnap TS-119P II. Note that enabling WOL with
"ethtool -s eth0 wol g" is not enough; you also need to tell the PIC
microcontroller inside the qnap that WOL should be enabled by sending
0xF2 with qcontrol(1) and you have to disable EUP ("Energy-using
Products", a European power-saving thing) by sending 0xF4.
Signed-off-by: Michael Stapelberg <michael@stapelberg.de> Signed-off-by: David S. Miller <davem@davemloft.net>
This allows ethernet drivers (such as the mv643xx_eth) to support
Wake on LAN on platforms where PHY registers have to be configured
for Wake on LAN (e.g. the Marvell Kirkwood based qnap TS-119P II).
Signed-off-by: Michael Stapelberg <michael@stapelberg.de> Signed-off-by: David S. Miller <davem@davemloft.net>
This is the second of the TLP patch series; it augments the basic TLP
algorithm with a loss detection scheme.
This patch implements a mechanism for loss detection when a Tail
loss probe retransmission plugs a hole thereby masking packet loss
from the sender. The loss detection algorithm relies on counting
TLP dupacks as outlined in Sec. 3 of:
http://tools.ietf.org/html/draft-dukkipati-tcpm-tcp-loss-probe-01
The basic idea is: Sender keeps track of TLP "episode" upon
retransmission of a TLP packet. An episode ends when the sender receives
an ACK above the SND.NXT (tracked by tlp_high_seq) at the time of the
episode. We want to make sure that before the episode ends the sender
receives a "TLP dupack", indicating that the TLP retransmission was
unnecessary, so there was no loss/hole that needed plugging. If the
sender gets no TLP dupack before the end of the episode, then it reduces
ssthresh and the congestion window, because the TLP packet arriving at
the receiver probably plugged a hole.
Signed-off-by: Nandita Dukkipati <nanditad@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
This patch series implement the Tail loss probe (TLP) algorithm described
in http://tools.ietf.org/html/draft-dukkipati-tcpm-tcp-loss-probe-01. The
first patch implements the basic algorithm.
TLP's goal is to reduce tail latency of short transactions. It achieves
this by converting retransmission timeouts (RTOs) occuring due
to tail losses (losses at end of transactions) into fast recovery.
TLP transmits one packet in two round-trips when a connection is in
Open state and isn't receiving any ACKs. The transmitted packet, aka
loss probe, can be either new or a retransmission. When there is tail
loss, the ACK from a loss probe triggers FACK/early-retransmit based
fast recovery, thus avoiding a costly RTO. In the absence of loss,
there is no change in the connection state.
PTO stands for probe timeout. It is a timer event indicating
that an ACK is overdue and triggers a loss probe packet. The PTO value
is set to max(2*SRTT, 10ms) and is adjusted to account for delayed
ACK timer when there is only one oustanding packet.
TLP Algorithm
On transmission of new data in Open state:
-> packets_out > 1: schedule PTO in max(2*SRTT, 10ms).
-> packets_out == 1: schedule PTO in max(2*RTT, 1.5*RTT + 200ms)
-> PTO = min(PTO, RTO)
Conditions for scheduling PTO:
-> Connection is in Open state.
-> Connection is either cwnd limited or no new data to send.
-> Number of probes per tail loss episode is limited to one.
-> Connection is SACK enabled.
When PTO fires:
new_segment_exists:
-> transmit new segment.
-> packets_out++. cwnd remains same.
no_new_packet:
-> retransmit the last segment.
Its ACK triggers FACK or early retransmit based recovery.
ACK path:
-> rearm RTO at start of ACK processing.
-> reschedule PTO if need be.
In addition, the patch includes a small variation to the Early Retransmit
(ER) algorithm, such that ER and TLP together can in principle recover any
N-degree of tail loss through fast recovery. TLP is controlled by the same
sysctl as ER, tcp_early_retrans sysctl.
tcp_early_retrans==0; disables TLP and ER.
==1; enables RFC5827 ER.
==2; delayed ER.
==3; TLP and delayed ER. [DEFAULT]
==4; TLP only.
The TLP patch series have been extensively tested on Google Web servers.
It is most effective for short Web trasactions, where it reduced RTOs by 15%
and improved HTTP response time (average by 6%, 99th percentile by 10%).
The transmitted probes account for <0.5% of the overall transmissions.
Signed-off-by: Nandita Dukkipati <nanditad@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Acked-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Wei Yongjun [Mon, 11 Mar 2013 04:40:14 +0000 (04:40 +0000)]
bnx2x: use list_move instead of list_del/list_add
Using list_move() instead of list_del() + list_add().
Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn> Acked-by: Ariel Elior <ariele@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Ariel Elior [Mon, 11 Mar 2013 05:17:46 +0000 (05:17 +0000)]
bnx2x: Control number of vfs dynamically
1. Support sysfs interface for getting the maximal number of virtual functions
of a given physical function.
2. Support sysfs interface for getting and setting the current number of
virtual functions.
Signed-off-by: Ariel Elior <ariele@broadcom.com> Signed-off-by: Yuval Mintz <yuvalmin@broadcom.com> Signed-off-by: Eilon Greenstein <eilong@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Ariel Elior [Mon, 11 Mar 2013 05:17:45 +0000 (05:17 +0000)]
bnx2x: Add iproute2 support for vfs
This patch adds support for iproute2 callbacks allowing querying a physical
function as to its child virtual functions, and setting the macs and vlans
of said virtual functions.
Signed-off-by: Ariel Elior <ariele@broadcom.com> Signed-off-by: Yuval Mintz <yuvalmin@broadcom.com> Signed-off-by: Eilon Greenstein <eilong@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>