git.proxmox.com Git - mirror_ubuntu-bionic-kernel.git/log

Merge branch 'bcmgenet_xmit_more'

Florian Fainelli says:

====================
net: bcmgenet: xmit_more support

This patch series adds xmit_more support to the GENET driver by allowing
the deferal of the producer index write to the TDMA engine.

Changes in v2:

- move the netif_tx_stop_queue check *before* updating the producer index
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

net: bcmgenet: add support for xmit_more

Delay the update of the TDMA producer index unless this is the last SKB
in a batch, or the queue is already stopped. Move the check for whether
the queue should be stopped before the xmit_more check to avoid locking
the transmit queue in case there was a SKB submitted which has xmit_more
set.

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: bcmgenet: update ring producer index and buffer count in xmit

There is no need to have both bcmgenet_xmit_single() and
bcmgenet_xmit_frag() perform a free_bds decrement and a prod_index
increment by one. In case one of these functions fails to map a SKB or
fragment for transmit, we will return and exit bcmgenet_xmit() with an
error.

We can therefore safely use our local copy of nr_frags to know by how
much we should decrement the number of free buffers available, and by
how much the producer count must be incremented and do this in the tail
of bcmgenet_xmit().

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Acked-by: Petri Gynther <pgynther@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: bcmgenet: rewrite bcmgenet_rx_refill()

Currently, bcmgenet_desc_rx() calls bcmgenet_rx_refill() at the end of
Rx packet processing loop, after the current Rx packet has already been
passed to napi_gro_receive(). However, bcmgenet_rx_refill() might fail
to allocate a new Rx skb, thus leaving a hole on the Rx queue where no
valid Rx buffer exists.

To eliminate this situation:
1. Rewrite bcmgenet_rx_refill() to retain the current Rx skb on the Rx
   queue if a new replacement Rx skb can't be allocated and DMA-mapped.
   In this case, the data on the current Rx skb is effectively dropped.
2. Modify bcmgenet_desc_rx() to call bcmgenet_rx_refill() at the top of
   Rx packet processing loop, so that the new replacement Rx skb is
   already in place before the current Rx skb is processed.

Signed-off-by: Petri Gynther <pgynther@google.com>
Tested-by: Jaedon Shin <jaedon.shin@gmail.com>--
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Tested-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'tcp_metrics_netns_debloat'

Eric W. Biederman says:

====================
tcp_metrics: Network namespace bloat reduction v3

This is a small pile of patches that convert tcp_metrics from using a
hash table per network namespace to using a single hash table for all
network namespaces.

This is broken up into several patches so that each small step along
the way could be carefully scrutinized as I wrote it, and equally so
that each small step can be reviewed.

There are several cleanups included in this series.  The addition of
panic calls during boot where we can not handle failure, and not trying
simplifies the code.  The removal of the return code from
tcp_metrics_flush_all.

The motivation for this change is that the tcp_metrics hash table at
128KiB is one of the largest components of a freshly allocated network
namespace.

I am resending the the previous version I sent has suffered bitrot, so I
have respun the patches so that they apply.  I believe I have addressed
all of the review concerns except optimal behavior on little machines
with 32-byte cache lines, which is beyond me as even the current code
has bad behavior in that case.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

tcp_metrics: Use a single hash table for all network namespaces.

Now that all of the operations are safe on a single hash table
accross network namespaces, allocate a single global hash table
and update the code to use it.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

tcp_metrics: Rewrite tcp_metrics_flush_all

Rewrite tcp_metrics_flush_all so that it can cope with entries from
different network namespaces on it's hash chain.

This is based on the logic in tcp_metrics_nl_cmd_del for deleting
a selection of entries from a tcp metrics hash chain.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

tcp_metrics: Remove the unused return code from tcp_metrics_flush_all

tcp_metrics_flush_all always returns 0. Remove the unnecessary return code.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

tcp_metrics: Add a field tcpm_net and verify it matches on lookup

In preparation for using one tcp metrics hash table for all network
namespaces add a field tcpm_net to struct tcp_metrics_block, and
verify that field on all hash table lookups.

Make the field tcpm_net of type possible_net_t so it takes no space
when network namespaces are disabled.

Further add a function tm_net to read that field so we can be
efficient when network namespaces are disabled and concise
the rest of the time.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

tcp_metrics: Mix the network namespace into the hash function.

In preparation for using one hash table for all network namespaces
mix the network namespace into the hash value.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

tcp_metrics: panic when tcp_metrics_init fails.

There is not a practical way to cleanup during boot so
just panic if there is a problem initializing tcp_metrics.

That will at least give us a clear place to start debugging
if something does go wrong.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

vxlan: Don't set s_addr in vxlan_create_sock

In the case of AF_INET s_addr was set to INADDR_ANY (0) which which both
symmetric with the AF_INET6 case, where s_addr is not set, and unnecessary
as udp_conf is zeroed out earlier in the same function.

I suspect this change does not have any run-time effect due to compiler
optimisations. But it does make the code a little easier on the/my eyes.

Cc: Tom Herbert <therbert@google.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

mpls: In mpls_egress verify the packet length.

Reobert Shearman noticed that mpls_egress is failing to verify that
the bytes to be examined are in fact present in the packet before
mpls_egress reads those bytes.

As suggested by David Miller reduce this to a single pskb_may_pull
call so that we don't do unnecessary work in the fast path.

Reported-by: Robert Shearman <rshearma@brocade.com>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net/macb: Only adjust tx_clk on link change

The PHY state machine (in drivers/net/phy/phy.c) will unconditionally
call phydev->adjust_link (macb_handle_link_change) when polling in the
PHY_CHANGELINK state. As currently written, macb always ends up
requesting a new tx_clk frequency in macb_handle_link_change. It is a
waste of time to request a new tx_clk frequency if the link state hasn't
changed, as the tx_clk will already be configured properly.

Let's only request a new tx_clk clock frequency when necessary.

Signed-off-by: Jaeden Amero <jaeden.amero@ni.com>
Cc: Josh Cartwright <joshc@ni.com>
Cc: Soren Brinkmann <soren.brinkmann@xilinx.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

rhashtable: Fix read-side crash during rehash

This patch fixes a typo rhashtable_lookup_compare where we fail
to recompute the hash when looking up the new table. This causes
elements to be missed and potentially a crash during a resize.

Reported-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>

rhashtable: kill ht->shift atomic operations

Commit c0c09bfdc415 ("rhashtable: avoid unnecessary wakeup for worker
queue") changed ht->shift to be atomic, which is actually unnecessary.

Instead of leaving the current shift in the core rhashtable structure,
it can be cached inside the individual bucket tables.

There, it will only be initialized once during a new table allocation
in the shrink/expansion slow path, and from then onward it stays immutable
for the rest of the bucket table liftime.

That allows shift to be non-atomic. The patch also moves hash_rnd
management into the table setup. The rhashtable structure now consumes
3 instead of 4 cachelines.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Cc: Ying Xue <ying.xue@windriver.com>
Acked-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>

rhashtable: Fix reader/rehash race

There is a potential race condition between readers and the rehasher.
In particular, the rehasher could have started a rehash while the
reader finishes a scan of the old table but fails to see the new
table pointer.

This patch closes this window by adding smp_wmb/smp_rmb.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'listener_refactor'

Eric Dumazet says:

====================
inet: tcp listener refactoring, part 8

These patches prepare request socks being hashed into general ehash
table : We declare 3 aliases (ireq_state, ireq_refcnt, ireq_family)

Note that refcnt is not yet handled, this will be done later.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

inet: introduce ireq_family

Before inserting request socks into general hash table,
fill their socket family.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

inet: get_openreq4() & get_openreq6() do not need listener

ireq->ir_num contains local port, use it.

Also, get_openreq4() dumping listen_sk->refcnt makes litle sense.

inet_diag_fill_req() can also use ireq->ir_num

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

inet: prepare sock_edemux() & sock_gen_put() for new SYN_RECV state

sock_edemux() & sock_gen_put() should be ready to cope with request socks.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: add req_prot_cleanup() & req_prot_init() helpers

Make proto_register() & proto_unregister() a bit nicer.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

inet: add rsk_refcnt/ireq_refcnt to request socks

When request socks will be in ehash, they'll need to be refcounted.

This patch adds rsk_refcnt/ireq_refcnt macros, and adds
reqsk_put() function, but nothing yet use them.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

inet: add ireq_state field to inet_request_sock

We need to identify request sock when they'll be visible in
global ehash table.

ireq_state is an alias to req.__req_common.skc_state.

Its value is set to TCP_NEW_SYN_RECV

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

inet: add TCP_NEW_SYN_RECV state

TCP_SYN_RECV state is currently used by fast open sockets.

Initial TCP requests (the pseudo sockets created when a SYN is received)
are not yet associated to a state. They are attached to their parent,
and the parent is in TCP_LISTEN state.

This commit adds TCP_NEW_SYN_RECV state, so that we can convert
TCP stack to a different schem gradually.

This state is not exported to user space.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

ipv6: add missing ireq_net & ir_cookie initializations

I forgot to update dccp_v6_conn_request() & cookie_v6_check().
They both need to set ireq->ireq_net and ireq->ir_cookie

Lets clear ireq->ir_cookie in inet_reqsk_alloc()

Signed-off-by: Eric Dumazet <edumazet@google.com>
Fixes: 33cf7c90fe2f ("net: add real socket cookies")
Signed-off-by: David S. Miller <davem@davemloft.net>

cls_bpf: do eBPF invocation under non-bh RCU lock variant for maps

Currently, it is possible in cls_bpf to access eBPF maps only under
rcu_read_lock_bh() variants: while on ingress side, that is, handle_ing(),
the classifier would be called from __netif_receive_skb_core() under
rcu_read_lock(); on egress side, however, it's rcu_read_lock_bh() via
__dev_queue_xmit().

This rcu/rcu_bh mix doesn't work together with eBPF maps as they require
soley to be called under rcu_read_lock(). eBPF maps could also be shared
among various other eBPF programs (possibly even with other eBPF program
types, f.e. tracing) and user space processes, so any context is assumed.

Therefore, a possible fix for cls_bpf is to wrap/nest eBPF program
invocation under non-bh RCU lock variant.

Fixes: e2e9b6541dd4 ("cls_bpf: add initial eBPF support for programmable classifiers")
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@plumgrid.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'fib_trie_table_merge_fixes'

Alexander Duyck says:

====================
fib_trie: Minor fixes for table merge

This patch set addresses two issues reported with the tables merged, the
first is a NULL pointer dereference, and the other is to remove a WARN_ON
and set the ordering for aliases from different tables with the same slen
values.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

fib_trie: Provide a deterministic order for fib_alias w/ tables merged

This change makes it so that we should always have a deterministic ordering
for the main and local aliases within the merged table when two leaves
overlap.

So for example if we have a leaf with a key of 192.168.254.0.  If we
previously added two aliases with a prefix length of 24 from both local and
main the first entry would be first and the second would be second.  When I
was coding this I had added a WARN_ON should such a situation occur as I
wasn't sure how likely it would be.  However this WARN_ON has been
triggered so this is something that should be addressed.

With this patch the ordering of the aliases is as follows.  First they are
sorted on prefix length, then on their table ID, then tos, and finally
priority.  This way what we end up doing is essentially interleaving the
two tables on what used to be leaf_info structure boundaries.

Fixes: 0ddcf43d5 ("ipv4: FIB Local/MAIN table collapse")
Reported-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

fib_trie: Avoid NULL pointer if local table is not allocated

The function fib_unmerge assumed the local table had already been
allocated. If that is not the case however when custom rules are applied
then this can result in a NULL pointer dereference.

In order to prevent this we must check the value of the local table pointer
and if it is NULL simply return 0 as there is no local table to separate
from the main.

Fixes: 0ddcf43d5 ("ipv4: FIB Local/MAIN table collapse")
Reported-by: Madhu Challa <challa@noironetworks.com>
Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

ebpf: verifier: check that call reg with ARG_ANYTHING is initialized

I noticed that a helper function with argument type ARG_ANYTHING does
not need to have an initialized value (register).

This can worst case lead to unintented stack memory leakage in future
helper functions if they are not carefully designed, or unintended
application behaviour in case the application developer was not careful
enough to match a correct helper function signature in the API.

The underlying issue is that ARG_ANYTHING should actually be split
into two different semantics:

  1) ARG_DONTCARE for function arguments that the helper function
     does not care about (in other words: the default for unused
     function arguments), and

  2) ARG_ANYTHING that is an argument actually being used by a
     helper function and *guaranteed* to be an initialized register.

The current risk is low: ARG_ANYTHING is only used for the 'flags'
argument (r4) in bpf_map_update_elem() that internally does strict
checking.

Fixes: 17a5267067f3 ("bpf: verifier (add verifier core)")
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@plumgrid.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'possible_net_t'

Eric W. Biederman says:

====================
Introduce possible_net_t

The current usage of write_pnet and read_pnet is a little laborious and
error prone as you only notice if you failed to include them if are
compiling with network namespaces enabled.

possible_net_t remedies that by using a type that is 0 bytes when
network namespaces are disabled and can only be read and written to with
read_pnet and write_pnet.

Aka less work and safer for the same effect.

I kill hold_net and release_net first as are they are haven't been used
since 2008 and are noise at the points where write_pnet and read_pnet
are used.

I have folded in Eric Dumazets suggestions to improve the killing of
hold_net and release net. And respon. I had to respin anyway as
there was enough changes elsewhere in the tree the previous version
of these patches did not quite apply cleanly.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

net: Introduce possible_net_t

Having to say
> #ifdef CONFIG_NET_NS
> struct net *net;
> #endif

in structures is a little bit wordy and a little bit error prone.

Instead it is possible to say:
> typedef struct {
> #ifdef CONFIG_NET_NS
> struct net *net;
> #endif
> } possible_net_t;

And then in a header say:

> possible_net_t net;

Which is cleaner and easier to use and easier to test, as the
possible_net_t is always there no matter what the compile options.

Further this allows read_pnet and write_pnet to be functions in all
cases which is better at catching typos.

This change adds possible_net_t, updates the definitions of read_pnet
and write_pnet, updates optional struct net * variables that
write_pnet uses on to have the type possible_net_t, and finally fixes
up the b0rked users of read_pnet and write_pnet.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: Kill hold_net release_net

hold_net and release_net were an idea that turned out to be useless.
The code has been disabled since 2008. Kill the code it is long past due.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'rhashtable-cleanups'

Herbert Xu says:

====================
rhashtable hash cleanups

This is a rebase on top of the nested lock annotation fix.

Nothing to see here, just a bunch of simple clean-ups before
I move onto something more substantial (hopefully).
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

rhashtable: Remove obj_raw_hashfn

Now that the only caller of obj_raw_hashfn is head_hashfn, we can
simply kill it and fold it into the latter.

This patch also moves the common shift from head_hashfn/key_hashfn
into rht_bucket_index.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Acked-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>

rhashtable: Remove key length argument to key_hashfn

key_hashfn has only one caller and it doesn't really need to supply
the key length as an extra parameter.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Acked-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>

rhashtable: Use head_hashfn instead of obj_raw_hashfn

Now that we don't have cross-table hashes, we no longer need to
keep the entire hash value so all users of obj_raw_hashfn can
use head_hashfn instead.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Acked-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>

rhashtable: Move masking back into key_hashfn

This patch reverts commit c88455ce50ae4224d84960ce2baa53e61580df27
("rhashtable: key_hashfn() must return full hash value") because
the only user of it always masks the hash value.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Acked-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>

net/mlx5_core: don't export static symbol

The semantic patch that fixes this problem is as follows:
(http://coccinelle.lip6.fr/)

// <smpl>
@r@
type T;
identifier f;
@@

static T f (...) { ... }

@@
identifier r.f;
declarer name EXPORT_SYMBOL;
@@

-EXPORT_SYMBOL(f);
// </smpl>

Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>

rhashtable: Add annotation to nested lock

Commit aa34a6cb0478842452bac58edb50d3ef9e178c92 ("rhashtable:
Add arbitrary rehash function") killed the annotation on the
nested lock which leads to bitching from lockdep.

Reported-by: Fengguang Wu <fengguang.wu@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: fix CONFIG_NET_NS=n compilation

I forgot to use write_pnet() in three locations.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Fixes: 33cf7c90fe2f9 ("net: add real socket cookies")
Reported-by: kbuild test robot <fengguang.wu@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

ipv6: expose RFC4191 route preference via rtnetlink

This makes it possible to retain the route preference when RAs are handled in
userspace.

Signed-off-by: Lubomir Rintel <lkundrak@v3.sk>
Reviewed-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: David S. Miller <davem@davemloft.net>

switchdev: correct spelling of notifier in comments

Signed-off-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

vxlan: Correct path typo in comment

Flags are used in the return path rather than the return patch.

Fixes: af33c1adae1e ("vxlan: Eliminate dependency on UDP socket in transmit path")
Signed-off-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: add real socket cookies

A long standing problem in netlink socket dumps is the use
of kernel socket addresses as cookies.

1) It is a security concern.

2) Sockets can be reused quite quickly, so there is
   no guarantee a cookie is used once and identify
   a flow.

3) request sock, establish sock, and timewait socks
   for a given flow have different cookies.

Part of our effort to bring better TCP statistics requires
to switch to a different allocator.

In this patch, I chose to use a per network namespace 64bit generator,
and to use it only in the case a socket needs to be dumped to netlink.
(This might be refined later if needed)

Note that I tried to carry cookies from request sock, to establish sock,
then timewait sockets.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Eric Salo <salo@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

fib_trie: Only display main table in /proc/net/route

When we merged the tries for local and main I had overlooked the iterator
for /proc/net/route. As a result it was outputting both local and main
when the two tries were merged.

This patch resolves that by only providing output for aliases that are
actually in the main trie. As a result we should go back to the original
behavior which I assume will be necessary to maintain legacy support.

Fixes: 0ddcf43d5 ("ipv4: FIB Local/MAIN table collapse")
Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'dsa_phy_divert'

Florian Fainelli says:

====================
net: dsa: support PHY reads/writes diversion

This patch series completes the PHY reads/writes diversion when we need to use
the slave MII bus provided by DSA and the underlying switch drivers to
implement the real PHY reads and writes. This is particularly useful when they
are conflicting MDIO bus addresses as in the case of multiple Broadcom switches
connected to each other (internal and external, or just external).
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

net: dsa: fully divert PHY reads/writes if requested

In case a PHY is found via Device Tree, and is also flagged by the
switch driver as needing indirect reads/writes using the switch driver
implemented MDIO bus, make sure that we bind this PHY to the slave MII
bus in order for this to happen.

Without this, we would succeed in having the PHY driver probe()'s
function to use slave MII bus read/write functions, because this is done
during dsa_slave_mii_init(), but past that point, the PHY driver would
not go through these diverted reads and writes.

Fixes: 0d8bcdd383b88 ("net: dsa: allow for more complex PHY setups")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: dsa: move PHY setup on DSA MII bus to its own function

In preparation for dealing with indirect reads and writes towards
certain PHY devices, move the code which deals with binding the PHY
device to the slave MII bus created by DSA to its own function:
dsa_slave_phy_connect().

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

of: mdio: export of_mdio_parse_addr

Export of_mdio_parse_addr() which allows parsing a given Ethernet PHY
node MDIO address, verify it is within the allowed range, and return
its value. This is going to be useful for the DSA code which needs to
deal with multiple layers of MDIO buses.

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Acked-by: Rob Herring <robh@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: bcmgenet: collect Rx discarded packet count

Bits 31:16 of RDMA_PROD_INDEX contain Rx discarded packet count, which
are the Rx packets that had to be dropped by MAC hardware since there
was no room on the Rx queue. Add code to collect this information into
the netdev stats.

Signed-off-by: Petri Gynther <pgynther@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

fib_trie: Fix uninitialized variable warning

The 0-day kernel test infrastructure reported a use of uninitialized
variable warning for local_table due to the fact that the local and main
allocations had been swapped from the original setup. This change corrects
that by making it so that we free the main table if the local table
allocation fails.

Fixes: 0ddcf43d5 ("ipv4: FIB Local/MAIN table collapse")
Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

fib_trie: call fib_table_flush_external under RTNL

Move rtnl_lock() before the call to fib4_rules_exit so that
fib_table_flush_external is called under RTNL.

Fixes: 104616e74e0b ("switchdev: don't support custom ip rules, for now")
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Acked-by: Alexander Duyck <alexander.h.duyck@redhat.com>
Reviewed-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: David S. Miller <davem@davemloft.net>

mpls: Allow mpls_gso and mpls_router to be built as modules

CONFIG_MPLS=m doesn't result in a kernel module being built because it
applies to the net/mpls directory, rather than to .o files.

So revert the MPLS menuitem to being a boolean and make MPLS_GSO and
MPLS_ROUTING tristates to allow mpls_gso and mpls_router modules to be
produced as desired.

Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Robert Shearman <rshearma@brocade.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net/fsl: remove dependency FSL_SOC from MDIO

FSL_PQ_MDIO and FSL_XGMAC_MDIO are not really depend on FSL_SOC, they
can build on non-PPC platforms.

Signed-off-by: Shaohui Xie <Shaohui.Xie@freescale.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

rhashtable: Add arbitrary rehash function

This patch adds a rehash function that supports the use of any
hash function for the new table.  This is needed to support changing
the random seed value during the lifetime of the hash table.

However for now the random seed value is still constant and the
rehash function is simply used to replace the existing expand/shrink
functions.

[ ASSERT_BUCKET_LOCK() and thus debug_dump_table() +
  debug_dump_buckets() are not longer used, so delete them
  entirely. -DaveM ]

Signed-off-by: Herbert Xu <herbert.xu@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

rhashtable: Move hash_rnd into bucket_table

Currently hash_rnd is a parameter that users can set. However,
no existing users set this parameter. It is also something that
people are unlikely to want to set directly since it's just a
random number.

In preparation for allowing the reseeding/rehashing of rhashtable,
this patch moves hash_rnd into bucket_table so that it's now an
internal state rather than a parameter.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Acked-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>

ipv4: FIB Local/MAIN table collapse

This patch is meant to collapse local and main into one by converting
tb_data from an array to a pointer. Doing this allows us to point the
local table into the main while maintaining the same variables in the
table.

As such the tb_data was converted from an array to a pointer, and a new
array called data is added in order to still provide an object for tb_data
to point to.

In order to track the origin of the fib aliases a tb_id value was added in
a hole that existed on 64b systems. Using this we can also reverse the
merge in the event that custom FIB rules are enabled.

With this patch I am seeing an improvement of 20ns to 30ns for routing
lookups as long as custom rules are not enabled, with custom rules enabled
we fall back to split tables and the original behavior.

Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

tipc: ensure that idle links are deleted when a bearer is disabled

commit afaa3f65f65fda2e7b190aac7e2a75d9a2a77cb6
(tipc: purge links when bearer is disabled) was an attempt to resolve
a problem that turned out to have a more profound reason.

When we disable a bearer, we delete all its pertaining links if
there is no other bearer to perform failover to, or if the module
is shutting down. In case there are dual bearers, we wait with
deleting links until the failover procedure is finished.

However, this misses the case when a link on the removed bearer
was already down, so that there will be no failover procedure to
finish the link delete. This causes confusion if a new bearer is
added to replace the removed one, and also entails a small memory
leak.

This commit takes the current state of the link into account when
deciding when to delete it, and also reverses the above-mentioned
commit.

Reviewed-by: Erik Hugne <erik.hugne@ericsson.com>
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

fib_trie: Address possible NULL pointer dereference in resize

If the inflate call failed it would return NULL. As a result tp would be
set to NULL and cause use to trigger a NULL pointer dereference in
should_halve if the inflate failed on the first attempt.

In order to prevent this we should decrement max_work before we actually
attempt to inflate as this will force us to exit before attempting to halve
a node we should have inflated. In order to keep things symmetric between
inflate and halve I went ahead and also moved the decrement of max_work for
the halve case as well so we take care of that before we actually attempt
to halve the tnode.

Fixes: 88bae714 ("fib_trie: Add key vector to root, return parent key_vector in resize")
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

macb: Fix merge error.

The code removed by commit 421d9df0628b ("net/macb: merge
at91_ether driver into macb driver") should be removed
in the merge resolution as well.

Signed-off-by: David S. Miller <davem@davemloft.net>

fib_trie: Correctly handle case of key == 0 in leaf_walk_rcu

In the case of a trie that had no tnodes with a key of 0 the initial
look-up would fail resulting in an out-of-bounds cindex on the first tnode.
This resulted in an entire trie being skipped.

In order resolve this I have updated the cindex logic in the initial
look-up so that if the key is zero we will always traverse the child zero
path.

Fixes: 8be33e95 ("fib_trie: Fib walk rcu should take a tnode and key instead of a trie and a leaf")
Reported-by: Sabrina Dubroca <sd@queasysnail.net>
Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com>
Tested-by: Sabrina Dubroca <sd@queasysnail.net>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'inet_diag_cleanups'

Eric Dumazet says:

====================
inet_diag: cleanups and improvements

Before changing way request socks are dumped, let's clean up this code
a bit.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

inet_diag: add const to inet_diag_req_v2

diag dumpers should not modify the request.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

inet_diag: cleanups

Remove all inline keywords, add some const, and cleanup style.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: constify sock_diag_check_cookie()

sock_diag_check_cookie() second parameter is constant

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

drivers: atm: nicstar: remove ifdef'd out skb destructors

remove dead code.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next

Pablo Neira Ayuso says:

====================
Netfilter fixes for net-next

The following batch contains a couple of fixes to address some fallout
from the previous pull request, they are:

1) Address link problems in the bridge code after e5de75b. Fix it by
   using rcu hook to address to avoid ifdef pollution and hard
   dependency between bridge and br_netfilter.

2) Address sparse warnings in the netfilter reject code, patch from
   Florian Westphal.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

netfilter: bridge: use rcu hook to resolve br_netfilter dependency

e5de75b ("netfilter: bridge: move DNAT helper to br_netfilter") results
in the following link problem:

net/bridge/br_device.c:29: undefined reference to `br_nf_prerouting_finish_bridge`

Moreover it creates a hard dependency between br_netfilter and the
bridge core, which is what we've been trying to avoid so far.

Resolve this problem by using a hook structure so we reduce #ifdef
pollution and keep bridge netfilter specific code under br_netfilter.c
which was the original intention.

Reported-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

netfilter: fix sparse warnings in reject handling

make C=1 CF=-D__CHECK_ENDIAN__ shows following:

net/bridge/netfilter/nft_reject_bridge.c:65:50: warning: incorrect type in argument 3 (different base types)
net/bridge/netfilter/nft_reject_bridge.c:65:50: expected restricted __be16 [usertype] protocol [..]
net/bridge/netfilter/nft_reject_bridge.c:102:37: warning: cast from restricted __be16
net/bridge/netfilter/nft_reject_bridge.c:102:37: warning: incorrect type in argument 1 (different base types) [..]
net/bridge/netfilter/nft_reject_bridge.c:121:50: warning: incorrect type in argument 3 (different base types) [..]
net/bridge/netfilter/nft_reject_bridge.c:168:52: warning: incorrect type in argument 3 (different base types) [..]
net/bridge/netfilter/nft_reject_bridge.c:233:52: warning: incorrect type in argument 3 (different base types) [..]

Caused by two (harmless) errors:
1. htons() instead of ntohs()
2. __be16 for protocol in nf_reject_ipXhdr_put API, use u8 instead.

Reported-by: kbuild test robot <fengguang.wu@intel.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

net: phy: bcm7xxx: add alternate id for 7439

BCM7439 has an alternate PHY OUI: 0xae025080 which is to be found in
some variants of this chip.

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

switchdev: add netlink flags to IPv4 FIB add op

Pass in the netlink flags (NLM_F_*) into switchdev driver for IPv4 FIB add op
to allow driver to 1) optimize hardware updates, 2) handle ip route prepend
and append commands correctly.

Suggested-by: Jamal Hadi Salim <jhs@mojatatu.com>
Suggested-by: Roopa Prabhu <roopa@cumulusnetworks.com>
Signed-off-by: Scott Feldman <sfeldma@gmail.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Acked-by: Roopa Prabhu <roopa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'dsa-next'

Florian Fainelli says:

====================
net: dsa: remove restriction on platform_device

This patch series removes the restriction in DSA to operate exclusively with
platform_device Ethernet MAC drivers when using Device Tree. This basically
allows arbitrary Ethernet MAC drivers to be used now in conjunction with
Device Tree.

The reason was that DSA was using a of_find_device_by_node() which limits
the device_node to device pointer search exclusively to platform_device,
in our case, we are interested in doing a "class" research and lookup the
net_device.

Thanks to Chris Packham for testing this on his platform.

Changes in v2:

- fix build for !CONFIG_OF_NET
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

net: dsa: utilize of_find_net_device_by_node

Using of_find_device_by_node() restricts the search to platform_device that
match the specified device_node pointer. This is not even remotely true for
network devices backed by a pci_device for instance.

of_find_net_device_by_node() allows us to do a more thorough lookup to find the
struct net_device corresponding to a particular device_node pointer.

For symetry with the non-OF code path, we hold the net_device pointer in
dsa_probe() just like what dev_to_net_dev() does when we call this
function.

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: core: add of_find_net_device_by_node()

Add a helper function which allows getting the struct net_device pointer
associated with a given struct device_node pointer. This is useful for
instance for DSA Ethernet devices not backed by a platform_device, but a PCI
device.

Since we need to access net_class which is not accessible outside of
net/core/net-sysfs.c, this helper function is also added here and gated
with CONFIG_OF_NET.

Network devices initialized with SET_NETDEV_DEV() are also taken into
account by checking for dev->parent first and then falling back to
checking the device pointer within struct net_device.

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net

Conflicts:
drivers/net/ethernet/cadence/macb.c

Overlapping changes in macb driver, mostly fixes and cleanups
in 'net' overlapping with the integration of at91_ether into
macb in 'net-next'.

Signed-off-by: David S. Miller <davem@davemloft.net>

net: bcmgenet: core changes for supporting multiple Rx queues

1. Add struct bcmgenet_rx_ring to hold all necessary information
for a single Rx queue.
2. Add bcmgenet_init_rx_queues() to initialize all Rx queues.
3. Modify bcmgenet_init_rx_ring() to initialize a single Rx queue.
4. Modify Rx interrupt path code to use per-queue data.
5. Modify bcmgenet_rx_refill() to use RxCB->bd_addr.

Signed-off-by: Petri Gynther <pgynther@google.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull kvm/s390 bugfixes from Marcelo Tosatti.

* git://git.kernel.org/pub/scm/virt/kvm/kvm:
  KVM: s390: non-LPAR case obsolete during facilities mask init
  KVM: s390: include guest facilities in kvm facility test
  KVM: s390: fix in memory copy of facility lists
  KVM: s390/cpacf: Fix kernel bug under z/VM
  KVM: s390/cpacf: Enable key wrapping by default

Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux

Pull s390 fixes from Martin Schwidefsky:
"One performance optimization for page_clear and a couple of bug fixes"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
  s390/mm: fix incorrect ASCE after crst_table_downgrade
  s390/ftrace: fix crashes when switching tracers / add notrace to cpu_relax()
  s390/pci: unify pci_iomap symbol exports
  s390/pci: fix [un]map_resources sequence
  s390: let the compiler do page clearing
  s390/pci: fix possible information leak in mmio syscall
  s390/dcss: array index 'i' is used before limits check.
  s390/scm_block: fix off by one during cluster reservation
  s390/jump label: improve and fix sanity check
  s390/jump label: add missing jump_label_apply_nops() call

Merge tag 'trace-fixes-v4.0-rc2-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace

Pull seq-buf/ftrace fixes from Steven Rostedt:
"This includes fixes for seq_buf_bprintf() truncation issue.  It also
  contains fixes to ftrace when /proc/sys/kernel/ftrace_enabled and
  function tracing are started.  Doing the following causes some issues:

    # echo 0 > /proc/sys/kernel/ftrace_enabled
    # echo function_graph > /sys/kernel/debug/tracing/current_tracer
    # echo 1 > /proc/sys/kernel/ftrace_enabled
    # echo nop > /sys/kernel/debug/tracing/current_tracer
    # echo function_graph > /sys/kernel/debug/tracing/current_tracer

  As well as with function tracing too.  Pratyush Anand first reported
  this issue to me and supplied a patch.  When I tested this on my x86
  test box, it caused thousands of backtraces and warnings to appear in
  dmesg, which also caused a denial of service (a warning for every
  function that was listed).  I applied Pratyush's patch but it did not
  fix the issue for me.  I looked into it and found a slight problem
  with trampoline accounting.  I fixed it and sent Pratyush a patch, but
  he said that it did not fix the issue for him.

  I later learned tha Pratyush was using an ARM64 server, and when I
  tested on my ARM board, I was able to reproduce the same issue as
  Pratyush.  After applying his patch, it fixed the problem.  The above
  test uncovered two different bugs, one in x86 and one in ARM and
  ARM64.  As this looked like it would affect PowerPC, I tested it on my
  PPC64 box.  It too broke, but neither the patch that fixed ARM or x86
  fixed this box (the changes were all in generic code!).  The above
  test, uncovered two more bugs that affected PowerPC.  Again, the
  changes were only done to generic code.  It's the way the arch code
  expected things to be done that was different between the archs.  Some
  where more sensitive than others.

  The rest of this series fixes the PPC bugs as well"

* tag 'trace-fixes-v4.0-rc2-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
  ftrace: Fix ftrace enable ordering of sysctl ftrace_enabled
  ftrace: Fix en(dis)able graph caller when en(dis)abling record via sysctl
  ftrace: Clear REGS_EN and TRAMP_EN flags on disabling record via sysctl
  seq_buf: Fix seq_buf_bprintf() truncation
  seq_buf: Fix seq_buf_vprintf() truncation

Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net

Pull networking fixes from David Miller:

1) nft_compat accidently truncates ethernet protocol to 8-bits, from
    Arturo Borrero.

2) Memory leak in ip_vs_proc_conn(), from Julian Anastasov.

3) Don't allow the space required for nftables rules to exceed the
    maximum value representable in the dlen field.  From Patrick
    McHardy.

4) bcm63xx_enet can accidently leave interrupts permanently disabled
    due to errors in the NAPI polling exit logic.  Fix from Nicolas
    Schichan.

5) Fix OOPSes triggerable by the ping protocol module, due to missing
    address family validations etc.  From Lorenzo Colitti.

6) Don't use RCU locking in sleepable context in team driver, from Jiri
    Pirko.

7) xen-netback miscalculates statistic offset pointers when reporting
    the stats to userspace.  From David Vrabel.

8) Fix a leak of up to 256 pages per VIF destroy in xen-netaback, also
    from David Vrabel.

9) ip_check_defrag() cannot assume that skb_network_offset(),
    particularly when it is used by the AF_PACKET fanout defrag code.
    From Alexander Drozdov.

10) gianfar driver doesn't query OF node names properly when trying to
    determine the number of hw queues available.  Fix it to explicitly
    check for OF nodes named queue-group.  From Tobias Waldekranz.

11) MID field in macb driver should be 12 bits, not 16.  From Punnaiah
    Choudary Kalluri.

12) Fix unintentional regression in traceroute due to timestamp socket
    option changes.  Empty ICMP payloads should be allowed in
    non-timestamp cases.  From Willem de Bruijn.

13) When devices are unregistered, we have to get rid of AF_PACKET
    multicast list entries that point to it via ifindex.  Fix from
    Francesco Ruggeri.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (38 commits)
  tipc: fix bug in link failover handling
  net: delete stale packet_mclist entries
  net: macb: constify macb configuration data
  MAINTAINERS: add Marc Kleine-Budde as co maintainer for CAN networking layer
  MAINTAINERS: linux-can moved to github
  can: kvaser_usb: Read all messages in a bulk-in URB buffer
  can: kvaser_usb: Avoid double free on URB submission failures
  can: peak_usb: fix missing ctrlmode_ init for every dev
  can: add missing initialisations in CAN related skbuffs
  ip: fix error queue empty skb handling
  bgmac: Clean warning messages
  tcp: align tcp_xmit_size_goal() on tcp_tso_autosize()
  net: fec: fix unbalanced clk disable on driver unbind
  net: macb: Correct the MID field length value
  net: gianfar: correctly determine the number of queue groups
  ipv4: ip_check_defrag should not assume that skb_network_offset is zero
  net: bcmgenet: properly disable password matching
  net: eth: xgene: fix booting with devicetree
  bnx2x: Force fundamental reset for EEH recovery
  xen-netback: refactor xenvif_handle_frag_list()
  ...

Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input

Pull input subsystem fixes from Dmitry Torokhov:
"Miscellaneous driver fixes"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
  Input: psmouse - disable "palm detection" in the focaltech driver
  Input: psmouse - disable changing resolution/rate/scale for FocalTech
  Input: psmouse - ensure that focaltech reports consistent coordinates
  Input: psmouse - remove hardcoded touchpad size from the focaltech driver
  Input: tc3589x-keypad - set IRQF_ONESHOT flag to ensure IRQ request
  Input: ALPS - fix memory leak when detection fails
  Input: sun4i-ts - add thermal driver dependency
  Input: cyapa - remove superfluous type check in cyapa_gen5_read_idac_data()
  Input: cyapa - fix unaligned functions redefinition error
  Input: mma8450 - add parent device

Merge tag 'regulator-v4.0-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator

Pull regulator fixes from Mark Brown:
"A couple of driver specific fixes plus a fix for a regression in the
  core where the updates to use sysfs group registration were overly
  enthusiastic in eliding properties and removed some that had been
  previously present"

* tag 'regulator-v4.0-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator:
  regulator: Fix regression due to NULL constraints check
  regulator: rk808: Set the enable time for LDOs
  regulator: da9210: Mask all interrupt sources to deassert interrupt line

Merge tag 'spi-v4.0-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi

Pull spi fixes from Mark Brown:
"A collection of driver specific fixes to which the usual comments
  about them being important if you see them mostly apply (except for
  the comment fix).  The pl022 one is particularly nasty for anyone
  affected by it"

* tag 'spi-v4.0-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi:
  spi: pl022: Fix race in giveback() leading to driver lock-up
  spi: dw-mid: avoid potential NULL dereference
  spi: img-spfi: Verify max spfi transfer length
  spi: fix a typo in comment.
  spi: atmel: Fix interrupt setup for PDC transfers
  spi: dw: revisit FIFO size detection again
  spi: dw-pci: correct number of chip selects
  drivers: spi: ti-qspi: wait for busy bit clear before data write/read

Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security

Pull tpm fixes from James Morris:
"fixes for the TPM driver"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security:
tpm: fix call order in tpm-chip.c
tpm/ibmvtpm: Additional LE support for tpm_ibmvtpm_send

Merge tag 'fbdev-fixes-4.0' of git://git.kernel.org/pub/scm/linux/kernel/git/tomba/linux

Pull fbdev fixes from Tomi Valkeinen:
- Fix regression in with omapdss when using i2c displays
- Fix possible null deref in fbmon
- Check kalloc return value in AMBA CLCD

* tag 'fbdev-fixes-4.0' of git://git.kernel.org/pub/scm/linux/kernel/git/tomba/linux:
  OMAPDSS: fix regression with display sysfs files
  video: fbdev: fix possible null dereference
  video: ARM CLCD: Add missing error check for devm_kzalloc

Merge branch 'for-4.0-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup

Pull cgroup fixes from Tejun Heo:
"The cgroup iteration update two years ago and the recent cpuset
  restructuring introduced regressions in subset of cpuset
  configurations.  Three patches to fix them.

  All are marked for -stable"

* 'for-4.0-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
  cpuset: Fix cpuset sched_relax_domain_level
  cpuset: fix a warning when clearing configured masks in old hierarchy
  cpuset: initialize effective masks when clone_children is enabled

Merge branch 'for-4.0-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata

Pull libata fixlet from Tejun Heo:
"Speed limiting fix for sata_fsl"

* 'for-4.0-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata:
sata-fsl: Apply link speed limits

Merge branch 'for-4.0-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq

Pull workqueue fix from Tejun Heo:
"One fix patch for a subtle livelock condition which can happen on
  PREEMPT_NONE kernels involving two racing cancel_work calls.  Whoever
  comes in the second has to wait for the previous one to finish.  This
  was implemented by making the later one block for the same condition
  that the former would be (work item completion) and then loop and
  retest; unfortunately, depending on the wake up order, the later one
  could lock out the former one to finish by busy looping on the cpu.

  This is fixed by implementing explicit wait mechanism.  Work item
  might not belong anywhere at this point and there's remote possibility
  of thundering herd problem.  I originally tried to use bit_waitqueue
  but it didn't work for static work items on modules.  It's currently
  using single wait queue with filtering wake up function and exclusive
  wakeup.  If this ever becomes a problem, which is not very likely, we
  can try to figure out a way to piggy back on bit_waitqueue"

* 'for-4.0-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq:
  workqueue: fix hang involving racing cancel[_delayed]_work_sync()'s for PREEMPT_NONE

tipc: fix bug in link failover handling

In commit c637c1035534867b85b78b453c38c495b58e2c5a
("tipc: resolve race problem at unicast message reception") we
introduced a new mechanism for delivering buffers upwards from link
to socket layer.

That code contains a bug in how we handle the new link input queue
during failover. When a link is reset, some of its users may be blocked
because of congestion, and in order to resolve this, we add any pending
wakeup pseudo messages to the link's input queue, and deliver them to
the socket. This misses the case where the other, remaining link also
may have congested users. Currently, the owner node's reference to the
remaining link's input queue is unconditionally overwritten by the
reset link's input queue. This has the effect that wakeup events from
the remaining link may be unduely delayed (but not lost) for a
potentially long period.

We fix this by adding the pending events from the reset link to the
input queue that is currently referenced by the node, whichever one
it is.

This commit should be applied to both net and net-next.

Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: delete stale packet_mclist entries

When an interface is deleted from a net namespace the ifindex in the
corresponding entries in PF_PACKET sockets' mclists becomes stale.
This can create inconsistencies if later an interface with the same ifindex
is moved from a different namespace (not that unlikely since ifindexes are
per-namespace).
In particular we saw problems with dev->promiscuity, resulting
in "promiscuity touches roof, set promiscuity failed. promiscuity
feature of device might be broken" warnings and EOVERFLOW failures of
setsockopt(PACKET_ADD_MEMBERSHIP).
This patch deletes the mclist entries for interfaces that are deleted.
Since this now causes setsockopt(PACKET_DROP_MEMBERSHIP) to fail with
EADDRNOTAVAIL if called after the interface is deleted, also make
packet_mc_drop not fail.

Signed-off-by: Francesco Ruggeri <fruggeri@arista.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

tipc: Add Ying Xue to TIPC maintainers list

We remove Allan Stephens, who has moved on to other tasks, from
the TIPC maintainers list. He is replaced by Ying Xue, who has
been doing the maintenance on behalf of WindRiver since almost
three years.

Acked-by: Ying Xue <ying.xue@windriver.com>
Acked-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: Remove protocol from struct dst_ops

After my change to neigh_hh_init to obtain the protocol from the
neigh_table there are no more users of protocol in struct dst_ops.
Remove the protocol field from dst_ops and all of it's initializers.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/net-next

Jeff Kirsher says:

====================
Intel Wired LAN Driver Updates 2015-03-09

This series contains updates to i40e and i40evf.

Greg cleans up some "hello world" strings which were left around from
early development.

Shannon modifies the drive to make sure the sizeof() calls are taking
the size of the actual struct that we care about.  Also updates the
NVMUpdate read/write so that it is less noisy when logging.  This was
because the NVMUpdate tool does not necessarily know the ReadOnly map of
the current NVM image, and must try reading and writing words that may be
protected.  This generates an error out of the Firmware request that the
driver logs.  Unfortunately, this ended up spitting out hundreds of
bogus read/write error messages.  If a user wants the noisy logging,
the change can be overridden by enabling the NVM update debugging.

Mitch fixes a possible deadlock issue where if a reset occurred when the
netdev is closed, the reset task will hang in napi_disable.  Added
ethtool RSS support as suggested by Ben Hutchings.

Jesse fixes a bug introduced in the force writeback code, where the
interrupt rate was set to 0 (maximum) by accident.  The driver must
correctly set the NOITR fields to avoid IT update as a side effect
of triggering the software interrupt.

I provided a simple cleanup to make the use of PF/VF consistent, which
was reported by Joe Perches.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next

Pablo Neira Ayuso says:

====================
Netfilter updates for net-next

The following patchset contains Netfilter updates for your net-next
tree. Basically, improvements for the packet rejection infrastructure,
deprecation of CLUSTERIP, cleanups for nf_tables and some untangling for
br_netfilter. More specifically they are:

1) Send packet to reset flow if checksum is valid, from Florian Westphal.

2) Fix nf_tables reject bridge from the input chain, also from Florian.

3) Deprecate the CLUSTERIP target, the cluster match supersedes it in
   functionality and it's known to have problems.

4) A couple of cleanups for nf_tables rule tracing infrastructure, from
   Patrick McHardy.

5) Another cleanup to place transaction declarations at the bottom of
   nf_tables.h, also from Patrick.

6) Consolidate Kconfig dependencies wrt. NF_TABLES.

7) Limit table names to 32 bytes in nf_tables.

8) mac header copying in bridge netfilter is already required when
   calling ip_fragment(), from Florian Westphal.

9) move nf_bridge_update_protocol() to br_netfilter.c, also from
   Florian.

10) Small refactor in br_netfilter in the transmission path, again from
    Florian.

11) Move br_nf_pre_routing_finish_bridge_slow() to br_netfilter.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

net: macb: constify macb configuration data

The configurations are not modified by the driver. Make them 'const' so
that they may be placed in a read-only section.

Signed-off-by: Josh Cartwright <joshc@ni.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

mpls: Spelling: s/conceved/conceived/, s/as/a/

Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Acked-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

tipc: fix inconsistent signal handling regression

Commit 9bbb4ecc6819 ("tipc: standardize recvmsg routine") changed
the sleep/wakeup behaviour for sockets entering recv() or accept().
In this process the order of reporting -EAGAIN/-EINTR was reversed.
This caused problems with wrong errno being reported back if the
timeout expires. The same problem happens if the socket is
nonblocking and recv()/accept() is called when the process have
pending signals. If there is no pending data read or connections to
accept, -EINTR will be returned instead of -EAGAIN.

Signed-off-by: Erik Hugne <erik.hugne@ericsson.com>
Reviewed-by: Ying Xue <ying.xue@windriver.com>
Reviewed-by: Jon Maloy <jon.maloy@ericsson.com>
Reported-by László Benedek <laszlo.benedek@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge tag 'linux-can-fixes-for-4.0-20150309' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can

Marc Kleine-Budde says:

====================
pull-request: can 2015-03-09

this is a pull request for net/master for the 4.0 release cycle, it consists of
6 patches:

A patch by Oliver Hartkopp fixes a long outstanding bug in the infrastructure,
which leads to skb_under_panics when CAN interfaces are used by AF_PACKET
sockets e.g. by dhclient. Stephane Grosjean contributes a patch for the
peak_usb driver which adds a missing initialization. Two patches by Ahmed S.
Darwish fix problems in the kvaser_usb driver. Followed by two patches by
myself, updating the MAINTAINERS file
====================

Signed-off-by: David S. Miller <davem@davemloft.net>