]> git.proxmox.com Git - mirror_ubuntu-kernels.git/log
mirror_ubuntu-kernels.git
6 years agobpf: sockhash, add release routine
John Fastabend [Sat, 30 Jun 2018 13:17:52 +0000 (06:17 -0700)]
bpf: sockhash, add release routine

Add map_release_uref pointer to hashmap ops. This was dropped when
original sockhash code was ported into bpf-next before initial
commit.

Fixes: 81110384441a ("bpf: sockmap, add hash map support")
Acked-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
6 years agobpf: sockhash fix omitted bucket lock in sock_close
John Fastabend [Sat, 30 Jun 2018 13:17:47 +0000 (06:17 -0700)]
bpf: sockhash fix omitted bucket lock in sock_close

First the sk_callback_lock() was being used to protect both the
sock callback hooks and the psock->maps list. This got overly
convoluted after the addition of sockhash (in sockmap it made
some sense because masp and callbacks were tightly coupled) so
lets split out a specific lock for maps and only use the callback
lock for its intended purpose. This fixes a couple cases where
we missed using maps lock when it was in fact needed. Also this
makes it easier to follow the code because now we can put the
locking closer to the actual code its serializing.

Next, in sock_hash_delete_elem() the pattern was as follows,

  sock_hash_delete_elem()
     [...]
     spin_lock(bucket_lock)
     l = lookup_elem_raw()
     if (l)
        hlist_del_rcu()
        write_lock(sk_callback_lock)
         .... destroy psock ...
        write_unlock(sk_callback_lock)
     spin_unlock(bucket_lock)

The ordering is necessary because we only know the {p}sock after
dereferencing the hash table which we can't do unless we have the
bucket lock held. Once we have the bucket lock and the psock element
it is deleted from the hashmap to ensure any other path doing a lookup
will fail. Finally, the refcnt is decremented and if zero the psock
is destroyed.

In parallel with the above (or free'ing the map) a tcp close event
may trigger tcp_close(). Which at the moment omits the bucket lock
altogether (oops!) where the flow looks like this,

  bpf_tcp_close()
     [...]
     write_lock(sk_callback_lock)
     for each psock->maps // list of maps this sock is part of
         hlist_del_rcu(ref_hash_node);
         .... destroy psock ...
     write_unlock(sk_callback_lock)

Obviously, and demonstrated by syzbot, this is broken because
we can have multiple threads deleting entries via hlist_del_rcu().

To fix this we might be tempted to wrap the hlist operation in a
bucket lock but that would create a lock inversion problem. In
summary to follow locking rules the psocks maps list needs the
sk_callback_lock (after this patch maps_lock) but we need the bucket
lock to do the hlist_del_rcu.

To resolve the lock inversion problem pop the head of the maps list
repeatedly and remove the reference until no more are left. If a
delete happens in parallel from the BPF API that is OK as well because
it will do a similar action, lookup the lock in the map/hash, delete
it from the map/hash, and dec the refcnt. We check for this case
before doing a destroy on the psock to ensure we don't have two
threads tearing down a psock. The new logic is as follows,

  bpf_tcp_close()
  e = psock_map_pop(psock->maps) // done with map lock
  bucket_lock() // lock hash list bucket
  l = lookup_elem_raw(head, hash, key, key_size);
  if (l) {
     //only get here if elmnt was not already removed
     hlist_del_rcu()
     ... destroy psock...
  }
  bucket_unlock()

And finally for all the above to work add missing locking around  map
operations per above. Then add RCU annotations and use
rcu_dereference/rcu_assign_pointer to manage values relying on RCU so
that the object is not free'd from sock_hash_free() while it is being
referenced in bpf_tcp_close().

Reported-by: syzbot+0ce137753c78f7b6acc1@syzkaller.appspotmail.com
Fixes: 81110384441a ("bpf: sockmap, add hash map support")
Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
6 years agobpf: sockmap, fix smap_list_map_remove when psock is in many maps
John Fastabend [Sat, 30 Jun 2018 13:17:41 +0000 (06:17 -0700)]
bpf: sockmap, fix smap_list_map_remove when psock is in many maps

If a hashmap is free'd with open socks it removes the reference to
the hash entry from the psock. If that is the last reference to the
psock then it will also be free'd by the reference counting logic.
However the current logic that removes the hash reference from the
list of references is broken. In smap_list_remove() we first check
if the sockmap entry matches and then check if the hashmap entry
matches. But, the sockmap entry sill always match because its NULL in
this case which causes the first entry to be removed from the list.
If this is always the "right" entry (because the user adds/removes
entries in order) then everything is OK but otherwise a subsequent
bpf_tcp_close() may reference a free'd object.

To fix this create two list handlers one for sockmap and one for
sockhash.

Reported-by: syzbot+0ce137753c78f7b6acc1@syzkaller.appspotmail.com
Fixes: 81110384441a ("bpf: sockmap, add hash map support")
Acked-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
6 years agobpf: sockmap, fix crash when ipv6 sock is added
John Fastabend [Sat, 30 Jun 2018 13:17:36 +0000 (06:17 -0700)]
bpf: sockmap, fix crash when ipv6 sock is added

This fixes a crash where we assign tcp_prot to IPv6 sockets instead
of tcpv6_prot.

Previously we overwrote the sk->prot field with tcp_prot even in the
AF_INET6 case. This patch ensures the correct tcp_prot and tcpv6_prot
are used.

Tested with 'netserver -6' and 'netperf -H [IPv6]' as well as
'netperf -H [IPv4]'. The ESTABLISHED check resolves the previously
crashing case here.

Fixes: 174a79ff9515 ("bpf: sockmap with sk redirect support")
Reported-by: syzbot+5c063698bdbfac19f363@syzkaller.appspotmail.com
Acked-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Wei Wang <weiwan@google.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
6 years agoMerge branch 'bpf-fixes'
Alexei Starovoitov [Fri, 29 Jun 2018 17:47:35 +0000 (10:47 -0700)]
Merge branch 'bpf-fixes'

Daniel Borkmann says:

====================
This set contains three fixes that are mostly JIT and set_memory_*()
related. The third in the series in particular fixes the syzkaller
bugs that were still pending; aside from local reproduction & testing,
also 'syz test' wasn't able to trigger them anymore. I've tested this
series on x86_64, arm64 and s390x, and kbuild bot wasn't yelling either
for the rest. For details, please see patches as usual, thanks!
====================

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
6 years agobpf: undo prog rejection on read-only lock failure
Daniel Borkmann [Thu, 28 Jun 2018 21:34:59 +0000 (23:34 +0200)]
bpf: undo prog rejection on read-only lock failure

Partially undo commit 9facc336876f ("bpf: reject any prog that failed
read-only lock") since it caused a regression, that is, syzkaller was
able to manage to cause a panic via fault injection deep in set_memory_ro()
path by letting an allocation fail: In x86's __change_page_attr_set_clr()
it was able to change the attributes of the primary mapping but not in
the alias mapping via cpa_process_alias(), so the second, inner call
to the __change_page_attr() via __change_page_attr_set_clr() had to split
a larger page and failed in the alloc_pages() with the artifically triggered
allocation error which is then propagated down to the call site.

Thus, for set_memory_ro() this means that it returned with an error, but
from debugging a probe_kernel_write() revealed EFAULT on that memory since
the primary mapping succeeded to get changed. Therefore the subsequent
hdr->locked = 0 reset triggered the panic as it was performed on read-only
memory, so call-site assumptions were infact wrong to assume that it would
either succeed /or/ not succeed at all since there's no such rollback in
set_memory_*() calls from partial change of mappings, in other words, we're
left in a state that is "half done". A later undo via set_memory_rw() is
succeeding though due to matching permissions on that part (aka due to the
try_preserve_large_page() succeeding). While reproducing locally with
explicitly triggering this error, the initial splitting only happens on
rare occasions and in real world it would additionally need oom conditions,
but that said, it could partially fail. Therefore, it is definitely wrong
to bail out on set_memory_ro() error and reject the program with the
set_memory_*() semantics we have today. Shouldn't have gone the extra mile
since no other user in tree today infact checks for any set_memory_*()
errors, e.g. neither module_enable_ro() / module_disable_ro() for module
RO/NX handling which is mostly default these days nor kprobes core with
alloc_insn_page() / free_insn_page() as examples that could be invoked long
after bootup and original 314beb9bcabf ("x86: bpf_jit_comp: secure bpf jit
against spraying attacks") did neither when it got first introduced to BPF
so "improving" with bailing out was clearly not right when set_memory_*()
cannot handle it today.

Kees suggested that if set_memory_*() can fail, we should annotate it with
__must_check, and all callers need to deal with it gracefully given those
set_memory_*() markings aren't "advisory", but they're expected to actually
do what they say. This might be an option worth to move forward in future
but would at the same time require that set_memory_*() calls from supporting
archs are guaranteed to be "atomic" in that they provide rollback if part
of the range fails, once that happened, the transition from RW -> RO could
be made more robust that way, while subsequent RO -> RW transition /must/
continue guaranteeing to always succeed the undo part.

Reported-by: syzbot+a4eb8c7766952a1ca872@syzkaller.appspotmail.com
Reported-by: syzbot+d866d1925855328eac3b@syzkaller.appspotmail.com
Fixes: 9facc336876f ("bpf: reject any prog that failed read-only lock")
Cc: Laura Abbott <labbott@redhat.com>
Cc: Kees Cook <keescook@chromium.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
6 years agobpf, s390: fix potential memleak when later bpf_jit_prog fails
Daniel Borkmann [Thu, 28 Jun 2018 21:34:58 +0000 (23:34 +0200)]
bpf, s390: fix potential memleak when later bpf_jit_prog fails

If we would ever fail in the bpf_jit_prog() pass that writes the
actual insns to the image after we got header via bpf_jit_binary_alloc()
then we also need to make sure to free it through bpf_jit_binary_free()
again when bailing out. Given we had prior bpf_jit_prog() passes to
initially probe for clobbered registers, program size and to fill in
addrs arrray for jump targets, this is more of a theoretical one,
but at least make sure this doesn't break with future changes.

Fixes: 054623105728 ("s390/bpf: Add s390x eBPF JIT compiler backend")
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
6 years agobpf, arm32: fix to use bpf_jit_binary_lock_ro api
Daniel Borkmann [Thu, 28 Jun 2018 21:34:57 +0000 (23:34 +0200)]
bpf, arm32: fix to use bpf_jit_binary_lock_ro api

Any eBPF JIT that where its underlying arch supports ARCH_HAS_SET_MEMORY
would need to use bpf_jit_binary_{un,}lock_ro() pair instead of the
set_memory_{ro,rw}() pair directly as otherwise changes to the former
might break. arm32's eBPF conversion missed to change it, so fix this
up here.

Fixes: 39c13c204bb1 ("arm: eBPF JIT compiler")
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
6 years agobpf: Change bpf_fib_lookup to return lookup status
David Ahern [Tue, 26 Jun 2018 23:21:18 +0000 (16:21 -0700)]
bpf: Change bpf_fib_lookup to return lookup status

For ACLs implemented using either FIB rules or FIB entries, the BPF
program needs the FIB lookup status to be able to drop the packet.
Since the bpf_fib_lookup API has not reached a released kernel yet,
change the return code to contain an encoding of the FIB lookup
result and return the nexthop device index in the params struct.

In addition, inform the BPF program of any post FIB lookup reason as
to why the packet needs to go up the stack.

The fib result for unicast routes must have an egress device, so remove
the check that it is non-NULL.

Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
6 years agotest_bpf: flag tests that cannot be jited on s390
Kleber Sacilotto de Souza [Wed, 27 Jun 2018 15:19:21 +0000 (17:19 +0200)]
test_bpf: flag tests that cannot be jited on s390

Flag with FLAG_EXPECTED_FAIL the BPF_MAXINSNS tests that cannot be jited
on s390 because they exceed BPF_SIZE_MAX and fail when
CONFIG_BPF_JIT_ALWAYS_ON is set. Also set .expected_errcode to -ENOTSUPP
so the tests pass in that case.

Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
Acked-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
6 years agoselftests: bpf: notification about privilege required to run test_lwt_seg6local.sh...
Jeffrin Jose T [Fri, 22 Jun 2018 21:40:32 +0000 (03:10 +0530)]
selftests: bpf: notification about privilege required to run test_lwt_seg6local.sh testing script

This test needs root privilege for it's successful execution.

This patch is atleast used to notify the user about the privilege
the script demands for the  smooth execution of the test.

Signed-off-by: Jeffrin Jose T (Rajagiri SET) <ahiliation@gmail.com>
Acked-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
6 years agoselftests: bpf: notification about privilege required to run test_lirc_mode2.sh testi...
Jeffrin Jose T [Fri, 22 Jun 2018 18:54:17 +0000 (00:24 +0530)]
selftests: bpf: notification about privilege required to run test_lirc_mode2.sh testing script

The test_lirc_mode2.sh script require root privilege for the successful
execution of the test.

This patch is to notify the user about the privilege the script
demands for the successful execution of the test.

Signed-off-by: Jeffrin Jose T (Rajagiri SET) <ahiliation@gmail.com>
Acked-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
6 years agoselftests: bpf: add missing NET_SCHED to config
Anders Roxell [Mon, 25 Jun 2018 14:56:05 +0000 (16:56 +0200)]
selftests: bpf: add missing NET_SCHED to config

CONFIG_NET_SCHED wasn't enabled in arm64's defconfig only for x86.

So bpf/test_tunnel.sh tests fails with:

  RTNETLINK answers: Operation not supported
  RTNETLINK answers: Operation not supported
  We have an error talking to the kernel, -1

Enable NET_SCHED and more tests pass.

Fixes: 3bce593ac06b ("selftests: bpf: config: add config fragments")
Signed-off-by: Anders Roxell <anders.roxell@linaro.org>
Acked-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
6 years agobpf: fix attach type BPF_LIRC_MODE2 dependency wrt CONFIG_CGROUP_BPF
Sean Young [Mon, 18 Jun 2018 23:04:24 +0000 (00:04 +0100)]
bpf: fix attach type BPF_LIRC_MODE2 dependency wrt CONFIG_CGROUP_BPF

If the kernel is compiled with CONFIG_CGROUP_BPF not enabled, it is not
possible to attach, detach or query IR BPF programs to /dev/lircN devices,
making them impossible to use. For embedded devices, it should be possible
to use IR decoding without cgroups or CONFIG_CGROUP_BPF enabled.

This change requires some refactoring, since bpf_prog_{attach,detach,query}
functions are now always compiled, but their code paths for cgroups need
moving out. Rather than a #ifdef CONFIG_CGROUP_BPF in kernel/bpf/syscall.c,
moving them to kernel/bpf/cgroup.c and kernel/bpf/sockmap.c does not
require #ifdefs since that is already conditionally compiled.

Fixes: f4364dcfc86d ("media: rc: introduce BPF_PROG_LIRC_MODE2")
Signed-off-by: Sean Young <sean@mess.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
6 years agonfp: bpf: don't stop offload if replace failed
Jakub Kicinski [Fri, 22 Jun 2018 18:56:56 +0000 (11:56 -0700)]
nfp: bpf: don't stop offload if replace failed

Stopping offload completely if replace of program failed dates
back to days of transparent offload.  Back then we wanted to
silently fall back to the in-driver processing.  Today we mark
programs for offload when they are loaded into the kernel, so
the transparent offload is no longer a reality.

Flags check in the driver will only allow replace of a driver
program with another driver program or an offload program with
another offload program.

When driver program is replaced stopping offload is a no-op,
because driver program isn't offloaded.  When replacing
offloaded program if the offload fails the entire operation
will fail all the way back to user space and we should continue
using the old program.  IOW when replacing a driver program
stopping offload is unnecessary and when replacing offloaded
program - it's a bug, old program should continue to run.

In practice this bug would mean that if offload operation was to
fail (either due to FW communication error, kernel OOM or new
program being offloaded but for a different netdev) driver
would continue reporting that previous XDP program is offloaded
but in fact no program will be loaded in hardware.  The failure
is fairly unlikely (found by inspection, when working on the code)
but it's unpleasant.

Backport note: even though the bug was introduced in commit
cafa92ac2553 ("nfp: bpf: add support for XDP_FLAGS_HW_MODE"),
this fix depends on commit 441a33031fe5 ("net: xdp: don't allow
device-bound programs in driver mode"), so this fix is sufficient
only in v4.15 or newer.  Kernels v4.13.x and v4.14.x do need to
stop offload if it was transparent/opportunistic, i.e. if
XDP_FLAGS_HW_MODE was not set on running program.

Fixes: cafa92ac2553 ("nfp: bpf: add support for XDP_FLAGS_HW_MODE")
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
Acked-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
6 years agotools/bpf: fix test_sockmap failure
Yonghong Song [Thu, 21 Jun 2018 17:02:20 +0000 (10:02 -0700)]
tools/bpf: fix test_sockmap failure

On one of our production test machine, when running
bpf selftest test_sockmap, I got the following error:
  # sudo ./test_sockmap
  libbpf: failed to create map (name: 'sock_map'): Operation not permitted
  libbpf: failed to load object 'test_sockmap_kern.o'
  libbpf: Can't get the 0th fd from program sk_skb1: only -1 instances
  ......
  load_bpf_file: (-1) Operation not permitted
  ERROR: (-1) load bpf failed

The error is due to not-big-enough rlimit
  struct rlimit r = {10 * 1024 * 1024, RLIM_INFINITY};

The test already includes "bpf_rlimit.h", which sets current
and max rlimit to RLIM_INFINITY. Let us just use it.

Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
6 years agoselftests: bpf: notification about privilege required to run test_kmod.sh testing...
Jeffrin Jose T [Thu, 21 Jun 2018 17:00:20 +0000 (22:30 +0530)]
selftests: bpf: notification about privilege required to run test_kmod.sh testing script

The test_kmod.sh script require root privilege for the successful
execution of the test.

This patch is to notify the user about the privilege the script
demands for the successful execution of the test.

Signed-off-by: Jeffrin Jose T (Rajagiri SET) <ahiliation@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
6 years agoMerge branch 'bpf-bpftool-fixes'
Daniel Borkmann [Thu, 21 Jun 2018 21:07:13 +0000 (23:07 +0200)]
Merge branch 'bpf-bpftool-fixes'

Jakub Kicinski says:

====================
Two small fixes for error handling in bpftool prog load, first patch
removes a duplicated message, second one frees resources correctly.
Multiple error messages break JSON:

{
    "error": "can't pin the object (/sys/fs/bpf/a): File exists"
},{
    "error": "failed to pin program"
}
====================

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
6 years agotools: bpftool: remember to close the libbpf object after prog load
Jakub Kicinski [Wed, 20 Jun 2018 18:42:46 +0000 (11:42 -0700)]
tools: bpftool: remember to close the libbpf object after prog load

Remembering to close all descriptors and free memory may not seem
important in a user space tool like bpftool, but if we were to run
in batch mode the consumed resources start to add up quickly.  Make
sure program load closes the libbpf object (which unloads and frees
it).

Fixes: 49a086c201a9 ("bpftool: implement prog load command")
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
Acked-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
6 years agotools: bpftool: remove duplicated error message on prog load
Jakub Kicinski [Wed, 20 Jun 2018 18:42:45 +0000 (11:42 -0700)]
tools: bpftool: remove duplicated error message on prog load

do_pin_fd() will already print out an error message if something
goes wrong.  Printing another error is unnecessary and will break
JSON output, since error messages are full objects:

$ bpftool -jp prog load tracex1_kern.o /sys/fs/bpf/a
{
    "error": "can't pin the object (/sys/fs/bpf/a): File exists"
},{
    "error": "failed to pin program"
}

Fixes: 49a086c201a9 ("bpftool: implement prog load command")
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
Acked-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
6 years agoselftests: net: add tcp_inq to gitignore
Anders Roxell [Wed, 20 Jun 2018 22:43:44 +0000 (00:43 +0200)]
selftests: net: add tcp_inq to gitignore

sha: 702353b538f5 ("selftest: add test for TCP_INQ") forgot to add
tcp_inq to .gitignore.

Signed-off-by: Anders Roxell <anders.roxell@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet: macb: Fix ptp time adjustment for large negative delta
Harini Katakam [Wed, 20 Jun 2018 11:34:20 +0000 (17:04 +0530)]
net: macb: Fix ptp time adjustment for large negative delta

When delta passed to gem_ptp_adjtime is negative, the sign is
maintained in the ns_to_timespec64 conversion. Hence timespec_add
should be used directly. timespec_sub will just subtract the negative
value thus increasing the time difference.

Signed-off-by: Harini Katakam <harini.katakam@xilinx.com>
Acked-by: Nicolas Ferre <nicolas.ferre@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoipvlan: fix IFLA_MTU ignored on NEWLINK
Xin Long [Thu, 21 Jun 2018 04:56:04 +0000 (12:56 +0800)]
ipvlan: fix IFLA_MTU ignored on NEWLINK

Commit 296d48568042 ("ipvlan: inherit MTU from master device") adjusted
the mtu from the master device when creating a ipvlan device, but it
would also override the mtu value set in rtnl_create_link. It causes
IFLA_MTU param not to take effect.

So this patch is to not adjust the mtu if IFLA_MTU param is set when
creating a ipvlan device.

Fixes: 296d48568042 ("ipvlan: inherit MTU from master device")
Reported-by: Jianlin Shi <jishi@redhat.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agosctp: fix erroneous inc of snmp SctpFragUsrMsgs
Marcelo Ricardo Leitner [Wed, 20 Jun 2018 15:47:52 +0000 (12:47 -0300)]
sctp: fix erroneous inc of snmp SctpFragUsrMsgs

Currently it is incrementing SctpFragUsrMsgs when the user message size
is of the exactly same size as the maximum fragment size, which is wrong.

The fix is to increment it only when user message is bigger than the
maximum fragment size.

Fixes: bfd2e4b8734d ("sctp: refactor sctp_datamsg_from_user")
Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Acked-by: Neil Horman <nhorman@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agobpf: enforce correct alignment for instructions
Eric Dumazet [Thu, 21 Jun 2018 00:24:09 +0000 (17:24 -0700)]
bpf: enforce correct alignment for instructions

After commit 9facc336876f ("bpf: reject any prog that failed read-only lock")
offsetof(struct bpf_binary_header, image) became 3 instead of 4,
breaking powerpc BPF badly, since instructions need to be word aligned.

Fixes: 9facc336876f ("bpf: reject any prog that failed read-only lock")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Martin KaFai Lau <kafai@fb.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet: mscc: fix the injection header
Antoine Tenart [Wed, 20 Jun 2018 08:50:46 +0000 (10:50 +0200)]
net: mscc: fix the injection header

When injecting frames in the Ocelot switch driver an injection header
(IFH) should be used to configure various parameters related to a given
frame, such as the port onto which the frame should be departed or its
vlan id. Other parameters in the switch configuration can led to an
injected frame being sent without an IFH but this led to various issues
as the per-frame parameters are then not used. This is especially true
when using multiple ports for injection.

The IFH was injected with the wrong endianness which led to the switch
not taking it into account as the IFH_INJ_BYPASS bit was then unset.
(The bit tells the switch to use the IFH over its internal
configuration). This patch fixes it.

In addition to the endianness fix, the IFH is also fixed. As it was
(unwillingly) unused, some of its fields were not configured the right
way.

Fixes: a556c76adc05 ("net: mscc: Add initial Ocelot switch support")
Signed-off-by: Antoine Tenart <antoine.tenart@bootlin.com>
Reviewed-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet: davinci_emac: match the mdio device against its compatible if possible
Bartosz Golaszewski [Wed, 20 Jun 2018 08:03:56 +0000 (10:03 +0200)]
net: davinci_emac: match the mdio device against its compatible if possible

Device tree based systems without of_dev_auxdata will have the mdio
device named differently than "davinci_mdio(.0)". In this case use the
device's parent's compatible string for matching

Signed-off-by: Bartosz Golaszewski <bgolaszewski@baylibre.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agor8169: Fix netpoll oops
Ville Syrjälä [Wed, 20 Jun 2018 12:01:53 +0000 (15:01 +0300)]
r8169: Fix netpoll oops

Pass the correct thing to rtl8169_interrupt() from netpoll.

Cc: Realtek linux nic maintainers <nic_swsd@realtek.com>
Cc: netdev@vger.kernel.org
Cc: Heiner Kallweit <hkallweit1@gmail.com>
Cc: David S. Miller <davem@davemloft.net>
Fixes: ebcd5daa7ffd ("r8169: change interrupt handler argument type")
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agostrparser: Don't schedule in workqueue in paused state
Vakul Garg [Wed, 20 Jun 2018 21:59:49 +0000 (03:29 +0530)]
strparser: Don't schedule in workqueue in paused state

In function strp_data_ready(), it is useless to call queue_work if
the state of strparser is already paused. The state checking should
be done before calling queue_work. The change reduces the context
switches and improves the ktls-rx throughput by approx 20% (measured
on cortex-a53 based platform).

Signed-off-by: Vakul Garg <vakul.garg@nxp.com>
Acked-by: Dave Watson <davejwatson@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoselftests: net: add config fragments
Anders Roxell [Tue, 19 Jun 2018 16:41:11 +0000 (18:41 +0200)]
selftests: net: add config fragments

Add fragments to pass bridge and vlan tests.

Fixes: 33b01b7b4f19 ("selftests: add rtnetlink test script")
Signed-off-by: Anders Roxell <anders.roxell@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agobpfilter: fix user mode helper cross compilation
Matteo Croce [Wed, 20 Jun 2018 14:04:34 +0000 (16:04 +0200)]
bpfilter: fix user mode helper cross compilation

Use $(OBJDUMP) instead of literal 'objdump' to avoid
using host toolchain when cross compiling.

Fixes: 421780fd4983 ("bpfilter: fix build error")
Signed-off-by: Matteo Croce <mcroce@redhat.com>
Reported-by: Stefano Brivio <sbrivio@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoMerge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma
Linus Torvalds [Wed, 20 Jun 2018 22:22:30 +0000 (07:22 +0900)]
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma

Pull rdma fixes from Jason Gunthorpe:
 "Here are eight fairly small fixes collected over the last two weeks.

  Regression and crashing bug fixes:

   - mlx4/5: Fixes for issues found from various checkers

   - A resource tracking and uverbs regression in the core code

   - qedr: NULL pointer regression found during testing

   - rxe: Various small bugs"

* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma:
  IB/rxe: Fix missing completion for mem_reg work requests
  RDMA/core: Save kernel caller name when creating CQ using ib_create_cq()
  IB/uverbs: Fix ordering of ucontext check in ib_uverbs_write
  IB/mlx4: Fix an error handling path in 'mlx4_ib_rereg_user_mr()'
  RDMA/qedr: Fix NULL pointer dereference when running over iWARP without RDMA-CM
  IB/mlx5: Fix return value check in flow_counters_set_data()
  IB/mlx5: Fix memory leak in mlx5_ib_create_flow
  IB/rxe: avoid double kfree skb

6 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Linus Torvalds [Wed, 20 Jun 2018 22:13:42 +0000 (07:13 +0900)]
Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net

Pull networking fixes from David Miller:

 1) Fix crash on bpf_prog_load() errors, from Daniel Borkmann.

 2) Fix ATM VCC memory accounting, from David Woodhouse.

 3) fib6_info objects need RCU freeing, from Eric Dumazet.

 4) Fix SO_BINDTODEVICE handling for TCP sockets, from David Ahern.

 5) Fix clobbered error code in enic_open() failure path, from
    Govindarajulu Varadarajan.

 6) Propagate dev_get_valid_name() error returns properly, from Li
    RongQing.

 7) Fix suspend/resume in davinci_emac driver, from Bartosz Golaszewski.

 8) Various act_ife fixes (recursive locking, IDR leaks, etc.) from
    Davide Caratti.

 9) Fix buggy checksum handling in sungem driver, from Eric Dumazet.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (40 commits)
  ip: limit use of gso_size to udp
  stmmac: fix DMA channel hang in half-duplex mode
  net: stmmac: socfpga: add additional ocp reset line for Stratix10
  net: sungem: fix rx checksum support
  bpfilter: ignore binary files
  bpfilter: fix build error
  net/usb/drivers: Remove useless hrtimer_active check
  net/sched: act_ife: preserve the action control in case of error
  net/sched: act_ife: fix recursive lock and idr leak
  net: ethernet: fix suspend/resume in davinci_emac
  net: propagate dev_get_valid_name return code
  enic: do not overwrite error code
  net/tcp: Fix socket lookups with SO_BINDTODEVICE
  ptp: replace getnstimeofday64() with ktime_get_real_ts64()
  net/ipv6: respect rcu grace period before freeing fib6_info
  net: net_failover: fix typo in net_failover_slave_register()
  ipvlan: use ETH_MAX_MTU as max mtu
  net: hamradio: use eth_broadcast_addr
  enic: initialize enic->rfs_h.lock in enic_probe
  MAINTAINERS: Add Sam as the maintainer for NCSI
  ...

6 years agoMerge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid
Linus Torvalds [Wed, 20 Jun 2018 07:42:39 +0000 (16:42 +0900)]
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid

Pull HID fixes from Jiri Kosina:

 - Wacom 2nd-gen Intuos Pro large Y axis handling fix from Jason Gerecke

 - fix for hibernation in Intel ISH driver, from Even Xu

 - crash fix for hid-steam driver, from Rodrigo Rivas Costa

 - new device ID addition to google-hammer driver

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid:
  HID: wacom: Correct logical maximum Y for 2nd-gen Intuos Pro large
  HID: intel_ish-hid: ipc: register more pm callbacks to support hibernation
  HID: steam: use hid_device.driver_data instead of hid_set_drvdata()
  HID: google: Add support for whiskers

6 years agoMerge tag 'dma-rename-4.18' of git://git.infradead.org/users/hch/dma-mapping
Linus Torvalds [Wed, 20 Jun 2018 07:30:01 +0000 (16:30 +0900)]
Merge tag 'dma-rename-4.18' of git://git.infradead.org/users/hch/dma-mapping

Pull dma-mapping rename from Christoph Hellwig:
 "Move all the dma-mapping code to kernel/dma and lose their dma-*
  prefixes"

* tag 'dma-rename-4.18' of git://git.infradead.org/users/hch/dma-mapping:
  dma-mapping: move all DMA mapping code to kernel/dma
  dma-mapping: use obj-y instead of lib-y for generic dma ops

6 years agoHID: wacom: Correct logical maximum Y for 2nd-gen Intuos Pro large
Jason Gerecke [Tue, 12 Jun 2018 20:42:46 +0000 (13:42 -0700)]
HID: wacom: Correct logical maximum Y for 2nd-gen Intuos Pro large

The HID descriptor for the 2nd-gen Intuos Pro large (PTH-860) contains
a typo which defines an incorrect logical maximum Y value. This causes
a small portion of the bottom of the tablet to become unusable (both
because the area is below the "bottom" of the tablet and because
'wacom_wac_event' ignores out-of-range values). It also results in a
skewed aspect ratio.

To fix this, we add a quirk to 'wacom_usage_mapping' which overwrites
the data with the correct value.

Signed-off-by: Jason Gerecke <jason.gerecke@wacom.com>
CC: stable@vger.kernel.org # v4.10+
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
6 years agoHID: intel_ish-hid: ipc: register more pm callbacks to support hibernation
Even Xu [Thu, 11 Feb 2016 20:11:34 +0000 (04:11 +0800)]
HID: intel_ish-hid: ipc: register more pm callbacks to support hibernation

Current ISH driver only registers suspend/resume PM callbacks which don't
support hibernation (suspend to disk). Basically after hiberation, the ISH
can't resume properly and user may not see sensor events (for example: screen
rotation may not work).

User will not see a crash or panic or anything except the following message
in log:

hid-sensor-hub 001F:8086:22D8.0001: timeout waiting for response from ISHTP device

So this patch adds support for S4/hiberbation to ISH by using the
SIMPLE_DEV_PM_OPS() MACRO instead of struct dev_pm_ops directly. The suspend
and resume functions will now be used for both suspend to RAM and hibernation.

If power management is disabled, SIMPLE_DEV_PM_OPS will do nothing, the suspend
and resume related functions won't be used, so mark them as __maybe_unused to
clarify that this is the intended behavior, and remove #ifdefs for power
management.

Cc: stable@vger.kernel.org
Signed-off-by: Even Xu <even.xu@intel.com>
Acked-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
6 years agoHID: steam: use hid_device.driver_data instead of hid_set_drvdata()
Rodrigo Rivas Costa [Tue, 22 May 2018 20:10:06 +0000 (22:10 +0200)]
HID: steam: use hid_device.driver_data instead of hid_set_drvdata()

When creating the low-level hidraw device, the reference to steam_device
was stored using hid_set_drvdata(). But this value is not guaranteed to
be kept when set before calling probe. If this pointer is reset, it
crashes when opening the emulated hidraw device.

It looks like hid_set_drvdata() is for users "avobe" this hid_device,
while hid_device.driver_data it for users "below" this one.

In this case, we are creating a virtual hidraw device, so we must use
hid_device.driver_data.

Signed-off-by: Rodrigo Rivas Costa <rodrigorivascosta@gmail.com>
Tested-by: Mariusz Ceier <mceier+kernel@gmail.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
6 years agoproc: fix missing final NUL in get_mm_cmdline() rewrite
Linus Torvalds [Wed, 20 Jun 2018 00:47:20 +0000 (09:47 +0900)]
proc: fix missing final NUL in get_mm_cmdline() rewrite

The rewrite of the cmdline fetching missed the fact that we used to also
return the final terminating NUL character of the last argument.  I
hadn't noticed, and none of the tools I tested cared, but something
obviously must care, because Michal Kubecek noticed the change in
behavior.

Tweak the "find the end" logic to actually include the NUL character,
and once past the eend of argv, always start the strnlen() at the
expected (original) argument end.

This whole "allow people to rewrite their arguments in place" is a nasty
hack and requires that odd slop handling at the end of the argv array,
but it's our traditional model, so we continue to support it.

Repored-and-bisected-by: Michal Kubecek <mkubecek@suse.cz>
Reviewed-and-tested-by: Michal Kubecek <mkubecek@suse.cz>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
6 years agoip: limit use of gso_size to udp
Willem de Bruijn [Tue, 19 Jun 2018 16:47:52 +0000 (12:47 -0400)]
ip: limit use of gso_size to udp

The ipcm(6)_cookie field gso_size is set only in the udp path. The ip
layer copies this to cork only if sk_type is SOCK_DGRAM. This check
proved too permissive. Ping and l2tp sockets have the same type.

Limit to sockets of type SOCK_DGRAM and protocol IPPROTO_UDP to
exclude ping sockets.

v1 -> v2
- remove irrelevant whitespace changes

Fixes: bec1f6f69736 ("udp: generate gso with UDP_SEGMENT")
Reported-by: Maciej Żenczykowski <maze@google.com>
Signed-off-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agostmmac: fix DMA channel hang in half-duplex mode
Bhadram Varka [Sun, 17 Jun 2018 14:32:05 +0000 (20:02 +0530)]
stmmac: fix DMA channel hang in half-duplex mode

HW does not support Half-duplex mode in multi-queue
scenario. Fix it by not advertising the Half-Duplex
mode if multi-queue enabled.

Signed-off-by: Bhadram Varka <vbhadram@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet: stmmac: socfpga: add additional ocp reset line for Stratix10
Dinh Nguyen [Tue, 19 Jun 2018 15:35:38 +0000 (10:35 -0500)]
net: stmmac: socfpga: add additional ocp reset line for Stratix10

The Stratix10 platform has an additional reset line, OCP(Open Core Protocol),
that also needs to get deasserted for the stmmac ethernet controller to work.
Thus we need to update the Kconfig to include ARCH_STRATIX10 in order to build
dwmac-socfpga.

Also, remove the redundant check for the reset controller pointer. The
reset driver already checks for the pointer and returns 0 if the pointer
is NULL.

Signed-off-by: Dinh Nguyen <dinguyen@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet: sungem: fix rx checksum support
Eric Dumazet [Wed, 20 Jun 2018 02:18:50 +0000 (19:18 -0700)]
net: sungem: fix rx checksum support

After commit 88078d98d1bb ("net: pskb_trim_rcsum() and CHECKSUM_COMPLETE
are friends"), sungem owners reported the infamous "eth0: hw csum failure"
message.

CHECKSUM_COMPLETE has in fact never worked for this driver, but this
was masked by the fact that upper stacks had to strip the FCS, and
therefore skb->ip_summed was set back to CHECKSUM_NONE before
my recent change.

Driver configures a number of bytes to skip when the chip computes
the checksum, and for some reason only half of the Ethernet header
was skipped.

Then a second problem is that we should strip the FCS by default,
unless the driver is updated to eventually support NETIF_F_RXFCS in
the future.

Finally, a driver should check if NETIF_F_RXCSUM feature is enabled
or not, so that the admin can turn off rx checksum if wanted.

Many thanks to Andreas Schwab and Mathieu Malaterre for their
help in debugging this issue.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Meelis Roos <mroos@linux.ee>
Reported-by: Mathieu Malaterre <malat@debian.org>
Reported-by: Andreas Schwab <schwab@linux-m68k.org>
Tested-by: Andreas Schwab <schwab@linux-m68k.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agobpfilter: ignore binary files
Matteo Croce [Tue, 19 Jun 2018 15:21:36 +0000 (17:21 +0200)]
bpfilter: ignore binary files

net/bpfilter/bpfilter_umh is a binary file generated when bpfilter is
enabled, add it to .gitignore to avoid committing it.

Fixes: d2ba09c17a064 ("net: add skeleton of bpfilter kernel module")
Signed-off-by: Matteo Croce <mcroce@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agobpfilter: fix build error
Matteo Croce [Tue, 19 Jun 2018 15:16:20 +0000 (17:16 +0200)]
bpfilter: fix build error

bpfilter Makefile assumes that the system locale is en_US, and the
parsing of objdump output fails.
Set LC_ALL=C and, while at it, rewrite the objdump parsing so it spawns
only 2 processes instead of 7.

Fixes: d2ba09c17a064 ("net: add skeleton of bpfilter kernel module")
Signed-off-by: Matteo Croce <mcroce@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet/usb/drivers: Remove useless hrtimer_active check
Daniel Lezcano [Tue, 19 Jun 2018 14:14:30 +0000 (16:14 +0200)]
net/usb/drivers: Remove useless hrtimer_active check

The code does:

 if (hrtimer_active(&t))
    hrtimer_cancel(&t);

However, hrtimer_cancel() checks if the timer is active, so the
test above is pointless.

Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet/sched: act_ife: preserve the action control in case of error
Davide Caratti [Tue, 19 Jun 2018 13:45:50 +0000 (15:45 +0200)]
net/sched: act_ife: preserve the action control in case of error

in the following script

 # tc actions add action ife encode allow prio pass index 42
 # tc actions replace action ife encode allow tcindex drop index 42

the action control should remain equal to 'pass', if the kernel failed
to replace the TC action. Pospone the assignment of the action control,
to ensure it is not overwritten in the error path of tcf_ife_init().

Fixes: ef6980b6becb ("introduce IFE action")
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Acked-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet/sched: act_ife: fix recursive lock and idr leak
Davide Caratti [Tue, 19 Jun 2018 13:39:46 +0000 (15:39 +0200)]
net/sched: act_ife: fix recursive lock and idr leak

a recursive lock warning [1] can be observed with the following script,

 # $TC actions add action ife encode allow prio pass index 42
 IFE type 0xED3E
 # $TC actions replace action ife encode allow tcindex pass index 42

in case the kernel was unable to run the last command (e.g. because of
the impossibility to load 'act_meta_skbtcindex'). For a similar reason,
the kernel can leak idr in the error path of tcf_ife_init(), because
tcf_idr_release() is not called after successful idr reservation:

 # $TC actions add action ife encode allow tcindex index 47
 IFE type 0xED3E
 RTNETLINK answers: No such file or directory
 We have an error talking to the kernel
 # $TC actions add action ife encode allow tcindex index 47
 IFE type 0xED3E
 RTNETLINK answers: No space left on device
 We have an error talking to the kernel
 # $TC actions add action ife encode use mark 7 type 0xfefe pass index 47
 IFE type 0xFEFE
 RTNETLINK answers: No space left on device
 We have an error talking to the kernel

Since tcfa_lock is already taken when the action is being edited, a call
to tcf_idr_release() wrongly makes tcf_idr_cleanup() take the same lock
again. On the other hand, tcf_idr_release() needs to be called in the
error path of tcf_ife_init(), to undo the last tcf_idr_create() invocation.
Fix both problems in tcf_ife_init().
Since the cleanup() routine can now be called when ife->params is NULL,
also add a NULL pointer check to avoid calling kfree_rcu(NULL, rcu).

 [1]
 ============================================
 WARNING: possible recursive locking detected
 4.17.0-rc4.kasan+ #417 Tainted: G            E
 --------------------------------------------
 tc/3932 is trying to acquire lock:
 000000005097c9a6 (&(&p->tcfa_lock)->rlock){+...}, at: tcf_ife_cleanup+0x19/0x80 [act_ife]

 but task is already holding lock:
 000000005097c9a6 (&(&p->tcfa_lock)->rlock){+...}, at: tcf_ife_init+0xf6d/0x13c0 [act_ife]

 other info that might help us debug this:
  Possible unsafe locking scenario:

        CPU0
        ----
   lock(&(&p->tcfa_lock)->rlock);
   lock(&(&p->tcfa_lock)->rlock);

  *** DEADLOCK ***

  May be due to missing lock nesting notation

 2 locks held by tc/3932:
  #0: 000000007ca8e990 (rtnl_mutex){+.+.}, at: tcf_ife_init+0xf61/0x13c0 [act_ife]
  #1: 000000005097c9a6 (&(&p->tcfa_lock)->rlock){+...}, at: tcf_ife_init+0xf6d/0x13c0 [act_ife]

 stack backtrace:
 CPU: 3 PID: 3932 Comm: tc Tainted: G            E     4.17.0-rc4.kasan+ #417
 Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
 Call Trace:
  dump_stack+0x9a/0xeb
  __lock_acquire+0xf43/0x34a0
  ? debug_check_no_locks_freed+0x2b0/0x2b0
  ? debug_check_no_locks_freed+0x2b0/0x2b0
  ? debug_check_no_locks_freed+0x2b0/0x2b0
  ? __mutex_lock+0x62f/0x1240
  ? kvm_sched_clock_read+0x1a/0x30
  ? sched_clock+0x5/0x10
  ? sched_clock_cpu+0x18/0x170
  ? find_held_lock+0x39/0x1d0
  ? lock_acquire+0x10b/0x330
  lock_acquire+0x10b/0x330
  ? tcf_ife_cleanup+0x19/0x80 [act_ife]
  _raw_spin_lock_bh+0x38/0x70
  ? tcf_ife_cleanup+0x19/0x80 [act_ife]
  tcf_ife_cleanup+0x19/0x80 [act_ife]
  __tcf_idr_release+0xff/0x350
  tcf_ife_init+0xdde/0x13c0 [act_ife]
  ? ife_exit_net+0x290/0x290 [act_ife]
  ? __lock_is_held+0xb4/0x140
  tcf_action_init_1+0x67b/0xad0
  ? tcf_action_dump_old+0xa0/0xa0
  ? sched_clock+0x5/0x10
  ? sched_clock_cpu+0x18/0x170
  ? kvm_sched_clock_read+0x1a/0x30
  ? sched_clock+0x5/0x10
  ? sched_clock_cpu+0x18/0x170
  ? memset+0x1f/0x40
  tcf_action_init+0x30f/0x590
  ? tcf_action_init_1+0xad0/0xad0
  ? memset+0x1f/0x40
  tc_ctl_action+0x48e/0x5e0
  ? mutex_lock_io_nested+0x1160/0x1160
  ? tca_action_gd+0x990/0x990
  ? sched_clock+0x5/0x10
  ? find_held_lock+0x39/0x1d0
  rtnetlink_rcv_msg+0x4da/0x990
  ? validate_linkmsg+0x680/0x680
  ? sched_clock_cpu+0x18/0x170
  ? find_held_lock+0x39/0x1d0
  netlink_rcv_skb+0x127/0x350
  ? validate_linkmsg+0x680/0x680
  ? netlink_ack+0x970/0x970
  ? __kmalloc_node_track_caller+0x304/0x3a0
  netlink_unicast+0x40f/0x5d0
  ? netlink_attachskb+0x580/0x580
  ? _copy_from_iter_full+0x187/0x760
  ? import_iovec+0x90/0x390
  netlink_sendmsg+0x67f/0xb50
  ? netlink_unicast+0x5d0/0x5d0
  ? copy_msghdr_from_user+0x206/0x340
  ? netlink_unicast+0x5d0/0x5d0
  sock_sendmsg+0xb3/0xf0
  ___sys_sendmsg+0x60a/0x8b0
  ? copy_msghdr_from_user+0x340/0x340
  ? lock_downgrade+0x5e0/0x5e0
  ? tty_write_lock+0x18/0x50
  ? kvm_sched_clock_read+0x1a/0x30
  ? sched_clock+0x5/0x10
  ? sched_clock_cpu+0x18/0x170
  ? find_held_lock+0x39/0x1d0
  ? lock_downgrade+0x5e0/0x5e0
  ? lock_acquire+0x10b/0x330
  ? __audit_syscall_entry+0x316/0x690
  ? current_kernel_time64+0x6b/0xd0
  ? __fget_light+0x55/0x1f0
  ? __sys_sendmsg+0xd2/0x170
  __sys_sendmsg+0xd2/0x170
  ? __ia32_sys_shutdown+0x70/0x70
  ? syscall_trace_enter+0x57a/0xd60
  ? rcu_read_lock_sched_held+0xdc/0x110
  ? __bpf_trace_sys_enter+0x10/0x10
  ? do_syscall_64+0x22/0x480
  do_syscall_64+0xa5/0x480
  entry_SYSCALL_64_after_hwframe+0x49/0xbe
 RIP: 0033:0x7fd646988ba0
 RSP: 002b:00007fffc9fab3c8 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
 RAX: ffffffffffffffda RBX: 00007fffc9fab4f0 RCX: 00007fd646988ba0
 RDX: 0000000000000000 RSI: 00007fffc9fab440 RDI: 0000000000000003
 RBP: 000000005b28c8b3 R08: 0000000000000002 R09: 0000000000000000
 R10: 00007fffc9faae20 R11: 0000000000000246 R12: 0000000000000000
 R13: 00007fffc9fab504 R14: 0000000000000001 R15: 000000000066c100

Fixes: 4e8c86155010 ("net sched: net sched: ife action fix late binding")
Fixes: ef6980b6becb ("introduce IFE action")
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Acked-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet: ethernet: fix suspend/resume in davinci_emac
Bartosz Golaszewski [Tue, 19 Jun 2018 12:44:00 +0000 (14:44 +0200)]
net: ethernet: fix suspend/resume in davinci_emac

This patch reverts commit 3243ff2a05ec ("net: ethernet: davinci_emac:
Deduplicate bus_find_device() by name matching") and adds a comment
which should stop anyone from reintroducing the same "fix" in the future.

We can't use bus_find_device_by_name() here because the device name is
not guaranteed to be 'davinci_mdio'. On some systems it can be
'davinci_mdio.0' so we need to use strncmp() against the first part of
the string to correctly match it.

Fixes: 3243ff2a05ec ("net: ethernet: davinci_emac: Deduplicate bus_find_device() by name matching")
Cc: stable@vger.kernel.org
Signed-off-by: Bartosz Golaszewski <bgolaszewski@baylibre.com>
Acked-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet: propagate dev_get_valid_name return code
Li RongQing [Tue, 19 Jun 2018 09:23:17 +0000 (17:23 +0800)]
net: propagate dev_get_valid_name return code

if dev_get_valid_name failed, propagate its return code

and remove the setting err to ENODEV, it will be set to
0 again before dev_change_net_namespace exits.

Signed-off-by: Li RongQing <lirongqing@baidu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoenic: do not overwrite error code
Govindarajulu Varadarajan [Mon, 18 Jun 2018 17:01:05 +0000 (10:01 -0700)]
enic: do not overwrite error code

In failure path, we overwrite err to what vnic_rq_disable() returns. In
case it returns 0, enic_open() returns success in case of error.

Reported-by: Ben Hutchings <ben.hutchings@codethink.co.uk>
Fixes: e8588e268509 ("enic: enable rq before updating rq descriptors")
Signed-off-by: Govindarajulu Varadarajan <gvaradar@cisco.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet/tcp: Fix socket lookups with SO_BINDTODEVICE
David Ahern [Mon, 18 Jun 2018 19:30:37 +0000 (12:30 -0700)]
net/tcp: Fix socket lookups with SO_BINDTODEVICE

Similar to 69678bcd4d2d ("udp: fix SO_BINDTODEVICE"), TCP socket lookups
need to fail if dev_match is not true. Currently, a packet to a given port
can match a socket bound to device when it should not. In the VRF case,
this causes the lookup to hit a VRF socket and not a global socket
resulting in a response trying to go through the VRF when it should not.

Fixes: 3fa6f616a7a4d ("net: ipv4: add second dif to inet socket lookups")
Fixes: 4297a0ef08572 ("net: ipv6: add second dif to inet6 socket lookups")
Reported-by: Lou Berger <lberger@labn.net>
Diagnosed-by: Renato Westphal <renato@opensourcerouting.org>
Tested-by: Renato Westphal <renato@opensourcerouting.org>
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoptp: replace getnstimeofday64() with ktime_get_real_ts64()
Arnd Bergmann [Mon, 18 Jun 2018 14:20:39 +0000 (16:20 +0200)]
ptp: replace getnstimeofday64() with ktime_get_real_ts64()

getnstimeofday64() is deprecated and getting replaced throughout
the kernel with ktime_get_*() based helpers for a more consistent
interface.

The two functions do the exact same thing, so this is just
a cosmetic change.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet/ipv6: respect rcu grace period before freeing fib6_info
Eric Dumazet [Mon, 18 Jun 2018 12:24:31 +0000 (05:24 -0700)]
net/ipv6: respect rcu grace period before freeing fib6_info

syzbot reported use after free that is caused by fib6_info being
freed without a proper RCU grace period.

CPU: 0 PID: 1407 Comm: udevd Not tainted 4.17.0+ #39
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
 <IRQ>
 __dump_stack lib/dump_stack.c:77 [inline]
 dump_stack+0x1b9/0x294 lib/dump_stack.c:113
 print_address_description+0x6c/0x20b mm/kasan/report.c:256
 kasan_report_error mm/kasan/report.c:354 [inline]
 kasan_report.cold.7+0x242/0x2fe mm/kasan/report.c:412
 __asan_report_load8_noabort+0x14/0x20 mm/kasan/report.c:433
 __read_once_size include/linux/compiler.h:188 [inline]
 find_rr_leaf net/ipv6/route.c:705 [inline]
 rt6_select net/ipv6/route.c:761 [inline]
 fib6_table_lookup+0x12b7/0x14d0 net/ipv6/route.c:1823
 ip6_pol_route+0x1c2/0x1020 net/ipv6/route.c:1856
 ip6_pol_route_output+0x54/0x70 net/ipv6/route.c:2082
 fib6_rule_lookup+0x211/0x6d0 net/ipv6/fib6_rules.c:122
 ip6_route_output_flags+0x2c5/0x350 net/ipv6/route.c:2110
 ip6_route_output include/net/ip6_route.h:82 [inline]
 icmpv6_xrlim_allow net/ipv6/icmp.c:211 [inline]
 icmp6_send+0x147c/0x2da0 net/ipv6/icmp.c:535
 icmpv6_send+0x17a/0x300 net/ipv6/ip6_icmp.c:43
 ip6_link_failure+0xa5/0x790 net/ipv6/route.c:2244
 dst_link_failure include/net/dst.h:427 [inline]
 ndisc_error_report+0xd1/0x1c0 net/ipv6/ndisc.c:695
 neigh_invalidate+0x246/0x550 net/core/neighbour.c:892
 neigh_timer_handler+0xaf9/0xde0 net/core/neighbour.c:978
 call_timer_fn+0x230/0x940 kernel/time/timer.c:1326
 expire_timers kernel/time/timer.c:1363 [inline]
 __run_timers+0x79e/0xc50 kernel/time/timer.c:1666
 run_timer_softirq+0x4c/0x70 kernel/time/timer.c:1692
 __do_softirq+0x2e0/0xaf5 kernel/softirq.c:284
 invoke_softirq kernel/softirq.c:364 [inline]
 irq_exit+0x1d1/0x200 kernel/softirq.c:404
 exiting_irq arch/x86/include/asm/apic.h:527 [inline]
 smp_apic_timer_interrupt+0x17e/0x710 arch/x86/kernel/apic/apic.c:1052
 apic_timer_interrupt+0xf/0x20 arch/x86/entry/entry_64.S:863
 </IRQ>
RIP: 0010:strlen+0x5e/0xa0 lib/string.c:482
Code: 24 00 74 3b 48 bb 00 00 00 00 00 fc ff df 4c 89 e0 48 83 c0 01 48 89 c2 48 89 c1 48 c1 ea 03 83 e1 07 0f b6 14 1a 38 ca 7f 04 <84> d2 75 23 80 38 00 75 de 48 83 c4 08 4c 29 e0 5b 41 5c 5d c3 48
RSP: 0018:ffff8801af117850 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff13
RAX: ffff880197f53bd0 RBX: dffffc0000000000 RCX: 0000000000000000
RDX: 0000000000000000 RSI: ffffffff81c5b06c RDI: ffff880197f53bc0
RBP: ffff8801af117868 R08: ffff88019a976540 R09: 0000000000000000
R10: ffff88019a976540 R11: 0000000000000000 R12: ffff880197f53bc0
R13: ffff880197f53bc0 R14: ffffffff899e4e90 R15: ffff8801d91c6a00
 strlen include/linux/string.h:267 [inline]
 getname_kernel+0x24/0x370 fs/namei.c:218
 open_exec+0x17/0x70 fs/exec.c:882
 load_elf_binary+0x968/0x5610 fs/binfmt_elf.c:780
 search_binary_handler+0x17d/0x570 fs/exec.c:1653
 exec_binprm fs/exec.c:1695 [inline]
 __do_execve_file.isra.35+0x16fe/0x2710 fs/exec.c:1819
 do_execveat_common fs/exec.c:1866 [inline]
 do_execve fs/exec.c:1883 [inline]
 __do_sys_execve fs/exec.c:1964 [inline]
 __se_sys_execve fs/exec.c:1959 [inline]
 __x64_sys_execve+0x8f/0xc0 fs/exec.c:1959
 do_syscall_64+0x1b1/0x800 arch/x86/entry/common.c:290
 entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x7f1576a46207
Code: 77 19 f4 48 89 d7 44 89 c0 0f 05 48 3d 00 f0 ff ff 76 e0 f7 d8 64 41 89 01 eb d8 f7 d8 64 41 89 01 eb df b8 3b 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 02 f3 c3 48 8b 15 00 8c 2d 00 f7 d8 64 89 02
RSP: 002b:00007ffff2784568 EFLAGS: 00000202 ORIG_RAX: 000000000000003b
RAX: ffffffffffffffda RBX: 00000000ffffffff RCX: 00007f1576a46207
RDX: 0000000001215b10 RSI: 00007ffff2784660 RDI: 00007ffff2785670
RBP: 0000000000625500 R08: 000000000000589c R09: 000000000000589c
R10: 0000000000000000 R11: 0000000000000202 R12: 0000000001215b10
R13: 0000000000000007 R14: 0000000001204250 R15: 0000000000000005

Allocated by task 12188:
 save_stack+0x43/0xd0 mm/kasan/kasan.c:448
 set_track mm/kasan/kasan.c:460 [inline]
 kasan_kmalloc+0xc4/0xe0 mm/kasan/kasan.c:553
 kmem_cache_alloc_trace+0x152/0x780 mm/slab.c:3620
 kmalloc include/linux/slab.h:513 [inline]
 kzalloc include/linux/slab.h:706 [inline]
 fib6_info_alloc+0xbb/0x280 net/ipv6/ip6_fib.c:152
 ip6_route_info_create+0x782/0x2b50 net/ipv6/route.c:3013
 ip6_route_add+0x23/0xb0 net/ipv6/route.c:3154
 ipv6_route_ioctl+0x5a5/0x760 net/ipv6/route.c:3660
 inet6_ioctl+0x100/0x1f0 net/ipv6/af_inet6.c:546
 sock_do_ioctl+0xe4/0x3e0 net/socket.c:973
 sock_ioctl+0x30d/0x680 net/socket.c:1097
 vfs_ioctl fs/ioctl.c:46 [inline]
 file_ioctl fs/ioctl.c:500 [inline]
 do_vfs_ioctl+0x1cf/0x16f0 fs/ioctl.c:684
 ksys_ioctl+0xa9/0xd0 fs/ioctl.c:701
 __do_sys_ioctl fs/ioctl.c:708 [inline]
 __se_sys_ioctl fs/ioctl.c:706 [inline]
 __x64_sys_ioctl+0x73/0xb0 fs/ioctl.c:706
 do_syscall_64+0x1b1/0x800 arch/x86/entry/common.c:290
 entry_SYSCALL_64_after_hwframe+0x49/0xbe

Freed by task 1402:
 save_stack+0x43/0xd0 mm/kasan/kasan.c:448
 set_track mm/kasan/kasan.c:460 [inline]
 __kasan_slab_free+0x11a/0x170 mm/kasan/kasan.c:521
 kasan_slab_free+0xe/0x10 mm/kasan/kasan.c:528
 __cache_free mm/slab.c:3498 [inline]
 kfree+0xd9/0x260 mm/slab.c:3813
 fib6_info_destroy+0x29b/0x350 net/ipv6/ip6_fib.c:207
 fib6_info_release include/net/ip6_fib.h:286 [inline]
 __ip6_del_rt_siblings net/ipv6/route.c:3235 [inline]
 ip6_route_del+0x11c4/0x13b0 net/ipv6/route.c:3316
 ipv6_route_ioctl+0x616/0x760 net/ipv6/route.c:3663
 inet6_ioctl+0x100/0x1f0 net/ipv6/af_inet6.c:546
 sock_do_ioctl+0xe4/0x3e0 net/socket.c:973
 sock_ioctl+0x30d/0x680 net/socket.c:1097
 vfs_ioctl fs/ioctl.c:46 [inline]
 file_ioctl fs/ioctl.c:500 [inline]
 do_vfs_ioctl+0x1cf/0x16f0 fs/ioctl.c:684
 ksys_ioctl+0xa9/0xd0 fs/ioctl.c:701
 __do_sys_ioctl fs/ioctl.c:708 [inline]
 __se_sys_ioctl fs/ioctl.c:706 [inline]
 __x64_sys_ioctl+0x73/0xb0 fs/ioctl.c:706
 do_syscall_64+0x1b1/0x800 arch/x86/entry/common.c:290
 entry_SYSCALL_64_after_hwframe+0x49/0xbe

The buggy address belongs to the object at ffff8801b5df2580
 which belongs to the cache kmalloc-256 of size 256
The buggy address is located 8 bytes inside of
 256-byte region [ffff8801b5df2580ffff8801b5df2680)
The buggy address belongs to the page:
page:ffffea0006d77c80 count:1 mapcount:0 mapping:ffff8801da8007c0 index:0xffff8801b5df2e40
flags: 0x2fffc0000000100(slab)
raw: 02fffc0000000100 ffffea0006c5cc48 ffffea0007363308 ffff8801da8007c0
raw: ffff8801b5df2e40 ffff8801b5df2080 0000000100000006 0000000000000000
page dumped because: kasan: bad access detected

Memory state around the buggy address:
 ffff8801b5df2480: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
 ffff8801b5df2500: fb fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc
ffff8801b5df2580: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
                      ^
 ffff8801b5df2600: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
 ffff8801b5df2680: fc fc fc fc fc fc fc fc fb fb fb fb fb fb fb fb

Fixes: a64efe142f5e ("net/ipv6: introduce fib6_info struct and helpers")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: David Ahern <dsahern@gmail.com>
Reported-by: syzbot+9e6d75e3edef427ee888@syzkaller.appspotmail.com
Acked-by: David Ahern <dsahern@gmail.com>
Tested-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet: net_failover: fix typo in net_failover_slave_register()
Liran Alon [Mon, 18 Jun 2018 12:04:05 +0000 (15:04 +0300)]
net: net_failover: fix typo in net_failover_slave_register()

Sync both unicast and multicast lists instead of unicast twice.

Fixes: cfc80d9a116 ("net: Introduce net_failover driver")
Reviewed-by: Joao Martins <joao.m.martins@oracle.com>
Signed-off-by: Liran Alon <liran.alon@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoipvlan: use ETH_MAX_MTU as max mtu
Xin Long [Mon, 18 Jun 2018 08:15:57 +0000 (16:15 +0800)]
ipvlan: use ETH_MAX_MTU as max mtu

Similar to the fixes on team and bonding, this restores the ability
to set an ipvlan device's mtu to anything higher than 1500.

Fixes: 91572088e3fd ("net: use core MTU range checking in core net infra")
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet: hamradio: use eth_broadcast_addr
Stefan Agner [Sun, 17 Jun 2018 21:40:53 +0000 (23:40 +0200)]
net: hamradio: use eth_broadcast_addr

The array bpq_eth_addr is only used to get the size of an
address, whereas the bcast_addr is used to set the broadcast
address. This leads to a warning when using clang:
drivers/net/hamradio/bpqether.c:94:13: warning: variable 'bpq_eth_addr' is not
      needed and will not be emitted [-Wunneeded-internal-declaration]
static char bpq_eth_addr[6];
            ^

Remove both variables and use the common eth_broadcast_addr
to set the broadcast address.

Signed-off-by: Stefan Agner <stefan@agner.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoenic: initialize enic->rfs_h.lock in enic_probe
Govindarajulu Varadarajan [Tue, 19 Jun 2018 15:15:24 +0000 (08:15 -0700)]
enic: initialize enic->rfs_h.lock in enic_probe

lockdep spotted that we are using rfs_h.lock in enic_get_rxnfc() without
initializing. rfs_h.lock is initialized in enic_open(). But ethtool_ops
can be called when interface is down.

Move enic_rfs_flw_tbl_init to enic_probe.

INFO: trying to register non-static key.
the code is fine but needs lockdep annotation.
turning off the locking correctness validator.
CPU: 18 PID: 1189 Comm: ethtool Not tainted 4.17.0-rc7-devel+ #27
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.11.0-20171110_100015-anatol 04/01/2014
Call Trace:
dump_stack+0x85/0xc0
register_lock_class+0x550/0x560
? __handle_mm_fault+0xa8b/0x1100
__lock_acquire+0x81/0x670
lock_acquire+0xb9/0x1e0
?  enic_get_rxnfc+0x139/0x2b0 [enic]
_raw_spin_lock_bh+0x38/0x80
? enic_get_rxnfc+0x139/0x2b0 [enic]
enic_get_rxnfc+0x139/0x2b0 [enic]
ethtool_get_rxnfc+0x8d/0x1c0
dev_ethtool+0x16c8/0x2400
? __mutex_lock+0x64d/0xa00
? dev_load+0x6a/0x150
dev_ioctl+0x253/0x4b0
sock_do_ioctl+0x9a/0x130
sock_ioctl+0x1af/0x350
do_vfs_ioctl+0x8e/0x670
? syscall_trace_enter+0x1e2/0x380
ksys_ioctl+0x60/0x90
__x64_sys_ioctl+0x16/0x20
do_syscall_64+0x5a/0x170
entry_SYSCALL_64_after_hwframe+0x49/0xbe

Signed-off-by: Govindarajulu Varadarajan <gvaradar@cisco.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoMerge branch 'NCSI-silence-warning-messages'
David S. Miller [Tue, 19 Jun 2018 22:26:59 +0000 (07:26 +0900)]
Merge branch 'NCSI-silence-warning-messages'

Joel Stanley says:

====================
Slience NCSI logging

v2:
  Fix indent issue and commit message based on Joe's feedback
  Add Sam's acks

Here are three changes to silence unnecessary warnings in the ncsi code.

The final patch adds Sam as the maintainer for NCSI.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoMAINTAINERS: Add Sam as the maintainer for NCSI
Joel Stanley [Tue, 19 Jun 2018 05:38:34 +0000 (15:08 +0930)]
MAINTAINERS: Add Sam as the maintainer for NCSI

Sam has been handing the maintenance of NCSI for a number release cycles
now.

Acked-by: Samuel Mendoza-Jonas <sam@mendozajonas.com>
Signed-off-by: Joel Stanley <joel@jms.id.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet/ncsi: Use netdev_dbg for debug messages
Joel Stanley [Tue, 19 Jun 2018 05:38:33 +0000 (15:08 +0930)]
net/ncsi: Use netdev_dbg for debug messages

This moves all of the netdev_printk(KERN_DEBUG, ...) messages over to
netdev_dbg.

As Joe explains:

> netdev_dbg is not included in object code unless
> DEBUG is defined or CONFIG_DYNAMIC_DEBUG is set.
> And then, it is not emitted into the log unless
> DEBUG is set or this specific netdev_dbg is enabled
> via the dynamic debug control file.

Which is what we're after in this case.

Acked-by: Samuel Mendoza-Jonas <sam@mendozajonas.com>
Signed-off-by: Joel Stanley <joel@jms.id.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet/ncsi: Drop no more channels message
Joel Stanley [Tue, 19 Jun 2018 05:38:32 +0000 (15:08 +0930)]
net/ncsi: Drop no more channels message

This does not provide useful information. As the ncsi maintainer said:

 > either we get a channel or broadcom has gone out to lunch

Acked-by: Samuel Mendoza-Jonas <sam@mendozajonas.com>
Signed-off-by: Joel Stanley <joel@jms.id.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet/ncsi: Silence debug messages
Joel Stanley [Tue, 19 Jun 2018 05:38:31 +0000 (15:08 +0930)]
net/ncsi: Silence debug messages

In normal operation we see this series of messages as the host drives
the network device:

 ftgmac100 1e660000.ethernet eth0: NCSI: LSC AEN - channel 0 state down
 ftgmac100 1e660000.ethernet eth0: NCSI: suspending channel 0
 ftgmac100 1e660000.ethernet eth0: NCSI: configuring channel 0
 ftgmac100 1e660000.ethernet eth0: NCSI: channel 0 link down after config
 ftgmac100 1e660000.ethernet eth0: NCSI interface down
 ftgmac100 1e660000.ethernet eth0: NCSI: LSC AEN - channel 0 state up
 ftgmac100 1e660000.ethernet eth0: NCSI: configuring channel 0
 ftgmac100 1e660000.ethernet eth0: NCSI interface up
 ftgmac100 1e660000.ethernet eth0: NCSI: LSC AEN - channel 0 state down
 ftgmac100 1e660000.ethernet eth0: NCSI: suspending channel 0
 ftgmac100 1e660000.ethernet eth0: NCSI: configuring channel 0
 ftgmac100 1e660000.ethernet eth0: NCSI: channel 0 link down after config
 ftgmac100 1e660000.ethernet eth0: NCSI interface down
 ftgmac100 1e660000.ethernet eth0: NCSI: LSC AEN - channel 0 state up
 ftgmac100 1e660000.ethernet eth0: NCSI: configuring channel 0
 ftgmac100 1e660000.ethernet eth0: NCSI interface up

This makes all of these messages netdev_dbg. They are still useful to
debug eg. misbehaving network device firmware, but we do not need them
filling up the kernel logs in normal operation.

Acked-by: Samuel Mendoza-Jonas <sam@mendozajonas.com>
Signed-off-by: Joel Stanley <joel@jms.id.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agobpf, xdp, i40e: fix i40e_build_skb skb reserve and truesize
Daniel Borkmann [Tue, 19 Jun 2018 21:33:54 +0000 (14:33 -0700)]
bpf, xdp, i40e: fix i40e_build_skb skb reserve and truesize

Using skb_reserve(skb, I40E_SKB_PAD + (xdp->data - xdp->data_hard_start))
is clearly wrong since I40E_SKB_PAD already points to the offset where
the original xdp->data was sitting since xdp->data_hard_start is defined
as xdp->data - i40e_rx_offset(rx_ring) where latter offsets to I40E_SKB_PAD
when build skb is used.

However, also before cc5b114dcf98 ("bpf, i40e: add meta data support")
this seems broken since bpf_xdp_adjust_head() helper could have been used
to alter headroom and enlarge / shrink the frame and with that the assumption
that the xdp->data remains unchanged does not hold and would push a bogus
packet to upper stack.

ixgbe got this right in 924708081629 ("ixgbe: add XDP support for pass and
drop actions"). In any case, fix it by removing the I40E_SKB_PAD from both
skb_reserve() and truesize calculation.

Fixes: cc5b114dcf98 ("bpf, i40e: add meta data support")
Fixes: 0c8493d90b6b ("i40e: add XDP support for pass and drop actions")
Reported-by: Keith Busch <keith.busch@linux.intel.com>
Reported-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Cc: Björn Töpel <bjorn.topel@intel.com>
Cc: John Fastabend <john.fastabend@gmail.com>
Tested-by: Keith Busch <keith.busch@linux.intel.com>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoMerge branch 'qed-fixes'
David S. Miller [Tue, 19 Jun 2018 22:15:34 +0000 (07:15 +0900)]
Merge branch 'qed-fixes'

Sudarsana Reddy Kalluru says:

====================
qed*: Fix series.

The patch series fixes few issues in the qed/qede drivers.
Please consider applying this series to "net".
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoqed: Do not advertise DCBX_LLD_MANAGED capability.
Sudarsana Reddy Kalluru [Tue, 19 Jun 2018 04:58:02 +0000 (21:58 -0700)]
qed: Do not advertise DCBX_LLD_MANAGED capability.

Do not advertise DCBX_LLD_MANAGED capability i.e., do not allow
external agent to manage the dcbx/lldp negotiation. MFW acts as lldp agent
for qed* devices, and no other lldp agent is allowed to coexist with mfw.

Also updated a debug print, to not to display the redundant info.

Fixes: a1d8d8a51 ("qed: Add dcbnl support.")
Signed-off-by: Sudarsana Reddy Kalluru <Sudarsana.Kalluru@cavium.com>
Signed-off-by: Ariel Elior <ariel.elior@cavium.com>
Signed-off-by: Michal Kalderon <Michal.Kalderon@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoqed: Add sanity check for SIMD fastpath handler.
Sudarsana Reddy Kalluru [Tue, 19 Jun 2018 04:58:01 +0000 (21:58 -0700)]
qed: Add sanity check for SIMD fastpath handler.

Avoid calling a SIMD fastpath handler if it is NULL. The check is needed
to handle an unlikely scenario where unsolicited interrupt is destined to
a PF in INTa mode.

Fixes: fe56b9e6a ("qed: Add module with basic common support")
Signed-off-by: Sudarsana Reddy Kalluru <Sudarsana.Kalluru@cavium.com>
Signed-off-by: Ariel Elior <ariel.elior@cavium.com>
Signed-off-by: Michal Kalderon <Michal.Kalderon@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoqed: Fix possible memory leak in Rx error path handling.
Sudarsana Reddy Kalluru [Tue, 19 Jun 2018 04:58:00 +0000 (21:58 -0700)]
qed: Fix possible memory leak in Rx error path handling.

Memory for packet buffers need to be freed in the error paths as there is
no consumer (e.g., upper layer) for such packets and that memory will never
get freed.
The issue was uncovered when port was attacked with flood of isatap
packets, these are multicast packets hence were directed at all the PFs.
For foce PF, this meant they were routed to the ll2 module which in turn
drops such packets.

Fixes: 0a7fb11c ("qed: Add Light L2 support")
Signed-off-by: Sudarsana Reddy Kalluru <Sudarsana.Kalluru@cavium.com>
Signed-off-by: Ariel Elior <ariel.elior@cavium.com>
Signed-off-by: Michal Kalderon <Michal.Kalderon@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoMerge tag 'jfs-4.18' of git://github.com/kleikamp/linux-shaggy
Linus Torvalds [Mon, 18 Jun 2018 22:47:32 +0000 (07:47 +0900)]
Merge tag 'jfs-4.18' of git://github.com/kleikamp/linux-shaggy

Pull jfs fix from Dave Kleikamp:
 "This fixes a too-small allocation in the xattr code"

* tag 'jfs-4.18' of git://github.com/kleikamp/linux-shaggy:
  jfs: Fix inconsistency between memory allocation and ea_buf->max_size

6 years agoMerge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux
Linus Torvalds [Mon, 18 Jun 2018 22:44:51 +0000 (07:44 +0900)]
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux

Pull s390 updates from Martin Schwidefsky:
 "common I/O layer
   - Fix bit-fields crossing storage-unit boundaries in css_general_char

  dasd driver
   - Avoid a sparse warning in regard to the queue lock
   - Allocate the struct dasd_ccw_req as per request data. Only for
     internal I/O is the structure allocated separately
   - Remove the unused function dasd_kmalloc_set_cda
   - Save a few bytes in struct dasd_ccw_req by reordering fields
   - Convert remaining users of dasd_kmalloc_request to
     dasd_smalloc_request and remove the now unused function

  vfio/ccw
   - Refactor and improve pfn_array_alloc_pin/pfn_array_pin
   - Add a new tracepoint for failed vfio/ccw requests
   - Add a CCW translation improvement to accept more requests as valid
   - Bug fixes"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
  s390/dasd: only use preallocated requests
  s390/dasd: reshuffle struct dasd_ccw_req
  s390/dasd: remove dasd_kmalloc_set_cda
  s390/dasd: move dasd_ccw_req to per request data
  s390/dasd: simplify locking in process_final_queue
  s390/cio: sanitize css_general_characteristics definition
  vfio: ccw: add tracepoints for interesting error paths
  vfio: ccw: set ccw->cda to NULL defensively
  vfio: ccw: refactor and improve pfn_array_alloc_pin()
  vfio: ccw: shorten kernel doc description for pfn_array_pin()
  vfio: ccw: push down unsupported IDA check
  vfio: ccw: fix error return in vfio_ccw_sch_event
  s390/archrandom: Rework arch random implementation.
  s390/net: add pnetid support

6 years agorevert "mm/memblock: add missing include <linux/bootmem.h>"
Andrew Morton [Mon, 18 Jun 2018 21:15:30 +0000 (14:15 -0700)]
revert "mm/memblock: add missing include <linux/bootmem.h>"

The patch fixed a W=1 warning but broke the ia64 build:

    CC      mm/memblock.o
  mm/memblock.c:1340: error: redefinition of `memblock_virt_alloc_try_nid_raw'
  ./include/linux/bootmem.h:335: error: previous definition of `memblock_virt_alloc_try_nid_raw' was here

Because inlcude/linux/bootmem.h says

#if defined(CONFIG_HAVE_MEMBLOCK) && defined(CONFIG_NO_BOOTMEM)

whereas mm/Makefile says

obj-$(CONFIG_HAVE_MEMBLOCK) += memblock.o

So revert 26f09e9b3a06 ("mm/memblock: add missing include
<linux/bootmem.h>") while a full fix can be worked on.

Fixes: 26f09e9b3a06 ("mm/memblock: add missing include <linux/bootmem.h>")
Reported-by: Tony Luck <tony.luck@gmail.com>
Cc: Mathieu Malaterre <malat@debian.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
6 years agoMAINTAINERS: Add me as an x86 entry code maintainer
Andy Lutomirski [Mon, 18 Jun 2018 21:41:36 +0000 (14:41 -0700)]
MAINTAINERS: Add me as an x86 entry code maintainer

And update my email address.

Cc: Ingo Molnar <mingo@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
6 years agoIB/rxe: Fix missing completion for mem_reg work requests
Vijay Immanuel [Wed, 13 Jun 2018 01:16:05 +0000 (18:16 -0700)]
IB/rxe: Fix missing completion for mem_reg work requests

Run the completer task to post a work completion after processing
a memory registration or invalidate work request. This covers the
case where the memory registration or invalidate was the last work
request posted to the qp.

Signed-off-by: Vijay Immanuel <vijayi@attalasystems.com>
Reviewed-by: Yonatan Cohen <yonatanc@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
6 years agoRDMA/core: Save kernel caller name when creating CQ using ib_create_cq()
Bharat Potnuri [Fri, 15 Jun 2018 15:22:33 +0000 (20:52 +0530)]
RDMA/core: Save kernel caller name when creating CQ using ib_create_cq()

Few kernel applications like SCST-iSER create CQ using ib_create_cq(),
where accessing CQ structures using rdma restrack tool leads to below NULL
pointer dereference. This patch saves caller kernel module name similar to
ib_alloc_cq().

BUG: unable to handle kernel NULL pointer dereference at           (null)
IP: [<ffffffff8132ca70>] skip_spaces+0x30/0x30
PGD 738bac067 PUD 8533f0067 PMD 0
Oops: 0000 [#1] SMP
R10: ffff88017fc03300 R11: 0000000000000246 R12: 0000000000000000
R13: ffff88082fa5a668 R14: ffff88017475a000 R15: 0000000000000000
FS:  00002b32726582c0(0000) GS:ffff88087fc40000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000000 CR3: 00000008491a1000 CR4: 00000000003607e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
 [<ffffffffc05af69c>] ? fill_res_name_pid+0x7c/0x90 [ib_core]
 [<ffffffffc05af79f>] fill_res_cq_entry+0xef/0x170 [ib_core]
 [<ffffffffc05af4c4>] res_get_common_dumpit+0x3c4/0x480 [ib_core]
 [<ffffffffc05af5d3>] nldev_res_get_cq_dumpit+0x13/0x20 [ib_core]
 [<ffffffff815bc1e7>] netlink_dump+0x117/0x2e0
 [<ffffffff815bcb8b>] __netlink_dump_start+0x1ab/0x230
 [<ffffffffc059fead>] ibnl_rcv_msg+0x11d/0x1f0 [ib_core]
 [<ffffffffc05af5c0>] ? nldev_res_get_mr_dumpit+0x20/0x20 [ib_core]
 [<ffffffffc059fd90>] ? rdma_nl_multicast+0x30/0x30 [ib_core]
 [<ffffffff815bea49>] netlink_rcv_skb+0xa9/0xc0
 [<ffffffffc05a0018>] ibnl_rcv+0x98/0xb0 [ib_core]
 [<ffffffff815be132>] netlink_unicast+0xf2/0x1b0
 [<ffffffff815be50f>] netlink_sendmsg+0x31f/0x6a0
 [<ffffffff8156b580>] sock_sendmsg+0xb0/0xf0
 [<ffffffff816ace9e>] ? _raw_spin_unlock_bh+0x1e/0x20
 [<ffffffff8156f998>] ? release_sock+0x118/0x170
 [<ffffffff8156b731>] SYSC_sendto+0x121/0x1c0
 [<ffffffff81568340>] ? sock_alloc_file+0xa0/0x140
 [<ffffffff81221265>] ? __fd_install+0x25/0x60
 [<ffffffff8156c2ce>] SyS_sendto+0xe/0x10
 [<ffffffff816b6c2a>] system_call_fastpath+0x16/0x1b
RIP  [<ffffffff8132ca70>] skip_spaces+0x30/0x30
RSP <ffff88072be97760>
CR2: 0000000000000000

Cc: <stable@vger.kernel.org>
Fixes: f66c8ba4c9fa ("RDMA/core: Save kernel caller name when creating PD and CQ objects")
Reviewed-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Potnuri Bharat Teja <bharat@chelsio.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
6 years agoMerge tag '4.18-rc1-more-smb3-fixes' of git://git.samba.org/sfrench/cifs-2.6
Linus Torvalds [Mon, 18 Jun 2018 05:28:19 +0000 (14:28 +0900)]
Merge tag '4.18-rc1-more-smb3-fixes' of git://git.samba.org/sfrench/cifs-2.6

Pull cifs fixes from Steve French:
 "Misc SMB3 fixes, including particularly important ones for signing,
  some minor documentation and debug improvements and another posix
  smb3.11 fix"

* tag '4.18-rc1-more-smb3-fixes' of git://git.samba.org/sfrench/cifs-2.6:
  cifs: Fix invalid check in __cifs_calc_signature()
  cifs: Use correct packet length in SMB2_TRANSFORM header
  smb3: fix corrupt path in subdirs on smb311 with posix
  smb3: do not display empty interface list
  smb3: Fix mode on mkdir on smb311 mounts
  cifs: Fix kernel oops when traceSMB is enabled
  CIFS: dump every session iface info
  CIFS: parse and store info on iface queries
  CIFS: add iface info to struct cifs_ses
  CIFS: complete PDU definitions for interface queries
  CIFS: move default port definitions to cifsglob.h
  cifs: Fix encryption/signing
  cifs: update __smb_send_rqst() to take an array of requests
  cifs: remove smb2_send_recv()
  cifs: push rfc1002 generation down the stack
  smb3: increase initial number of credits requested to allow write
  cifs: minor documentation updates
  cifs: add lease tracking to the cached root fid
  smb3: note that smb3.11 posix extensions mount option is experimental

6 years agoMerge branch 'dmi-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jdelvar...
Linus Torvalds [Mon, 18 Jun 2018 04:43:09 +0000 (13:43 +0900)]
Merge branch 'dmi-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jdelvare/staging

Pull dmi update from Jean Delvare:
 "Expose SKU ID string as a DMI attribute"

* 'dmi-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jdelvare/staging:
  firmware: dmi: Add access to the SKU ID string

6 years agoFix Documentation build due to rename of main.c to mtrr.c
Randy Dunlap [Sun, 17 Jun 2018 00:09:41 +0000 (17:09 -0700)]
Fix Documentation build due to rename of main.c to mtrr.c

This fixes this documentation build error that is due to a file rename:

  Error: Cannot open file ../arch/x86/kernel/cpu/mtrr/main.c

Fixes: 0afe832e55a7 ("Merge branch 'x86-cleanups-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip")
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
6 years agofirmware: dmi: Add access to the SKU ID string
Simon Glass [Sun, 17 Jun 2018 12:09:42 +0000 (14:09 +0200)]
firmware: dmi: Add access to the SKU ID string

This is used in some systems from user space for determining the identity
of the device.

Expose this as a file so that that user-space tools don't need to read
from /sys/firmware/dmi/tables/DMI

Signed-off-by: Simon Glass <sjg@chromium.org>
Signed-off-by: Jean Delvare <jdelvare@suse.de>
6 years agonet_sched: blackhole: tell upper qdisc about dropped packets
Konstantin Khlebnikov [Fri, 15 Jun 2018 10:27:31 +0000 (13:27 +0300)]
net_sched: blackhole: tell upper qdisc about dropped packets

When blackhole is used on top of classful qdisc like hfsc it breaks
qlen and backlog counters because packets are disappear without notice.

In HFSC non-zero qlen while all classes are inactive triggers warning:
WARNING: ... at net/sched/sch_hfsc.c:1393 hfsc_dequeue+0xba4/0xe90 [sch_hfsc]
and schedules watchdog work endlessly.

This patch return __NET_XMIT_BYPASS in addition to NET_XMIT_SUCCESS,
this flag tells upper layer: this packet is gone and isn't queued.

Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agobluetooth: hci_nokia: Don't include linux/unaligned/le_struct.h directly.
David S. Miller [Sat, 16 Jun 2018 23:38:55 +0000 (08:38 +0900)]
bluetooth: hci_nokia: Don't include linux/unaligned/le_struct.h directly.

This breaks the build as this header is not meant to be used in this
way.

./include/linux/unaligned/access_ok.h:8:28: error: redefinition of ‘get_unaligned_le16’
 static __always_inline u16 get_unaligned_le16(const void *p)
                            ^~~~~~~~~~~~~~~~~~
In file included from drivers/bluetooth/hci_nokia.c:32:
./include/linux/unaligned/le_struct.h:7:19: note: previous definition of ‘get_unaligned_le16’ was here
 static inline u16 get_unaligned_le16(const void *p)

Use asm/unaligned.h instead.

Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoatm: Preserve value of skb->truesize when accounting to vcc
David Woodhouse [Sat, 16 Jun 2018 10:55:44 +0000 (11:55 +0100)]
atm: Preserve value of skb->truesize when accounting to vcc

ATM accounts for in-flight TX packets in sk_wmem_alloc of the VCC on
which they are to be sent. But it doesn't take ownership of those
packets from the sock (if any) which originally owned them. They should
remain owned by their actual sender until they've left the box.

There's a hack in pskb_expand_head() to avoid adjusting skb->truesize
for certain skbs, precisely to avoid messing up sk_wmem_alloc
accounting. Ideally that hack would cover the ATM use case too, but it
doesn't — skbs which aren't owned by any sock, for example PPP control
frames, still get their truesize adjusted when the low-level ATM driver
adds headroom.

This has always been an issue, it seems. The truesize of a packet
increases, and sk_wmem_alloc on the VCC goes negative. But this wasn't
for normal traffic, only for control frames. So I think we just got away
with it, and we probably needed to send 2GiB of LCP echo frames before
the misaccounting would ever have caused a problem and caused
atm_may_send() to start refusing packets.

Commit 14afee4b609 ("net: convert sock.sk_wmem_alloc from atomic_t to
refcount_t") did exactly what it was intended to do, and turned this
mostly-theoretical problem into a real one, causing PPPoATM to fail
immediately as sk_wmem_alloc underflows and atm_may_send() *immediately*
starts refusing to allow new packets.

The least intrusive solution to this problem is to stash the value of
skb->truesize that was accounted to the VCC, in a new member of the
ATM_SKB(skb) structure. Then in atm_pop_raw() subtract precisely that
value instead of the then-current value of skb->truesize.

Fixes: 158f323b9868 ("net: adjust skb->truesize in pskb_expand_head()")
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
Tested-by: Kevin Darbyshire-Bryant <ldir@darbyshire-bryant.me.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoLinux 4.18-rc1
Linus Torvalds [Sat, 16 Jun 2018 23:04:49 +0000 (08:04 +0900)]
Linux 4.18-rc1

6 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf
David S. Miller [Sat, 16 Jun 2018 22:54:24 +0000 (07:54 +0900)]
Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf

Daniel Borkmann says:

====================
pull-request: bpf 2018-06-16

The following pull-request contains BPF updates for your *net* tree.

The main changes are:

1) Fix a panic in devmap handling in generic XDP where return type
   of __devmap_lookup_elem() got changed recently but generic XDP
   code missed the related update, from Toshiaki.

2) Fix a freeze when BPF progs are loaded that include BPF to BPF
   calls when JIT is enabled where we would later bail out via error
   path w/o dropping kallsyms, and another one to silence syzkaller
   splats from locking prog read-only, from Daniel.

3) Fix a bug in test_offloads.py BPF selftest which must not assume
   that the underlying system have no BPF progs loaded prior to test,
   and one in bpftool to fix accuracy of program load time, from Jakub.

4) Fix a bug in bpftool's probe for availability of the bpf(2)
   BPF_TASK_FD_QUERY subcommand, from Yonghong.

5) Fix a regression in AF_XDP's XDP_SKB receive path where queue
   id check got erroneously removed, from Björn.

6) Fix missing state cleanup in BPF's xfrm tunnel test, from William.

7) Check tunnel type more accurately in BPF's tunnel collect metadata
   kselftest, from Jian.

8) Fix missing Kconfig fragments for BPF kselftests, from Anders.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoMerge tag 'for-linus-20180616' of git://git.kernel.dk/linux-block
Linus Torvalds [Sat, 16 Jun 2018 20:37:55 +0000 (05:37 +0900)]
Merge tag 'for-linus-20180616' of git://git.kernel.dk/linux-block

Pull block fixes from Jens Axboe:
 "A collection of fixes that should go into -rc1. This contains:

   - bsg_open vs bsg_unregister race fix (Anatoliy)

   - NVMe pull request from Christoph, with fixes for regressions in
     this window, FC connect/reconnect path code unification, and a
     trace point addition.

   - timeout fix (Christoph)

   - remove a few unused functions (Christoph)

   - blk-mq tag_set reinit fix (Roman)"

* tag 'for-linus-20180616' of git://git.kernel.dk/linux-block:
  bsg: fix race of bsg_open and bsg_unregister
  block: remov blk_queue_invalidate_tags
  nvme-fabrics: fix and refine state checks in __nvmf_check_ready
  nvme-fabrics: handle the admin-only case properly in nvmf_check_ready
  nvme-fabrics: refactor queue ready check
  blk-mq: remove blk_mq_tagset_iter
  nvme: remove nvme_reinit_tagset
  nvme-fc: fix nulling of queue data on reconnect
  nvme-fc: remove reinit_request routine
  blk-mq: don't time out requests again that are in the timeout handler
  nvme-fc: change controllers first connect to use reconnect path
  nvme: don't rely on the changed namespace list log
  nvmet: free smart-log buffer after use
  nvme-rdma: fix error flow during mapping request data
  nvme: add bio remapping tracepoint
  nvme: fix NULL pointer dereference in nvme_init_subsystem
  blk-mq: reinit q->tag_set_list entry only after grace period

6 years agoMerge tag 'docs-broken-links' of git://linuxtv.org/mchehab/experimental
Linus Torvalds [Sat, 16 Jun 2018 20:25:18 +0000 (05:25 +0900)]
Merge tag 'docs-broken-links' of git://linuxtv.org/mchehab/experimental

Pull documentation fixes from Mauro Carvalho Chehab:
 "This solves a series of broken links for files under Documentation,
  and improves a script meant to detect such broken links (see
  scripts/documentation-file-ref-check).

  The changes on this series are:

   - can.rst: fix a footnote reference;

   - crypto_engine.rst: Fix two parsing warnings;

   - Fix a lot of broken references to Documentation/*;

   - improve the scripts/documentation-file-ref-check script, in order
     to help detecting/fixing broken references, preventing
     false-positives.

  After this patch series, only 33 broken references to doc files are
  detected by scripts/documentation-file-ref-check"

* tag 'docs-broken-links' of git://linuxtv.org/mchehab/experimental: (26 commits)
  fix a series of Documentation/ broken file name references
  Documentation: rstFlatTable.py: fix a broken reference
  ABI: sysfs-devices-system-cpu: remove a broken reference
  devicetree: fix a series of wrong file references
  devicetree: fix name of pinctrl-bindings.txt
  devicetree: fix some bindings file names
  MAINTAINERS: fix location of DT npcm files
  MAINTAINERS: fix location of some display DT bindings
  kernel-parameters.txt: fix pointers to sound parameters
  bindings: nvmem/zii: Fix location of nvmem.txt
  docs: Fix more broken references
  scripts/documentation-file-ref-check: check tools/*/Documentation
  scripts/documentation-file-ref-check: get rid of false-positives
  scripts/documentation-file-ref-check: hint: dash or underline
  scripts/documentation-file-ref-check: add a fix logic for DT
  scripts/documentation-file-ref-check: accept more wildcards at filenames
  scripts/documentation-file-ref-check: fix help message
  media: max2175: fix location of driver's companion documentation
  media: v4l: fix broken video4linux docs locations
  media: dvb: point to the location of the old README.dvb-usb file
  ...

6 years agoMerge tag 'fsnotify_for_v4.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Sat, 16 Jun 2018 20:06:18 +0000 (05:06 +0900)]
Merge tag 'fsnotify_for_v4.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs

Pull fsnotify updates from Jan Kara:
 "fsnotify cleanups unifying handling of different watch types.

  This is the shortened fsnotify series from Amir with the last five
  patches pulled out. Amir has modified those patches to not change
  struct inode but obviously it's too late for those to go into this
  merge window"

* tag 'fsnotify_for_v4.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
  fsnotify: add fsnotify_add_inode_mark() wrappers
  fanotify: generalize fanotify_should_send_event()
  fsnotify: generalize send_to_group()
  fsnotify: generalize iteration of marks by object type
  fsnotify: introduce marks iteration helpers
  fsnotify: remove redundant arguments to handle_event()
  fsnotify: use type id to identify connector object type

6 years agoMerge tag 'fbdev-v4.18' of git://github.com/bzolnier/linux
Linus Torvalds [Sat, 16 Jun 2018 20:00:24 +0000 (05:00 +0900)]
Merge tag 'fbdev-v4.18' of git://github.com/bzolnier/linux

Pull fbdev updates from Bartlomiej Zolnierkiewicz:
 "There is nothing really major here, few small fixes, some cleanups and
  dead drivers removal:

   - mark omapfb drivers as orphans in MAINTAINERS file (Tomi Valkeinen)

   - add missing module license tags to omap/omapfb driver (Arnd
     Bergmann)

   - add missing GPIOLIB dependendy to omap2/omapfb driver (Arnd
     Bergmann)

   - convert savagefb, aty128fb & radeonfb drivers to use msleep & co.
     (Jia-Ju Bai)

   - allow COMPILE_TEST build for viafb driver (media part was reviewed
     by media subsystem Maintainer)

   - remove unused MERAM support from sh_mobile_lcdcfb and shmob-drm
     drivers (drm parts were acked by shmob-drm driver Maintainer)

   - remove unused auo_k190xfb drivers

   - misc cleanups (Souptick Joarder, Wolfram Sang, Markus Elfring, Andy
     Shevchenko, Colin Ian King)"

* tag 'fbdev-v4.18' of git://github.com/bzolnier/linux: (26 commits)
  fb_omap2: add gpiolib dependency
  video/omap: add module license tags
  MAINTAINERS: make omapfb orphan
  video: fbdev: pxafb: match_string() conversion fixup
  video: fbdev: nvidia: fix spelling mistake: "scaleing" -> "scaling"
  video: fbdev: fix spelling mistake: "frambuffer" -> "framebuffer"
  video: fbdev: pxafb: Convert to use match_string() helper
  video: fbdev: via: allow COMPILE_TEST build
  video: fbdev: remove unused sh_mobile_meram driver
  drm: shmobile: remove unused MERAM support
  video: fbdev: sh_mobile_lcdcfb: remove unused MERAM support
  video: fbdev: remove unused auo_k190xfb drivers
  video: omap: Improve a size determination in omapfb_do_probe()
  video: sm501fb: Improve a size determination in sm501fb_probe()
  video: fbdev-MMP: Improve a size determination in path_init()
  video: fbdev-MMP: Delete an error message for a failed memory allocation in two functions
  video: auo_k190x: Delete an error message for a failed memory allocation in auok190x_common_probe()
  video: sh_mobile_lcdcfb: Delete an error message for a failed memory allocation in two functions
  video: sh_mobile_meram: Delete an error message for a failed memory allocation in sh_mobile_meram_probe()
  video: fbdev: sh_mobile_meram: Drop SUPERH platform dependency
  ...

6 years agoMerge branch 'afs-proc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Linus Torvalds [Sat, 16 Jun 2018 07:32:04 +0000 (16:32 +0900)]
Merge branch 'afs-proc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs

Pull AFS updates from Al Viro:
 "Assorted AFS stuff - ended up in vfs.git since most of that consists
  of David's AFS-related followups to Christoph's procfs series"

* 'afs-proc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  afs: Optimise callback breaking by not repeating volume lookup
  afs: Display manually added cells in dynamic root mount
  afs: Enable IPv6 DNS lookups
  afs: Show all of a server's addresses in /proc/fs/afs/servers
  afs: Handle CONFIG_PROC_FS=n
  proc: Make inline name size calculation automatic
  afs: Implement network namespacing
  afs: Mark afs_net::ws_cell as __rcu and set using rcu functions
  afs: Fix a Sparse warning in xdr_decode_AFSFetchStatus()
  proc: Add a way to make network proc files writable
  afs: Rearrange fs/afs/proc.c to remove remaining predeclarations.
  afs: Rearrange fs/afs/proc.c to move the show routines up
  afs: Rearrange fs/afs/proc.c by moving fops and open functions down
  afs: Move /proc management functions to the end of the file

6 years agoMerge branch 'work.compat' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Linus Torvalds [Sat, 16 Jun 2018 07:21:50 +0000 (16:21 +0900)]
Merge branch 'work.compat' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs

Pull compat updates from Al Viro:
 "Some biarch patches - getting rid of assorted (mis)uses of
  compat_alloc_user_space().

  Not much in that area this cycle..."

* 'work.compat' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  orangefs: simplify compat ioctl handling
  signalfd: lift sigmask copyin and size checks to callers of do_signalfd4()
  vmsplice(): lift importing iovec into vmsplice(2) and compat counterpart

6 years agoMerge branch 'work.aio' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Linus Torvalds [Sat, 16 Jun 2018 07:11:40 +0000 (16:11 +0900)]
Merge branch 'work.aio' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs

Pull aio fixes from Al Viro:
 "Assorted AIO followups and fixes"

* 'work.aio' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  eventpoll: switch to ->poll_mask
  aio: only return events requested in poll_mask() for IOCB_CMD_POLL
  eventfd: only return events requested in poll_mask()
  aio: mark __aio_sigset::sigmask const

6 years agocifs: Fix invalid check in __cifs_calc_signature()
Paulo Alcantara [Fri, 15 Jun 2018 18:58:00 +0000 (15:58 -0300)]
cifs: Fix invalid check in __cifs_calc_signature()

The following check would never evaluate to true:
  > if (i == 0 && iov[0].iov_len <= 4)

Because 'i' always starts at 1.

This patch fixes it and also move the header checks outside the for loop
- which makes more sense.

Signed-off-by: Paulo Alcantara <palcantara@suse.de>
Signed-off-by: Steve French <stfrench@microsoft.com>
6 years agocifs: Use correct packet length in SMB2_TRANSFORM header
Paulo Alcantara [Fri, 15 Jun 2018 13:22:44 +0000 (10:22 -0300)]
cifs: Use correct packet length in SMB2_TRANSFORM header

In smb3_init_transform_rq(), 'orig_len' was only counting the request
length, but forgot to count any data pages in the request.

Writing or creating files with the 'seal' mount option was broken.

In addition, do some code refactoring by exporting smb2_rqst_len() to
calculate the appropriate packet size and avoid duplicating the same
calculation all over the code.

The start of the io vector is either the rfc1002 length (4 bytes) or a
SMB2 header which is always > 4. Use this fact to check and skip the
rfc1002 length if requested.

Signed-off-by: Paulo Alcantara <palcantara@suse.de>
Reviewed-by: Aurelien Aptel <aaptel@suse.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
6 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Linus Torvalds [Fri, 15 Jun 2018 22:39:34 +0000 (07:39 +0900)]
Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net

Pull networking fixes from David Miller:

 1) Various netfilter fixlets from Pablo and the netfilter team.

 2) Fix regression in IPVS caused by lack of PMTU exceptions on local
    routes in ipv6, from Julian Anastasov.

 3) Check pskb_trim_rcsum for failure in DSA, from Zhouyang Jia.

 4) Don't crash on poll in TLS, from Daniel Borkmann.

 5) Revert SO_REUSE{ADDR,PORT} change, it regresses various things
    including Avahi mDNS. From Bart Van Assche.

 6) Missing of_node_put in qcom/emac driver, from Yue Haibing.

 7) We lack checking of the TCP checking in one special case during SYN
    receive, from Frank van der Linden.

 8) Fix module init error paths of mac80211 hwsim, from Johannes Berg.

 9) Handle 802.1ad properly in stmmac driver, from Elad Nachman.

10) Must grab HW caps before doing quirk checks in stmmac driver, from
    Jose Abreu.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (81 commits)
  net: stmmac: Run HWIF Quirks after getting HW caps
  neighbour: skip NTF_EXT_LEARNED entries during forced gc
  net: cxgb3: add error handling for sysfs_create_group
  tls: fix waitall behavior in tls_sw_recvmsg
  tls: fix use-after-free in tls_push_record
  l2tp: filter out non-PPP sessions in pppol2tp_tunnel_ioctl()
  l2tp: reject creation of non-PPP sessions on L2TPv2 tunnels
  mlxsw: spectrum_switchdev: Fix port_vlan refcounting
  mlxsw: spectrum_router: Align with new route replace logic
  mlxsw: spectrum_router: Allow appending to dev-only routes
  ipv6: Only emit append events for appended routes
  stmmac: added support for 802.1ad vlan stripping
  cfg80211: fix rcu in cfg80211_unregister_wdev
  mac80211: Move up init of TXQs
  mac80211_hwsim: fix module init error paths
  cfg80211: initialize sinfo in cfg80211_get_station
  nl80211: fix some kernel doc tag mistakes
  hv_netvsc: Fix the variable sizes in ipsecv2 and rsc offload
  rds: avoid unenecessary cong_update in loop transport
  l2tp: clean up stale tunnel or session in pppol2tp_connect's error path
  ...

6 years agoMerge tag 'modules-for-v4.18' of git://git.kernel.org/pub/scm/linux/kernel/git/jeyu...
Linus Torvalds [Fri, 15 Jun 2018 22:36:39 +0000 (07:36 +0900)]
Merge tag 'modules-for-v4.18' of git://git.kernel.org/pub/scm/linux/kernel/git/jeyu/linux

Pull module updates from Jessica Yu:
 "Minor code cleanup and also allow sig_enforce param to be shown in
  sysfs with CONFIG_MODULE_SIG_FORCE"

* tag 'modules-for-v4.18' of git://git.kernel.org/pub/scm/linux/kernel/git/jeyu/linux:
  module: Allow to always show the status of modsign
  module: Do not access sig_enforce directly

6 years agoMerge branch 'for-linus-4.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Fri, 15 Jun 2018 21:50:51 +0000 (06:50 +0900)]
Merge branch 'for-linus-4.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/uml

Pull uml updates from Richard Weinberger:
 "Minor updates for UML:

   - fixes for our new vector network driver by Anton

   - initcall cleanup by Alexander

   - We have a new mailinglist, sourceforge.net sucks"

* 'for-linus-4.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/uml:
  um: Fix raw interface options
  um: Fix initialization of vector queues
  um: remove uml initcalls
  um: Update mailing list address

6 years agoxdp: Fix handling of devmap in generic XDP
Toshiaki Makita [Thu, 14 Jun 2018 02:07:42 +0000 (11:07 +0900)]
xdp: Fix handling of devmap in generic XDP

Commit 67f29e07e131 ("bpf: devmap introduce dev_map_enqueue") changed
the return value type of __devmap_lookup_elem() from struct net_device *
to struct bpf_dtab_netdev * but forgot to modify generic XDP code
accordingly.

Thus generic XDP incorrectly used struct bpf_dtab_netdev where struct
net_device is expected, then skb->dev was set to invalid value.

v2:
- Fix compiler warning without CONFIG_BPF_SYSCALL.

Fixes: 67f29e07e131 ("bpf: devmap introduce dev_map_enqueue")
Signed-off-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp>
Acked-by: Yonghong Song <yhs@fb.com>
Acked-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
6 years agoMerge tag 'riscv-for-linus-4.18-merge_window' of git://git.kernel.org/pub/scm/linux...
Linus Torvalds [Fri, 15 Jun 2018 21:42:43 +0000 (06:42 +0900)]
Merge tag 'riscv-for-linus-4.18-merge_window' of git://git.kernel.org/pub/scm/linux/kernel/git/palmer/riscv-linux

Pull RISC-V updates from Palmer Dabbelt:
 "This contains some small RISC-V updates I'd like to target for 4.18.

  They are all fairly small this time. Here's a short summary, there's
  more info in the commits/merges:

   - a fix to __clear_user to respect the passed arguments.

   - enough support for the perf subsystem to work with RISC-V's ISA
     defined performance counters.

   - support for sparse and cleanups suggested by it.

   - support for R_RISCV_32 (a relocation, not the 32-bit ISA).

   - some MAINTAINERS cleanups.

   - the addition of CONFIG_HVC_RISCV_SBI to our defconfig, as it's
     always present.

  I've given these a simple build+boot test"

* tag 'riscv-for-linus-4.18-merge_window' of git://git.kernel.org/pub/scm/linux/kernel/git/palmer/riscv-linux:
  RISC-V: Add CONFIG_HVC_RISCV_SBI=y to defconfig
  RISC-V: Handle R_RISCV_32 in modules
  riscv/ftrace: Export _mcount when DYNAMIC_FTRACE isn't set
  riscv: add riscv-specific predefines to CHECKFLAGS
  riscv: split the declaration of __copy_user
  riscv: no __user for probe_kernel_address()
  riscv: use NULL instead of a plain 0
  perf: riscv: Add Document for Future Porting Guide
  perf: riscv: preliminary RISC-V support
  MAINTAINERS: Update Albert's email, he's back at Berkeley
  MAINTAINERS: Add myself as a maintainer for SiFive's drivers
  riscv: Fix the bug in memory access fixup code

6 years agoMerge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Linus Torvalds [Fri, 15 Jun 2018 21:37:04 +0000 (06:37 +0900)]
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull more kvm updates from Paolo Bonzini:
 "Mostly the PPC part of the release, but also switching to Arnd's fix
  for the hyperv config issue and a typo fix.

  Main PPC changes:

   - reimplement the MMIO instruction emulation

   - transactional memory support for PR KVM

   - improve radix page table handling"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (63 commits)
  KVM: x86: VMX: redo fix for link error without CONFIG_HYPERV
  KVM: x86: fix typo at kvm_arch_hardware_setup comment
  KVM: PPC: Book3S PR: Fix failure status setting in tabort. emulation
  KVM: PPC: Book3S PR: Enable use on POWER9 bare-metal hosts in HPT mode
  KVM: PPC: Book3S PR: Don't let PAPR guest set MSR hypervisor bit
  KVM: PPC: Book3S PR: Fix failure status setting in treclaim. emulation
  KVM: PPC: Book3S PR: Fix MSR setting when delivering interrupts
  KVM: PPC: Book3S PR: Handle additional interrupt types
  KVM: PPC: Book3S PR: Enable kvmppc_get/set_one_reg_pr() for HTM registers
  KVM: PPC: Book3S: Remove load/put vcpu for KVM_GET_REGS/KVM_SET_REGS
  KVM: PPC: Remove load/put vcpu for KVM_GET/SET_ONE_REG ioctl
  KVM: PPC: Move vcpu_load/vcpu_put down to each ioctl case in kvm_arch_vcpu_ioctl
  KVM: PPC: Book3S PR: Enable HTM for PR KVM for KVM_CHECK_EXTENSION ioctl
  KVM: PPC: Book3S PR: Support TAR handling for PR KVM HTM
  KVM: PPC: Book3S PR: Add guard code to prevent returning to guest with PR=0 and Transactional state
  KVM: PPC: Book3S PR: Add emulation for tabort. in privileged state
  KVM: PPC: Book3S PR: Add emulation for trechkpt.
  KVM: PPC: Book3S PR: Add emulation for treclaim.
  KVM: PPC: Book3S PR: Restore NV regs after emulating mfspr from TM SPRs
  KVM: PPC: Book3S PR: Always fail transactions in guest privileged state
  ...

6 years agoMerge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost
Linus Torvalds [Fri, 15 Jun 2018 21:35:02 +0000 (06:35 +0900)]
Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost

Pull virtio updates from Michael Tsirkin:
 "virtio, vhost: features, fixes

   - PCI virtual function support for virtio

   - DMA barriers for virtio strong barriers

   - bugfixes"

* tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost:
  virtio: update the comments for transport features
  virtio_pci: support enabling VFs
  vhost: fix info leak due to uninitialized memory
  virtio_ring: switch to dma_XX barriers for rpmsg

6 years agofix a series of Documentation/ broken file name references
Mauro Carvalho Chehab [Thu, 14 Jun 2018 15:34:32 +0000 (12:34 -0300)]
fix a series of Documentation/ broken file name references

As files move around, their previous links break. Fix the
references for them.

Acked-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
Acked-by: Jonathan Corbet <corbet@lwn.net>