]> git.proxmox.com Git - mirror_ubuntu-kernels.git/log
mirror_ubuntu-kernels.git
5 years agonet/mlx5e: Replace deprecated PCI_DMA_TODEVICE
Maxim Mikityanskiy [Wed, 26 Jun 2019 14:35:29 +0000 (17:35 +0300)]
net/mlx5e: Replace deprecated PCI_DMA_TODEVICE

The PCI API for DMA is deprecated, and PCI_DMA_TODEVICE is just defined
to DMA_TO_DEVICE for backward compatibility. Just use DMA_TO_DEVICE.

Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com>
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Acked-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
5 years agoxsk: Return the whole xdp_desc from xsk_umem_consume_tx
Maxim Mikityanskiy [Wed, 26 Jun 2019 14:35:28 +0000 (17:35 +0300)]
xsk: Return the whole xdp_desc from xsk_umem_consume_tx

Some drivers want to access the data transmitted in order to implement
acceleration features of the NICs. It is also useful in AF_XDP TX flow.

Change the xsk_umem_consume_tx API to return the whole xdp_desc, that
contains the data pointer, length and DMA address, instead of only the
latter two. Adapt the implementation of i40e and ixgbe to this change.

Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com>
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Acked-by: Saeed Mahameed <saeedm@mellanox.com>
Cc: Björn Töpel <bjorn.topel@intel.com>
Cc: Magnus Karlsson <magnus.karlsson@intel.com>
Acked-by: Björn Töpel <bjorn.topel@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
5 years agoxsk: Change the default frame size to 4096 and allow controlling it
Maxim Mikityanskiy [Wed, 26 Jun 2019 14:35:27 +0000 (17:35 +0300)]
xsk: Change the default frame size to 4096 and allow controlling it

The typical XDP memory scheme is one packet per page. Change the AF_XDP
frame size in libbpf to 4096, which is the page size on x86, to allow
libbpf to be used with the drivers with the packet-per-page scheme.

Add a command line option -f to xdpsock to allow to specify a custom
frame size.

Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com>
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Acked-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
5 years agolibbpf: Support getsockopt XDP_OPTIONS
Maxim Mikityanskiy [Wed, 26 Jun 2019 14:35:26 +0000 (17:35 +0300)]
libbpf: Support getsockopt XDP_OPTIONS

Query XDP_OPTIONS in libbpf to determine if the zero-copy mode is active
or not.

Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com>
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Acked-by: Saeed Mahameed <saeedm@mellanox.com>
Acked-by: Björn Töpel <bjorn.topel@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
5 years agoxsk: Add getsockopt XDP_OPTIONS
Maxim Mikityanskiy [Wed, 26 Jun 2019 14:35:25 +0000 (17:35 +0300)]
xsk: Add getsockopt XDP_OPTIONS

Make it possible for the application to determine whether the AF_XDP
socket is running in zero-copy mode. To achieve this, add a new
getsockopt option XDP_OPTIONS that returns flags. The only flag
supported for now is the zero-copy mode indicator.

Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com>
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Acked-by: Saeed Mahameed <saeedm@mellanox.com>
Acked-by: Björn Töpel <bjorn.topel@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
5 years agoxsk: Add API to check for available entries in FQ
Maxim Mikityanskiy [Wed, 26 Jun 2019 14:35:24 +0000 (17:35 +0300)]
xsk: Add API to check for available entries in FQ

Add a function that checks whether the Fill Ring has the specified
amount of descriptors available. It will be useful for mlx5e that wants
to check in advance, whether it can allocate a bulk of RX descriptors,
to get the best performance.

Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com>
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Acked-by: Saeed Mahameed <saeedm@mellanox.com>
Acked-by: Björn Töpel <bjorn.topel@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
5 years agonet/mlx5e: Attach/detach XDP program safely
Maxim Mikityanskiy [Wed, 26 Jun 2019 14:35:23 +0000 (17:35 +0300)]
net/mlx5e: Attach/detach XDP program safely

When an XDP program is set, a full reopen of all channels happens in two
cases:

1. When there was no program set, and a new one is being set.

2. When there was a program set, but it's being unset.

The full reopen is necessary, because the channel parameters may change
if XDP is enabled or disabled. However, it's performed in an unsafe way:
if the new channels fail to open, the old ones are already closed, and
the interface goes down. Use the safe way to switch channels instead.
The same way is already used for other configuration changes.

Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com>
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
5 years agobpf: fix cgroup bpf release synchronization
Roman Gushchin [Tue, 25 Jun 2019 21:38:58 +0000 (14:38 -0700)]
bpf: fix cgroup bpf release synchronization

Since commit 4bfc0bb2c60e ("bpf: decouple the lifetime of cgroup_bpf
from cgroup itself"), cgroup_bpf release occurs asynchronously
(from a worker context), and before the release of the cgroup itself.

This introduced a previously non-existing race between the release
and update paths. E.g. if a leaf's cgroup_bpf is released and a new
bpf program is attached to the one of ancestor cgroups at the same
time. The race may result in double-free and other memory corruptions.

To fix the problem, let's protect the body of cgroup_bpf_release()
with cgroup_mutex, as it was effectively previously, when all this
code was called from the cgroup release path with cgroup mutex held.

Also let's skip cgroups, which have no chances to invoke a bpf
program, on the update path. If the cgroup bpf refcnt reached 0,
it means that the cgroup is offline (no attached processes), and
there are no associated sockets left. It means there is no point
in updating effective progs array! And it can lead to a leak,
if it happens after the release. So, let's skip such cgroups.

Big thanks for Tejun Heo for discovering and debugging of this problem!

Fixes: 4bfc0bb2c60e ("bpf: decouple the lifetime of cgroup_bpf from cgroup itself")
Reported-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Roman Gushchin <guro@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
5 years agoxdp: Make __mem_id_disconnect static
YueHaibing [Tue, 25 Jun 2019 02:31:37 +0000 (10:31 +0800)]
xdp: Make __mem_id_disconnect static

Fix sparse warning:

net/core/xdp.c:88:6: warning:
 symbol '__mem_id_disconnect' was not declared. Should it be static?

Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Acked-by: Jesper Dangaard Brouer <brouer@redhat.com>
Acked-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
5 years agosamples: bpf: make the use of xdp samples consistent
Daniel T. Lee [Tue, 25 Jun 2019 00:55:36 +0000 (09:55 +0900)]
samples: bpf: make the use of xdp samples consistent

Currently, each xdp samples are inconsistent in the use.
Most of the samples fetch the interface with it's name.
(ex. xdp1, xdp2skb, xdp_redirect_cpu, xdp_sample_pkts, etc.)

But some of the xdp samples are fetching the interface with
ifindex by command argument.

This commit enables xdp samples to fetch interface with it's name
without changing the original index interface fetching.
(<ifname|ifindex> fetching in the same way as xdp_sample_pkts_user.c does.)

Signed-off-by: Daniel T. Lee <danieltimlee@gmail.com>
Acked-by: Toke Høiland-Jørgensen <toke@redhat.com>
Acked-by: Jesper Dangaard Brouer <brouer@redhat.com>
Acked-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
5 years agobpf: fix compiler warning with CONFIG_MODULES=n
Yonghong Song [Wed, 26 Jun 2019 00:35:03 +0000 (17:35 -0700)]
bpf: fix compiler warning with CONFIG_MODULES=n

With CONFIG_MODULES=n, the following compiler warning occurs:
  /data/users/yhs/work/net-next/kernel/trace/bpf_trace.c:605:13: warning:
      ‘do_bpf_send_signal’ defined but not used [-Wunused-function]
  static void do_bpf_send_signal(struct irq_work *entry)

The __init function send_signal_irq_work_init(), which calls
do_bpf_send_signal(), is defined under CONFIG_MODULES. Hence,
when CONFIG_MODULES=n, nobody calls static function do_bpf_send_signal(),
hence the warning.

The init function send_signal_irq_work_init() should work without
CONFIG_MODULES. Moving it out of CONFIG_MODULES
code section fixed the compiler warning, and also make bpf_send_signal()
helper work without CONFIG_MODULES.

Fixes: 8b401f9ed244 ("bpf: implement bpf_send_signal() helper")
Reported-By: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Yonghong Song <yhs@fb.com>
Acked-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
5 years agoselftests/bpf: build tests with debug info
Andrii Nakryiko [Tue, 25 Jun 2019 22:56:28 +0000 (15:56 -0700)]
selftests/bpf: build tests with debug info

Non-BPF (user land) part of selftests is built without debug info making
occasional debugging with gdb terrible. Build with debug info always.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
5 years agolibbpf: fix max() type mismatch for 32bit
Ivan Khoronzhuk [Wed, 26 Jun 2019 10:38:37 +0000 (13:38 +0300)]
libbpf: fix max() type mismatch for 32bit

It fixes build error for 32bit caused by type mismatch
size_t/unsigned long.

Fixes: bf82927125dd ("libbpf: refactor map initialization")
Acked-by: Song Liu <songliubraving@fb.com>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
5 years agoveth: Support bulk XDP_TX
Toshiaki Makita [Thu, 13 Jun 2019 09:39:59 +0000 (18:39 +0900)]
veth: Support bulk XDP_TX

XDP_TX is similar to XDP_REDIRECT as it essentially redirects packets to
the device itself. XDP_REDIRECT has bulk transmit mechanism to avoid the
heavy cost of indirect call but it also reduces lock acquisition on the
destination device that needs locks like veth and tun.

XDP_TX does not use indirect calls but drivers which require locks can
benefit from the bulk transmit for XDP_TX as well.

This patch introduces bulk transmit mechanism in veth using bulk queue
on stack, and improves XDP_TX performance by about 9%.

Here are single-core/single-flow XDP_TX test results. CPU consumptions
are taken from "perf report --no-child".

- Before:

  7.26 Mpps

  _raw_spin_lock  7.83%
  veth_xdp_xmit  12.23%

- After:

  7.94 Mpps

  _raw_spin_lock  1.08%
  veth_xdp_xmit   6.10%

v2:
- Use stack for bulk queue instead of a global variable.

Signed-off-by: Toshiaki Makita <toshiaki.makita1@gmail.com>
Acked-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
5 years agoxdp: Add tracepoint for bulk XDP_TX
Toshiaki Makita [Thu, 13 Jun 2019 09:39:58 +0000 (18:39 +0900)]
xdp: Add tracepoint for bulk XDP_TX

This is introduced for admins to check what is happening on XDP_TX when
bulk XDP_TX is in use, which will be first introduced in veth in next
commit.

v3:
- Add act field to be in line with other XDP tracepoints.

Signed-off-by: Toshiaki Makita <toshiaki.makita1@gmail.com>
Acked-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
5 years agoselftests, bpf: Add test for veth native XDP
Toshiaki Makita [Thu, 20 Jun 2019 02:23:23 +0000 (11:23 +0900)]
selftests, bpf: Add test for veth native XDP

Add a test case for veth native XDP. It checks if XDP_PASS, XDP_TX and
XDP_REDIRECT work properly.

  $ cd tools/testing/selftests/bpf
  $ make \
   TEST_CUSTOM_PROGS= \
   TEST_GEN_PROGS= \
   TEST_GEN_PROGS_EXTENDED= \
   TEST_PROGS_EXTENDED= \
   TEST_PROGS="test_xdp_veth.sh" \
   run_tests
  TAP version 13
  1..1
  # selftests: bpf: test_xdp_veth.sh
  # PING 10.1.1.33 (10.1.1.33) 56(84) bytes of data.
  # 64 bytes from 10.1.1.33: icmp_seq=1 ttl=64 time=0.073 ms
  #
  # --- 10.1.1.33 ping statistics ---
  # 1 packets transmitted, 1 received, 0% packet loss, time 0ms
  # rtt min/avg/max/mdev = 0.073/0.073/0.073/0.000 ms
  # selftests: xdp_veth [PASS]
  ok 1 selftests: bpf: test_xdp_veth.sh

Signed-off-by: Toshiaki Makita <toshiaki.makita1@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
5 years agoxsk: sample kernel code is now in libbpf
Eric Leblond [Fri, 21 Jun 2019 20:13:10 +0000 (22:13 +0200)]
xsk: sample kernel code is now in libbpf

Fix documentation that mention xdpsock_kern.c which has been
replaced by code embedded in libbpf.

Signed-off-by: Eric Leblond <eric@regit.org>
Acked-by: Björn Töpel <bjorn.topel@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
5 years agolibbpf: fix spelling mistake "conflictling" -> "conflicting"
Colin Ian King [Wed, 19 Jun 2019 16:27:42 +0000 (17:27 +0100)]
libbpf: fix spelling mistake "conflictling" -> "conflicting"

There are several spelling mistakes in pr_warning messages. Fix these.

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
5 years agosamples: bpf: Remove bpf_debug macro in favor of bpf_printk
Michal Rostecki [Tue, 18 Jun 2019 18:13:18 +0000 (20:13 +0200)]
samples: bpf: Remove bpf_debug macro in favor of bpf_printk

ibumad example was implementing the bpf_debug macro which is exactly the
same as the bpf_printk macro available in bpf_helpers.h. This change
makes use of bpf_printk instead of bpf_debug.

Signed-off-by: Michal Rostecki <mrostecki@opensuse.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
5 years agoMerge branch 'ipv6-avoid-taking-refcnt-on-dst-during-route-lookup'
David S. Miller [Sun, 23 Jun 2019 20:24:17 +0000 (13:24 -0700)]
Merge branch 'ipv6-avoid-taking-refcnt-on-dst-during-route-lookup'

Wei Wang says:

====================
ipv6: avoid taking refcnt on dst during route lookup

Ipv6 route lookup code always grabs refcnt on the dst for the caller.
But for certain cases, grabbing refcnt is not always necessary if the
call path is rcu protected and the caller does not cache the dst.
Another issue in the route lookup logic is:
When there are multiple custom rules, we have to do the lookup into
each table associated to each rule individually. And when we can't
find the route in one table, we grab and release refcnt on
net->ipv6.ip6_null_entry before going to the next table.
This operation is completely redundant, and causes false issue because
net->ipv6.ip6_null_entry is a shared object.

This patch set introduces a new flag RT6_LOOKUP_F_DST_NOREF for route
lookup callers to set, to avoid any manipulation on the dst refcnt. And
it converts the major input and output path to use it.

The performance gain is noticable.
I ran synflood tests between 2 hosts under the same switch. Both hosts
have 20G mlx NIC, and 8 tx/rx queues.
Sender sends pure SYN flood with random src IPs and ports using trafgen.
Receiver has a simple TCP listener on the target port.
Both hosts have multiple custom rules:
- For incoming packets, only local table is traversed.
- For outgoing packets, 3 tables are traversed to find the route.
The packet processing rate on the receiver is as follows:
- Before the fix: 3.78Mpps
- After the fix:  5.50Mpps

v2->v3:
- Handled fib6_rule_lookup() when CONFIG_IPV6_MULTIPLE_TABLES is not
  configured in patch 03 (suggested by David Ahern)
- Removed the renaming of l3mdev_link_scope_lookup() in patch 05
  (suggested by David Ahern)
- Moved definition of ip6_route_output_flags() from an inline function
  in /net/ipv6/route.c to net/ipv6/route.c in order to address kbuild
  error in patch 05

v1->v2:
- Added a helper ip6_rt_put_flags() in patch 3 suggested by David Miller
====================

Reviewed-by: David Ahern <dsahern@gmail.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoipv6: convert major tx path to use RT6_LOOKUP_F_DST_NOREF
Wei Wang [Fri, 21 Jun 2019 00:36:41 +0000 (17:36 -0700)]
ipv6: convert major tx path to use RT6_LOOKUP_F_DST_NOREF

For tx path, in most cases, we still have to take refcnt on the dst
cause the caller is caching the dst somewhere. But it still is
beneficial to make use of RT6_LOOKUP_F_DST_NOREF flag while doing the
route lookup. It is cause this flag prevents manipulating refcnt on
net->ipv6.ip6_null_entry when doing fib6_rule_lookup() to traverse each
routing table. The null_entry is a shared object and constant updates on
it cause false sharing.

We converted the current major lookup function ip6_route_output_flags()
to make use of RT6_LOOKUP_F_DST_NOREF.

Together with the change in the rx path, we see noticable performance
boost:
I ran synflood tests between 2 hosts under the same switch. Both hosts
have 20G mlx NIC, and 8 tx/rx queues.
Sender sends pure SYN flood with random src IPs and ports using trafgen.
Receiver has a simple TCP listener on the target port.
Both hosts have multiple custom rules:
- For incoming packets, only local table is traversed.
- For outgoing packets, 3 tables are traversed to find the route.
The packet processing rate on the receiver is as follows:
- Before the fix: 3.78Mpps
- After the fix:  5.50Mpps

Signed-off-by: Wei Wang <weiwan@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoipv6: convert rx data path to not take refcnt on dst
Wei Wang [Fri, 21 Jun 2019 00:36:40 +0000 (17:36 -0700)]
ipv6: convert rx data path to not take refcnt on dst

ip6_route_input() is the key function to do the route lookup in the
rx data path. All the callers to this function are already holding rcu
lock. So it is fairly easy to convert it to not take refcnt on the dst:
We pass in flag RT6_LOOKUP_F_DST_NOREF and do skb_dst_set_noref().
This saves a few atomic inc or dec operations and should boost
performance overall.
This also makes the logic more aligned with v4.

Signed-off-by: Wei Wang <weiwan@google.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Acked-by: Mahesh Bandewar <maheshb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoipv6: honor RT6_LOOKUP_F_DST_NOREF in rule lookup logic
Wei Wang [Fri, 21 Jun 2019 00:36:39 +0000 (17:36 -0700)]
ipv6: honor RT6_LOOKUP_F_DST_NOREF in rule lookup logic

This patch specifically converts the rule lookup logic to honor this
flag and not release refcnt when traversing each rule and calling
lookup() on each routing table.
Similar to previous patch, we also need some special handling of dst
entries in uncached list because there is always 1 refcnt taken for them
even if RT6_LOOKUP_F_DST_NOREF flag is set.

Signed-off-by: Wei Wang <weiwan@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoipv6: initialize rt6->rt6i_uncached in all pre-allocated dst entries
Wei Wang [Fri, 21 Jun 2019 00:36:38 +0000 (17:36 -0700)]
ipv6: initialize rt6->rt6i_uncached in all pre-allocated dst entries

Initialize rt6->rt6i_uncached on the following pre-allocated dsts:
net->ipv6.ip6_null_entry
net->ipv6.ip6_prohibit_entry
net->ipv6.ip6_blk_hole_entry

This is a preparation patch for later commits to be able to distinguish
dst entries in uncached list by doing:
!list_empty(rt6->rt6i_uncached)

Signed-off-by: Wei Wang <weiwan@google.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Acked-by: Mahesh Bandewar <maheshb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoipv6: introduce RT6_LOOKUP_F_DST_NOREF flag in ip6_pol_route()
Wei Wang [Fri, 21 Jun 2019 00:36:37 +0000 (17:36 -0700)]
ipv6: introduce RT6_LOOKUP_F_DST_NOREF flag in ip6_pol_route()

This new flag is to instruct the route lookup function to not take
refcnt on the dst entry. The user which does route lookup with this flag
must properly use rcu protection.
ip6_pol_route() is the major route lookup function for both tx and rx
path.
In this function:
Do not take refcnt on dst if RT6_LOOKUP_F_DST_NOREF flag is set, and
directly return the route entry. The caller should be holding rcu lock
when using this flag, and decide whether to take refcnt or not.

One note on the dst cache in the uncached_list:
As uncached_list does not consume refcnt, one refcnt is always returned
back to the caller even if RT6_LOOKUP_F_DST_NOREF flag is set.
Uncached dst is only possible in the output path. So in such call path,
caller MUST check if the dst is in the uncached_list before assuming
that there is no refcnt taken on the returned dst.

Signed-off-by: Wei Wang <weiwan@google.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Acked-by: Mahesh Bandewar <maheshb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agodoc: phy: document some PHY_INTERFACE_MODE_xxx settings
Russell King [Fri, 21 Jun 2019 14:59:09 +0000 (15:59 +0100)]
doc: phy: document some PHY_INTERFACE_MODE_xxx settings

There seems to be some confusion surrounding three PHY interface modes,
specifically 1000BASE-X, 2500BASE-X and SGMII.  Add some documentation
to phylib detailing precisely what these interface modes refer to.

Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoinet: fix compilation warnings in fqdir_pre_exit()
Qian Cai [Thu, 20 Jun 2019 14:52:40 +0000 (10:52 -0400)]
inet: fix compilation warnings in fqdir_pre_exit()

The linux-next commit "inet: fix various use-after-free in defrags
units" [1] introduced compilation warnings,

./include/net/inet_frag.h:117:1: warning: 'inline' is not at beginning
of declaration [-Wold-style-declaration]
 static void inline fqdir_pre_exit(struct fqdir *fqdir)
 ^~~~~~
In file included from ./include/net/netns/ipv4.h:10,
                 from ./include/net/net_namespace.h:20,
                 from ./include/linux/netdevice.h:38,
                 from ./include/linux/icmpv6.h:13,
                 from ./include/linux/ipv6.h:86,
                 from ./include/net/ipv6.h:12,
                 from ./include/rdma/ib_verbs.h:51,
                 from ./include/linux/mlx5/device.h:37,
                 from ./include/linux/mlx5/driver.h:51,
                 from
drivers/net/ethernet/mellanox/mlx5/core/pagealloc.c:37:

[1] https://lore.kernel.org/netdev/20190618180900.88939-3-edumazet@google.com/

Signed-off-by: Qian Cai <cai@lca.pw>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: dsa: mv88e6xxx: introduce helpers for handling chip->reg_lock
Rasmus Villemoes [Thu, 20 Jun 2019 13:50:42 +0000 (13:50 +0000)]
net: dsa: mv88e6xxx: introduce helpers for handling chip->reg_lock

This is a no-op that simply moves all locking and unlocking of
->reg_lock into trivial helpers. I did that to be able to easily add
some ad hoc instrumentation to those helpers to get some information
on contention and hold times of the mutex. Perhaps others want to do
something similar at some point, so this frees them from doing the
'sed -i' yoga, and have a much smaller 'git diff' while fiddling.

Signed-off-by: Rasmus Villemoes <rasmus.villemoes@prevas.dk>
Reviewed-by: Vivien Didelot <vivien.didelot@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: ena: Fix bug where ring allocation backoff stopped too late
Sameeh Jubran [Sun, 23 Jun 2019 07:11:10 +0000 (10:11 +0300)]
net: ena: Fix bug where ring allocation backoff stopped too late

The current code of create_queues_with_size_backoff() allows the ring size
to become as small as ENA_MIN_RING_SIZE/2. This is a bug since we don't
want the queue ring to be smaller than ENA_MIN_RING_SIZE

In this commit we change the loop's termination condition to look at the
queue size of the next iteration instead of that of the current one,
so that the minimal queue size again becomes ENA_MIN_RING_SIZE.

Fixes: eece4d2ab9d2 ("net: ena: add ethtool function for changing io queue sizes")
Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com>
Signed-off-by: Sameeh Jubran <sameehj@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agohinic: fix dereference of pointer hwdev before it is null checked
Colin Ian King [Thu, 20 Jun 2019 13:27:51 +0000 (14:27 +0100)]
hinic: fix dereference of pointer hwdev before it is null checked

Currently pointer hwdev is dereferenced when assigning hwif before
hwdev is null checked.  Fix this by only derefencing hwdev after the
null check.

Addresses-Coverity: ("Dereference before null check")
Fixes: 4fdc51bb4e92 ("hinic: add support for rss parameters with ethtool")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoMerge branch 'net-mediatek-Add-MT7621-TRGMII-mode-support'
David S. Miller [Sat, 22 Jun 2019 23:58:24 +0000 (16:58 -0700)]
Merge branch 'net-mediatek-Add-MT7621-TRGMII-mode-support'

René van Dorst says:

====================
net: mediatek: Add MT7621 TRGMII mode support

Like many other mediatek SOCs, the MT7621 SOC and the internal MT7530
switch both supports TRGMII mode. MT7621 TRGMII speed is fix 1200MBit.

v1->v2:
 - Fix breakage on non MT7621 SOC
 - Support 25MHz and 40MHz XTAL as MT7530 clocksource
====================

Tested-by: "Frank Wunderlich" <frank-w@public-files.de>
Acked-by: Sean Wang <sean.wang@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: dsa: mt7530: Add MT7621 TRGMII mode support
René van Dorst [Thu, 20 Jun 2019 12:21:55 +0000 (14:21 +0200)]
net: dsa: mt7530: Add MT7621 TRGMII mode support

This patch add support TRGMII mode for MT7621 internal MT7530 switch.
MT7621 TRGMII has only one fix speed mode of 1200MBit.

Also adding support for mt7530 25MHz and 40MHz crystal clocksource.
Values are based on Banana Pi R2 bsp [1].

Don't change MT7623 registers on a MT7621 device.

[1] https://github.com/BPI-SINOVOIP/BPI-R2-bsp/blob/master/linux-mt/drivers/net/ethernet/mediatek/gsw_mt7623.c#L769

Signed-off-by: René van Dorst <opensource@vdorst.com>
Tested-by: Frank Wunderlich <frank-w@public-files.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: ethernet: mediatek: Add MT7621 TRGMII mode support
René van Dorst [Thu, 20 Jun 2019 12:21:54 +0000 (14:21 +0200)]
net: ethernet: mediatek: Add MT7621 TRGMII mode support

MT7621 SOC also supports TRGMII.
TRGMII speed is 1200MBit.

Signed-off-by: René van Dorst <opensource@vdorst.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonetns: restore ops before calling ops_exit_list
Li RongQing [Thu, 20 Jun 2019 11:24:40 +0000 (19:24 +0800)]
netns: restore ops before calling ops_exit_list

ops has been iterated to first element when call pre_exit, and
it needs to restore from save_ops, not save ops to save_ops

Fixes: d7d99872c144 ("netns: add pre_exit method to struct pernet_operations")
Signed-off-by: Li RongQing <lirongqing@baidu.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoipv6: Error when route does not have any valid nexthops
Ido Schimmel [Thu, 20 Jun 2019 09:10:21 +0000 (12:10 +0300)]
ipv6: Error when route does not have any valid nexthops

When user space sends invalid information in RTA_MULTIPATH, the nexthop
list in ip6_route_multipath_add() is empty and 'rt_notif' is set to
NULL.

The code that emits the in-kernel notifications does not check for this
condition, which results in a NULL pointer dereference [1].

Fix this by bailing earlier in the function if the parsed nexthop list
is empty. This is consistent with the corresponding IPv4 code.

v2:
* Check if parsed nexthop list is empty and bail with extack set

[1]
kasan: CONFIG_KASAN_INLINE enabled
kasan: GPF could be caused by NULL-ptr deref or user memory access
general protection fault: 0000 [#1] PREEMPT SMP KASAN
CPU: 0 PID: 9190 Comm: syz-executor149 Not tainted 5.2.0-rc5+ #38
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
Google 01/01/2011
RIP: 0010:call_fib6_multipath_entry_notifiers+0xd1/0x1a0
net/ipv6/ip6_fib.c:396
Code: 8b b5 30 ff ff ff 48 c7 85 68 ff ff ff 00 00 00 00 48 c7 85 70 ff ff
ff 00 00 00 00 89 45 88 4c 89 e0 48 c1 e8 03 4c 89 65 80 <42> 80 3c 28 00
0f 85 9a 00 00 00 48 b8 00 00 00 00 00 fc ff df 4d
RSP: 0018:ffff88809788f2c0 EFLAGS: 00010246
RAX: 0000000000000000 RBX: 1ffff11012f11e59 RCX: 00000000ffffffff
RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
RBP: ffff88809788f390 R08: ffff88809788f8c0 R09: 000000000000000c
R10: ffff88809788f5d8 R11: ffff88809788f527 R12: 0000000000000000
R13: dffffc0000000000 R14: ffff88809788f8c0 R15: ffffffff89541d80
FS:  000055555632c880(0000) GS:ffff8880ae800000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000020000080 CR3: 000000009ba7c000 CR4: 00000000001406f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
  ip6_route_multipath_add+0xc55/0x1490 net/ipv6/route.c:5094
  inet6_rtm_newroute+0xed/0x180 net/ipv6/route.c:5208
  rtnetlink_rcv_msg+0x463/0xb00 net/core/rtnetlink.c:5219
  netlink_rcv_skb+0x177/0x450 net/netlink/af_netlink.c:2477
  rtnetlink_rcv+0x1d/0x30 net/core/rtnetlink.c:5237
  netlink_unicast_kernel net/netlink/af_netlink.c:1302 [inline]
  netlink_unicast+0x531/0x710 net/netlink/af_netlink.c:1328
  netlink_sendmsg+0x8ae/0xd70 net/netlink/af_netlink.c:1917
  sock_sendmsg_nosec net/socket.c:646 [inline]
  sock_sendmsg+0xd7/0x130 net/socket.c:665
  ___sys_sendmsg+0x803/0x920 net/socket.c:2286
  __sys_sendmsg+0x105/0x1d0 net/socket.c:2324
  __do_sys_sendmsg net/socket.c:2333 [inline]
  __se_sys_sendmsg net/socket.c:2331 [inline]
  __x64_sys_sendmsg+0x78/0xb0 net/socket.c:2331
  do_syscall_64+0xfd/0x680 arch/x86/entry/common.c:301
  entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x4401f9
Code: 18 89 d0 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 48 89 f8 48 89 f7
48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff
ff 0f 83 fb 13 fc ff c3 66 2e 0f 1f 84 00 00 00 00
RSP: 002b:00007ffc09fd0028 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
RAX: ffffffffffffffda RBX: 00000000004002c8 RCX: 00000000004401f9
RDX: 0000000000000000 RSI: 0000000020000080 RDI: 0000000000000003
RBP: 00000000006ca018 R08: 0000000000000000 R09: 00000000004002c8
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000401a80
R13: 0000000000401b10 R14: 0000000000000000 R15: 0000000000000000

Reported-by: syzbot+382566d339d52cd1a204@syzkaller.appspotmail.com
Fixes: ebee3cad835f ("ipv6: Add IPv6 multipath notifications for add / replace")
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agofjes: no need to check return value of debugfs_create functions
Greg Kroah-Hartman [Thu, 20 Jun 2019 07:31:06 +0000 (09:31 +0200)]
fjes: no need to check return value of debugfs_create functions

When calling debugfs functions, there is no need to ever check the
return value.  The function can work or not, but the code logic should
never do something different based on this.

Cc: "David S. Miller" <davem@davemloft.net>
Cc: Yangtao Li <tiny.windzz@gmail.com>
Cc: netdev@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: fastopen: robustness and endianness fixes for SipHash
Ard Biesheuvel [Wed, 19 Jun 2019 21:46:28 +0000 (23:46 +0200)]
net: fastopen: robustness and endianness fixes for SipHash

Some changes to the TCP fastopen code to make it more robust
against future changes in the choice of key/cookie size, etc.

- Instead of keeping the SipHash key in an untyped u8[] buffer
  and casting it to the right type upon use, use the correct
  type directly. This ensures that the key will appear at the
  correct alignment if we ever change the way these data
  structures are allocated. (Currently, they are only allocated
  via kmalloc so they always appear at the correct alignment)

- Use DIV_ROUND_UP when sizing the u64[] array to hold the
  cookie, so it is always of sufficient size, even if
  TCP_FASTOPEN_COOKIE_MAX is no longer a multiple of 8.

- Drop the 'len' parameter from the tcp_fastopen_reset_cipher()
  function, which is no longer used.

- Add endian swabbing when setting the keys and calculating the hash,
  to ensure that cookie values are the same for a given key and
  source/destination address pair regardless of the endianness of
  the server.

Note that none of these are functional changes wrt the current
state of the code, with the exception of the swabbing, which only
affects big endian systems.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
David S. Miller [Sat, 22 Jun 2019 12:59:24 +0000 (08:59 -0400)]
Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net

Minor SPDX change conflict.

Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Linus Torvalds [Sat, 22 Jun 2019 05:23:35 +0000 (22:23 -0700)]
Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net

Pull networking fixes from David Miller:

 1) Fix leak of unqueued fragments in ipv6 nf_defrag, from Guillaume
    Nault.

 2) Don't access the DDM interface unless the transceiver implements it
    in bnx2x, from Mauro S. M. Rodrigues.

 3) Don't double fetch 'len' from userspace in sock_getsockopt(), from
    JingYi Hou.

 4) Sign extension overflow in lio_core, from Colin Ian King.

 5) Various netem bug fixes wrt. corrupted packets from Jakub Kicinski.

 6) Fix epollout hang in hvsock, from Sunil Muthuswamy.

 7) Fix regression in default fib6_type, from David Ahern.

 8) Handle memory limits in tcp_fragment more appropriately, from Eric
    Dumazet.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (24 commits)
  tcp: refine memory limit test in tcp_fragment()
  inet: clear num_timeout reqsk_alloc()
  net: mvpp2: debugfs: Add pmap to fs dump
  ipv6: Default fib6_type to RTN_UNICAST when not set
  net: hns3: Fix inconsistent indenting
  net/af_iucv: always register net_device notifier
  net/af_iucv: build proper skbs for HiperTransport
  net/af_iucv: remove GFP_DMA restriction for HiperTransport
  net: dsa: mv88e6xxx: fix shift of FID bits in mv88e6185_g1_vtu_loadpurge()
  hvsock: fix epollout hang from race condition
  net/udp_gso: Allow TX timestamp with UDP GSO
  net: netem: fix use after free and double free with packet corruption
  net: netem: fix backlog accounting for corrupted GSO frames
  net: lio_core: fix potential sign-extension overflow on large shift
  tipc: pass tunnel dev as NULL to udp_tunnel(6)_xmit_skb
  ip6_tunnel: allow not to count pkts on tstats by passing dev as NULL
  ip_tunnel: allow not to count pkts on tstats by setting skb's dev to NULL
  tun: wake up waitqueues after IFF_UP is set
  net: remove duplicate fetch in sock_getsockopt
  tipc: fix issues with early FAILOVER_MSG from peer
  ...

5 years agoMerge branch 'PCI-let-pci_disable_link_state-propagate-errors'
David S. Miller [Sat, 22 Jun 2019 02:05:42 +0000 (22:05 -0400)]
Merge branch 'PCI-let-pci_disable_link_state-propagate-errors'

Heiner Kallweit says:

====================
PCI: let pci_disable_link_state propagate errors

Drivers like r8169 rely on pci_disable_link_state() having disabled
certain ASPM link states. If OS can't control ASPM then
pci_disable_link_state() turns into a no-op w/o informing the caller.
The driver therefore may falsely assume the respective ASPM link
states are disabled. Let pci_disable_link_state() propagate errors
to the caller, enabling the caller to react accordingly.

I'd propose to let this series go through the netdev tree if the PCI
core extension is acked by the PCI people.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agor8169: don't activate ASPM in chip if OS can't control ASPM
Heiner Kallweit [Tue, 18 Jun 2019 21:14:50 +0000 (23:14 +0200)]
r8169: don't activate ASPM in chip if OS can't control ASPM

Certain chip version / board combinations have massive problems if
ASPM is active. If BIOS enables ASPM and doesn't let OS control it,
then we may have a problem with the current code. Therefore check the
return code of pci_disable_link_state() and don't enable ASPM in the
network chip if OS can't control ASPM.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoPCI: let pci_disable_link_state propagate errors
Heiner Kallweit [Tue, 18 Jun 2019 21:13:48 +0000 (23:13 +0200)]
PCI: let pci_disable_link_state propagate errors

Drivers may rely on pci_disable_link_state() having disabled certain
ASPM link states. If OS can't control ASPM then pci_disable_link_state()
turns into a no-op w/o informing the caller. The driver therefore may
falsely assume the respective ASPM link states are disabled.
Let pci_disable_link_state() propagate errors to the caller, enabling
the caller to react accordingly.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agotcp: refine memory limit test in tcp_fragment()
Eric Dumazet [Fri, 21 Jun 2019 13:09:55 +0000 (06:09 -0700)]
tcp: refine memory limit test in tcp_fragment()

tcp_fragment() might be called for skbs in the write queue.

Memory limits might have been exceeded because tcp_sendmsg() only
checks limits at full skb (64KB) boundaries.

Therefore, we need to make sure tcp_fragment() wont punish applications
that might have setup very low SO_SNDBUF values.

Fixes: f070ef2ac667 ("tcp: tcp_fragment() should apply sane memory limits")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Christoph Paasch <cpaasch@apple.com>
Tested-by: Christoph Paasch <cpaasch@apple.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoMerge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma
Linus Torvalds [Fri, 21 Jun 2019 21:47:09 +0000 (14:47 -0700)]
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma

Pull rdma fixes from Doug Ledford:
 "This is probably our last -rc pull request. We don't have anything
  else outstanding at the moment anyway, and with the summer months on
  us and people taking trips, I expect the next weeks leading up to the
  merge window to be pretty calm and sedate.

  This has two simple, no brainer fixes for the EFA driver.

  Then it has ten not quite so simple fixes for the hfi1 driver. The
  problem with them is that they aren't simply one liner typo fixes.
  They're still fixes, but they're more complex issues like livelock
  under heavy load where the answer was to change work queue usage and
  spinlock usage to resolve the problem, or issues with orphaned
  requests during certain types of failures like link down which
  required some more complex work to fix too. They all look like
  legitimate fixes to me, they just aren't small like I wish they were.

  Summary:

   - 2 minor EFA fixes

   - 10 hfi1 fixes related to scaling issues"

* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma:
  RDMA/efa: Handle mmap insertions overflow
  RDMA/efa: Fix success return value in case of error
  IB/hfi1: Handle port down properly in pio
  IB/hfi1: Handle wakeup of orphaned QPs for pio
  IB/hfi1: Wakeup QPs orphaned on wait list after flush
  IB/hfi1: Use aborts to trigger RC throttling
  IB/hfi1: Create inline to get extended headers
  IB/hfi1: Silence txreq allocation warnings
  IB/hfi1: Avoid hardlockup with flushlist_lock
  IB/hfi1: Correct tid qp rcd to match verbs context
  IB/hfi1: Close PSM sdma_progress sleep window
  IB/hfi1: Validate fault injection opcode user input

5 years agoMerge tag 'nfs-for-5.2-3' of git://git.linux-nfs.org/projects/anna/linux-nfs
Linus Torvalds [Fri, 21 Jun 2019 20:45:41 +0000 (13:45 -0700)]
Merge tag 'nfs-for-5.2-3' of git://git.linux-nfs.org/projects/anna/linux-nfs

Pull more NFS client fixes from Anna Schumaker:
 "These are mostly refcounting issues that people have found recently.
  The revert fixes a suspend recovery performance issue.

   - SUNRPC: Fix a credential refcount leak

   - Revert "SUNRPC: Declare RPC timers as TIMER_DEFERRABLE"

   - SUNRPC: Fix xps refcount imbalance on the error path

   - NFS4: Only set creation opendata if O_CREAT"

* tag 'nfs-for-5.2-3' of git://git.linux-nfs.org/projects/anna/linux-nfs:
  SUNRPC: Fix a credential refcount leak
  Revert "SUNRPC: Declare RPC timers as TIMER_DEFERRABLE"
  net :sunrpc :clnt :Fix xps refcount imbalance on the error path
  NFS4: Only set creation opendata if O_CREAT

5 years agox86/vdso: Prevent segfaults due to hoisted vclock reads
Andy Lutomirski [Fri, 21 Jun 2019 15:43:04 +0000 (08:43 -0700)]
x86/vdso: Prevent segfaults due to hoisted vclock reads

GCC 5.5.0 sometimes cleverly hoists reads of the pvclock and/or hvclock
pages before the vclock mode checks.  This creates a path through
vclock_gettime() in which no vclock is enabled at all (due to disabled
TSC on old CPUs, for example) but the pvclock or hvclock page
nevertheless read.  This will segfault on bare metal.

This fixes commit 459e3a21535a ("gcc-9: properly declare the
{pv,hv}clock_page storage") in the sense that, before that commit, GCC
didn't seem to generate the offending code.  There was nothing wrong
with that commit per se, and -stable maintainers should backport this to
all supported kernels regardless of whether the offending commit was
present, since the same crash could just as easily be triggered by the
phase of the moon.

On GCC 9.1.1, this doesn't seem to affect the generated code at all, so
I'm not too concerned about performance regressions from this fix.

Cc: stable@vger.kernel.org
Cc: x86@kernel.org
Cc: Borislav Petkov <bp@alien8.de>
Reported-by: Duncan Roe <duncan_roe@optusnet.com.au>
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
5 years agoSUNRPC: Fix a credential refcount leak
Trond Myklebust [Thu, 20 Jun 2019 14:47:40 +0000 (10:47 -0400)]
SUNRPC: Fix a credential refcount leak

All callers of __rpc_clone_client() pass in a value for args->cred,
meaning that the credential gets assigned and referenced in
the call to rpc_new_client().

Reported-by: Ido Schimmel <idosch@idosch.org>
Fixes: 79caa5fad47c ("SUNRPC: Cache cred of process creating the rpc_client")
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Tested-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
5 years agoRevert "SUNRPC: Declare RPC timers as TIMER_DEFERRABLE"
Anna Schumaker [Tue, 18 Jun 2019 18:57:33 +0000 (14:57 -0400)]
Revert "SUNRPC: Declare RPC timers as TIMER_DEFERRABLE"

Jon Hunter reports:
  "I have been noticing intermittent failures with a system suspend test on
   some of our machines that have a NFS mounted root file-system. Bisecting
   this issue points to your commit 431235818bc3 ("SUNRPC: Declare RPC
   timers as TIMER_DEFERRABLE") and reverting this on top of v5.2-rc3 does
   appear to resolve the problem.

   The cause of the suspend failure appears to be a long delay observed
   sometimes when resuming from suspend, and this is causing our test to
   timeout."

This reverts commit 431235818bc3a919ca7487500c67c3144feece80.

Reported-by: Jon Hunter <jonathanh@nvidia.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
5 years agonet :sunrpc :clnt :Fix xps refcount imbalance on the error path
Lin Yi [Mon, 10 Jun 2019 02:16:56 +0000 (10:16 +0800)]
net :sunrpc :clnt :Fix xps refcount imbalance on the error path

rpc_clnt_add_xprt take a reference to struct rpc_xprt_switch, but forget
to release it before return, may lead to a memory leak.

Signed-off-by: Lin Yi <teroincn@163.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
5 years agoNFS4: Only set creation opendata if O_CREAT
Benjamin Coddington [Fri, 7 Jun 2019 10:37:30 +0000 (06:37 -0400)]
NFS4: Only set creation opendata if O_CREAT

We can end up in nfs4_opendata_alloc during task exit, in which case
current->fs has already been cleaned up.  This leads to a crash in
current_umask().

Fix this by only setting creation opendata if we are actually doing an open
with O_CREAT.  We can drop the check for NULL nfs4_open_createattrs, since
O_CREAT will never be set for the recovery path.

Suggested-by: Trond Myklebust <trondmy@hammerspace.com>
Signed-off-by: Benjamin Coddington <bcodding@redhat.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
5 years agoMerge tag 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-arm
Linus Torvalds [Fri, 21 Jun 2019 18:11:30 +0000 (11:11 -0700)]
Merge tag 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-arm

Pull ARM fix from Russell King:
 "Just one ARM fix this time around for Jason Donenfeld, fixing a
  problem with the VDSO generation on big endian"

* tag 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-arm:
  ARM: 8867/1: vdso: pass --be8 to linker if necessary

5 years agoMerge tag 'drm-fixes-2019-06-21' of git://anongit.freedesktop.org/drm/drm
Linus Torvalds [Fri, 21 Jun 2019 18:03:33 +0000 (11:03 -0700)]
Merge tag 'drm-fixes-2019-06-21' of git://anongit.freedesktop.org/drm/drm

Pull drm fixes from Dave Airlie:
 "Just catching up on the week since back from holidays, everything
  seems quite sane.

  core:
   - copy_to_user fix for really legacy codepaths.

  vmwgfx:
   - two dma fixes
   - one virt hw interaction fix

  i915:
   - modesetting fix
   - gvt fix

  panfrost:
   - BO unmapping fix

  imx:
   - image converter fixes"

* tag 'drm-fixes-2019-06-21' of git://anongit.freedesktop.org/drm/drm:
  drm/i915: Don't clobber M/N values during fastset check
  drm: return -EFAULT if copy_to_user() fails
  drm/panfrost: Make sure a BO is only unmapped when appropriate
  drm/i915/gvt: ignore unexpected pvinfo write
  gpu: ipu-v3: image-convert: Fix image downsize coefficients
  gpu: ipu-v3: image-convert: Fix input bytesperline for packed formats
  gpu: ipu-v3: image-convert: Fix input bytesperline width/height align
  drm/vmwgfx: fix a warning due to missing dma_parms
  drm/vmwgfx: Honor the sg list segment size limitation
  drm/vmwgfx: Use the backdoor port if the HB port is not available

5 years agoMerge tag 'staging-5.2-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh...
Linus Torvalds [Fri, 21 Jun 2019 17:20:19 +0000 (10:20 -0700)]
Merge tag 'staging-5.2-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging

Pull staging/IIO/counter fixes from Greg KH:
 "Here are some small driver bugfixes for some staging/iio/counter
  drivers.

  Staging and IIO have been lumped together for a while, as those
  subsystems cross the areas a log, and counter is used by IIO, so
  that's why they are all in one pull request here.

  These are small fixes for reported issues in some iio drivers, the
  erofs filesystem, and a build issue for counter code.

  All have been in linux-next with no reported issues"

* tag 'staging-5.2-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging:
  staging: erofs: add requirements field in superblock
  counter/ftm-quaddec: Add missing dependencies in Kconfig
  staging: iio: adt7316: Fix build errors when GPIOLIB is not set
  iio: temperature: mlx90632 Relax the compatibility check
  iio: imu: st_lsm6dsx: fix PM support for st_lsm6dsx i2c controller
  staging:iio:ad7150: fix threshold mode config bit

5 years agoMerge tag 'char-misc-5.2-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh...
Linus Torvalds [Fri, 21 Jun 2019 17:18:16 +0000 (10:18 -0700)]
Merge tag 'char-misc-5.2-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc

Pull char/misc driver fixes from Greg KH:
 "Here are a number of small driver fixes for 5.2-rc6

  Nothing major, just fixes for reported issues:
   - soundwire fixes
   - thunderbolt fixes
   - MAINTAINERS update for fpga maintainer change
   - binder bugfix
   - habanalabs 64bit pointer fix
   - documentation updates

  All of these have been in linux-next with no reported issues"

* tag 'char-misc-5.2-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc:
  habanalabs: use u64_to_user_ptr() for reading user pointers
  doc: fix documentation about UIO_MEM_LOGICAL using
  MAINTAINERS / Documentation: Thorsten Scherer is the successor of Gavin Schenk
  docs: fb: Add TER16x32 to the available font names
  MAINTAINERS: fpga: hand off maintainership to Moritz
  thunderbolt: Implement CIO reset correctly for Titan Ridge
  binder: fix possible UAF when freeing buffer
  thunderbolt: Make sure device runtime resume completes before taking domain lock
  soundwire: intel: set dai min and max channels correctly
  soundwire: stream: fix bad unlock balance
  soundwire: stream: fix out of boundary access on port properties

5 years agoMerge tag 'usb-5.2-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb
Linus Torvalds [Fri, 21 Jun 2019 17:16:41 +0000 (10:16 -0700)]
Merge tag 'usb-5.2-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb

Pull USB fixes from Greg KH:
 "Here are four small USB fixes for 5.2-rc6.

  They include two xhci bugfixes, a chipidea fix, and a small dwc2 fix.
  Nothing major, just nice things to get resolved for reported issues.

  All have been in linux-next with no reported issues"

* tag 'usb-5.2-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
  xhci: detect USB 3.2 capable host controllers correctly
  usb: xhci: Don't try to recover an endpoint if port is in error state.
  usb: dwc2: Use generic PHY width in params setup
  usb: chipidea: udc: workaround for endpoint conflict issue

5 years agoMerge tag 'spdx-5.2-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh...
Linus Torvalds [Fri, 21 Jun 2019 16:58:42 +0000 (09:58 -0700)]
Merge tag 'spdx-5.2-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/spdx

Pull still more SPDX updates from Greg KH:
 "Another round of SPDX updates for 5.2-rc6

  Here is what I am guessing is going to be the last "big" SPDX update
  for 5.2. It contains all of the remaining GPLv2 and GPLv2+ updates
  that were "easy" to determine by pattern matching. The ones after this
  are going to be a bit more difficult and the people on the spdx list
  will be discussing them on a case-by-case basis now.

  Another 5000+ files are fixed up, so our overall totals are:
Files checked:            64545
Files with SPDX:          45529

  Compared to the 5.1 kernel which was:
Files checked:            63848
Files with SPDX:          22576

  This is a huge improvement.

  Also, we deleted another 20000 lines of boilerplate license crud,
  always nice to see in a diffstat"

* tag 'spdx-5.2-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/spdx: (65 commits)
  treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 507
  treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 506
  treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 505
  treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 504
  treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 503
  treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 502
  treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 501
  treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 500
  treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 499
  treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 498
  treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 497
  treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 496
  treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 495
  treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 491
  treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 490
  treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 489
  treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 488
  treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 487
  treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 486
  treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 485
  ...

5 years agoMerge tag '5.2-rc5-smb3-fixes' of git://git.samba.org/sfrench/cifs-2.6
Linus Torvalds [Fri, 21 Jun 2019 16:51:44 +0000 (09:51 -0700)]
Merge tag '5.2-rc5-smb3-fixes' of git://git.samba.org/sfrench/cifs-2.6

Pull cifs fixes from Steve French:
 "Four small SMB3 fixes, all for stable"

* tag '5.2-rc5-smb3-fixes' of git://git.samba.org/sfrench/cifs-2.6:
  cifs: fix GlobalMid_Lock bug in cifs_reconnect
  SMB3: retry on STATUS_INSUFFICIENT_RESOURCES instead of failing write
  cifs: add spinlock for the openFileList to cifsInodeInfo
  cifs: fix panic in smb2_reconnect

5 years agoMerge tag 'imx-drm-fixes-2019-06-20' of git://git.pengutronix.de/git/pza/linux into...
Dave Airlie [Fri, 21 Jun 2019 01:44:20 +0000 (11:44 +1000)]
Merge tag 'imx-drm-fixes-2019-06-20' of git://git.pengutronix.de/git/pza/linux into drm-fixes

drm/imx: ipu-v3 image converter fixes

This series fixes input buffer alignment and downsizer configuration
to adhere to IPU mem2mem CSC/scaler hardware restrictions in certain
downscaling ratios.

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Philipp Zabel <p.zabel@pengutronix.de>
Link: https://patchwork.freedesktop.org/patch/msgid/1561040798.14349.20.camel@pengutronix.de
5 years agoMerge tag 'drm-intel-fixes-2019-06-20' of git://anongit.freedesktop.org/drm/drm-intel...
Dave Airlie [Fri, 21 Jun 2019 01:39:14 +0000 (11:39 +1000)]
Merge tag 'drm-intel-fixes-2019-06-20' of git://anongit.freedesktop.org/drm/drm-intel into drm-fixes

drm/i915 fixes for v5.2-rc6:
- GVT: Fix reserved PVINFO register write (Weinan)
- Avoid clobbering M/N values in fastset fuzzy checks (Ville)

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/87pnn8sbdp.fsf@intel.com
5 years agoMerge tag 'drm-misc-fixes-2019-06-19' of git://anongit.freedesktop.org/drm/drm-misc...
Dave Airlie [Fri, 21 Jun 2019 01:35:12 +0000 (11:35 +1000)]
Merge tag 'drm-misc-fixes-2019-06-19' of git://anongit.freedesktop.org/drm/drm-misc into drm-fixes

panfrost- Only unmap BO's if they're mapped (Boris)
core- Handle buffer desc copy_to_user failure properly (Dan)

Cc: Boris Brezillon <boris.brezillon@collabora.com>
Cc: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Sean Paul <sean@poorly.run>
Link: https://patchwork.freedesktop.org/patch/msgid/20190619192745.GA145841@art_vandelay
5 years agoMerge branch 'vmwgfx-fixes-5.2' of git://people.freedesktop.org/~thomash/linux into...
Dave Airlie [Fri, 21 Jun 2019 01:26:59 +0000 (11:26 +1000)]
Merge branch 'vmwgfx-fixes-5.2' of git://people.freedesktop.org/~thomash/linux into drm-fixes

A couple of fixes for vmwgfx. Two fixes for a DMA sg-list debug warning
message. These are not cc'd stable since there is no evidence of actual
breakage.
On fix for the high-bandwidth backdoor port which is cc'd stable due to
upcoming hardware, on which the code would otherwise break.

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Thomas Hellstrom <VMware> <thomas@shipmail.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20190618072255.2720-1-thomas@shipmail.org
5 years agoARM: 8867/1: vdso: pass --be8 to linker if necessary
Jason A. Donenfeld [Mon, 17 Jun 2019 12:29:19 +0000 (13:29 +0100)]
ARM: 8867/1: vdso: pass --be8 to linker if necessary

The commit fe00e50b2db8 ("ARM: 8858/1: vdso: use $(LD) instead of $(CC)
to link VDSO") removed the passing of CFLAGS, since ld doesn't take
those directly. However, prior, big-endian ARM was relying on gcc to
translate its -mbe8 option into ld's --be8 option. Lacking this, ld
generated be32 code, making the VDSO generate SIGILL when called by
userspace.

This commit passes --be8 if CONFIG_CPU_ENDIAN_BE8 is enabled.

Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
5 years agoMerge tag 'ovl-fixes-5.2-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/mszere...
Linus Torvalds [Thu, 20 Jun 2019 21:19:34 +0000 (14:19 -0700)]
Merge tag 'ovl-fixes-5.2-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs

Pull overlayfs fixes from Miklos Szeredi:
 "Fix two regressions in this cycle, and a couple of older bugs"

* tag 'ovl-fixes-5.2-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs:
  ovl: make i_ino consistent with st_ino in more cases
  ovl: fix typo in MODULE_PARM_DESC
  ovl: fix bogus -Wmaybe-unitialized warning
  ovl: don't fail with disconnected lower NFS
  ovl: fix wrong flags check in FS_IOC_FS[SG]ETXATTR ioctls

5 years agoMerge tag 'fuse-fixes-5.2-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/mszer...
Linus Torvalds [Thu, 20 Jun 2019 21:16:16 +0000 (14:16 -0700)]
Merge tag 'fuse-fixes-5.2-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse

Pull fuse fix from Miklos Szeredi:
 "Just a single revert, fixing a regression in -rc1"

* tag 'fuse-fixes-5.2-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse:
  Revert "fuse: require /dev/fuse reads to have enough buffer capacity"

5 years agoMerge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Linus Torvalds [Thu, 20 Jun 2019 20:50:37 +0000 (13:50 -0700)]
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull kvm fixes from Paolo Bonzini:
 "Fixes for ARM and x86, plus selftest patches and nicer structs for
  nested state save/restore"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  KVM: nVMX: reorganize initial steps of vmx_set_nested_state
  KVM: arm/arm64: Fix emulated ptimer irq injection
  tests: kvm: Check for a kernel warning
  kvm: tests: Sort tests in the Makefile alphabetically
  KVM: x86/mmu: Allocate PAE root array when using SVM's 32-bit NPT
  KVM: x86: Modify struct kvm_nested_state to have explicit fields for data
  KVM: fix typo in documentation
  KVM: nVMX: use correct clean fields when copying from eVMCS
  KVM: arm/arm64: vgic: Fix kvm_device leak in vgic_its_destroy
  KVM: arm64: Filter out invalid core register IDs in KVM_GET_REG_LIST
  KVM: arm64: Implement vq_present() as a macro

5 years agoMerge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Linus Torvalds [Thu, 20 Jun 2019 19:04:57 +0000 (12:04 -0700)]
Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux

Pull arm64 fixes from Will Deacon:
 "This is mainly a couple of email address updates to MAINTAINERS, but
  we've also fixed a UAPI build issue with musl libc and an accidental
  double-initialisation of our pgd_cache due to a naming conflict with a
  weak symbol.

  There are a couple of outstanding issues that have been reported, but
  it doesn't look like they're new and we're still a long way off from
  fully debugging them.

  Summary:

   - Fix use of #include in UAPI headers for compatability with musl libc

   - Update email addresses in MAINTAINERS

   - Fix initialisation of pgd_cache due to name collision with weak symbol"

* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
  arm64/mm: don't initialize pgd_cache twice
  MAINTAINERS: Update my email address
  arm64/sve: <uapi/asm/ptrace.h> should not depend on <uapi/linux/prctl.h>
  arm64: ssbd: explicitly depend on <linux/prctl.h>
  MAINTAINERS: Update my email address to use @kernel.org

5 years agoMerge tag 's390-5.2-5' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux
Linus Torvalds [Thu, 20 Jun 2019 19:03:41 +0000 (12:03 -0700)]
Merge tag 's390-5.2-5' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux

Pull s390 fixes from Heiko Carstens:

 - Disable address-of-packed-member warning in s390 specific boot code
   to get rid of a gcc9 warning which otherwise is already disabled for
   the whole kernel.

 - Fix yet another compiler error seen with CONFIG_OPTIMIZE_INLINING
   enabled.

 - Fix memory leak in vfio-ccw code on module exit.

* tag 's390-5.2-5' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
  vfio-ccw: Destroy kmem cache region on module exit
  s390/ctl_reg: mark __ctl_set_bit and __ctl_clear_bit as __always_inline
  s390/boot: disable address-of-packed-member warning

5 years agoMerge tag 'for_v5.2-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs
Linus Torvalds [Thu, 20 Jun 2019 17:12:53 +0000 (10:12 -0700)]
Merge tag 'for_v5.2-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs

Pull two misc vfs fixes from Jan Kara:
 "One small quota fix fixing spurious EDQUOT errors and one fanotify fix
  fixing a bug in the new fanotify FID reporting code"

* tag 'for_v5.2-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
  fanotify: update connector fsid cache on add mark
  quota: fix a problem about transfer quota

5 years agoMerge tag 'mmc-v5.2-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc
Linus Torvalds [Thu, 20 Jun 2019 17:08:38 +0000 (10:08 -0700)]
Merge tag 'mmc-v5.2-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc

Pull MMC fixes from Ulf Hansson:
 "Here's quite a few MMC fixes intended for v5.2-rc6. This time it also
  contains fixes for a WiFi driver, which device is attached to the SDIO
  interface. Patches for the WiFi driver have been acked by the
  corresponding maintainers.

  Summary:

  MMC core:
   - Make switch to eMMC HS400 more robust for some controllers
   - Add two SDIO func API to manage re-tuning constraints
   - Prevent processing SDIO IRQs when the card is suspended

  MMC host:
   - sdhi: Disallow broken HS400 for M3-W ES1.2, RZ/G2M and V3H
   - mtk-sd: Fixup support for SDIO IRQs
   - sdhci-pci-o2micro: Fixup support for tuning

  Wireless BRCMFMAC (SDIO):
   - Deal with expected transmission errors related to the idle states
     (handled by the Always-On-Subsystem or AOS) on the SDIO-based WiFi
     on rk3288-veyron-minnie, rk3288-veyron-speedy and
     rk3288-veyron-mickey"

* tag 'mmc-v5.2-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc:
  mmc: core: Prevent processing SDIO IRQs when the card is suspended
  mmc: sdhci: sdhci-pci-o2micro: Correctly set bus width when tuning
  brcmfmac: sdio: Don't tune while the card is off
  mmc: core: Add sdio_retune_hold_now() and sdio_retune_release()
  brcmfmac: sdio: Disable auto-tuning around commands expected to fail
  mmc: core: API to temporarily disable retuning for SDIO CRC errors
  Revert "brcmfmac: disable command decode in sdio_aos"
  mmc: mediatek: fix SDIO IRQ detection issue
  mmc: mediatek: fix SDIO IRQ interrupt handle flow
  mmc: core: complete HS400 before checking status
  mmc: sdhi: disallow HS400 for M3-W ES1.2, RZ/G2M, and V3H

5 years agoMerge tag 'for-linus-20190620' of git://git.kernel.dk/linux-block
Linus Torvalds [Thu, 20 Jun 2019 16:58:35 +0000 (09:58 -0700)]
Merge tag 'for-linus-20190620' of git://git.kernel.dk/linux-block

Pull block fixes from Jens Axboe:
 "Three fixes that should go into this series.

  One is a set of two patches from Christoph, fixing a page leak on same
  page merges. Boiled down version of a bigger fix, but this one is more
  appropriate for this late in the cycle (and easier to backport to
  stable).

  The last patch is for a divide error in MD, from Mariusz (via Song)"

* tag 'for-linus-20190620' of git://git.kernel.dk/linux-block:
  md: fix for divide error in status_resync
  block: fix page leak when merging to same page
  block: return from __bio_try_merge_page if merging occured in the same page

5 years agoMerge tag 'kvmarm-fixes-for-5.2-2' of git://git.kernel.org/pub/scm/linux/kernel/git...
Paolo Bonzini [Thu, 20 Jun 2019 16:24:18 +0000 (18:24 +0200)]
Merge tag 'kvmarm-fixes-for-5.2-2' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD

KVM/arm fixes for 5.2, take #2

- SVE cleanup killing a warning with ancient GCC versions
- Don't report non-existent system registers to userspace
- Fix memory leak when freeing the vgic ITS
- Properly lower the interrupt on the emulated physical timer

5 years agoKVM: nVMX: reorganize initial steps of vmx_set_nested_state
Paolo Bonzini [Wed, 19 Jun 2019 14:52:27 +0000 (16:52 +0200)]
KVM: nVMX: reorganize initial steps of vmx_set_nested_state

Commit 332d079735f5 ("KVM: nVMX: KVM_SET_NESTED_STATE - Tear down old EVMCS
state before setting new state", 2019-05-02) broke evmcs_test because the
eVMCS setup must be performed even if there is no VMXON region defined,
as long as the eVMCS bit is set in the assist page.

While the simplest possible fix would be to add a check on
kvm_state->flags & KVM_STATE_NESTED_EVMCS in the initial "if" that
covers kvm_state->hdr.vmx.vmxon_pa == -1ull, that is quite ugly.

Instead, this patch moves checks earlier in the function and
conditionalizes them on kvm_state->hdr.vmx.vmxon_pa, so that
vmx_set_nested_state always goes through vmx_leave_nested
and nested_enable_evmcs.

Fixes: 332d079735f5 ("KVM: nVMX: KVM_SET_NESTED_STATE - Tear down old EVMCS state before setting new state")
Cc: Aaron Lewis <aaronlewis@google.com>
Reviewed-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
5 years agoMerge tag 'misc-habanalabs-fixes-2019-06-20' of git://people.freedesktop.org/~gabbayo...
Greg Kroah-Hartman [Thu, 20 Jun 2019 11:30:47 +0000 (13:30 +0200)]
Merge tag 'misc-habanalabs-fixes-2019-06-20' of git://people.freedesktop.org/~gabbayo/linux into char-misc-linus

Oded writes:

This tag contains the following fix:

- Casting warning of a 64-bit integer in 32-bit architecture. Use the
  macro that was defined for this purpose.

* tag 'misc-habanalabs-fixes-2019-06-20' of git://people.freedesktop.org/~gabbayo/linux:
  habanalabs: use u64_to_user_ptr() for reading user pointers

5 years agoMerge tag 'fixes-for-v5.2-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/balbi...
Greg Kroah-Hartman [Thu, 20 Jun 2019 09:56:35 +0000 (11:56 +0200)]
Merge tag 'fixes-for-v5.2-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/balbi/usb into usb-linus

Felipe writes:

usb: fixes for v5.2-rc5

A single fix to take into account the PHY width during initialization of
dwc2 driver. This change allows deviceTree to pass PHY width if
necessary.

Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
* tag 'fixes-for-v5.2-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/balbi/usb:
  usb: dwc2: Use generic PHY width in params setup

5 years agohabanalabs: use u64_to_user_ptr() for reading user pointers
Arnd Bergmann [Mon, 17 Jun 2019 12:41:33 +0000 (14:41 +0200)]
habanalabs: use u64_to_user_ptr() for reading user pointers

We cannot cast a 64-bit integer to a pointer on 32-bit architectures
without a warning:

drivers/misc/habanalabs/habanalabs_ioctl.c: In function 'debug_coresight':
drivers/misc/habanalabs/habanalabs_ioctl.c:143:23: error: cast to pointer from integer of different size [-Werror=int-to-pointer-cast]
   input = memdup_user((const void __user *) args->input_ptr,

Use the macro that was defined for this purpose.

Fixes: 315bc055ed56 ("habanalabs: add new IOCTL for debug, tracing and profiling")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
5 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
David S. Miller [Thu, 20 Jun 2019 04:06:27 +0000 (00:06 -0400)]
Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next

Alexei Starovoitov says:

====================
pull-request: bpf-next 2019-06-19

The following pull-request contains BPF updates for your *net-next* tree.

The main changes are:

1) new SO_REUSEPORT_DETACH_BPF setsocktopt, from Martin.

2) BTF based map definition, from Andrii.

3) support bpf_map_lookup_elem for xskmap, from Jonathan.

4) bounded loops and scalar precision logic in the verifier, from Alexei.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agopage_pool: fix compile warning when CONFIG_PAGE_POOL is disabled
Jesper Dangaard Brouer [Wed, 19 Jun 2019 22:15:52 +0000 (00:15 +0200)]
page_pool: fix compile warning when CONFIG_PAGE_POOL is disabled

Kbuild test robot reported compile warning:
 warning: no return statement in function returning non-void
in function page_pool_request_shutdown, when CONFIG_PAGE_POOL is disabled.

The fix makes the code a little more verbose, with a descriptive variable.

Fixes: 99c07c43c4ea ("xdp: tracking page_pool resources and safe removal")
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Reported-by: kbuild test robot <lkp@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonfsd: replace Jeff by Chuck as nfsd co-maintainer
J. Bruce Fields [Wed, 19 Jun 2019 21:06:24 +0000 (17:06 -0400)]
nfsd: replace Jeff by Chuck as nfsd co-maintainer

Jeff's picking up more responsibilities elsewhere, and Chuck's agreed to
take over.

For now, as before, nothing's changing day-to-day, but I want to have a
co-maintainer if only for bus factor.

Acked-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
5 years agoinet: clear num_timeout reqsk_alloc()
Eric Dumazet [Wed, 19 Jun 2019 16:38:38 +0000 (09:38 -0700)]
inet: clear num_timeout reqsk_alloc()

KMSAN caught uninit-value in tcp_create_openreq_child() [1]
This is caused by a recent change, combined by the fact
that TCP cleared num_timeout, num_retrans and sk fields only
when a request socket was about to be queued.

Under syncookie mode, a temporary request socket is used,
and req->num_timeout could contain garbage.

Lets clear these three fields sooner, there is really no
point trying to defer this and risk other bugs.

[1]

BUG: KMSAN: uninit-value in tcp_create_openreq_child+0x157f/0x1cc0 net/ipv4/tcp_minisocks.c:526
CPU: 1 PID: 13357 Comm: syz-executor591 Not tainted 5.2.0-rc4+ #3
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
 <IRQ>
 __dump_stack lib/dump_stack.c:77 [inline]
 dump_stack+0x191/0x1f0 lib/dump_stack.c:113
 kmsan_report+0x162/0x2d0 mm/kmsan/kmsan.c:611
 __msan_warning+0x75/0xe0 mm/kmsan/kmsan_instr.c:304
 tcp_create_openreq_child+0x157f/0x1cc0 net/ipv4/tcp_minisocks.c:526
 tcp_v6_syn_recv_sock+0x761/0x2d80 net/ipv6/tcp_ipv6.c:1152
 tcp_get_cookie_sock+0x16e/0x6b0 net/ipv4/syncookies.c:209
 cookie_v6_check+0x27e0/0x29a0 net/ipv6/syncookies.c:252
 tcp_v6_cookie_check net/ipv6/tcp_ipv6.c:1039 [inline]
 tcp_v6_do_rcv+0xf1c/0x1ce0 net/ipv6/tcp_ipv6.c:1344
 tcp_v6_rcv+0x60b7/0x6a30 net/ipv6/tcp_ipv6.c:1554
 ip6_protocol_deliver_rcu+0x1433/0x22f0 net/ipv6/ip6_input.c:397
 ip6_input_finish net/ipv6/ip6_input.c:438 [inline]
 NF_HOOK include/linux/netfilter.h:305 [inline]
 ip6_input+0x2af/0x340 net/ipv6/ip6_input.c:447
 dst_input include/net/dst.h:439 [inline]
 ip6_rcv_finish net/ipv6/ip6_input.c:76 [inline]
 NF_HOOK include/linux/netfilter.h:305 [inline]
 ipv6_rcv+0x683/0x710 net/ipv6/ip6_input.c:272
 __netif_receive_skb_one_core net/core/dev.c:4981 [inline]
 __netif_receive_skb net/core/dev.c:5095 [inline]
 process_backlog+0x721/0x1410 net/core/dev.c:5906
 napi_poll net/core/dev.c:6329 [inline]
 net_rx_action+0x738/0x1940 net/core/dev.c:6395
 __do_softirq+0x4ad/0x858 kernel/softirq.c:293
 do_softirq_own_stack+0x49/0x80 arch/x86/entry/entry_64.S:1052
 </IRQ>
 do_softirq kernel/softirq.c:338 [inline]
 __local_bh_enable_ip+0x199/0x1e0 kernel/softirq.c:190
 local_bh_enable+0x36/0x40 include/linux/bottom_half.h:32
 rcu_read_unlock_bh include/linux/rcupdate.h:682 [inline]
 ip6_finish_output2+0x213f/0x2670 net/ipv6/ip6_output.c:117
 ip6_finish_output+0xae4/0xbc0 net/ipv6/ip6_output.c:150
 NF_HOOK_COND include/linux/netfilter.h:294 [inline]
 ip6_output+0x5d3/0x720 net/ipv6/ip6_output.c:167
 dst_output include/net/dst.h:433 [inline]
 NF_HOOK include/linux/netfilter.h:305 [inline]
 ip6_xmit+0x1f53/0x2650 net/ipv6/ip6_output.c:271
 inet6_csk_xmit+0x3df/0x4f0 net/ipv6/inet6_connection_sock.c:135
 __tcp_transmit_skb+0x4076/0x5b40 net/ipv4/tcp_output.c:1156
 tcp_transmit_skb net/ipv4/tcp_output.c:1172 [inline]
 tcp_write_xmit+0x39a9/0xa730 net/ipv4/tcp_output.c:2397
 __tcp_push_pending_frames+0x124/0x4e0 net/ipv4/tcp_output.c:2573
 tcp_send_fin+0xd43/0x1540 net/ipv4/tcp_output.c:3118
 tcp_close+0x16ba/0x1860 net/ipv4/tcp.c:2403
 inet_release+0x1f7/0x270 net/ipv4/af_inet.c:427
 inet6_release+0xaf/0x100 net/ipv6/af_inet6.c:470
 __sock_release net/socket.c:601 [inline]
 sock_close+0x156/0x490 net/socket.c:1273
 __fput+0x4c9/0xba0 fs/file_table.c:280
 ____fput+0x37/0x40 fs/file_table.c:313
 task_work_run+0x22e/0x2a0 kernel/task_work.c:113
 tracehook_notify_resume include/linux/tracehook.h:185 [inline]
 exit_to_usermode_loop arch/x86/entry/common.c:168 [inline]
 prepare_exit_to_usermode+0x39d/0x4d0 arch/x86/entry/common.c:199
 syscall_return_slowpath+0x90/0x5c0 arch/x86/entry/common.c:279
 do_syscall_64+0xe2/0xf0 arch/x86/entry/common.c:305
 entry_SYSCALL_64_after_hwframe+0x63/0xe7
RIP: 0033:0x401d50
Code: 01 f0 ff ff 0f 83 40 0d 00 00 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 83 3d dd 8d 2d 00 00 75 14 b8 03 00 00 00 0f 05 <48> 3d 01 f0 ff ff 0f 83 14 0d 00 00 c3 48 83 ec 08 e8 7a 02 00 00
RSP: 002b:00007fff1cf58cf8 EFLAGS: 00000246 ORIG_RAX: 0000000000000003
RAX: 0000000000000000 RBX: 0000000000000004 RCX: 0000000000401d50
RDX: 000000000000001c RSI: 0000000000000000 RDI: 0000000000000003
RBP: 00000000004a9050 R08: 0000000020000040 R09: 000000000000001c
R10: 0000000020004004 R11: 0000000000000246 R12: 0000000000402ef0
R13: 0000000000402f80 R14: 0000000000000000 R15: 0000000000000000

Uninit was created at:
 kmsan_save_stack_with_flags mm/kmsan/kmsan.c:201 [inline]
 kmsan_internal_poison_shadow+0x53/0xa0 mm/kmsan/kmsan.c:160
 kmsan_kmalloc+0xa4/0x130 mm/kmsan/kmsan_hooks.c:177
 kmem_cache_alloc+0x534/0xb00 mm/slub.c:2781
 reqsk_alloc include/net/request_sock.h:84 [inline]
 inet_reqsk_alloc+0xa8/0x600 net/ipv4/tcp_input.c:6384
 cookie_v6_check+0xadb/0x29a0 net/ipv6/syncookies.c:173
 tcp_v6_cookie_check net/ipv6/tcp_ipv6.c:1039 [inline]
 tcp_v6_do_rcv+0xf1c/0x1ce0 net/ipv6/tcp_ipv6.c:1344
 tcp_v6_rcv+0x60b7/0x6a30 net/ipv6/tcp_ipv6.c:1554
 ip6_protocol_deliver_rcu+0x1433/0x22f0 net/ipv6/ip6_input.c:397
 ip6_input_finish net/ipv6/ip6_input.c:438 [inline]
 NF_HOOK include/linux/netfilter.h:305 [inline]
 ip6_input+0x2af/0x340 net/ipv6/ip6_input.c:447
 dst_input include/net/dst.h:439 [inline]
 ip6_rcv_finish net/ipv6/ip6_input.c:76 [inline]
 NF_HOOK include/linux/netfilter.h:305 [inline]
 ipv6_rcv+0x683/0x710 net/ipv6/ip6_input.c:272
 __netif_receive_skb_one_core net/core/dev.c:4981 [inline]
 __netif_receive_skb net/core/dev.c:5095 [inline]
 process_backlog+0x721/0x1410 net/core/dev.c:5906
 napi_poll net/core/dev.c:6329 [inline]
 net_rx_action+0x738/0x1940 net/core/dev.c:6395
 __do_softirq+0x4ad/0x858 kernel/softirq.c:293
 do_softirq_own_stack+0x49/0x80 arch/x86/entry/entry_64.S:1052
 do_softirq kernel/softirq.c:338 [inline]
 __local_bh_enable_ip+0x199/0x1e0 kernel/softirq.c:190
 local_bh_enable+0x36/0x40 include/linux/bottom_half.h:32
 rcu_read_unlock_bh include/linux/rcupdate.h:682 [inline]
 ip6_finish_output2+0x213f/0x2670 net/ipv6/ip6_output.c:117
 ip6_finish_output+0xae4/0xbc0 net/ipv6/ip6_output.c:150
 NF_HOOK_COND include/linux/netfilter.h:294 [inline]
 ip6_output+0x5d3/0x720 net/ipv6/ip6_output.c:167
 dst_output include/net/dst.h:433 [inline]
 NF_HOOK include/linux/netfilter.h:305 [inline]
 ip6_xmit+0x1f53/0x2650 net/ipv6/ip6_output.c:271
 inet6_csk_xmit+0x3df/0x4f0 net/ipv6/inet6_connection_sock.c:135
 __tcp_transmit_skb+0x4076/0x5b40 net/ipv4/tcp_output.c:1156
 tcp_transmit_skb net/ipv4/tcp_output.c:1172 [inline]
 tcp_write_xmit+0x39a9/0xa730 net/ipv4/tcp_output.c:2397
 __tcp_push_pending_frames+0x124/0x4e0 net/ipv4/tcp_output.c:2573
 tcp_send_fin+0xd43/0x1540 net/ipv4/tcp_output.c:3118
 tcp_close+0x16ba/0x1860 net/ipv4/tcp.c:2403
 inet_release+0x1f7/0x270 net/ipv4/af_inet.c:427
 inet6_release+0xaf/0x100 net/ipv6/af_inet6.c:470
 __sock_release net/socket.c:601 [inline]
 sock_close+0x156/0x490 net/socket.c:1273
 __fput+0x4c9/0xba0 fs/file_table.c:280
 ____fput+0x37/0x40 fs/file_table.c:313
 task_work_run+0x22e/0x2a0 kernel/task_work.c:113
 tracehook_notify_resume include/linux/tracehook.h:185 [inline]
 exit_to_usermode_loop arch/x86/entry/common.c:168 [inline]
 prepare_exit_to_usermode+0x39d/0x4d0 arch/x86/entry/common.c:199
 syscall_return_slowpath+0x90/0x5c0 arch/x86/entry/common.c:279
 do_syscall_64+0xe2/0xf0 arch/x86/entry/common.c:305
 entry_SYSCALL_64_after_hwframe+0x63/0xe7

Fixes: 336c39a03151 ("tcp: undo init congestion window on false SYNACK timeout")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Yuchung Cheng <ycheng@google.com>
Cc: Neal Cardwell <ncardwell@google.com>
Cc: Soheil Hassas Yeganeh <soheil@google.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Acked-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: stmmac: initialize the reset delay array
Martin Blumenstingl [Tue, 18 Jun 2019 20:39:27 +0000 (22:39 +0200)]
net: stmmac: initialize the reset delay array

Commit ce4ab73ab0c27c ("net: stmmac: drop the reset delays from struct
stmmac_mdio_bus_data") moved the reset delay array from struct
stmmac_mdio_bus_data to a stack variable.
The values from the array inside struct stmmac_mdio_bus_data were
previously initialized to 0 because the struct was allocated using
devm_kzalloc(). The array on the stack has to be initialized
explicitly, else we might be reading garbage values.

Initialize all reset delays to 0 to ensure that the values are 0 if the
"snps,reset-delays-us" property is not defined.
This fixes booting at least two boards (MIPS pistachio marduk and ARM
sun8i H2+ Orange Pi Zero). These are hanging during boot when
initializing the stmmac Ethernet controller (as found by Kernel CI).
Both have in common that they don't define the "snps,reset-delays-us"
property.

Fixes: ce4ab73ab0c27c ("net: stmmac: drop the reset delays from struct stmmac_mdio_bus_data")
Signed-off-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com>
Reported-by: "kernelci.org bot" <bot@kernelci.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoselftests/net: make udpgso_bench skip unsupported testcases
Willem de Bruijn [Tue, 18 Jun 2019 20:03:04 +0000 (16:03 -0400)]
selftests/net: make udpgso_bench skip unsupported testcases

Kselftest can be run against older kernels. Instead of failing hard
when a feature is unsupported, return the KSFT_SKIP exit code.

Specifically, do not fail hard on missing udp zerocopy.

The udp gso bench test runs multiple test cases from a single script.
Fail if any case fails, else return skip if any test is skipped.

Link: https://lore.kernel.org/lkml/20190618171516.GA17547@kroah.com/
Signed-off-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet/ipv4: fib_trie: Avoid cryptic ternary expressions
Matthias Kaehlcke [Tue, 18 Jun 2019 21:14:40 +0000 (14:14 -0700)]
net/ipv4: fib_trie: Avoid cryptic ternary expressions

empty_child_inc/dec() use the ternary operator for conditional
operations. The conditions involve the post/pre in/decrement
operator and the operation is only performed when the condition
is *not* true. This is hard to parse for humans, use a regular
'if' construct instead and perform the in/decrement separately.

This also fixes two warnings that are emitted about the value
of the ternary expression being unused, when building the kernel
with clang + "kbuild: Remove unnecessary -Wno-unused-value"
(https://lore.kernel.org/patchwork/patch/1089869/):

CC      net/ipv4/fib_trie.o
net/ipv4/fib_trie.c:351:2: error: expression result unused [-Werror,-Wunused-value]
        ++tn_info(n)->empty_children ? : ++tn_info(n)->full_children;

Fixes: 95f60ea3e99a ("fib_trie: Add collapse() and should_collapse() to resize")
Signed-off-by: Matthias Kaehlcke <mka@chromium.org>
Reviewed-by: Douglas Anderson <dianders@chromium.org>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
Acked-by: Alexander Duyck <alexander.h.duyck@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: mvpp2: debugfs: Add pmap to fs dump
Nathan Huckleberry [Wed, 19 Jun 2019 18:17:15 +0000 (11:17 -0700)]
net: mvpp2: debugfs: Add pmap to fs dump

There was an unused variable 'mvpp2_dbgfs_prs_pmap_fops'
Added a usage consistent with other fops to dump pmap
to userspace.

Cc: clang-built-linux@googlegroups.com
Link: https://github.com/ClangBuiltLinux/linux/issues/529
Signed-off-by: Nathan Huckleberry <nhuck@google.com>
Tested-by: Nick Desaulniers <ndesaulniers@google.com>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoipv6: Default fib6_type to RTN_UNICAST when not set
David Ahern [Wed, 19 Jun 2019 17:50:24 +0000 (10:50 -0700)]
ipv6: Default fib6_type to RTN_UNICAST when not set

A user reported that routes are getting installed with type 0 (RTN_UNSPEC)
where before the routes were RTN_UNICAST. One example is from accel-ppp
which apparently still uses the ioctl interface and does not set
rtmsg_type. Another is the netlink interface where ipv6 does not require
rtm_type to be set (v4 does). Prior to the commit in the Fixes tag the
ipv6 stack converted type 0 to RTN_UNICAST, so restore that behavior.

Fixes: e8478e80e5a7 ("net/ipv6: Save route type in rt6_info")
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: sched: act_ctinfo: tidy UAPI definition
Kevin Darbyshire-Bryant [Wed, 19 Jun 2019 17:41:10 +0000 (18:41 +0100)]
net: sched: act_ctinfo: tidy UAPI definition

Remove some enums from the UAPI definition that were only used
internally and are NOT part of the UAPI.

Signed-off-by: Kevin Darbyshire-Bryant <ldir@darbyshire-bryant.me.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: hns3: Fix inconsistent indenting
Krzysztof Kozlowski [Tue, 18 Jun 2019 18:54:22 +0000 (20:54 +0200)]
net: hns3: Fix inconsistent indenting

Fix wrong indentation of goto return.

Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoMerge branch 'af_iucv-fixes'
David S. Miller [Wed, 19 Jun 2019 20:26:33 +0000 (16:26 -0400)]
Merge branch 'af_iucv-fixes'

Julian Wiedmann says:

====================
net/af_iucv: fixes 2019-06-18

I spent a few cycles on transmit problems for af_iucv over regular
netdevices - please apply the following fixes to -net.

The first patch allows for skb allocations outside of GFP_DMA, while the
second patch respects that drivers might use skb_cow_head() and/or want
additional dev->needed_headroom.
Patch 3 is for a separate issue, where we didn't setup some of the
netdevice-specific infrastructure when running as a z/VM guest.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet/af_iucv: always register net_device notifier
Julian Wiedmann [Tue, 18 Jun 2019 18:43:01 +0000 (20:43 +0200)]
net/af_iucv: always register net_device notifier

Even when running as VM guest (ie pr_iucv != NULL), af_iucv can still
open HiperTransport-based connections. For robust operation these
connections require the af_iucv_netdev_notifier, so register it
unconditionally.

Also handle any error that register_netdevice_notifier() returns.

Fixes: 9fbd87d41392 ("af_iucv: handle netdev events")
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Reviewed-by: Ursula Braun <ubraun@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet/af_iucv: build proper skbs for HiperTransport
Julian Wiedmann [Tue, 18 Jun 2019 18:43:00 +0000 (20:43 +0200)]
net/af_iucv: build proper skbs for HiperTransport

The HiperSockets-based transport path in af_iucv is still too closely
entangled with qeth.
With commit a647a02512ca ("s390/qeth: speed-up L3 IQD xmit"), the
relevant xmit code in qeth has begun to use skb_cow_head(). So to avoid
unnecessary skb head expansions, af_iucv must learn to
1) respect dev->needed_headroom when allocating skbs, and
2) drop the header reference before cloning the skb.

While at it, also stop hard-coding the LL-header creation stage and just
use the appropriate helper.

Fixes: a647a02512ca ("s390/qeth: speed-up L3 IQD xmit")
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet/af_iucv: remove GFP_DMA restriction for HiperTransport
Julian Wiedmann [Tue, 18 Jun 2019 18:42:59 +0000 (20:42 +0200)]
net/af_iucv: remove GFP_DMA restriction for HiperTransport

af_iucv sockets over z/VM IUCV require that their skbs are allocated
in DMA memory. This restriction doesn't apply to connections over
HiperSockets. So only set this limit for z/VM IUCV sockets, thereby
increasing the likelihood that the large (and linear!) allocations for
HiperTransport messages succeed.

Fixes: 3881ac441f64 ("af_iucv: add HiperSockets transport")
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Reviewed-by: Ursula Braun <ubraun@linux.ibm.com>
Reviewed-by: Hendrik Brueckner <brueckner@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoMerge tag 'pm-5.2-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Linus Torvalds [Wed, 19 Jun 2019 18:44:04 +0000 (11:44 -0700)]
Merge tag 'pm-5.2-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull power management fix from Rafael Wysocki:
 "Prevent PCI bridges in general (and PCIe ports in particular) from
  being put into low-power states during system-wide suspend transitions
  if there are any devices in D0 below them and refine the handling of
  PCI devices in D0 during suspend-to-idle cycles"

* tag 'pm-5.2-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  PCI: PM: Skip devices in D0 for suspend-to-idle

5 years agoMerge tag 'apparmor-pr-2019-06-18' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Wed, 19 Jun 2019 18:39:00 +0000 (11:39 -0700)]
Merge tag 'apparmor-pr-2019-06-18' of git://git.kernel.org/pub/scm/linux/kernel/git/jj/linux-apparmor

Pull apparmor bug fixes from John Johansen:

 - fix PROFILE_MEDIATES for untrusted input

 - enforce nullbyte at end of tag string

 - reset pos on failure to unpack for various functions

* tag 'apparmor-pr-2019-06-18' of git://git.kernel.org/pub/scm/linux/kernel/git/jj/linux-apparmor:
  apparmor: reset pos on failure to unpack for various functions
  apparmor: enforce nullbyte at end of tag string
  apparmor: fix PROFILE_MEDIATES for untrusted input

5 years agoMerge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input
Linus Torvalds [Wed, 19 Jun 2019 18:26:09 +0000 (11:26 -0700)]
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input

Pull input updates from Dmitry Torokhov:
 "Just a few small fixups and switching a couple of Thinkpads to SMBus
  for touchpads as PS/2 emulation is not working well"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
  Input: synaptics - enable SMBus on ThinkPad E480 and E580
  Input: imx_keypad - make sure keyboard can always wake up system
  Input: iqs5xx - get axis info before calling input_mt_init_slots()
  Input: uinput - add compat ioctl number translation for UI_*_FF_UPLOAD
  Input: silead - add MSSL0017 to acpi_device_id
  Input: elantech - enable middle button support on 2 ThinkPads
  Input: elan_i2c - increment wakeup count if wake source

5 years agodoc: fix documentation about UIO_MEM_LOGICAL using
Yang Yingliang [Sat, 15 Jun 2019 09:41:29 +0000 (17:41 +0800)]
doc: fix documentation about UIO_MEM_LOGICAL using

After commit d4fc5069a394 ("mm: switch s_mem and slab_cache in struct page")
page->mapping will be re-used by slab allocations and page->mapping->host
will be used in balance_dirty_pages_ratelimited() as an inode member but
it's not an inode in fact and leads an oops.

[  159.906493] Unable to handle kernel paging request at virtual address ffff200012d90be8
[  159.908029] Mem abort info:
[  159.908552]   ESR = 0x96000007
[  159.909138]   Exception class = DABT (current EL), IL = 32 bits
[  159.910155]   SET = 0, FnV = 0
[  159.910690]   EA = 0, S1PTW = 0
[  159.911241] Data abort info:
[  159.911846]   ISV = 0, ISS = 0x00000007
[  159.912567]   CM = 0, WnR = 0
[  159.913105] swapper pgtable: 4k pages, 48-bit VAs, pgdp=0000000042acd000
[  159.914269] [ffff200012d90be8] pgd=000000043ffff003, pud=000000043fffe003, pmd=000000043fffa003, pte=0000000000000000
[  159.916280] Internal error: Oops: 96000007 [#1] SMP
[  159.917195] Dumping ftrace buffer:
[  159.917845]    (ftrace buffer empty)
[  159.918521] Modules linked in: uio_dev(OE)
[  159.919276] CPU: 1 PID: 295 Comm: uio_test Tainted: G           OE     5.2.0-rc4+ #46
[  159.920859] Hardware name: linux,dummy-virt (DT)
[  159.921815] pstate: 60000005 (nZCv daif -PAN -UAO)
[  159.922809] pc : balance_dirty_pages_ratelimited+0x68/0xc38
[  159.923965] lr : fault_dirty_shared_page.isra.8+0xe4/0x100
[  159.925134] sp : ffff800368a77ae0
[  159.925824] x29: ffff800368a77ae0 x28: 1ffff0006d14ce1a
[  159.926906] x27: ffff800368a670d0 x26: ffff800368a67120
[  159.927985] x25: 1ffff0006d10f5fe x24: ffff200012d90be8
[  159.929089] x23: ffff200013732000 x22: ffff80036ec03200
[  159.930172] x21: ffff200012d90bc0 x20: 1fffe400025b217d
[  159.931253] x19: ffff80036ec03200 x18: 0000000000000000
[  159.932348] x17: 0000000000000000 x16: 0ffffe0000010208
[  159.933439] x15: 0000000000000000 x14: 0000000000000000
[  159.934518] x13: 0000000000000000 x12: 0000000000000000
[  159.935596] x11: 1fffefc001b452c0 x10: ffff0fc001b452c0
[  159.936697] x9 : dfff200000000000 x8 : dfff200000000001
[  159.937781] x7 : ffff7e000da29607 x6 : ffff0fc001b452c1
[  159.938859] x5 : ffff0fc001b452c1 x4 : ffff0fc001b452c1
[  159.939944] x3 : ffff200010523ad4 x2 : 1fffe400026e659b
[  159.941065] x1 : dfff200000000000 x0 : ffff200013732cd8
[  159.942205] Call trace:
[  159.942732]  balance_dirty_pages_ratelimited+0x68/0xc38
[  159.943797]  fault_dirty_shared_page.isra.8+0xe4/0x100
[  159.944867]  do_fault+0x608/0x1250
[  159.945571]  __handle_mm_fault+0x93c/0xfb8
[  159.946412]  handle_mm_fault+0x1c0/0x360
[  159.947224]  do_page_fault+0x358/0x8d0
[  159.947993]  do_translation_fault+0xf8/0x124
[  159.948884]  do_mem_abort+0x70/0x190
[  159.949624]  el0_da+0x24/0x28

According another commit 5e901d0b15c0 ("scsi: qedi: Fix bad pte call trace
when iscsiuio is stopped."), using kmalloc also cause other problem.

But the documentation about UIO_MEM_LOGICAL allows using kmalloc(), remove
and don't allow using kmalloc() in documentation.

Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agoMAINTAINERS / Documentation: Thorsten Scherer is the successor of Gavin Schenk
Gavin Schenk [Tue, 18 Jun 2019 06:04:12 +0000 (08:04 +0200)]
MAINTAINERS / Documentation: Thorsten Scherer is the successor of Gavin Schenk

Due to new challenges in my life I can no longer take care of SIOX.
Thorsten takes over my SIOX tasks.

Signed-off-by: Gavin Schenk <g.schenk@eckelmann.de>
Acked-by: Thorsten Scherer <t.scherer@eckelmann.de>
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agodocs: fb: Add TER16x32 to the available font names
Takashi Iwai [Wed, 19 Jun 2019 05:39:43 +0000 (07:39 +0200)]
docs: fb: Add TER16x32 to the available font names

The new font is available since recently.

Signed-off-by: Takashi Iwai <tiwai@suse.de>
Acked-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agoMerge tag 'thunderbolt-fixes-for-v5.2-rc6' of git://git.kernel.org/pub/scm/linux...
Greg Kroah-Hartman [Wed, 19 Jun 2019 17:02:52 +0000 (19:02 +0200)]
Merge tag 'thunderbolt-fixes-for-v5.2-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/westeri/thunderbolt into char-misc-linus

Mika writes:

thunderbolt: Fixes for v5.2-rc6

This includes two fixes for issues found during the current release
cycle:

  - Fix runtime PM regression when device is authorized after the
    controller is runtime suspended.

  - Correct CIO reset flow for Titan Ridge.

* tag 'thunderbolt-fixes-for-v5.2-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/westeri/thunderbolt:
  thunderbolt: Implement CIO reset correctly for Titan Ridge
  thunderbolt: Make sure device runtime resume completes before taking domain lock

5 years agoMAINTAINERS: fpga: hand off maintainership to Moritz
Alan Tull [Mon, 17 Jun 2019 03:11:13 +0000 (22:11 -0500)]
MAINTAINERS: fpga: hand off maintainership to Moritz

I'm moving on to a new position and stepping down as FPGA subsystem
maintainer.  Moritz has graciously agreed to take over the
maintainership.

Signed-off-by: Alan Tull <atull@kernel.org>
Acked-by: Moritz Fischer <mdf@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agoMerge branch 'inet-fix-defrag-units-dismantle-races'
David S. Miller [Wed, 19 Jun 2019 15:37:48 +0000 (11:37 -0400)]
Merge branch 'inet-fix-defrag-units-dismantle-races'

Eric Dumazet says:

====================
inet: fix defrag units dismantle races

This series add a new pre_exit() method to struct pernet_operations
to solve a race in defrag units dismantle, without adding extra
delays to netns dismantles.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoinet: fix various use-after-free in defrags units
Eric Dumazet [Tue, 18 Jun 2019 18:09:00 +0000 (11:09 -0700)]
inet: fix various use-after-free in defrags units

syzbot reported another issue caused by my recent patches. [1]

The issue here is that fqdir_exit() is initiating a work queue
and immediately returns. A bit later cleanup_net() was able
to free the MIB (percpu data) and the whole struct net was freed,
but we had active frag timers that fired and triggered use-after-free.

We need to make sure that timers can catch fqdir->dead being set,
to bailout.

Since RCU is used for the reader side, this means
we want to respect an RCU grace period between these operations :

1) qfdir->dead = 1;

2) netns dismantle (freeing of various data structure)

This patch uses new new (struct pernet_operations)->pre_exit
infrastructure to ensures a full RCU grace period
happens between fqdir_pre_exit() and fqdir_exit()

This also means we can use a regular work queue, we no
longer need rcu_work.

Tested:

$ time for i in {1..1000}; do unshare -n /bin/false;done

real 0m2.585s
user 0m0.160s
sys 0m2.214s

[1]

BUG: KASAN: use-after-free in ip_expire+0x73e/0x800 net/ipv4/ip_fragment.c:152
Read of size 8 at addr ffff88808b9fe330 by task syz-executor.4/11860

CPU: 1 PID: 11860 Comm: syz-executor.4 Not tainted 5.2.0-rc2+ #22
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
 <IRQ>
 __dump_stack lib/dump_stack.c:77 [inline]
 dump_stack+0x172/0x1f0 lib/dump_stack.c:113
 print_address_description.cold+0x7c/0x20d mm/kasan/report.c:188
 __kasan_report.cold+0x1b/0x40 mm/kasan/report.c:317
 kasan_report+0x12/0x20 mm/kasan/common.c:614
 __asan_report_load8_noabort+0x14/0x20 mm/kasan/generic_report.c:132
 ip_expire+0x73e/0x800 net/ipv4/ip_fragment.c:152
 call_timer_fn+0x193/0x720 kernel/time/timer.c:1322
 expire_timers kernel/time/timer.c:1366 [inline]
 __run_timers kernel/time/timer.c:1685 [inline]
 __run_timers kernel/time/timer.c:1653 [inline]
 run_timer_softirq+0x66f/0x1740 kernel/time/timer.c:1698
 __do_softirq+0x25c/0x94c kernel/softirq.c:293
 invoke_softirq kernel/softirq.c:374 [inline]
 irq_exit+0x180/0x1d0 kernel/softirq.c:414
 exiting_irq arch/x86/include/asm/apic.h:536 [inline]
 smp_apic_timer_interrupt+0x13b/0x550 arch/x86/kernel/apic/apic.c:1068
 apic_timer_interrupt+0xf/0x20 arch/x86/entry/entry_64.S:806
 </IRQ>
RIP: 0010:tomoyo_domain_quota_is_ok+0x131/0x540 security/tomoyo/util.c:1035
Code: 24 4c 3b 65 d0 0f 84 9c 00 00 00 e8 19 1d 73 fe 49 8d 7c 24 18 48 ba 00 00 00 00 00 fc ff df 48 89 f8 48 c1 e8 03 0f b6 04 10 <48> 89 fa 83 e2 07 38 d0 7f 08 84 c0 0f 85 69 03 00 00 41 0f b6 5c
RSP: 0018:ffff88806ae079c0 EFLAGS: 00000a02 ORIG_RAX: ffffffffffffff13
RAX: 0000000000000000 RBX: 0000000000000010 RCX: ffffc9000e655000
RDX: dffffc0000000000 RSI: ffffffff82fd88a7 RDI: ffff888086202398
RBP: ffff88806ae07a00 R08: ffff88808b6c8700 R09: ffffed100d5c0f4d
R10: ffffed100d5c0f4c R11: 0000000000000000 R12: ffff888086202380
R13: 0000000000000030 R14: 00000000000000d3 R15: 0000000000000000
 tomoyo_supervisor+0x2e8/0xef0 security/tomoyo/common.c:2087
 tomoyo_audit_path_number_log security/tomoyo/file.c:235 [inline]
 tomoyo_path_number_perm+0x42f/0x520 security/tomoyo/file.c:734
 tomoyo_file_ioctl+0x23/0x30 security/tomoyo/tomoyo.c:335
 security_file_ioctl+0x77/0xc0 security/security.c:1370
 ksys_ioctl+0x57/0xd0 fs/ioctl.c:711
 __do_sys_ioctl fs/ioctl.c:720 [inline]
 __se_sys_ioctl fs/ioctl.c:718 [inline]
 __x64_sys_ioctl+0x73/0xb0 fs/ioctl.c:718
 do_syscall_64+0xfd/0x680 arch/x86/entry/common.c:301
 entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x4592c9
Code: fd b7 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 cb b7 fb ff c3 66 2e 0f 1f 84 00 00 00 00
RSP: 002b:00007f8db5e44c78 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 00000000004592c9
RDX: 0000000020000080 RSI: 00000000000089f1 RDI: 0000000000000006
RBP: 000000000075bf20 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 00007f8db5e456d4
R13: 00000000004cc770 R14: 00000000004d5cd8 R15: 00000000ffffffff

Allocated by task 9047:
 save_stack+0x23/0x90 mm/kasan/common.c:71
 set_track mm/kasan/common.c:79 [inline]
 __kasan_kmalloc mm/kasan/common.c:489 [inline]
 __kasan_kmalloc.constprop.0+0xcf/0xe0 mm/kasan/common.c:462
 kasan_slab_alloc+0xf/0x20 mm/kasan/common.c:497
 slab_post_alloc_hook mm/slab.h:437 [inline]
 slab_alloc mm/slab.c:3326 [inline]
 kmem_cache_alloc+0x11a/0x6f0 mm/slab.c:3488
 kmem_cache_zalloc include/linux/slab.h:732 [inline]
 net_alloc net/core/net_namespace.c:386 [inline]
 copy_net_ns+0xed/0x340 net/core/net_namespace.c:426
 create_new_namespaces+0x400/0x7b0 kernel/nsproxy.c:107
 unshare_nsproxy_namespaces+0xc2/0x200 kernel/nsproxy.c:206
 ksys_unshare+0x440/0x980 kernel/fork.c:2692
 __do_sys_unshare kernel/fork.c:2760 [inline]
 __se_sys_unshare kernel/fork.c:2758 [inline]
 __x64_sys_unshare+0x31/0x40 kernel/fork.c:2758
 do_syscall_64+0xfd/0x680 arch/x86/entry/common.c:301
 entry_SYSCALL_64_after_hwframe+0x49/0xbe

Freed by task 2541:
 save_stack+0x23/0x90 mm/kasan/common.c:71
 set_track mm/kasan/common.c:79 [inline]
 __kasan_slab_free+0x102/0x150 mm/kasan/common.c:451
 kasan_slab_free+0xe/0x10 mm/kasan/common.c:459
 __cache_free mm/slab.c:3432 [inline]
 kmem_cache_free+0x86/0x260 mm/slab.c:3698
 net_free net/core/net_namespace.c:402 [inline]
 net_drop_ns.part.0+0x70/0x90 net/core/net_namespace.c:409
 net_drop_ns net/core/net_namespace.c:408 [inline]
 cleanup_net+0x538/0x960 net/core/net_namespace.c:571
 process_one_work+0x989/0x1790 kernel/workqueue.c:2269
 worker_thread+0x98/0xe40 kernel/workqueue.c:2415
 kthread+0x354/0x420 kernel/kthread.c:255
 ret_from_fork+0x24/0x30 arch/x86/entry/entry_64.S:352

The buggy address belongs to the object at ffff88808b9fe100
 which belongs to the cache net_namespace of size 6784
The buggy address is located 560 bytes inside of
 6784-byte region [ffff88808b9fe100ffff88808b9ffb80)
The buggy address belongs to the page:
page:ffffea00022e7f80 refcount:1 mapcount:0 mapping:ffff88821b6f60c0 index:0x0 compound_mapcount: 0
flags: 0x1fffc0000010200(slab|head)
raw: 01fffc0000010200 ffffea000256f288 ffffea0001bbef08 ffff88821b6f60c0
raw: 0000000000000000 ffff88808b9fe100 0000000100000001 0000000000000000
page dumped because: kasan: bad access detected

Memory state around the buggy address:
 ffff88808b9fe200: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
 ffff88808b9fe280: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>ffff88808b9fe300: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
                                     ^
 ffff88808b9fe380: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
 ffff88808b9fe400: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb

Fixes: 3c8fc8782044 ("inet: frags: rework rhashtable dismantle")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: David S. Miller <davem@davemloft.net>