]> git.proxmox.com Git - mirror_ubuntu-eoan-kernel.git/log
mirror_ubuntu-eoan-kernel.git
6 years agotun: fix a memory leak for tfile->tx_array
Cong Wang [Mon, 15 Jan 2018 19:37:29 +0000 (11:37 -0800)]
tun: fix a memory leak for tfile->tx_array

tfile->tun could be detached before we close the tun fd,
via tun_detach_all(), so it should not be used to check for
tfile->tx_array.

As Jason suggested, we probably have to clean it up
unconditionally both in __tun_deatch() and tun_detach_all(),
but this requires to check if it is initialized or not.
Currently skb_array_cleanup() doesn't have such a check,
so I check it in the caller and introduce a helper function,
it is a bit ugly but we can always improve it in net-next.

Reported-by: Dmitry Vyukov <dvyukov@google.com>
Fixes: 1576d9860599 ("tun: switch to use skb array for tx")
Cc: Jason Wang <jasowang@redhat.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoMerge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma
Linus Torvalds [Wed, 17 Jan 2018 00:47:40 +0000 (16:47 -0800)]
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma

Pull rdma fixes from Doug Ledford:
 "We had a few more items creep up over the last week. Given we are in
  -rc8, these are obviously limited to bugs that have a big downside and
  for which we are certain of the fix.

  The first is a straight up oops bug that all you have to do is read
  the code to see it's a guaranteed 100% oops bug.

  The second is a use-after-free issue. We get away lucky if the queue
  we are shutting down is empty, but if it isn't, we can end up oopsing.
  We really need to drain the queue before destroying it.

  The final one is an issue with bad user input causing us to access our
  port array out of bounds. While fixing the array out of bounds issue,
  it was noticed that the original code did the same thing twice (the
  call to rdma_ah_set_port_num()), so its removal is not balanced by a
  readd elsewhere, it was already where it needed to be in addition to
  where it didn't need to be.

  Summary:

   - Oops fix in hfi1 driver

   - use-after-free issue in iser-target

   - use of user supplied array index without proper checking"

* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma:
  RDMA/mlx5: Fix out-of-bound access while querying AH
  IB/hfi1: Prevent a NULL dereference
  iser-target: Fix possible use-after-free in connection establishment error

6 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Linus Torvalds [Tue, 16 Jan 2018 20:45:30 +0000 (12:45 -0800)]
Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net

Pull networking fixes from David Miller:

 1) Two read past end of buffer fixes in AF_KEY, from Eric Biggers.

 2) Memory leak in key_notify_policy(), from Steffen Klassert.

 3) Fix overflow with bpf arrays, from Daniel Borkmann.

 4) Fix RDMA regression with mlx5 due to mlx5 no longer using
    pci_irq_get_affinity(), from Saeed Mahameed.

 5) Missing RCU read locking in nl80211_send_iface() when it calls
    ieee80211_bss_get_ie(), from Dominik Brodowski.

 6) cfg80211 should check dev_set_name()'s return value, from Johannes
    Berg.

 7) Missing module license tag in 9p protocol, from Stephen Hemminger.

 8) Fix crash due to too small MTU in udp ipv6 sendmsg, from Mike
    Maloney.

 9) Fix endless loop in netlink extack code, from David Ahern.

10) TLS socket layer sets inverted error codes, resulting in an endless
    loop. From Robert Hering.

11) Revert openvswitch erspan tunnel support, it's mis-designed and we
    need to kill it before it goes into a real release. From William Tu.

12) Fix lan78xx failures in full speed USB mode, from Yuiko Oshino.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (54 commits)
  net, sched: fix panic when updating miniq {b,q}stats
  qed: Fix potential use-after-free in qed_spq_post()
  nfp: use the correct index for link speed table
  lan78xx: Fix failure in USB Full Speed
  sctp: do not allow the v4 socket to bind a v4mapped v6 address
  sctp: return error if the asoc has been peeled off in sctp_wait_for_sndbuf
  sctp: reinit stream if stream outcnt has been change by sinit in sendmsg
  ibmvnic: Fix pending MAC address changes
  netlink: extack: avoid parenthesized string constant warning
  ipv4: Make neigh lookup keys for loopback/point-to-point devices be INADDR_ANY
  net: Allow neigh contructor functions ability to modify the primary_key
  sh_eth: fix dumping ARSTR
  Revert "openvswitch: Add erspan tunnel support."
  net/tls: Fix inverted error codes to avoid endless loop
  ipv6: ip6_make_skb() needs to clear cork.base.dst
  sctp: avoid compiler warning on implicit fallthru
  net: ipv4: Make "ip route get" match iif lo rules again.
  netlink: extack needs to be reset each time through loop
  tipc: fix a memory leak in tipc_nl_node_get_link()
  ipv6: fix udpv6 sendmsg crash caused by too small MTU
  ...

6 years agoMerge tag 'sound-4.15' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
Linus Torvalds [Tue, 16 Jan 2018 20:13:52 +0000 (12:13 -0800)]
Merge tag 'sound-4.15' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound

Pull sound fixes from Takashi Iwai:
 "A few small last-minute fixes that should sneak into 4.15:

   - remove a spurious WARN_ON() triggered by syzkaller

   - fix for ioctl races in ALSA sequencer

   - two trivial HD-audio fixup entries"

* tag 'sound-4.15' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
  ALSA: seq: Make ioctls race-free
  ALSA: pcm: Remove yet superfluous WARN_ON()
  ALSA: hda - Apply the existing quirk to iMac 14,1
  ALSA: hda - Apply headphone noise quirk for another Dell XPS 13 variant

6 years agoMerge tag 'trace-v4.15-rc4-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rosted...
Linus Torvalds [Tue, 16 Jan 2018 20:09:36 +0000 (12:09 -0800)]
Merge tag 'trace-v4.15-rc4-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace

Pull tracing fixes from Steven Rostedt:

 - Bring back context level recursive protection in ring buffer.

   The simpler counter protection failed, due to a path when tracing
   with trace_clock_global() as it could not be reentrant and depended
   on the ring buffer recursive protection to keep that from happening.

 - Prevent branch profiling when FORTIFY_SOURCE is enabled.

   It causes 50 - 60 MB in warning messages. Branch profiling should
   never be run on production systems, so there's no reason that it
   needs to be enabled with FORTIFY_SOURCE.

* tag 'trace-v4.15-rc4-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
  tracing: Prevent PROFILE_ALL_BRANCHES when FORTIFY_SOURCE=y
  ring-buffer: Bring back context level recursive checks

6 years agonet, sched: fix panic when updating miniq {b,q}stats
Daniel Borkmann [Mon, 15 Jan 2018 22:12:09 +0000 (23:12 +0100)]
net, sched: fix panic when updating miniq {b,q}stats

While working on fixing another bug, I ran into the following panic
on arm64 by simply attaching clsact qdisc, adding a filter and running
traffic on ingress to it:

  [...]
  [  178.188591] Unable to handle kernel read from unreadable memory at virtual address 810fb501f000
  [  178.197314] Mem abort info:
  [  178.200121]   ESR = 0x96000004
  [  178.203168]   Exception class = DABT (current EL), IL = 32 bits
  [  178.209095]   SET = 0, FnV = 0
  [  178.212157]   EA = 0, S1PTW = 0
  [  178.215288] Data abort info:
  [  178.218175]   ISV = 0, ISS = 0x00000004
  [  178.222019]   CM = 0, WnR = 0
  [  178.224997] user pgtable: 4k pages, 48-bit VAs, pgd = 0000000023cb3f33
  [  178.231531] [0000810fb501f000] *pgd=0000000000000000
  [  178.236508] Internal error: Oops: 96000004 [#1] SMP
  [...]
  [  178.311855] CPU: 73 PID: 2497 Comm: ping Tainted: G        W        4.15.0-rc7+ #5
  [  178.319413] Hardware name: FOXCONN R2-1221R-A4/C2U4N_MB, BIOS G31FB18A 03/31/2017
  [  178.326887] pstate: 60400005 (nZCv daif +PAN -UAO)
  [  178.331685] pc : __netif_receive_skb_core+0x49c/0xac8
  [  178.336728] lr : __netif_receive_skb+0x28/0x78
  [  178.341161] sp : ffff00002344b750
  [  178.344465] x29: ffff00002344b750 x28: ffff810fbdfd0580
  [  178.349769] x27: 0000000000000000 x26: ffff000009378000
  [...]
  [  178.418715] x1 : 0000000000000054 x0 : 0000000000000000
  [  178.424020] Process ping (pid: 2497, stack limit = 0x000000009f0a3ff4)
  [  178.430537] Call trace:
  [  178.432976]  __netif_receive_skb_core+0x49c/0xac8
  [  178.437670]  __netif_receive_skb+0x28/0x78
  [  178.441757]  process_backlog+0x9c/0x160
  [  178.445584]  net_rx_action+0x2f8/0x3f0
  [...]

Reason is that sch_ingress and sch_clsact are doing mini_qdisc_pair_init()
which sets up miniq pointers to cpu_{b,q}stats from the underlying qdisc.
Problem is that this cannot work since they are actually set up right after
the qdisc ->init() callback in qdisc_create(), so first packet going into
sch_handle_ingress() tries to call mini_qdisc_bstats_cpu_update() and we
therefore panic.

In order to fix this, allocation of {b,q}stats needs to happen before we
call into ->init(). In net-next, there's already such option through commit
d59f5ffa59d8 ("net: sched: a dflt qdisc may be used with per cpu stats").
However, the bug needs to be fixed in net still for 4.15. Thus, include
these bits to reduce any merge churn and reuse the static_flags field to
set TCQ_F_CPUSTATS, and remove the allocation from qdisc_create() since
there is no other user left. Prashant Bhole ran into the same issue but
for net-next, thus adding him below as well as co-author. Same issue was
also reported by Sandipan Das when using bcc.

Fixes: 46209401f8f6 ("net: core: introduce mini_Qdisc and eliminate usage of tp->q for clsact fastpath")
Reference: https://lists.iovisor.org/pipermail/iovisor-dev/2018-January/001190.html
Reported-by: Sandipan Das <sandipan@linux.vnet.ibm.com>
Co-authored-by: Prashant Bhole <bhole_prashant_q7@lab.ntt.co.jp>
Co-authored-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Cc: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoqed: Fix potential use-after-free in qed_spq_post()
Roland Dreier [Mon, 15 Jan 2018 20:24:49 +0000 (12:24 -0800)]
qed: Fix potential use-after-free in qed_spq_post()

We need to check if p_ent->comp_mode is QED_SPQ_MODE_EBLOCK before
calling qed_spq_add_entry().  The test is fine is the mode is EBLOCK,
but if it isn't then qed_spq_add_entry() might kfree(p_ent).

Signed-off-by: Roland Dreier <roland@purestorage.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonfp: use the correct index for link speed table
Jakub Kicinski [Mon, 15 Jan 2018 19:47:53 +0000 (11:47 -0800)]
nfp: use the correct index for link speed table

sts variable is holding link speed as well as state.  We should
be using ls to index into ls_to_ethtool.

Fixes: 265aeb511bd5 ("nfp: add support for .get_link_ksettings()")
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agolan78xx: Fix failure in USB Full Speed
Yuiko Oshino [Mon, 15 Jan 2018 18:24:28 +0000 (13:24 -0500)]
lan78xx: Fix failure in USB Full Speed

Fix initialize the uninitialized tx_qlen to an appropriate value when USB
Full Speed is used.

Fixes: 55d7de9de6c3 ("Microchip's LAN7800 family USB 2/3 to 10/100/1000 Ethernet device driver")
Signed-off-by: Yuiko Oshino <yuiko.oshino@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoMerge tag 'mac80211-for-davem-2018-01-15' of git://git.kernel.org/pub/scm/linux/kerne...
David S. Miller [Tue, 16 Jan 2018 19:28:14 +0000 (14:28 -0500)]
Merge tag 'mac80211-for-davem-2018-01-15' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211

Johannes Berg says:

====================
More fixes:
 * hwsim:
    - properly flush deletion works at module unload
    - validate # of channels passed from userspace
 * cfg80211:
    - fix RCU locking regression
    - initialize on-stack channel data for nl80211 event
    - check dev_set_name() return value
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agosctp: do not allow the v4 socket to bind a v4mapped v6 address
Xin Long [Mon, 15 Jan 2018 09:02:00 +0000 (17:02 +0800)]
sctp: do not allow the v4 socket to bind a v4mapped v6 address

The check in sctp_sockaddr_af is not robust enough to forbid binding a
v4mapped v6 addr on a v4 socket.

The worse thing is that v4 socket's bind_verify would not convert this
v4mapped v6 addr to a v4 addr. syzbot even reported a crash as the v4
socket bound a v6 addr.

This patch is to fix it by doing the common sa.sa_family check first,
then AF_INET check for v4mapped v6 addrs.

Fixes: 7dab83de50c7 ("sctp: Support ipv6only AF_INET6 sockets.")
Reported-by: syzbot+7b7b518b1228d2743963@syzkaller.appspotmail.com
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agosctp: return error if the asoc has been peeled off in sctp_wait_for_sndbuf
Xin Long [Mon, 15 Jan 2018 09:01:36 +0000 (17:01 +0800)]
sctp: return error if the asoc has been peeled off in sctp_wait_for_sndbuf

After commit cea0cc80a677 ("sctp: use the right sk after waking up from
wait_buf sleep"), it may change to lock another sk if the asoc has been
peeled off in sctp_wait_for_sndbuf.

However, the asoc's new sk could be already closed elsewhere, as it's in
the sendmsg context of the old sk that can't avoid the new sk's closing.
If the sk's last one refcnt is held by this asoc, later on after putting
this asoc, the new sk will be freed, while under it's own lock.

This patch is to revert that commit, but fix the old issue by returning
error under the old sk's lock.

Fixes: cea0cc80a677 ("sctp: use the right sk after waking up from wait_buf sleep")
Reported-by: syzbot+ac6ea7baa4432811eb50@syzkaller.appspotmail.com
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agosctp: reinit stream if stream outcnt has been change by sinit in sendmsg
Xin Long [Mon, 15 Jan 2018 09:01:19 +0000 (17:01 +0800)]
sctp: reinit stream if stream outcnt has been change by sinit in sendmsg

After introducing sctp_stream structure, sctp uses stream->outcnt as the
out stream nums instead of c.sinit_num_ostreams.

However when users use sinit in cmsg, it only updates c.sinit_num_ostreams
in sctp_sendmsg. At that moment, stream->outcnt is still using previous
value. If it's value is not updated, the sinit_num_ostreams of sinit could
not really work.

This patch is to fix it by updating stream->outcnt and reiniting stream
if stream outcnt has been change by sinit in sendmsg.

Fixes: a83863174a61 ("sctp: prepare asoc stream for stream reconf")
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoibmvnic: Fix pending MAC address changes
Thomas Falcon [Thu, 11 Jan 2018 01:39:52 +0000 (19:39 -0600)]
ibmvnic: Fix pending MAC address changes

Due to architecture limitations, the IBM VNIC client driver is unable
to perform MAC address changes unless the device has "logged in" to
its backing device. Currently, pending MAC changes are handled before
login, resulting in an error and failure to change the MAC address.
Moving that chunk to the end of the ibmvnic_login function, when we are
sure that it was successful, fixes that.

The MAC address can be changed when the device is up or down, so
only check if the device is in a "PROBED" state before setting the
MAC address.

Fixes: c26eba03e407 ("ibmvnic: Update reset infrastructure to support tunable parameters")
Signed-off-by: Thomas Falcon <tlfalcon@linux.vnet.ibm.com>
Reviewed-by: John Allen <jallen@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoRDMA/mlx5: Fix out-of-bound access while querying AH
Leon Romanovsky [Fri, 12 Jan 2018 05:58:39 +0000 (07:58 +0200)]
RDMA/mlx5: Fix out-of-bound access while querying AH

The rdma_ah_find_type() accesses the port array based on an index
controlled by userspace. The existing bounds check is after the first use
of the index, so userspace can generate an out of bounds access, as shown
by the KASN report below.

==================================================================
BUG: KASAN: slab-out-of-bounds in to_rdma_ah_attr+0xa8/0x3b0
Read of size 4 at addr ffff880019ae2268 by task ibv_rc_pingpong/409

CPU: 0 PID: 409 Comm: ibv_rc_pingpong Not tainted 4.15.0-rc2-00031-gb60a3faf5b83-dirty #3
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.7.5-0-ge51488c-20140602_164612-nilsson.home.kraxel.org 04/01/2014
Call Trace:
 dump_stack+0xe9/0x18f
 print_address_description+0xa2/0x350
 kasan_report+0x3a5/0x400
 to_rdma_ah_attr+0xa8/0x3b0
 mlx5_ib_query_qp+0xd35/0x1330
 ib_query_qp+0x8a/0xb0
 ib_uverbs_query_qp+0x237/0x7f0
 ib_uverbs_write+0x617/0xd80
 __vfs_write+0xf7/0x500
 vfs_write+0x149/0x310
 SyS_write+0xca/0x190
 entry_SYSCALL_64_fastpath+0x18/0x85
RIP: 0033:0x7fe9c7a275a0
RSP: 002b:00007ffee5498738 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
RAX: ffffffffffffffda RBX: 00007fe9c7ce4b00 RCX: 00007fe9c7a275a0
RDX: 0000000000000018 RSI: 00007ffee5498800 RDI: 0000000000000003
RBP: 000055d0c8d3f010 R08: 00007ffee5498800 R09: 0000000000000018
R10: 00000000000000ba R11: 0000000000000246 R12: 0000000000008000
R13: 0000000000004fb0 R14: 000055d0c8d3f050 R15: 00007ffee5498560

Allocated by task 1:
 __kmalloc+0x3f9/0x430
 alloc_mad_private+0x25/0x50
 ib_mad_post_receive_mads+0x204/0xa60
 ib_mad_init_device+0xa59/0x1020
 ib_register_device+0x83a/0xbc0
 mlx5_ib_add+0x50e/0x5c0
 mlx5_add_device+0x142/0x410
 mlx5_register_interface+0x18f/0x210
 mlx5_ib_init+0x56/0x63
 do_one_initcall+0x15b/0x270
 kernel_init_freeable+0x2d8/0x3d0
 kernel_init+0x14/0x190
 ret_from_fork+0x24/0x30

Freed by task 0:
(stack is not available)

The buggy address belongs to the object at ffff880019ae2000
 which belongs to the cache kmalloc-512 of size 512
The buggy address is located 104 bytes to the right of
 512-byte region [ffff880019ae2000ffff880019ae2200)
The buggy address belongs to the page:
page:000000005d674e18 count:1 mapcount:0 mapping:          (null) index:0x0 compound_mapcount: 0
flags: 0x4000000000008100(slab|head)
raw: 4000000000008100 0000000000000000 0000000000000000 00000001000c000c
raw: dead000000000100 dead000000000200 ffff88001a402000 0000000000000000
page dumped because: kasan: bad access detected

Memory state around the buggy address:
 ffff880019ae2100: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
 ffff880019ae2180: 00 00 00 00 00 00 00 00 00 00 00 00 00 fc fc fc
>ffff880019ae2200: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
                                                          ^
 ffff880019ae2280: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
 ffff880019ae2300: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
==================================================================
Disabling lock debugging due to kernel taint

Cc: <stable@vger.kernel.org>
Fixes: 44c58487d51a ("IB/core: Define 'ib' and 'roce' rdma_ah_attr types")
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
6 years agonetlink: extack: avoid parenthesized string constant warning
Johannes Berg [Mon, 15 Jan 2018 11:42:25 +0000 (12:42 +0100)]
netlink: extack: avoid parenthesized string constant warning

NL_SET_ERR_MSG() and NL_SET_ERR_MSG_ATTR() lead to the following warning
in newer versions of gcc:
  warning: array initialized from parenthesized string constant

Just remove the parentheses, they're not needed in this context since
anyway since there can be no operator precendence issues or similar.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoMerge branch 'ipv4-Make-neigh-lookup-keys-for-loopback-point-to-point-devices-be...
David S. Miller [Mon, 15 Jan 2018 19:53:44 +0000 (14:53 -0500)]
Merge branch 'ipv4-Make-neigh-lookup-keys-for-loopback-point-to-point-devices-be-INADDR_ANY'

Jim Westfall says:

====================
ipv4: Make neigh lookup keys for loopback/point-to-point devices be INADDR_ANY

This used to be the previous behavior in older kernels but became broken in
a263b3093641f (ipv4: Make neigh lookups directly in output packet path)
and then later removed because it was broken in 0bb4087cbec0 (ipv4: Fix neigh
lookup keying over loopback/point-to-point devices)

Not having this results in there being an arp entry for every remote ip
address that the device talks to.  Given a fairly active device it can
cause the arp table to become huge and/or having to add/purge large number
of entires to keep within table size thresholds.

$ ip -4 neigh show nud noarp | grep tun | wc -l
55850

$ lnstat -k arp_cache:entries,arp_cache:allocs,arp_cache:destroys -c 10
arp_cach|arp_cach|arp_cach|
 entries|  allocs|destroys|
   81493|620166816|620126069|
  101867|   10186|       0|
  113854|    5993|       0|
  118773|    2459|       0|
   27937|   18579|   63998|
   39256|    5659|       0|
   56231|    8487|       0|
   65602|    4685|       0|
   79697|    7047|       0|
   90733|    5517|       0|

v2:
 - fixes coding style issues
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoipv4: Make neigh lookup keys for loopback/point-to-point devices be INADDR_ANY
Jim Westfall [Sun, 14 Jan 2018 12:18:51 +0000 (04:18 -0800)]
ipv4: Make neigh lookup keys for loopback/point-to-point devices be INADDR_ANY

Map all lookup neigh keys to INADDR_ANY for loopback/point-to-point devices
to avoid making an entry for every remote ip the device needs to talk to.

This used the be the old behavior but became broken in a263b3093641f
(ipv4: Make neigh lookups directly in output packet path) and later removed
in 0bb4087cbec0 (ipv4: Fix neigh lookup keying over loopback/point-to-point
devices) because it was broken.

Signed-off-by: Jim Westfall <jwestfall@surrealistic.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet: Allow neigh contructor functions ability to modify the primary_key
Jim Westfall [Sun, 14 Jan 2018 12:18:50 +0000 (04:18 -0800)]
net: Allow neigh contructor functions ability to modify the primary_key

Use n->primary_key instead of pkey to account for the possibility that a neigh
constructor function may have modified the primary_key value.

Signed-off-by: Jim Westfall <jwestfall@surrealistic.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agosh_eth: fix dumping ARSTR
Sergei Shtylyov [Sat, 13 Jan 2018 17:22:01 +0000 (20:22 +0300)]
sh_eth: fix dumping ARSTR

ARSTR  is always located at the start of the TSU register region, thus
using add_reg()  instead of add_tsu_reg() in __sh_eth_get_regs() to dump it
causes EDMR or EDSR (depending on the register layout) to be dumped instead
of ARSTR.  Use the correct condition/macro there...

Fixes: 6b4b4fead342 ("sh_eth: Implement ethtool register dump operations")
Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoRevert "openvswitch: Add erspan tunnel support."
William Tu [Fri, 12 Jan 2018 20:29:22 +0000 (12:29 -0800)]
Revert "openvswitch: Add erspan tunnel support."

This reverts commit ceaa001a170e43608854d5290a48064f57b565ed.

The OVS_TUNNEL_KEY_ATTR_ERSPAN_OPTS attr should be designed
as a nested attribute to support all ERSPAN v1 and v2's fields.
The current attr is a be32 supporting only one field.  Thus, this
patch reverts it and later patch will redo it using nested attr.

Signed-off-by: William Tu <u9012063@gmail.com>
Cc: Jiri Benc <jbenc@redhat.com>
Cc: Pravin Shelar <pshelar@ovn.org>
Acked-by: Jiri Benc <jbenc@redhat.com>
Acked-by: Pravin B Shelar <pshelar@ovn.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet/tls: Fix inverted error codes to avoid endless loop
r.hering@avm.de [Fri, 12 Jan 2018 14:42:06 +0000 (15:42 +0100)]
net/tls: Fix inverted error codes to avoid endless loop

sendfile() calls can hang endless with using Kernel TLS if a socket error occurs.
Socket error codes must be inverted by Kernel TLS before returning because
they are stored with positive sign. If returned non-inverted they are
interpreted as number of bytes sent, causing endless looping of the
splice mechanic behind sendfile().

Signed-off-by: Robert Hering <r.hering@avm.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoipv6: ip6_make_skb() needs to clear cork.base.dst
Eric Dumazet [Fri, 12 Jan 2018 06:31:18 +0000 (22:31 -0800)]
ipv6: ip6_make_skb() needs to clear cork.base.dst

In my last patch, I missed fact that cork.base.dst was not initialized
in ip6_make_skb() :

If ip6_setup_cork() returns an error, we might attempt a dst_release()
on some random pointer.

Fixes: 862c03ee1deb ("ipv6: fix possible mem leaks in ipv6_make_skb()")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agotracing: Prevent PROFILE_ALL_BRANCHES when FORTIFY_SOURCE=y
Randy Dunlap [Mon, 15 Jan 2018 19:07:27 +0000 (11:07 -0800)]
tracing: Prevent PROFILE_ALL_BRANCHES when FORTIFY_SOURCE=y

I regularly get 50 MB - 60 MB files during kernel randconfig builds.
These large files mostly contain (many repeats of; e.g., 124,594):

In file included from ../include/linux/string.h:6:0,
                 from ../include/linux/uuid.h:20,
                 from ../include/linux/mod_devicetable.h:13,
                 from ../scripts/mod/devicetable-offsets.c:3:
../include/linux/compiler.h:64:4: warning: '______f' is static but declared in inline function 'strcpy' which is not static [enabled by default]
    ______f = {     \
    ^
../include/linux/compiler.h:56:23: note: in expansion of macro '__trace_if'
                       ^
../include/linux/string.h:425:2: note: in expansion of macro 'if'
  if (p_size == (size_t)-1 && q_size == (size_t)-1)
  ^

This only happens when CONFIG_FORTIFY_SOURCE=y and
CONFIG_PROFILE_ALL_BRANCHES=y, so prevent PROFILE_ALL_BRANCHES if
FORTIFY_SOURCE=y.

Link: http://lkml.kernel.org/r/9199446b-a141-c0c3-9678-a3f9107f2750@infradead.org
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
6 years agosctp: avoid compiler warning on implicit fallthru
Marcelo Ricardo Leitner [Thu, 11 Jan 2018 16:22:06 +0000 (14:22 -0200)]
sctp: avoid compiler warning on implicit fallthru

These fall-through are expected.

Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Reviewed-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet: ipv4: Make "ip route get" match iif lo rules again.
Lorenzo Colitti [Thu, 11 Jan 2018 09:36:26 +0000 (18:36 +0900)]
net: ipv4: Make "ip route get" match iif lo rules again.

Commit 3765d35ed8b9 ("net: ipv4: Convert inet_rtm_getroute to rcu
versions of route lookup") broke "ip route get" in the presence
of rules that specify iif lo.

Host-originated traffic always has iif lo, because
ip_route_output_key_hash and ip6_route_output_flags set the flow
iif to LOOPBACK_IFINDEX. Thus, putting "iif lo" in an ip rule is a
convenient way to select only originated traffic and not forwarded
traffic.

inet_rtm_getroute used to match these rules correctly because
even though it sets the flow iif to 0, it called
ip_route_output_key which overwrites iif with LOOPBACK_IFINDEX.
But now that it calls ip_route_output_key_hash_rcu, the ifindex
will remain 0 and not match the iif lo in the rule. As a result,
"ip route get" will return ENETUNREACH.

Fixes: 3765d35ed8b9 ("net: ipv4: Convert inet_rtm_getroute to rcu versions of route lookup")
Tested: https://android.googlesource.com/kernel/tests/+/master/net/test/multinetwork_test.py passes again
Signed-off-by: Lorenzo Colitti <lorenzo@google.com>
Acked-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonetlink: extack needs to be reset each time through loop
David Ahern [Wed, 10 Jan 2018 21:00:39 +0000 (13:00 -0800)]
netlink: extack needs to be reset each time through loop

syzbot triggered the WARN_ON in netlink_ack testing the bad_attr value.
The problem is that netlink_rcv_skb loops over the skb repeatedly invoking
the callback and without resetting the extack leaving potentially stale
data. Initializing each time through avoids the WARN_ON.

Fixes: 2d4bc93368f5a ("netlink: extended ACK reporting")
Reported-by: syzbot+315fa6766d0f7c359327@syzkaller.appspotmail.com
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agotipc: fix a memory leak in tipc_nl_node_get_link()
Cong Wang [Wed, 10 Jan 2018 20:50:25 +0000 (12:50 -0800)]
tipc: fix a memory leak in tipc_nl_node_get_link()

When tipc_node_find_by_name() fails, the nlmsg is not
freed.

While on it, switch to a goto label to properly
free it.

Fixes: be9c086715c ("tipc: narrow down exposure of struct tipc_node")
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Cc: Jon Maloy <jon.maloy@ericsson.com>
Cc: Ying Xue <ying.xue@windriver.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Acked-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoipv6: fix udpv6 sendmsg crash caused by too small MTU
Mike Maloney [Wed, 10 Jan 2018 17:45:10 +0000 (12:45 -0500)]
ipv6: fix udpv6 sendmsg crash caused by too small MTU

The logic in __ip6_append_data() assumes that the MTU is at least large
enough for the headers.  A device's MTU may be adjusted after being
added while sendmsg() is processing data, resulting in
__ip6_append_data() seeing any MTU.  For an mtu smaller than the size of
the fragmentation header, the math results in a negative 'maxfraglen',
which causes problems when refragmenting any previous skb in the
skb_write_queue, leaving it possibly malformed.

Instead sendmsg returns EINVAL when the mtu is calculated to be less
than IPV6_MIN_MTU.

Found by syzkaller:
kernel BUG at ./include/linux/skbuff.h:2064!
invalid opcode: 0000 [#1] SMP KASAN
Dumping ftrace buffer:
   (ftrace buffer empty)
Modules linked in:
CPU: 1 PID: 14216 Comm: syz-executor5 Not tainted 4.13.0-rc4+ #2
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
task: ffff8801d0b68580 task.stack: ffff8801ac6b8000
RIP: 0010:__skb_pull include/linux/skbuff.h:2064 [inline]
RIP: 0010:__ip6_make_skb+0x18cf/0x1f70 net/ipv6/ip6_output.c:1617
RSP: 0018:ffff8801ac6bf570 EFLAGS: 00010216
RAX: 0000000000010000 RBX: 0000000000000028 RCX: ffffc90003cce000
RDX: 00000000000001b8 RSI: ffffffff839df06f RDI: ffff8801d9478ca0
RBP: ffff8801ac6bf780 R08: ffff8801cc3f1dbc R09: 0000000000000000
R10: ffff8801ac6bf7a0 R11: 43cb4b7b1948a9e7 R12: ffff8801cc3f1dc8
R13: ffff8801cc3f1d40 R14: 0000000000001036 R15: dffffc0000000000
FS:  00007f43d740c700(0000) GS:ffff8801dc100000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f7834984000 CR3: 00000001d79b9000 CR4: 00000000001406e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
 ip6_finish_skb include/net/ipv6.h:911 [inline]
 udp_v6_push_pending_frames+0x255/0x390 net/ipv6/udp.c:1093
 udpv6_sendmsg+0x280d/0x31a0 net/ipv6/udp.c:1363
 inet_sendmsg+0x11f/0x5e0 net/ipv4/af_inet.c:762
 sock_sendmsg_nosec net/socket.c:633 [inline]
 sock_sendmsg+0xca/0x110 net/socket.c:643
 SYSC_sendto+0x352/0x5a0 net/socket.c:1750
 SyS_sendto+0x40/0x50 net/socket.c:1718
 entry_SYSCALL_64_fastpath+0x1f/0xbe
RIP: 0033:0x4512e9
RSP: 002b:00007f43d740bc08 EFLAGS: 00000216 ORIG_RAX: 000000000000002c
RAX: ffffffffffffffda RBX: 00000000007180a8 RCX: 00000000004512e9
RDX: 000000000000002e RSI: 0000000020d08000 RDI: 0000000000000005
RBP: 0000000000000086 R08: 00000000209c1000 R09: 000000000000001c
R10: 0000000000040800 R11: 0000000000000216 R12: 00000000004b9c69
R13: 00000000ffffffff R14: 0000000000000005 R15: 00000000202c2000
Code: 9e 01 fe e9 c5 e8 ff ff e8 7f 9e 01 fe e9 4a ea ff ff 48 89 f7 e8 52 9e 01 fe e9 aa eb ff ff e8 a8 b6 cf fd 0f 0b e8 a1 b6 cf fd <0f> 0b 49 8d 45 78 4d 8d 45 7c 48 89 85 78 fe ff ff 49 8d 85 ba
RIP: __skb_pull include/linux/skbuff.h:2064 [inline] RSP: ffff8801ac6bf570
RIP: __ip6_make_skb+0x18cf/0x1f70 net/ipv6/ip6_output.c:1617 RSP: ffff8801ac6bf570

Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: Mike Maloney <maloney@google.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet: cs89x0: add MODULE_LICENSE
Arnd Bergmann [Wed, 10 Jan 2018 16:30:22 +0000 (17:30 +0100)]
net: cs89x0: add MODULE_LICENSE

This driver lacks a MODULE_LICENSE tag, leading to a Kbuild warning:

WARNING: modpost: missing MODULE_LICENSE() in drivers/net/ethernet/cirrus/cs89x0.o

This adds license, author, and description according to the
comment block at the start of the file.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoppp: unlock all_ppp_mutex before registering device
Guillaume Nault [Wed, 10 Jan 2018 15:24:45 +0000 (16:24 +0100)]
ppp: unlock all_ppp_mutex before registering device

ppp_dev_uninit(), which is the .ndo_uninit() handler of PPP devices,
needs to lock pn->all_ppp_mutex. Therefore we mustn't call
register_netdevice() with pn->all_ppp_mutex already locked, or we'd
deadlock in case register_netdevice() fails and calls .ndo_uninit().

Fortunately, we can unlock pn->all_ppp_mutex before calling
register_netdevice(). This lock protects pn->units_idr, which isn't
used in the device registration process.

However, keeping pn->all_ppp_mutex locked during device registration
did ensure that no device in transient state would be published in
pn->units_idr. In practice, unlocking it before calling
register_netdevice() doesn't change this property: ppp_unit_register()
is called with 'ppp_mutex' locked and all searches done in
pn->units_idr hold this lock too.

Fixes: 8cb775bc0a34 ("ppp: fix device unregistration upon netns deletion")
Reported-and-tested-by: syzbot+367889b9c9e279219175@syzkaller.appspotmail.com
Signed-off-by: Guillaume Nault <g.nault@alphalink.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoptr_ring: document usage around __ptr_ring_peek
Michael S. Tsirkin [Wed, 10 Jan 2018 14:03:05 +0000 (16:03 +0200)]
ptr_ring: document usage around __ptr_ring_peek

This explains why is the net usage of __ptr_ring_peek
actually ok without locks.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years ago9p: add missing module license for xen transport
Stephen Hemminger [Mon, 8 Jan 2018 16:23:18 +0000 (08:23 -0800)]
9p: add missing module license for xen transport

The 9P of Xen module is missing required license and module information.
See https://bugzilla.kernel.org/show_bug.cgi?id=198109

Reported-by: Alan Bartlett <ajb@elrepo.org>
Fixes: 868eb122739a ("xen/9pfs: introduce Xen 9pfs transport driver")
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoring-buffer: Bring back context level recursive checks
Steven Rostedt (VMware) [Mon, 15 Jan 2018 15:47:09 +0000 (10:47 -0500)]
ring-buffer: Bring back context level recursive checks

Commit 1a149d7d3f45 ("ring-buffer: Rewrite trace_recursive_(un)lock() to be
simpler") replaced the context level recursion checks with a simple counter.
This would prevent the ring buffer code from recursively calling itself more
than the max number of contexts that exist (Normal, softirq, irq, nmi). But
this change caused a lockup in a specific case, which was during suspend and
resume using a global clock. Adding a stack dump to see where this occurred,
the issue was in the trace global clock itself:

  trace_buffer_lock_reserve+0x1c/0x50
  __trace_graph_entry+0x2d/0x90
  trace_graph_entry+0xe8/0x200
  prepare_ftrace_return+0x69/0xc0
  ftrace_graph_caller+0x78/0xa8
  queued_spin_lock_slowpath+0x5/0x1d0
  trace_clock_global+0xb0/0xc0
  ring_buffer_lock_reserve+0xf9/0x390

The function graph tracer traced queued_spin_lock_slowpath that was called
by trace_clock_global. This pointed out that the trace_clock_global() is not
reentrant, as it takes a spin lock. It depended on the ring buffer recursive
lock from letting that happen.

By removing the context detection and adding just a max number of allowable
recursions, it allowed the trace_clock_global() to be entered again and try
to retake the spinlock it already held, causing a deadlock.

Fixes: 1a149d7d3f45 ("ring-buffer: Rewrite trace_recursive_(un)lock() to be simpler")
Reported-by: David Weinehall <david.weinehall@gmail.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
6 years agocfg80211: check dev_set_name() return value
Johannes Berg [Mon, 15 Jan 2018 08:58:27 +0000 (09:58 +0100)]
cfg80211: check dev_set_name() return value

syzbot reported a warning from rfkill_alloc(), and after a while
I think that the reason is that it was doing fault injection and
the dev_set_name() failed, leaving the name NULL, and we didn't
check the return value and got to rfkill_alloc() with a NULL name.
Since we really don't want a NULL name, we ought to check the
return value.

Fixes: fb28ad35906a ("net: struct device - replace bus_id with dev_name(), dev_set_name()")
Reported-by: syzbot+1ddfb3357e1d7bb5b5d3@syzkaller.appspotmail.com
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
6 years agomac80211_hwsim: validate number of different channels
Johannes Berg [Mon, 15 Jan 2018 08:32:36 +0000 (09:32 +0100)]
mac80211_hwsim: validate number of different channels

When creating a new radio on the fly, hwsim allows this
to be done with an arbitrary number of channels, but
cfg80211 only supports a limited number of simultaneous
channels, leading to a warning.

Fix this by validating the number - this requires moving
the define for the maximum out to a visible header file.

Reported-by: syzbot+8dd9051ff19940290931@syzkaller.appspotmail.com
Fixes: b59ec8dd4394 ("mac80211_hwsim: fix number of channels in interface combinations")
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
6 years agomac80211_hwsim: add workqueue to wait for deferred radio deletion on mod unload
Benjamin Beichler [Wed, 10 Jan 2018 16:42:51 +0000 (17:42 +0100)]
mac80211_hwsim: add workqueue to wait for deferred radio deletion on mod unload

When closing multiple wmediumd instances with many radios and try to
unload the  mac80211_hwsim module, it may happen that the work items live
longer than the module. To wait especially for this deletion work items,
add a work queue, otherwise flush_scheduled_work would be necessary.

Signed-off-by: Benjamin Beichler <benjamin.beichler@uni-rostock.de>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
6 years agonl80211: take RCU read lock when calling ieee80211_bss_get_ie()
Dominik Brodowski [Mon, 15 Jan 2018 07:12:15 +0000 (08:12 +0100)]
nl80211: take RCU read lock when calling ieee80211_bss_get_ie()

As ieee80211_bss_get_ie() derefences an RCU to return ssid_ie, both
the call to this function and any operation on this variable need
protection by the RCU read lock.

Fixes: 44905265bc15 ("nl80211: don't expose wdev->ssid for most interfaces")
Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
6 years agocfg80211: fully initialize old channel for event
Johannes Berg [Mon, 15 Jan 2018 08:12:05 +0000 (09:12 +0100)]
cfg80211: fully initialize old channel for event

Paul reported that he got a report about undefined behaviour
that seems to me to originate in using uninitialized memory
when the channel structure here is used in the event code in
nl80211 later.

He never reported whether this fixed it, and I wasn't able
to trigger this so far, but we should do the right thing and
fully initialize the on-stack structure anyway.

Reported-by: Paul Menzel <pmenzel+linux-wireless@molgen.mpg.de>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
6 years agoLinux 4.15-rc8
Linus Torvalds [Sun, 14 Jan 2018 23:32:30 +0000 (15:32 -0800)]
Linux 4.15-rc8

6 years agoMerge branch 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Sun, 14 Jan 2018 23:30:02 +0000 (15:30 -0800)]
Merge branch 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 fixlet from Thomas Gleixner.

Remove a warning about lack of compiler support for retpoline that most
people can't do anything about, so it just annoys them needlessly.

* 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/retpoline: Remove compile time warning

6 years agoMerge tag 'powerpc-4.15-7' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc...
Linus Torvalds [Sun, 14 Jan 2018 23:03:17 +0000 (15:03 -0800)]
Merge tag 'powerpc-4.15-7' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux

Pull powerpc fixes from Michael Ellerman:
 "One fix for an oops at boot if we take a hotplug interrupt before we
  are ready to handle it.

  The bulk is patches to implement mitigation for Meltdown, see the
  change logs for more details.

  Thanks to: Nicholas Piggin, Michael Neuling, Oliver O'Halloran, Jon
  Masters, Jose Ricardo Ziviani, David Gibson"

* tag 'powerpc-4.15-7' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
  powerpc/powernv: Check device-tree for RFI flush settings
  powerpc/pseries: Query hypervisor for RFI flush settings
  powerpc/64s: Support disabling RFI flush with no_rfi_flush and nopti
  powerpc/64s: Add support for RFI flush of L1-D cache
  powerpc/64s: Convert slb_miss_common to use RFI_TO_USER/KERNEL
  powerpc/64: Convert fast_exception_return to use RFI_TO_USER/KERNEL
  powerpc/64: Convert the syscall exit path to use RFI_TO_USER/KERNEL
  powerpc/64s: Simple RFI macro conversions
  powerpc/64: Add macros for annotating the destination of rfid/hrfid
  powerpc/pseries: Add H_GET_CPU_CHARACTERISTICS flags & wrapper
  powerpc/pseries: Make RAS IRQ explicitly dependent on DLPAR WQ

6 years agox86/retpoline: Remove compile time warning
Thomas Gleixner [Sun, 14 Jan 2018 21:13:29 +0000 (22:13 +0100)]
x86/retpoline: Remove compile time warning

Remove the compile time warning when CONFIG_RETPOLINE=y and the compiler
does not have retpoline support. Linus rationale for this is:

  It's wrong because it will just make people turn off RETPOLINE, and the
  asm updates - and return stack clearing - that are independent of the
  compiler are likely the most important parts because they are likely the
  ones easiest to target.

  And it's annoying because most people won't be able to do anything about
  it. The number of people building their own compiler? Very small. So if
  their distro hasn't got a compiler yet (and pretty much nobody does), the
  warning is just annoying crap.

  It is already properly reported as part of the sysfs interface. The
  compile-time warning only encourages bad things.

Fixes: 76b043848fd2 ("x86/retpoline: Add initial retpoline support")
Requested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: David Woodhouse <dwmw@amazon.co.uk>
Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: gnomes@lxorguk.ukuu.org.uk
Cc: Rik van Riel <riel@redhat.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: thomas.lendacky@amd.com
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Jiri Kosina <jikos@kernel.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Kees Cook <keescook@google.com>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Greg Kroah-Hartman <gregkh@linux-foundation.org>
Link: https://lkml.kernel.org/r/CA+55aFzWgquv4i6Mab6bASqYXg3ErV3XDFEYf=GEcCDQg5uAtw@mail.gmail.com
6 years agoMerge branch 'for-linus' of git://git.kernel.dk/linux-block
Linus Torvalds [Sun, 14 Jan 2018 18:22:45 +0000 (10:22 -0800)]
Merge branch 'for-linus' of git://git.kernel.dk/linux-block

Pull NVMe fix from Jens Axboe:
 "Just a single fix for nvme over fabrics that should go into 4.15"

* 'for-linus' of git://git.kernel.dk/linux-block:
  nvme-fabrics: initialize default host->id in nvmf_host_default()

6 years agoMerge branch 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Sun, 14 Jan 2018 17:51:25 +0000 (09:51 -0800)]
Merge branch 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 pti updates from Thomas Gleixner:
 "This contains:

   - a PTI bugfix to avoid setting reserved CR3 bits when PCID is
     disabled. This seems to cause issues on a virtual machine at least
     and is incorrect according to the AMD manual.

   - a PTI bugfix which disables the perf BTS facility if PTI is
     enabled. The BTS AUX buffer is not globally visible and causes the
     CPU to fault when the mapping disappears on switching CR3 to user
     space. A full fix which restores BTS on PTI is non trivial and will
     be worked on.

   - PTI bugfixes for EFI and trusted boot which make sure that the user
     space visible page table entries have the NX bit cleared

   - removal of dead code in the PTI pagetable setup functions

   - add PTI documentation

   - add a selftest for vsyscall to verify that the kernel actually
     implements what it advertises.

   - a sysfs interface to expose vulnerability and mitigation
     information so there is a coherent way for users to retrieve the
     status.

   - the initial spectre_v2 mitigations, aka retpoline:

      + The necessary ASM thunk and compiler support

      + The ASM variants of retpoline and the conversion of affected ASM
        code

      + Make LFENCE serializing on AMD so it can be used as speculation
        trap

      + The RSB fill after vmexit

   - initial objtool support for retpoline

  As I said in the status mail this is the most of the set of patches
  which should go into 4.15 except two straight forward patches still on
  hold:

   - the retpoline add on of LFENCE which waits for ACKs

   - the RSB fill after context switch

  Both should be ready to go early next week and with that we'll have
  covered the major holes of spectre_v2 and go back to normality"

* 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (28 commits)
  x86,perf: Disable intel_bts when PTI
  security/Kconfig: Correct the Documentation reference for PTI
  x86/pti: Fix !PCID and sanitize defines
  selftests/x86: Add test_vsyscall
  x86/retpoline: Fill return stack buffer on vmexit
  x86/retpoline/irq32: Convert assembler indirect jumps
  x86/retpoline/checksum32: Convert assembler indirect jumps
  x86/retpoline/xen: Convert Xen hypercall indirect jumps
  x86/retpoline/hyperv: Convert assembler indirect jumps
  x86/retpoline/ftrace: Convert ftrace assembler indirect jumps
  x86/retpoline/entry: Convert entry assembler indirect jumps
  x86/retpoline/crypto: Convert crypto assembler indirect jumps
  x86/spectre: Add boot time option to select Spectre v2 mitigation
  x86/retpoline: Add initial retpoline support
  objtool: Allow alternatives to be ignored
  objtool: Detect jumps to retpoline thunks
  x86/pti: Make unpoison of pgd for trusted boot work for real
  x86/alternatives: Fix optimize_nops() checking
  sysfs/cpu: Fix typos in vulnerability documentation
  x86/cpu/AMD: Use LFENCE_RDTSC in preference to MFENCE_RDTSC
  ...

6 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf
David S. Miller [Sun, 14 Jan 2018 16:01:33 +0000 (11:01 -0500)]
Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf

Daniel Borkmann says:

====================
pull-request: bpf 2018-01-13

The following pull-request contains BPF updates for your *net* tree.

The main changes are:

1) Follow-up fix to the recent BPF out-of-bounds speculation
   fix that prevents max_entries overflows and an undefined
   behavior on 32 bit archs on index_mask calculation, from
   Daniel.

2) Reject unsupported BPF_ARSH opcode in 32 bit ALU mode that
   was otherwise throwing an unknown opcode warning in the
   interpreter, from Daniel.

3) Typo fix in one of the user facing verbose() messages that
   was added during the BPF out-of-bounds speculation fix,
   from Colin.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agox86,perf: Disable intel_bts when PTI
Peter Zijlstra [Sun, 14 Jan 2018 10:27:13 +0000 (11:27 +0100)]
x86,perf: Disable intel_bts when PTI

The intel_bts driver does not use the 'normal' BTS buffer which is exposed
through the cpu_entry_area but instead uses the memory allocated for the
perf AUX buffer.

This obviously comes apart when using PTI because then the kernel mapping;
which includes that AUX buffer memory; disappears. Fixing this requires to
expose a mapping which is visible in all context and that's not trivial.

As a quick fix disable this driver when PTI is enabled to prevent
malfunction.

Fixes: 385ce0ea4c07 ("x86/mm/pti: Add Kconfig")
Reported-by: Vince Weaver <vincent.weaver@maine.edu>
Reported-by: Robert Święcki <robert@swiecki.net>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: greg@kroah.com
Cc: hughd@google.com
Cc: luto@amacapital.net
Cc: Vince Weaver <vince@deater.net>
Cc: torvalds@linux-foundation.org
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20180114102713.GB6166@worktop.programming.kicks-ass.net
6 years agosecurity/Kconfig: Correct the Documentation reference for PTI
W. Trevor King [Fri, 12 Jan 2018 23:24:59 +0000 (15:24 -0800)]
security/Kconfig: Correct the Documentation reference for PTI

When the config option for PTI was added a reference to documentation was
added as well. But the documentation did not exist at that point. The final
documentation has a different file name.

Fix it up to point to the proper file.

Fixes: 385ce0ea ("x86/mm/pti: Add Kconfig")
Signed-off-by: W. Trevor King <wking@tremily.us>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: linux-mm@kvack.org
Cc: linux-security-module@vger.kernel.org
Cc: James Morris <james.l.morris@oracle.com>
Cc: "Serge E. Hallyn" <serge@hallyn.com>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/3009cc8ccbddcd897ec1e0cb6dda524929de0d14.1515799398.git.wking@tremily.us
6 years agox86/pti: Fix !PCID and sanitize defines
Thomas Gleixner [Sat, 13 Jan 2018 23:23:57 +0000 (00:23 +0100)]
x86/pti: Fix !PCID and sanitize defines

The switch to the user space page tables in the low level ASM code sets
unconditionally bit 12 and bit 11 of CR3. Bit 12 is switching the base
address of the page directory to the user part, bit 11 is switching the
PCID to the PCID associated with the user page tables.

This fails on a machine which lacks PCID support because bit 11 is set in
CR3. Bit 11 is reserved when PCID is inactive.

While the Intel SDM claims that the reserved bits are ignored when PCID is
disabled, the AMD APM states that they should be cleared.

This went unnoticed as the AMD APM was not checked when the code was
developed and reviewed and test systems with Intel CPUs never failed to
boot. The report is against a Centos 6 host where the guest fails to boot,
so it's not yet clear whether this is a virt issue or can happen on real
hardware too, but thats irrelevant as the AMD APM clearly ask for clearing
the reserved bits.

Make sure that on non PCID machines bit 11 is not set by the page table
switching code.

Andy suggested to rename the related bits and masks so they are clearly
describing what they should be used for, which is done as well for clarity.

That split could have been done with alternatives but the macro hell is
horrible and ugly. This can be done on top if someone cares to remove the
extra orq. For now it's a straight forward fix.

Fixes: 6fd166aae78c ("x86/mm: Use/Fix PCID to optimize user/kernel switches")
Reported-by: Laura Abbott <labbott@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: stable <stable@vger.kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Willy Tarreau <w@1wt.eu>
Cc: David Woodhouse <dwmw@amazon.co.uk>
Link: https://lkml.kernel.org/r/alpine.DEB.2.20.1801140009150.2371@nanos
6 years agoMerge tag 'usb-4.15-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb
Linus Torvalds [Sat, 13 Jan 2018 22:10:32 +0000 (14:10 -0800)]
Merge tag 'usb-4.15-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb

Pull USB fixes from Greg KH:
 "Here are some small USB fixes and device ids for 4.15-rc8

  Nothing major, small fixes for various devices, some resolutions for
  bugs found by fuzzers, and the usual handful of new device ids.

  All of these have been in linux-next with no reported issues"

* tag 'usb-4.15-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
  Documentation: usb: fix typo in UVC gadgetfs config command
  usb: misc: usb3503: make sure reset is low for at least 100us
  uas: ignore UAS for Norelsys NS1068(X) chips
  USB: UDC core: fix double-free in usb_add_gadget_udc_release
  USB: fix usbmon BUG trigger
  usbip: vudc_tx: fix v_send_ret_submit() vulnerability to null xfer buffer
  usbip: remove kernel addresses from usb device and urb debug msgs
  usbip: fix vudc_rx: harden CMD_SUBMIT path to handle malicious input
  USB: serial: cp210x: add new device ID ELV ALC 8xxx
  USB: serial: cp210x: add IDs for LifeScan OneTouch Verio IQ

6 years agoMerge tag 'staging-4.15-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh...
Linus Torvalds [Sat, 13 Jan 2018 22:04:06 +0000 (14:04 -0800)]
Merge tag 'staging-4.15-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging

Pull staging driver fix from Greg KH:
 "Here is a single android ashmem bugfix that resolves a reported issue
  in that interface. It's been in linux-next this week with no reported
  issues"

* tag 'staging-4.15-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging:
  staging: android: ashmem: fix a race condition in ASHMEM_SET_SIZE ioctl

6 years agoMerge tag 'char-misc-4.15-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/gregk...
Linus Torvalds [Sat, 13 Jan 2018 22:01:59 +0000 (14:01 -0800)]
Merge tag 'char-misc-4.15-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc

Pull char/misc fixes from Greg KH:
 "Here are two bugfixes for some driver bugs for 4.15-rc8

  The first is a bluetooth security bug that has been ignored by the
  Bluetooth developers for months for no obvious reason at all, so I've
  taken it through my tree.

  The second is a simple double-free bug in the mux subsystem.

  Both have been in linux-next for a while with no reported issues"

* tag 'char-misc-4.15-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc:
  mux: core: fix double get_device()
  Bluetooth: Prevent stack info leak from the EFS element.

6 years agoMerge tag 'kbuild-fixes-v4.15' of git://git.kernel.org/pub/scm/linux/kernel/git/masah...
Linus Torvalds [Sat, 13 Jan 2018 21:24:56 +0000 (13:24 -0800)]
Merge tag 'kbuild-fixes-v4.15' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild

Pull Kbuild fixes from Masahiro Yamada:

 - fix cross-compilation for architectures that setup CROSS_COMPILE in
   their arch Makefile

 - fix Kconfig rational operators for bool / tristate

 - drop a gperf-generated file from .gitignore

* tag 'kbuild-fixes-v4.15' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild:
  genksyms: drop *.hash.c from .gitignore
  kconfig: fix relational operators for bool and tristate symbols
  kbuild: move cc-option and cc-disable-warning after incl. arch Makefile

6 years agoMerge tag 'apparmor-pr-2018-01-12' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Sat, 13 Jan 2018 21:18:15 +0000 (13:18 -0800)]
Merge tag 'apparmor-pr-2018-01-12' of git://git.kernel.org/pub/scm/linux/kernel/git/jj/linux-apparmor

Pull apparmor regression fixes from John Johansen:
 "This fixes a couple bugs I have been working with Matthew Garrett on
  this week. Specifically a regression in the handling of a conflicting
  profile attachment and label match restrictions for ptrace when
  profiles are stacked.

  Summary:

   - fix ptrace label match when matching stacked labels

   - fix regression in profile conflict logic"

* tag 'apparmor-pr-2018-01-12' of git://git.kernel.org/pub/scm/linux/kernel/git/jj/linux-apparmor:
  apparmor: Fix regression in profile conflict logic
  apparmor: fix ptrace label match when matching stacked labels

6 years agoMerge tag 'pci-v4.15-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaa...
Linus Torvalds [Sat, 13 Jan 2018 21:14:54 +0000 (13:14 -0800)]
Merge tag 'pci-v4.15-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci

Pull PCI fixes from Bjorn Helgaas:
 "Fix AMD boot regression due to 64-bit window conflicting with system
  memory (Christian König)"

* tag 'pci-v4.15-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci:
  x86/PCI: Move and shrink AMD 64-bit window to avoid conflict
  x86/PCI: Add "pci=big_root_window" option for AMD 64-bit windows

6 years agoMerge branch 'akpm' (patches from Andrew)
Linus Torvalds [Sat, 13 Jan 2018 19:07:55 +0000 (11:07 -0800)]
Merge branch 'akpm' (patches from Andrew)

Merge misc fixlets from Andrew Morton:
 "4 fixes"

* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
  tools/objtool/Makefile: don't assume sync-check.sh is executable
  kdump: write correct address of mem_section into vmcoreinfo
  kmemleak: allow to coexist with fault injection
  MAINTAINERS, nilfs2: change project home URLs

6 years agotools/objtool/Makefile: don't assume sync-check.sh is executable
Andrew Morton [Sat, 13 Jan 2018 00:53:17 +0000 (16:53 -0800)]
tools/objtool/Makefile: don't assume sync-check.sh is executable

patch(1) loses the x bit.  So if a user follows our patching
instructions in Documentation/admin-guide/README.rst, their kernel will
not compile.

Fixes: 3bd51c5a371de ("objtool: Move kernel headers/code sync check to a script")
Reported-by: Nicolas Bock <nicolasbock@gentoo.org>
Reported-by Joakim Tjernlund <Joakim.Tjernlund@infinera.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
6 years agokdump: write correct address of mem_section into vmcoreinfo
Kirill A. Shutemov [Sat, 13 Jan 2018 00:53:14 +0000 (16:53 -0800)]
kdump: write correct address of mem_section into vmcoreinfo

Depending on configuration mem_section can now be an array or a pointer
to an array allocated dynamically.  In most cases, we can continue to
refer to it as 'mem_section' regardless of what it is.

But there's one exception: '&mem_section' means "address of the array"
if mem_section is an array, but if mem_section is a pointer, it would
mean "address of the pointer".

We've stepped onto this in kdump code.  VMCOREINFO_SYMBOL(mem_section)
writes down address of pointer into vmcoreinfo, not array as we wanted.

Let's introduce VMCOREINFO_SYMBOL_ARRAY() that would handle the
situation correctly for both cases.

Link: http://lkml.kernel.org/r/20180112162532.35896-1-kirill.shutemov@linux.intel.com
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Fixes: 83e3c48729d9 ("mm/sparsemem: Allocate mem_section at runtime for CONFIG_SPARSEMEM_EXTREME=y")
Acked-by: Baoquan He <bhe@redhat.com>
Acked-by: Dave Young <dyoung@redhat.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Dave Young <dyoung@redhat.com>
Cc: Baoquan He <bhe@redhat.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
6 years agokmemleak: allow to coexist with fault injection
Dmitry Vyukov [Sat, 13 Jan 2018 00:53:10 +0000 (16:53 -0800)]
kmemleak: allow to coexist with fault injection

kmemleak does one slab allocation per user allocation.  So if slab fault
injection is enabled to any degree, kmemleak instantly fails to allocate
and turns itself off.  However, it's useful to use kmemleak with fault
injection to find leaks on error paths.  On the other hand, checking
kmemleak itself is not so useful because (1) it's a debugging tool and
(2) it has a very regular allocation pattern (basically a single
allocation site, so it either works or not).

Turn off fault injection for kmemleak allocations.

Link: http://lkml.kernel.org/r/20180109192243.19316-1-dvyukov@google.com
Signed-off-by: Dmitry Vyukov <dvyukov@google.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
6 years agoMAINTAINERS, nilfs2: change project home URLs
Ryusuke Konishi [Sat, 13 Jan 2018 00:53:07 +0000 (16:53 -0800)]
MAINTAINERS, nilfs2: change project home URLs

The domain of NILFS project home was changed to "nilfs.sourceforge.io"
to enable https access (the previous domain "nilfs.sourceforge.net" is
redirected to the new one).  Modify URLs of the project home to reflect
this change and to replace their protocol from http to https.

Link: http://lkml.kernel.org/r/1515416141-5614-1-git-send-email-konishi.ryusuke@lab.ntt.co.jp
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
6 years agogenksyms: drop *.hash.c from .gitignore
Masahiro Yamada [Thu, 11 Jan 2018 09:28:08 +0000 (18:28 +0900)]
genksyms: drop *.hash.c from .gitignore

This is a left-over of commit bb3290d91695 ("Remove gperf usage from
toolchain").

We do not generate a hash function any more.

Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
6 years agoselftests/x86: Add test_vsyscall
Andy Lutomirski [Fri, 12 Jan 2018 01:16:51 +0000 (17:16 -0800)]
selftests/x86: Add test_vsyscall

This tests that the vsyscall entries do what they're expected to do.
It also confirms that attempts to read the vsyscall page behave as
expected.

If changes are made to the vsyscall code or its memory map handling,
running this test in all three of vsyscall=none, vsyscall=emulate,
and vsyscall=native are helpful.

(Because it's easy, this also compares the vsyscall results to their
 vDSO equivalents.)

Note to KAISER backporters: please test this under all three
vsyscall modes.  Also, in the emulate and native modes, make sure
that test_vsyscall_64 agrees with the command line or config
option as to which mode you're in.  It's quite easy to mess up
the kernel such that native mode accidentally emulates
or vice versa.

Greg, etc: please backport this to all your Meltdown-patched
kernels.  It'll help make sure the patches didn't regress
vsyscalls.

CSigned-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/2b9c5a174c1d60fd7774461d518aa75598b1d8fd.1515719552.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
6 years agoapparmor: Fix regression in profile conflict logic
Matthew Garrett [Thu, 11 Jan 2018 21:07:54 +0000 (13:07 -0800)]
apparmor: Fix regression in profile conflict logic

The intended behaviour in apparmor profile matching is to flag a
conflict if two profiles match equally well. However, right now a
conflict is generated if another profile has the same match length even
if that profile doesn't actually match. Fix the logic so we only
generate a conflict if the profiles match.

Fixes: 844b8292b631 ("apparmor: ensure that undecidable profile attachments fail")
Cc: Stable <stable@vger.kernel.org>
Signed-off-by: Matthew Garrett <mjg59@google.com>
Signed-off-by: John Johansen <john.johansen@canonical.com>
6 years agoapparmor: fix ptrace label match when matching stacked labels
John Johansen [Sat, 9 Dec 2017 01:43:18 +0000 (17:43 -0800)]
apparmor: fix ptrace label match when matching stacked labels

Given a label with a profile stack of
  A//&B or A//&C ...

A ptrace rule should be able to specify a generic trace pattern with
a rule like

  ptrace trace A//&**,

however this is failing because while the correct label match routine
is called, it is being done post label decomposition so it is always
being done against a profile instead of the stacked label.

To fix this refactor the cross check to pass the full peer label in to
the label_match.

Fixes: 290f458a4f16 ("apparmor: allow ptrace checks to be finer grained than just capability")
Cc: Stable <stable@vger.kernel.org>
Reported-by: Matthew Garrett <mjg59@google.com>
Tested-by: Matthew Garrett <mjg59@google.com>
Signed-off-by: John Johansen <john.johansen@canonical.com>
6 years agoMerge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Fri, 12 Jan 2018 18:32:11 +0000 (10:32 -0800)]
Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 fixes from Ingo Molnar:
 "Two pending (non-PTI) x86 fixes:

   - an Intel-MID crash fix

   - and an Intel microcode loader blacklist quirk to avoid a
     problematic revision"

* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/platform/intel-mid: Revert "Make 'bt_sfi_data' const"
  x86/microcode/intel: Extend BDW late-loading with a revision check

6 years agoMerge branch 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Fri, 12 Jan 2018 18:23:59 +0000 (10:23 -0800)]
Merge branch 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull scheduler fixes from Ingo Molnar:
 "A Kconfig fix, a build fix and a membarrier bug fix"

* 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  membarrier: Disable preemption when calling smp_call_function_many()
  sched/isolation: Make CONFIG_CPU_ISOLATION=y depend on SMP or COMPILE_TEST
  ia64, sched/cputime: Fix build error if CONFIG_VIRT_CPU_ACCOUNTING_NATIVE=y

6 years agoMerge branch 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Fri, 12 Jan 2018 18:14:09 +0000 (10:14 -0800)]
Merge branch 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull locking fixes from Ingo Molnar:
 "No functional effects intended: removes leftovers from recent lockdep
  and refcounts work"

* 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  locking/refcounts: Remove stale comment from the ARCH_HAS_REFCOUNT Kconfig entry
  locking/lockdep: Remove cross-release leftovers
  locking/Documentation: Remove stale crossrelease_fullstack parameter

6 years agoMerge tag 'for-linus-4.15-rc8-tag' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Fri, 12 Jan 2018 18:00:15 +0000 (10:00 -0800)]
Merge tag 'for-linus-4.15-rc8-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip

Pull xen fixes from Juergen Gross:
 "This contains two build fixes for clang and two fixes for rather
  unlikely situations in the Xen gntdev driver"

* tag 'for-linus-4.15-rc8-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
  xen/gntdev: Fix partial gntdev_mmap() cleanup
  xen/gntdev: Fix off-by-one error when unmapping with holes
  x86: xen: remove the use of VLAIS
  x86/xen/time: fix section mismatch for xen_init_time_ops()

6 years agoMerge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Linus Torvalds [Fri, 12 Jan 2018 17:56:52 +0000 (09:56 -0800)]
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull KVM fixes from Paolo Bonzini:
 "PPC:
   - user-triggerable use-after-free in HPT resizing
   - stale TLB entries in the guest
   - trap-and-emulate (PR) KVM guests failing to start under pHyp

  x86:
   - Another "Spectre" fix.
   - async pagefault fix
   - Revert an old fix for x86 nested virtualization, which turned out
     to do more harm than good
   - Check shrinker registration return code, to avoid warnings from
     upcoming 4.16 -mm patches"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  KVM: x86: Add memory barrier on vmcs field lookup
  KVM: x86: emulate #UD while in guest mode
  x86: kvm: propagate register_shrinker return code
  KVM MMU: check pending exception before injecting APF
  KVM: PPC: Book3S HV: Always flush TLB in kvmppc_alloc_reset_hpt()
  KVM: PPC: Book3S PR: Fix WIMG handling under pHyp
  KVM: PPC: Book3S HV: Fix use after free in case of multiple resize requests
  KVM: PPC: Book3S HV: Drop prepare_done from struct kvm_resize_hpt

6 years agoMerge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
Linus Torvalds [Fri, 12 Jan 2018 17:47:58 +0000 (09:47 -0800)]
Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6

Pull crypto fix from Herbert Xu:
 "This fixes a NULL pointer dereference in crypto_remove_spawns that can
  be triggered through af_alg"

* 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
  crypto: algapi - fix NULL dereference in crypto_remove_spawns()

6 years agoMerge branch 'nvme-4.15' of git://git.infradead.org/nvme into for-linus
Jens Axboe [Fri, 12 Jan 2018 17:42:36 +0000 (10:42 -0700)]
Merge branch 'nvme-4.15' of git://git.infradead.org/nvme into for-linus

Pull a single NVMe fix from Christoph for 4.15.

6 years agoMerge tag 'mmc-v4.15-rc2-2' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc
Linus Torvalds [Fri, 12 Jan 2018 17:34:20 +0000 (09:34 -0800)]
Merge tag 'mmc-v4.15-rc2-2' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc

Pull MMC host fixes from Ulf Hansson:

 - s3mci: mark debug_regs[] as static

 - renesas_sdhi: Add MODULE_LICENSE

* tag 'mmc-v4.15-rc2-2' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc:
  mmc: s3mci: mark debug_regs[] as static
  mmc: renesas_sdhi: Add MODULE_LICENSE

6 years agoMerge tag 'drm-fixes-for-v4.15-rc8' of git://people.freedesktop.org/~airlied/linux
Linus Torvalds [Fri, 12 Jan 2018 17:28:28 +0000 (09:28 -0800)]
Merge tag 'drm-fixes-for-v4.15-rc8' of git://people.freedesktop.org/~airlied/linux

Pull drm fixes from Dave Airlie:

 - Nouveau: regression fix

 - Tegra: regression fix

 - vmwgfx: crasher + freed data leak

 - i915: KASAN use after free fix, whitelist register to avoid hang fix,
   GVT fixes

 - vc4: irq/pm fix

* tag 'drm-fixes-for-v4.15-rc8' of git://people.freedesktop.org/~airlied/linux:
  drm/i915: Don't adjust priority on an already signaled fence
  drm/i915: Whitelist SLICE_COMMON_ECO_CHICKEN1 on Geminilake.
  drm/vmwgfx: Potential off by one in vmw_view_add()
  drm/tegra: sor: Fix hang on Tegra124 eDP
  drm/vmwgfx: Don't cache framebuffer maps
  drm/nouveau/disp/gf119: add missing drive vfunc ptr
  drm/i915/gvt: Fix stack-out-of-bounds bug in cmd parser
  drm/i915/gvt: Clear the shadow page table entry after post-sync
  drm/vc4: Move IRQ enable to PM path

6 years agoMerge tag 'mlx5-fixes-2018-01-11' of git://git.kernel.org/pub/scm/linux/kernel/git...
David S. Miller [Fri, 12 Jan 2018 15:40:48 +0000 (10:40 -0500)]
Merge tag 'mlx5-fixes-2018-01-11' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux

Saeed Mahameed says:

====================
Mellanox, mlx5 fixes 2018-01-11

The following series includes fixes to mlx5 core and netdev driver.
To highlight we have two critical fixes in this series:
1st patch from Eran to address a fix for Host2BMC Breakage.

2nd patch from Saeed to address the RDMA IRQ vector affinity settings query
issue, the patch provides the correct mlx5_core implementation for RDMA to
correctly  query vector affinity.
I sent this patch privately to Sagi a week a go, so he could to test it
but I didn't hear from him.

All other patches are trivial misc fixes.
Please pull and let me know if there's any problem.

for -stable v4.14-y and later:
("net/mlx5: Fix get vector affinity helper function")
("{net,ib}/mlx5: Don't disable local loopback multicast traffic when needed")

Note: Merging this series with net-next will produce the following conflict:
<<<<<<< HEAD
        u8         disable_local_lb[0x1];
        u8         reserved_at_3e2[0x1];
        u8         log_min_hairpin_wq_data_sz[0x5];
        u8         reserved_at_3e8[0x3];
=======
        u8         disable_local_lb_uc[0x1];
        u8         disable_local_lb_mc[0x1];
        u8         reserved_at_3e3[0x8];
>>>>>>> 359c96447ac2297fabe15ef30b60f3b4b71e7fd0

To resolve, use the following hunk:
i.e:
<<<<<<
        u8         disable_local_lb_uc[0x1];
        u8         disable_local_lb_mc[0x1];
        u8         log_min_hairpin_wq_data_sz[0x5];
        u8         reserved_at_3e8[0x3];
>>>>>>
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoMerge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec
David S. Miller [Fri, 12 Jan 2018 15:32:49 +0000 (10:32 -0500)]
Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec

Steffen Klassert says:

====================
pull request (net): ipsec 2018-01-11

1) Don't allow to change the encap type on state updates.
   The encap type is set on state initialization and
   should not change anymore. From Herbert Xu.

2) Skip dead policies when rehashing to fix a
   slab-out-of-bounds bug in xfrm_hash_rebuild.
   From Florian Westphal.

3) Two buffer overread fixes in pfkey.
   From Eric Biggers.

4) Fix rcu usage in xfrm_get_type_offload,
   request_module can sleep, so can't be used
   under rcu_read_lock. From Sabrina Dubroca.

5) Fix an uninitialized lock in xfrm_trans_queue.
   Use __skb_queue_tail instead of skb_queue_tail
   in xfrm_trans_queue as we don't need the lock.
   From Herbert Xu.

6) Currently it is possible to create an xfrm state with an
   unknown encap type in ESP IPv4. Fix this by returning an
   error on unknown encap types. Also from Herbert Xu.

7) Fix sleeping inside a spinlock in xfrm_policy_cache_flush.
   From Florian Westphal.

8) Fix ESP GRO when the headers not fully in the linear part
   of the skb. We need to pull before we can access them.

9) Fix a skb leak on error in key_notify_policy.

10) Fix a race in the xdst pcpu cache, we need to
    run the resolver routines with bottom halfes
    off like the old flowcache did.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agox86/retpoline: Fill return stack buffer on vmexit
David Woodhouse [Fri, 12 Jan 2018 11:11:27 +0000 (11:11 +0000)]
x86/retpoline: Fill return stack buffer on vmexit

In accordance with the Intel and AMD documentation, we need to overwrite
all entries in the RSB on exiting a guest, to prevent malicious branch
target predictions from affecting the host kernel. This is needed both
for retpoline and for IBRS.

[ak: numbers again for the RSB stuffing labels]

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: gnomes@lxorguk.ukuu.org.uk
Cc: Rik van Riel <riel@redhat.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: thomas.lendacky@amd.com
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Jiri Kosina <jikos@kernel.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Kees Cook <keescook@google.com>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Greg Kroah-Hartman <gregkh@linux-foundation.org>
Cc: Paul Turner <pjt@google.com>
Link: https://lkml.kernel.org/r/1515755487-8524-1-git-send-email-dwmw@amazon.co.uk
6 years agoMerge tag 'drm-intel-fixes-2018-01-11-1' of git://anongit.freedesktop.org/drm/drm...
Dave Airlie [Fri, 12 Jan 2018 01:48:06 +0000 (11:48 +1000)]
Merge tag 'drm-intel-fixes-2018-01-11-1' of git://anongit.freedesktop.org/drm/drm-intel into drm-fixes

Hopefully final drm/i915 fixes for v4.15:
- Fix a KASAN reported use after free
- Whitelist a register to avoid hangs
- GVT fixes

* tag 'drm-intel-fixes-2018-01-11-1' of git://anongit.freedesktop.org/drm/drm-intel:
  drm/i915: Don't adjust priority on an already signaled fence
  drm/i915: Whitelist SLICE_COMMON_ECO_CHICKEN1 on Geminilake.
  drm/i915/gvt: Fix stack-out-of-bounds bug in cmd parser
  drm/i915/gvt: Clear the shadow page table entry after post-sync

6 years agoMerge branch 'vmwgfx-fixes-4.15' of git://people.freedesktop.org/~thomash/linux into...
Dave Airlie [Fri, 12 Jan 2018 01:47:40 +0000 (11:47 +1000)]
Merge branch 'vmwgfx-fixes-4.15' of git://people.freedesktop.org/~thomash/linux into drm-fixes

Two important fixes for vmwgfx.
The off-by-one fix could cause a malicious user to potentially crash the
kernel.
The framebuffer map cache fix can under some circumstances enable a user to
read from or write to freed pages.

* 'vmwgfx-fixes-4.15' of git://people.freedesktop.org/~thomash/linux:
  drm/vmwgfx: Potential off by one in vmw_view_add()
  drm/vmwgfx: Don't cache framebuffer maps

6 years agoMerge tag 'drm/tegra/for-4.15-rc8' of git://anongit.freedesktop.org/tegra/linux into...
Dave Airlie [Fri, 12 Jan 2018 01:47:11 +0000 (11:47 +1000)]
Merge tag 'drm/tegra/for-4.15-rc8' of git://anongit.freedesktop.org/tegra/linux into drm-fixes

drm/tegra: Fixes for v4.15-rc8

A single fix for a Tegra124 eDP regression introduced by the SOR changes
in v4.15-rc1.

* tag 'drm/tegra/for-4.15-rc8' of git://anongit.freedesktop.org/tegra/linux:
  drm/tegra: sor: Fix hang on Tegra124 eDP

6 years agoMerge tag 'ceph-for-4.15-rc8' of git://github.com/ceph/ceph-client
Linus Torvalds [Fri, 12 Jan 2018 00:57:32 +0000 (16:57 -0800)]
Merge tag 'ceph-for-4.15-rc8' of git://github.com/ceph/ceph-client

Pull ceph fixes from Ilya Dryomov:
 "Two rbd fixes for 4.12 and 4.2 issues respectively, marked for
  stable"

* tag 'ceph-for-4.15-rc8' of git://github.com/ceph/ceph-client:
  rbd: set max_segments to USHRT_MAX
  rbd: reacquire lock should update lock owner client id

6 years agoMerge tag 'gpio-v4.15-4' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw...
Linus Torvalds [Fri, 12 Jan 2018 00:54:35 +0000 (16:54 -0800)]
Merge tag 'gpio-v4.15-4' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio

Pull GPIO fix from Linus Walleij:
 "Fix a raw vs elaborate GPIO descriptor bug introduced by yours truly"

* tag 'gpio-v4.15-4' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio:
  gpio: Add missing open drain/source handling to gpiod_set_value_cansleep()

6 years agonet/mlx5e: Remove timestamp set from netdevice open flow
Feras Daoud [Mon, 8 Jan 2018 08:01:04 +0000 (10:01 +0200)]
net/mlx5e: Remove timestamp set from netdevice open flow

To avoid configuration override, timestamp set call will
be moved from the netdevice open flow to the init flow.
By this, a close-open procedure will not override the timestamp
configuration.
In addition, the change will rename mlx5e_timestamp_set function
to be mlx5e_timestamp_init.

Fixes: ef9814deafd0 ("net/mlx5e: Add HW timestamping (TS) support")
Signed-off-by: Feras Daoud <ferasda@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
6 years agonet/mlx5: Update ptp_clock_event foreach PPS event
Feras Daoud [Wed, 3 Jan 2018 15:23:55 +0000 (17:23 +0200)]
net/mlx5: Update ptp_clock_event foreach PPS event

PPS event did not update ptp_clock_event fields, therefore,
timestamp value was not updated correctly. This fix updates the
event source and the timestamp value for each PPS event.

Fixes: 7c39afb394c7 ("net/mlx5: PTP code migration to driver core section")
Signed-off-by: Feras Daoud <ferasda@mellanox.com>
Reported-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
6 years agonet/mlx5e: Don't override netdev features field unless in error flow
Gal Pressman [Wed, 10 Jan 2018 15:11:11 +0000 (17:11 +0200)]
net/mlx5e: Don't override netdev features field unless in error flow

Set features function sets dev->features in order to keep track of which
features were successfully changed and which weren't (in case the user
asks for more than one change in a single command).

This breaks the logic in __netdev_update_features which assumes that
dev->features is not changed on success and checks for diffs between
features and dev->features (diffs that might not exist at this point
because of the driver override).

The solution is to keep track of successful/failed feature changes and
assign them to dev->features in case of failure only.

Fixes: 0e405443e803 ("net/mlx5e: Improve set features ndo resiliency")
Signed-off-by: Gal Pressman <galp@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
6 years agonet/mlx5e: Check support before TC swap in ETS init
Tariq Toukan [Tue, 10 Oct 2017 13:54:30 +0000 (16:54 +0300)]
net/mlx5e: Check support before TC swap in ETS init

Should not do the following swap between TCs 0 and 1
when max num of TCs is 1:
tclass[prio=0]=1, tclass[prio=1]=0, tclass[prio=i]=i (for i>1)

Fixes: 08fb1dacdd76 ("net/mlx5e: Support DCBNL IEEE ETS")
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
6 years agonet/mlx5e: Add error print in ETS init
Tariq Toukan [Tue, 10 Oct 2017 13:51:44 +0000 (16:51 +0300)]
net/mlx5e: Add error print in ETS init

ETS initialization might fail, add a print to indicate
such failures.

Fixes: 08fb1dacdd76 ("net/mlx5e: Support DCBNL IEEE ETS")
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
6 years agonet/mlx5e: Keep updating ethtool statistics when the interface is down
Gal Pressman [Tue, 26 Dec 2017 11:44:49 +0000 (13:44 +0200)]
net/mlx5e: Keep updating ethtool statistics when the interface is down

ethtool statistics should be updated even when the interface is down
since it shows more than just netdev counters, which might change while
the logical link is down.
One useful use case, for example, is when running RoCE traffic over the
interface (while the logical link is down, but physical link is up) and
examining rx_prioX_bytes.

Fixes: f62b8bb8f2d3 ("net/mlx5: Extend mlx5_core to support ConnectX-4 Ethernet functionality")
Signed-off-by: Gal Pressman <galp@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
6 years agonet/mlx5: Fix error handling in load one
Maor Gottlieb [Sun, 31 Dec 2017 09:31:34 +0000 (11:31 +0200)]
net/mlx5: Fix error handling in load one

We didn't store the result of mlx5_init_once, due to that
mlx5_load_one returned success on error.  Fix that.

Fixes: 59211bd3b632 ("net/mlx5: Split the load/unload flow into hardware and software flows")
Signed-off-by: Maor Gottlieb <maorg@mellanox.com>
Signed-off-by: Eugenia Emantayev <eugenia@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
6 years agonet/mlx5: Fix mlx5_get_uars_page to return error code
Eran Ben Elisha [Mon, 20 Nov 2017 07:58:01 +0000 (09:58 +0200)]
net/mlx5: Fix mlx5_get_uars_page to return error code

Change mlx5_get_uars_page to return ERR_PTR in case of
allocation failure. Change all callers accordingly to
check the IS_ERR(ptr) instead of NULL.

Fixes: 59211bd3b632 ("net/mlx5: Split the load/unload flow into hardware and software flows")
Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: Eugenia Emantayev <eugenia@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
6 years agonet/mlx5: Fix memory leak in bad flow of mlx5_alloc_irq_vectors
Alaa Hleihel [Thu, 14 Dec 2017 17:23:50 +0000 (19:23 +0200)]
net/mlx5: Fix memory leak in bad flow of mlx5_alloc_irq_vectors

Fix a memory leak where in case that pci_alloc_irq_vectors failed,
priv->irq_info was not released.

Fixes: e126ba97dba9 ("mlx5: Add driver for Mellanox Connect-IB adapters")
Signed-off-by: Alaa Hleihel <alaa@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
6 years agonet/mlx5: Fix get vector affinity helper function
Saeed Mahameed [Thu, 4 Jan 2018 02:35:51 +0000 (04:35 +0200)]
net/mlx5: Fix get vector affinity helper function

mlx5_get_vector_affinity used to call pci_irq_get_affinity and after
reverting the patch that sets the device affinity via PCI_IRQ_AFFINITY
API, calling pci_irq_get_affinity becomes useless and it breaks RDMA
mlx5 users.  To fix this, this patch provides an alternative way to
retrieve IRQ vector affinity using legacy IRQ API, following
smp_affinity read procfs implementation.

Fixes: 231243c82793 ("Revert mlx5: move affinity hints assignments to generic code")
Fixes: a435393acafb ("mlx5: move affinity hints assignments to generic code")
Cc: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
6 years agox86/retpoline/irq32: Convert assembler indirect jumps
Andi Kleen [Thu, 11 Jan 2018 21:46:33 +0000 (21:46 +0000)]
x86/retpoline/irq32: Convert assembler indirect jumps

Convert all indirect jumps in 32bit irq inline asm code to use non
speculative sequences.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Arjan van de Ven <arjan@linux.intel.com>
Acked-by: Ingo Molnar <mingo@kernel.org>
Cc: gnomes@lxorguk.ukuu.org.uk
Cc: Rik van Riel <riel@redhat.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: thomas.lendacky@amd.com
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Jiri Kosina <jikos@kernel.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Kees Cook <keescook@google.com>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Greg Kroah-Hartman <gregkh@linux-foundation.org>
Cc: Paul Turner <pjt@google.com>
Link: https://lkml.kernel.org/r/1515707194-20531-12-git-send-email-dwmw@amazon.co.uk
6 years agox86/retpoline/checksum32: Convert assembler indirect jumps
David Woodhouse [Thu, 11 Jan 2018 21:46:32 +0000 (21:46 +0000)]
x86/retpoline/checksum32: Convert assembler indirect jumps

Convert all indirect jumps in 32bit checksum assembler code to use
non-speculative sequences when CONFIG_RETPOLINE is enabled.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Arjan van de Ven <arjan@linux.intel.com>
Acked-by: Ingo Molnar <mingo@kernel.org>
Cc: gnomes@lxorguk.ukuu.org.uk
Cc: Rik van Riel <riel@redhat.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: thomas.lendacky@amd.com
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Jiri Kosina <jikos@kernel.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Kees Cook <keescook@google.com>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Greg Kroah-Hartman <gregkh@linux-foundation.org>
Cc: Paul Turner <pjt@google.com>
Link: https://lkml.kernel.org/r/1515707194-20531-11-git-send-email-dwmw@amazon.co.uk
6 years agox86/retpoline/xen: Convert Xen hypercall indirect jumps
David Woodhouse [Thu, 11 Jan 2018 21:46:31 +0000 (21:46 +0000)]
x86/retpoline/xen: Convert Xen hypercall indirect jumps

Convert indirect call in Xen hypercall to use non-speculative sequence,
when CONFIG_RETPOLINE is enabled.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Arjan van de Ven <arjan@linux.intel.com>
Acked-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Juergen Gross <jgross@suse.com>
Cc: gnomes@lxorguk.ukuu.org.uk
Cc: Rik van Riel <riel@redhat.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: thomas.lendacky@amd.com
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Jiri Kosina <jikos@kernel.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Kees Cook <keescook@google.com>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Greg Kroah-Hartman <gregkh@linux-foundation.org>
Cc: Paul Turner <pjt@google.com>
Link: https://lkml.kernel.org/r/1515707194-20531-10-git-send-email-dwmw@amazon.co.uk
6 years agox86/retpoline/hyperv: Convert assembler indirect jumps
David Woodhouse [Thu, 11 Jan 2018 21:46:30 +0000 (21:46 +0000)]
x86/retpoline/hyperv: Convert assembler indirect jumps

Convert all indirect jumps in hyperv inline asm code to use non-speculative
sequences when CONFIG_RETPOLINE is enabled.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Arjan van de Ven <arjan@linux.intel.com>
Acked-by: Ingo Molnar <mingo@kernel.org>
Cc: gnomes@lxorguk.ukuu.org.uk
Cc: Rik van Riel <riel@redhat.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: thomas.lendacky@amd.com
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Jiri Kosina <jikos@kernel.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Kees Cook <keescook@google.com>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Greg Kroah-Hartman <gregkh@linux-foundation.org>
Cc: Paul Turner <pjt@google.com>
Link: https://lkml.kernel.org/r/1515707194-20531-9-git-send-email-dwmw@amazon.co.uk
6 years agox86/retpoline/ftrace: Convert ftrace assembler indirect jumps
David Woodhouse [Thu, 11 Jan 2018 21:46:29 +0000 (21:46 +0000)]
x86/retpoline/ftrace: Convert ftrace assembler indirect jumps

Convert all indirect jumps in ftrace assembler code to use non-speculative
sequences when CONFIG_RETPOLINE is enabled.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Arjan van de Ven <arjan@linux.intel.com>
Acked-by: Ingo Molnar <mingo@kernel.org>
Cc: gnomes@lxorguk.ukuu.org.uk
Cc: Rik van Riel <riel@redhat.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: thomas.lendacky@amd.com
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Jiri Kosina <jikos@kernel.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Kees Cook <keescook@google.com>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Greg Kroah-Hartman <gregkh@linux-foundation.org>
Cc: Paul Turner <pjt@google.com>
Link: https://lkml.kernel.org/r/1515707194-20531-8-git-send-email-dwmw@amazon.co.uk
6 years agox86/retpoline/entry: Convert entry assembler indirect jumps
David Woodhouse [Thu, 11 Jan 2018 21:46:28 +0000 (21:46 +0000)]
x86/retpoline/entry: Convert entry assembler indirect jumps

Convert indirect jumps in core 32/64bit entry assembler code to use
non-speculative sequences when CONFIG_RETPOLINE is enabled.

Don't use CALL_NOSPEC in entry_SYSCALL_64_fastpath because the return
address after the 'call' instruction must be *precisely* at the
.Lentry_SYSCALL_64_after_fastpath label for stub_ptregs_64 to work,
and the use of alternatives will mess that up unless we play horrid
games to prepend with NOPs and make the variants the same length. It's
not worth it; in the case where we ALTERNATIVE out the retpoline, the
first instruction at __x86.indirect_thunk.rax is going to be a bare
jmp *%rax anyway.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Arjan van de Ven <arjan@linux.intel.com>
Cc: gnomes@lxorguk.ukuu.org.uk
Cc: Rik van Riel <riel@redhat.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: thomas.lendacky@amd.com
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Jiri Kosina <jikos@kernel.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Kees Cook <keescook@google.com>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Greg Kroah-Hartman <gregkh@linux-foundation.org>
Cc: Paul Turner <pjt@google.com>
Link: https://lkml.kernel.org/r/1515707194-20531-7-git-send-email-dwmw@amazon.co.uk
6 years agox86/retpoline/crypto: Convert crypto assembler indirect jumps
David Woodhouse [Thu, 11 Jan 2018 21:46:27 +0000 (21:46 +0000)]
x86/retpoline/crypto: Convert crypto assembler indirect jumps

Convert all indirect jumps in crypto assembler code to use non-speculative
sequences when CONFIG_RETPOLINE is enabled.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Arjan van de Ven <arjan@linux.intel.com>
Acked-by: Ingo Molnar <mingo@kernel.org>
Cc: gnomes@lxorguk.ukuu.org.uk
Cc: Rik van Riel <riel@redhat.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: thomas.lendacky@amd.com
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Jiri Kosina <jikos@kernel.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Kees Cook <keescook@google.com>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Greg Kroah-Hartman <gregkh@linux-foundation.org>
Cc: Paul Turner <pjt@google.com>
Link: https://lkml.kernel.org/r/1515707194-20531-6-git-send-email-dwmw@amazon.co.uk
6 years agox86/spectre: Add boot time option to select Spectre v2 mitigation
David Woodhouse [Thu, 11 Jan 2018 21:46:26 +0000 (21:46 +0000)]
x86/spectre: Add boot time option to select Spectre v2 mitigation

Add a spectre_v2= option to select the mitigation used for the indirect
branch speculation vulnerability.

Currently, the only option available is retpoline, in its various forms.
This will be expanded to cover the new IBRS/IBPB microcode features.

The RETPOLINE_AMD feature relies on a serializing LFENCE for speculation
control. For AMD hardware, only set RETPOLINE_AMD if LFENCE is a
serializing instruction, which is indicated by the LFENCE_RDTSC feature.

[ tglx: Folded back the LFENCE/AMD fixes and reworked it so IBRS
   integration becomes simple ]

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: gnomes@lxorguk.ukuu.org.uk
Cc: Rik van Riel <riel@redhat.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: thomas.lendacky@amd.com
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Jiri Kosina <jikos@kernel.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Kees Cook <keescook@google.com>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Greg Kroah-Hartman <gregkh@linux-foundation.org>
Cc: Paul Turner <pjt@google.com>
Link: https://lkml.kernel.org/r/1515707194-20531-5-git-send-email-dwmw@amazon.co.uk
6 years agox86/retpoline: Add initial retpoline support
David Woodhouse [Thu, 11 Jan 2018 21:46:25 +0000 (21:46 +0000)]
x86/retpoline: Add initial retpoline support

Enable the use of -mindirect-branch=thunk-extern in newer GCC, and provide
the corresponding thunks. Provide assembler macros for invoking the thunks
in the same way that GCC does, from native and inline assembler.

This adds X86_FEATURE_RETPOLINE and sets it by default on all CPUs. In
some circumstances, IBRS microcode features may be used instead, and the
retpoline can be disabled.

On AMD CPUs if lfence is serialising, the retpoline can be dramatically
simplified to a simple "lfence; jmp *\reg". A future patch, after it has
been verified that lfence really is serialising in all circumstances, can
enable this by setting the X86_FEATURE_RETPOLINE_AMD feature bit in addition
to X86_FEATURE_RETPOLINE.

Do not align the retpoline in the altinstr section, because there is no
guarantee that it stays aligned when it's copied over the oldinstr during
alternative patching.

[ Andi Kleen: Rename the macros, add CONFIG_RETPOLINE option, export thunks]
[ tglx: Put actual function CALL/JMP in front of the macros, convert to
   symbolic labels ]
[ dwmw2: Convert back to numeric labels, merge objtool fixes ]

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Arjan van de Ven <arjan@linux.intel.com>
Acked-by: Ingo Molnar <mingo@kernel.org>
Cc: gnomes@lxorguk.ukuu.org.uk
Cc: Rik van Riel <riel@redhat.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: thomas.lendacky@amd.com
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Jiri Kosina <jikos@kernel.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Kees Cook <keescook@google.com>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Greg Kroah-Hartman <gregkh@linux-foundation.org>
Cc: Paul Turner <pjt@google.com>
Link: https://lkml.kernel.org/r/1515707194-20531-4-git-send-email-dwmw@amazon.co.uk