]> git.proxmox.com Git - mirror_ubuntu-hirsute-kernel.git/log
mirror_ubuntu-hirsute-kernel.git
3 years agonet: phy: broadcom: Add power down exit reset state delay
Florian Fainelli [Thu, 11 Mar 2021 04:53:42 +0000 (20:53 -0800)]
net: phy: broadcom: Add power down exit reset state delay

BugLink: https://bugs.launchpad.net/bugs/1922601
[ Upstream commit 7a1468ba0e02eee24ae1353e8933793a27198e20 ]

Per the datasheet, when we clear the power down bit, the PHY remains in
an internal reset state for 40us and then resume normal operation.
Account for that delay to avoid any issues in the future if
genphy_resume() changes.

Fixes: fe26821fa614 ("net: phy: broadcom: Wire suspend/resume for BCM54810")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
3 years agonet/qlcnic: Fix a use after free in qlcnic_83xx_get_minidump_template
Lv Yunlong [Thu, 11 Mar 2021 04:01:40 +0000 (20:01 -0800)]
net/qlcnic: Fix a use after free in qlcnic_83xx_get_minidump_template

BugLink: https://bugs.launchpad.net/bugs/1922601
[ Upstream commit db74623a3850db99cb9692fda9e836a56b74198d ]

In qlcnic_83xx_get_minidump_template, fw_dump->tmpl_hdr was freed by
vfree(). But unfortunately, it is used when extended is true.

Fixes: 7061b2bdd620e ("qlogic: Deletion of unnecessary checks before two function calls")
Signed-off-by: Lv Yunlong <lyl2019@mail.ustc.edu.cn>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
3 years agokunit: tool: Disable PAGE_POISONING under --alltests
David Gow [Tue, 9 Feb 2021 07:10:34 +0000 (23:10 -0800)]
kunit: tool: Disable PAGE_POISONING under --alltests

BugLink: https://bugs.launchpad.net/bugs/1922601
[ Upstream commit 7fd53f41f771d250eb08db08650940f017e37c26 ]

kunit_tool maintains a list of config options which are broken under
UML, which we exclude from an otherwise 'make ARCH=um allyesconfig'
build used to run all tests with the --alltests option.

Something in UML allyesconfig is causing segfaults when page poisining
is enabled (and is poisoning with a non-zero value). Previously, this
didn't occur, as allyesconfig enabled the CONFIG_PAGE_POISONING_ZERO
option, which worked around the problem by zeroing memory. This option
has since been removed, and memory is now poisoned with 0xAA, which
triggers segfaults in many different codepaths, preventing UML from
booting.

Note that we have to disable both CONFIG_PAGE_POISONING and
CONFIG_DEBUG_PAGEALLOC, as the latter will 'select' the former on
architectures (such as UML) which don't implement __kernel_map_pages().

Ideally, we'd fix this properly by tracking down the real root cause,
but since this is breaking KUnit's --alltests feature, it's worth
disabling there in the meantime so the kernel can boot to the point
where tests can actually run.

Fixes: f289041ed4cf ("mm, page_poison: remove CONFIG_PAGE_POISONING_ZERO")
Signed-off-by: David Gow <davidgow@google.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Reviewed-by: Brendan Higgins <brendanhiggins@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
3 years agoe1000e: Fix error handling in e1000_set_d0_lplu_state_82571
Dinghao Liu [Sun, 28 Feb 2021 09:44:23 +0000 (17:44 +0800)]
e1000e: Fix error handling in e1000_set_d0_lplu_state_82571

BugLink: https://bugs.launchpad.net/bugs/1922601
[ Upstream commit b52912b8293f2c496f42583e65599aee606a0c18 ]

There is one e1e_wphy() call in e1000_set_d0_lplu_state_82571
that we have caught its return value but lack further handling.
Check and terminate the execution flow just like other e1e_wphy()
in this function.

Fixes: bc7f75fa9788 ("[E1000E]: New pci-express e1000 driver (currently for ICH9 devices only)")
Signed-off-by: Dinghao Liu <dinghao.liu@zju.edu.cn>
Acked-by: Sasha Neftin <sasha.neftin@intel.com>
Tested-by: Dvora Fuxbrumer <dvorax.fuxbrumer@linux.intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
3 years agoe1000e: add rtnl_lock() to e1000_reset_task
Vitaly Lifshits [Wed, 21 Oct 2020 11:59:37 +0000 (14:59 +0300)]
e1000e: add rtnl_lock() to e1000_reset_task

BugLink: https://bugs.launchpad.net/bugs/1922601
[ Upstream commit 21f857f0321d0d0ea9b1a758bd55dc63d1cb2437 ]

A possible race condition was found in e1000_reset_task,
after discovering a similar issue in igb driver via
commit 024a8168b749 ("igb: reinit_locked() should be called
with rtnl_lock").

Added rtnl_lock() and rtnl_unlock() to avoid this.

Fixes: bc7f75fa9788 ("[E1000E]: New pci-express e1000 driver (currently for ICH9 devices only)")
Suggested-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Vitaly Lifshits <vitaly.lifshits@intel.com>
Tested-by: Dvora Fuxbrumer <dvorax.fuxbrumer@linux.intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
3 years agoigc: Fix igc_ptp_rx_pktstamp()
Andre Guedes [Wed, 10 Mar 2021 06:42:56 +0000 (22:42 -0800)]
igc: Fix igc_ptp_rx_pktstamp()

BugLink: https://bugs.launchpad.net/bugs/1922601
[ Upstream commit fc9e5020971d57d7d0b3fef9e2ab2108fcb5588b ]

The comment describing the timestamps layout in the packet buffer is
wrong and the code is actually retrieving the timestamp in Timer 1
reference instead of Timer 0. This hasn't been a big issue so far
because hardware is configured to report both timestamps using Timer 0
(see IGC_SRRCTL register configuration in igc_ptp_enable_rx_timestamp()
helper). This patch fixes the comment and the code so we retrieve the
timestamp in Timer 0 reference as expected.

This patch also takes the opportunity to get rid of the hw.mac.type check
since it is not required.

Fixes: 81b055205e8ba ("igc: Add support for RX timestamping")
Signed-off-by: Andre Guedes <andre.guedes@intel.com>
Signed-off-by: Vedang Patel <vedang.patel@intel.com>
Signed-off-by: Jithu Joseph <jithu.joseph@intel.com>
Reviewed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Tested-by: Dvora Fuxbrumer <dvorax.fuxbrumer@linux.intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
3 years agoigc: Fix Supported Pause Frame Link Setting
Muhammad Husaini Zulkifli [Fri, 19 Feb 2021 16:36:48 +0000 (00:36 +0800)]
igc: Fix Supported Pause Frame Link Setting

BugLink: https://bugs.launchpad.net/bugs/1922601
[ Upstream commit 9a4a1cdc5ab52118c1f2b216f4240830b6528d32 ]

The Supported Pause Frame always display "No" even though the Advertised
pause frame showing the correct setting based on the pause parameters via
ethtool. Set bit in link_ksettings to "Supported" for Pause Frame.

Before output:
Supported pause frame use: No

Expected output:
Supported pause frame use: Symmetric

Fixes: 8c5ad0dae93c ("igc: Add ethtool support")
Signed-off-by: Muhammad Husaini Zulkifli <muhammad.husaini.zulkifli@intel.com>
Reviewed-by: Malli C <mallikarjuna.chilakala@intel.com>
Tested-by: Dvora Fuxbrumer <dvorax.fuxbrumer@linux.intel.com>
Acked-by: Sasha Neftin <sasha.neftin@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
3 years agoigc: Fix Pause Frame Advertising
Muhammad Husaini Zulkifli [Fri, 19 Feb 2021 16:36:47 +0000 (00:36 +0800)]
igc: Fix Pause Frame Advertising

BugLink: https://bugs.launchpad.net/bugs/1922601
[ Upstream commit 8876529465c368beafd51a70f79d7a738f2aadf4 ]

Fix Pause Frame Advertising when getting the advertisement via ethtool.
Remove setting the "advertising" bit in link_ksettings during default
case when Tx and Rx are in off state with Auto Negotiate off.

Below is the original output of advertisement link during Tx and Rx off:
Advertised pause frame use: Symmetric Receive-only

Expected output:
Advertised pause frame use: No

Fixes: 8c5ad0dae93c ("igc: Add ethtool support")
Signed-off-by: Muhammad Husaini Zulkifli <muhammad.husaini.zulkifli@intel.com>
Reviewed-by: Malli C <mallikarjuna.chilakala@intel.com>
Acked-by: Sasha Neftin <sasha.neftin@intel.com>
Tested-by: Dvora Fuxbrumer <dvorax.fuxbrumer@linux.intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
3 years agoigc: reinit_locked() should be called with rtnl_lock
Sasha Neftin [Tue, 20 Oct 2020 13:34:00 +0000 (16:34 +0300)]
igc: reinit_locked() should be called with rtnl_lock

BugLink: https://bugs.launchpad.net/bugs/1922601
[ Upstream commit 6da262378c99b17b1a1ac2e42aa65acc1bd471c7 ]

This commit applies to the igc_reset_task the same changes that
were applied to the igb driver in commit 024a8168b749 ("igb:
reinit_locked() should be called with rtnl_lock")
and fix possible race in reset subtask.

Fixes: 0507ef8a0372 ("igc: Add transmit and receive fastpath and interrupt handlers")
Suggested-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Neftin <sasha.neftin@intel.com>
Tested-by: Dvora Fuxbrumer <dvorax.fuxbrumer@linux.intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
3 years agonet: dsa: bcm_sf2: Qualify phydev->dev_flags based on port
Florian Fainelli [Wed, 10 Mar 2021 22:17:58 +0000 (14:17 -0800)]
net: dsa: bcm_sf2: Qualify phydev->dev_flags based on port

BugLink: https://bugs.launchpad.net/bugs/1922601
[ Upstream commit 47142ed6c34d544ae9f0463e58d482289cbe0d46 ]

Similar to commit 92696286f3bb37ba50e4bd8d1beb24afb759a799 ("net:
bcmgenet: Set phydev->dev_flags only for internal PHYs") we need to
qualify the phydev->dev_flags based on whether the port is connected to
an internal or external PHY otherwise we risk having a flags collision
with a completely different interpretation depending on the driver.

Fixes: aa9aef77c761 ("net: dsa: bcm_sf2: communicate integrated PHY revision to PHY driver")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
3 years agonet: sched: validate stab values
Eric Dumazet [Wed, 10 Mar 2021 16:26:41 +0000 (08:26 -0800)]
net: sched: validate stab values

BugLink: https://bugs.launchpad.net/bugs/1922601
[ Upstream commit e323d865b36134e8c5c82c834df89109a5c60dab ]

iproute2 package is well behaved, but malicious user space can
provide illegal shift values and trigger UBSAN reports.

Add stab parameter to red_check_params() to validate user input.

syzbot reported:

UBSAN: shift-out-of-bounds in ./include/net/red.h:312:18
shift exponent 111 is too large for 64-bit type 'long unsigned int'
CPU: 1 PID: 14662 Comm: syz-executor.3 Not tainted 5.12.0-rc2-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
 __dump_stack lib/dump_stack.c:79 [inline]
 dump_stack+0x141/0x1d7 lib/dump_stack.c:120
 ubsan_epilogue+0xb/0x5a lib/ubsan.c:148
 __ubsan_handle_shift_out_of_bounds.cold+0xb1/0x181 lib/ubsan.c:327
 red_calc_qavg_from_idle_time include/net/red.h:312 [inline]
 red_calc_qavg include/net/red.h:353 [inline]
 choke_enqueue.cold+0x18/0x3dd net/sched/sch_choke.c:221
 __dev_xmit_skb net/core/dev.c:3837 [inline]
 __dev_queue_xmit+0x1943/0x2e00 net/core/dev.c:4150
 neigh_hh_output include/net/neighbour.h:499 [inline]
 neigh_output include/net/neighbour.h:508 [inline]
 ip6_finish_output2+0x911/0x1700 net/ipv6/ip6_output.c:117
 __ip6_finish_output net/ipv6/ip6_output.c:182 [inline]
 __ip6_finish_output+0x4c1/0xe10 net/ipv6/ip6_output.c:161
 ip6_finish_output+0x35/0x200 net/ipv6/ip6_output.c:192
 NF_HOOK_COND include/linux/netfilter.h:290 [inline]
 ip6_output+0x1e4/0x530 net/ipv6/ip6_output.c:215
 dst_output include/net/dst.h:448 [inline]
 NF_HOOK include/linux/netfilter.h:301 [inline]
 NF_HOOK include/linux/netfilter.h:295 [inline]
 ip6_xmit+0x127e/0x1eb0 net/ipv6/ip6_output.c:320
 inet6_csk_xmit+0x358/0x630 net/ipv6/inet6_connection_sock.c:135
 dccp_transmit_skb+0x973/0x12c0 net/dccp/output.c:138
 dccp_send_reset+0x21b/0x2b0 net/dccp/output.c:535
 dccp_finish_passive_close net/dccp/proto.c:123 [inline]
 dccp_finish_passive_close+0xed/0x140 net/dccp/proto.c:118
 dccp_terminate_connection net/dccp/proto.c:958 [inline]
 dccp_close+0xb3c/0xe60 net/dccp/proto.c:1028
 inet_release+0x12e/0x280 net/ipv4/af_inet.c:431
 inet6_release+0x4c/0x70 net/ipv6/af_inet6.c:478
 __sock_release+0xcd/0x280 net/socket.c:599
 sock_close+0x18/0x20 net/socket.c:1258
 __fput+0x288/0x920 fs/file_table.c:280
 task_work_run+0xdd/0x1a0 kernel/task_work.c:140
 tracehook_notify_resume include/linux/tracehook.h:189 [inline]

Fixes: 8afa10cbe281 ("net_sched: red: Avoid illegal values")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
3 years agomacvlan: macvlan_count_rx() needs to be aware of preemption
Eric Dumazet [Wed, 10 Mar 2021 09:56:36 +0000 (01:56 -0800)]
macvlan: macvlan_count_rx() needs to be aware of preemption

BugLink: https://bugs.launchpad.net/bugs/1922601
[ Upstream commit dd4fa1dae9f4847cc1fd78ca468ad69e16e5db3e ]

macvlan_count_rx() can be called from process context, it is thus
necessary to disable preemption before calling u64_stats_update_begin()

syzbot was able to spot this on 32bit arch:

WARNING: CPU: 1 PID: 4632 at include/linux/seqlock.h:271 __seqprop_assert include/linux/seqlock.h:271 [inline]
WARNING: CPU: 1 PID: 4632 at include/linux/seqlock.h:271 __seqprop_assert.constprop.0+0xf0/0x11c include/linux/seqlock.h:269
Modules linked in:
Kernel panic - not syncing: panic_on_warn set ...
CPU: 1 PID: 4632 Comm: kworker/1:3 Not tainted 5.12.0-rc2-syzkaller #0
Hardware name: ARM-Versatile Express
Workqueue: events macvlan_process_broadcast
Backtrace:
[<82740468>] (dump_backtrace) from [<827406dc>] (show_stack+0x18/0x1c arch/arm/kernel/traps.c:252)
 r7:00000080 r6:60000093 r5:00000000 r4:8422a3c4
[<827406c4>] (show_stack) from [<82751b58>] (__dump_stack lib/dump_stack.c:79 [inline])
[<827406c4>] (show_stack) from [<82751b58>] (dump_stack+0xb8/0xe8 lib/dump_stack.c:120)
[<82751aa0>] (dump_stack) from [<82741270>] (panic+0x130/0x378 kernel/panic.c:231)
 r7:830209b4 r6:84069ea4 r5:00000000 r4:844350d0
[<82741140>] (panic) from [<80244924>] (__warn+0xb0/0x164 kernel/panic.c:605)
 r3:8404ec8c r2:00000000 r1:00000000 r0:830209b4
 r7:0000010f
[<80244874>] (__warn) from [<82741520>] (warn_slowpath_fmt+0x68/0xd4 kernel/panic.c:628)
 r7:81363f70 r6:0000010f r5:83018e50 r4:00000000
[<827414bc>] (warn_slowpath_fmt) from [<81363f70>] (__seqprop_assert include/linux/seqlock.h:271 [inline])
[<827414bc>] (warn_slowpath_fmt) from [<81363f70>] (__seqprop_assert.constprop.0+0xf0/0x11c include/linux/seqlock.h:269)
 r8:5a109000 r7:0000000f r6:a568dac0 r5:89802300 r4:00000001
[<81363e80>] (__seqprop_assert.constprop.0) from [<81364af0>] (u64_stats_update_begin include/linux/u64_stats_sync.h:128 [inline])
[<81363e80>] (__seqprop_assert.constprop.0) from [<81364af0>] (macvlan_count_rx include/linux/if_macvlan.h:47 [inline])
[<81363e80>] (__seqprop_assert.constprop.0) from [<81364af0>] (macvlan_broadcast+0x154/0x26c drivers/net/macvlan.c:291)
 r5:89802300 r4:8a927740
[<8136499c>] (macvlan_broadcast) from [<81365020>] (macvlan_process_broadcast+0x258/0x2d0 drivers/net/macvlan.c:317)
 r10:81364f78 r9:8a86d000 r8:8a9c7e7c r7:8413aa5c r6:00000000 r5:00000000
 r4:89802840
[<81364dc8>] (macvlan_process_broadcast) from [<802696a4>] (process_one_work+0x2d4/0x998 kernel/workqueue.c:2275)
 r10:00000008 r9:8404ec98 r8:84367a02 r7:ddfe6400 r6:ddfe2d40 r5:898dac80
 r4:8a86d43c
[<802693d0>] (process_one_work) from [<80269dcc>] (worker_thread+0x64/0x54c kernel/workqueue.c:2421)
 r10:00000008 r9:8a9c6000 r8:84006d00 r7:ddfe2d78 r6:898dac94 r5:ddfe2d40
 r4:898dac80
[<80269d68>] (worker_thread) from [<80271f40>] (kthread+0x184/0x1a4 kernel/kthread.c:292)
 r10:85247e64 r9:898dac80 r8:80269d68 r7:00000000 r6:8a9c6000 r5:89a2ee40
 r4:8a97bd00
[<80271dbc>] (kthread) from [<80200114>] (ret_from_fork+0x14/0x20 arch/arm/kernel/entry-common.S:158)
Exception stack(0x8a9c7fb0 to 0x8a9c7ff8)

Fixes: 412ca1550cbe ("macvlan: Move broadcasts into a work queue")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Reported-by: syzbot <syzkaller@googlegroups.com>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
3 years agodrop_monitor: Perform cleanup upon probe registration failure
Ido Schimmel [Wed, 10 Mar 2021 10:28:01 +0000 (12:28 +0200)]
drop_monitor: Perform cleanup upon probe registration failure

BugLink: https://bugs.launchpad.net/bugs/1922601
[ Upstream commit 9398e9c0b1d44eeb700e9e766c02bcc765c82570 ]

In the rare case that drop_monitor fails to register its probe on the
'napi_poll' tracepoint, it will not deactivate its hysteresis timer as
part of the error path. If the hysteresis timer was armed by the shortly
lived 'kfree_skb' probe and user space retries to initiate tracing, a
warning will be emitted for trying to initialize an active object [1].

Fix this by properly undoing all the operations that were done prior to
probe registration, in both software and hardware code paths.

Note that syzkaller managed to fail probe registration by injecting a
slab allocation failure [2].

[1]
ODEBUG: init active (active state 0) object type: timer_list hint: sched_send_work+0x0/0x60 include/linux/list.h:135
WARNING: CPU: 1 PID: 8649 at lib/debugobjects.c:505 debug_print_object+0x16e/0x250 lib/debugobjects.c:505
Modules linked in:
CPU: 1 PID: 8649 Comm: syz-executor.0 Not tainted 5.11.0-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
RIP: 0010:debug_print_object+0x16e/0x250 lib/debugobjects.c:505
[...]
Call Trace:
 __debug_object_init+0x524/0xd10 lib/debugobjects.c:588
 debug_timer_init kernel/time/timer.c:722 [inline]
 debug_init kernel/time/timer.c:770 [inline]
 init_timer_key+0x2d/0x340 kernel/time/timer.c:814
 net_dm_trace_on_set net/core/drop_monitor.c:1111 [inline]
 set_all_monitor_traces net/core/drop_monitor.c:1188 [inline]
 net_dm_monitor_start net/core/drop_monitor.c:1295 [inline]
 net_dm_cmd_trace+0x720/0x1220 net/core/drop_monitor.c:1339
 genl_family_rcv_msg_doit+0x228/0x320 net/netlink/genetlink.c:739
 genl_family_rcv_msg net/netlink/genetlink.c:783 [inline]
 genl_rcv_msg+0x328/0x580 net/netlink/genetlink.c:800
 netlink_rcv_skb+0x153/0x420 net/netlink/af_netlink.c:2502
 genl_rcv+0x24/0x40 net/netlink/genetlink.c:811
 netlink_unicast_kernel net/netlink/af_netlink.c:1312 [inline]
 netlink_unicast+0x533/0x7d0 net/netlink/af_netlink.c:1338
 netlink_sendmsg+0x856/0xd90 net/netlink/af_netlink.c:1927
 sock_sendmsg_nosec net/socket.c:652 [inline]
 sock_sendmsg+0xcf/0x120 net/socket.c:672
 ____sys_sendmsg+0x6e8/0x810 net/socket.c:2348
 ___sys_sendmsg+0xf3/0x170 net/socket.c:2402
 __sys_sendmsg+0xe5/0x1b0 net/socket.c:2435
 do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46
 entry_SYSCALL_64_after_hwframe+0x44/0xae

[2]
 FAULT_INJECTION: forcing a failure.
 name failslab, interval 1, probability 0, space 0, times 1
 CPU: 1 PID: 8645 Comm: syz-executor.0 Not tainted 5.11.0-syzkaller #0
 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
 Call Trace:
  dump_stack+0xfa/0x151
  should_fail.cold+0x5/0xa
  should_failslab+0x5/0x10
  __kmalloc+0x72/0x3f0
  tracepoint_add_func+0x378/0x990
  tracepoint_probe_register+0x9c/0xe0
  net_dm_cmd_trace+0x7fc/0x1220
  genl_family_rcv_msg_doit+0x228/0x320
  genl_rcv_msg+0x328/0x580
  netlink_rcv_skb+0x153/0x420
  genl_rcv+0x24/0x40
  netlink_unicast+0x533/0x7d0
  netlink_sendmsg+0x856/0xd90
  sock_sendmsg+0xcf/0x120
  ____sys_sendmsg+0x6e8/0x810
  ___sys_sendmsg+0xf3/0x170
  __sys_sendmsg+0xe5/0x1b0
  do_syscall_64+0x2d/0x70
  entry_SYSCALL_64_after_hwframe+0x44/0xae

Fixes: 70c69274f354 ("drop_monitor: Initialize timer and work item upon tracing enable")
Fixes: 8ee2267ad33e ("drop_monitor: Convert to using devlink tracepoint")
Reported-by: syzbot+779559d6503f3a56213d@syzkaller.appspotmail.com
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
3 years agoipv6: fix suspecious RCU usage warning
Wei Wang [Wed, 10 Mar 2021 02:20:35 +0000 (18:20 -0800)]
ipv6: fix suspecious RCU usage warning

BugLink: https://bugs.launchpad.net/bugs/1922601
[ Upstream commit 28259bac7f1dde06d8ba324e222bbec9d4e92f2b ]

Syzbot reported the suspecious RCU usage in nexthop_fib6_nh() when
called from ipv6_route_seq_show(). The reason is ipv6_route_seq_start()
calls rcu_read_lock_bh(), while nexthop_fib6_nh() calls
rcu_dereference_rtnl().
The fix proposed is to add a variant of nexthop_fib6_nh() to use
rcu_dereference_bh_rtnl() for ipv6_route_seq_show().

The reported trace is as follows:
./include/net/nexthop.h:416 suspicious rcu_dereference_check() usage!

other info that might help us debug this:

rcu_scheduler_active = 2, debug_locks = 1
2 locks held by syz-executor.0/17895:
     at: seq_read+0x71/0x12a0 fs/seq_file.c:169
     at: seq_file_net include/linux/seq_file_net.h:19 [inline]
     at: ipv6_route_seq_start+0xaf/0x300 net/ipv6/ip6_fib.c:2616

stack backtrace:
CPU: 1 PID: 17895 Comm: syz-executor.0 Not tainted 4.15.0-syzkaller #0
Call Trace:
 [<ffffffff849edf9e>] __dump_stack lib/dump_stack.c:17 [inline]
 [<ffffffff849edf9e>] dump_stack+0xd8/0x147 lib/dump_stack.c:53
 [<ffffffff8480b7fa>] lockdep_rcu_suspicious+0x153/0x15d kernel/locking/lockdep.c:5745
 [<ffffffff8459ada6>] nexthop_fib6_nh include/net/nexthop.h:416 [inline]
 [<ffffffff8459ada6>] ipv6_route_native_seq_show net/ipv6/ip6_fib.c:2488 [inline]
 [<ffffffff8459ada6>] ipv6_route_seq_show+0x436/0x7a0 net/ipv6/ip6_fib.c:2673
 [<ffffffff81c556df>] seq_read+0xccf/0x12a0 fs/seq_file.c:276
 [<ffffffff81dbc62c>] proc_reg_read+0x10c/0x1d0 fs/proc/inode.c:231
 [<ffffffff81bc28ae>] do_loop_readv_writev fs/read_write.c:714 [inline]
 [<ffffffff81bc28ae>] do_loop_readv_writev fs/read_write.c:701 [inline]
 [<ffffffff81bc28ae>] do_iter_read+0x49e/0x660 fs/read_write.c:935
 [<ffffffff81bc81ab>] vfs_readv+0xfb/0x170 fs/read_write.c:997
 [<ffffffff81c88847>] kernel_readv fs/splice.c:361 [inline]
 [<ffffffff81c88847>] default_file_splice_read+0x487/0x9c0 fs/splice.c:416
 [<ffffffff81c86189>] do_splice_to+0x129/0x190 fs/splice.c:879
 [<ffffffff81c86f66>] splice_direct_to_actor+0x256/0x890 fs/splice.c:951
 [<ffffffff81c8777d>] do_splice_direct+0x1dd/0x2b0 fs/splice.c:1060
 [<ffffffff81bc4747>] do_sendfile+0x597/0xce0 fs/read_write.c:1459
 [<ffffffff81bca205>] SYSC_sendfile64 fs/read_write.c:1520 [inline]
 [<ffffffff81bca205>] SyS_sendfile64+0x155/0x170 fs/read_write.c:1506
 [<ffffffff81015fcf>] do_syscall_64+0x1ff/0x310 arch/x86/entry/common.c:305
 [<ffffffff84a00076>] entry_SYSCALL_64_after_hwframe+0x42/0xb7

Fixes: f88d8ea67fbdb ("ipv6: Plumb support for nexthop object in a fib6_info")
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: Wei Wang <weiwan@google.com>
Cc: David Ahern <dsahern@kernel.org>
Cc: Ido Schimmel <idosch@idosch.org>
Cc: Petr Machata <petrm@nvidia.com>
Cc: Eric Dumazet <edumazet@google.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
3 years agonet/mlx5e: E-switch, Fix rate calculation division
Parav Pandit [Wed, 17 Feb 2021 06:23:29 +0000 (08:23 +0200)]
net/mlx5e: E-switch, Fix rate calculation division

BugLink: https://bugs.launchpad.net/bugs/1922601
[ Upstream commit 8b90d897823b28a51811931f3bdc79f8df79407e ]

do_div() returns reminder, while cited patch wanted to use
quotient.
Fix it by using quotient.

Fixes: 0e22bfb7c046 ("net/mlx5e: E-switch, Fix rate calculation for overflow")
Signed-off-by: Parav Pandit <parav@nvidia.com>
Signed-off-by: Maor Dickman <maord@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
3 years agonet/mlx5e: Don't match on Geneve options in case option masks are all zero
Maor Dickman [Tue, 16 Feb 2021 11:39:18 +0000 (13:39 +0200)]
net/mlx5e: Don't match on Geneve options in case option masks are all zero

BugLink: https://bugs.launchpad.net/bugs/1922601
[ Upstream commit 385d40b042e60aa0b677d7b400a0fefb44bcbaf4 ]

The cited change added offload support for Geneve options without verifying
the validity of the options masks, this caused offload of rules with match
on Geneve options with class,type and data masks which are zero to fail.

Fix by ignoring the match on Geneve options in case option masks are
all zero.

Fixes: 9272e3df3023 ("net/mlx5e: Geneve, Add support for encap/decap flows offload")
Signed-off-by: Maor Dickman <maord@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Reviewed-by: Oz Shlomo <ozsh@nvidia.com>
Reviewed-by: Yevgeny Kliteynik <kliteyn@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
3 years agonet/mlx5e: Revert parameters on errors when changing PTP state without reset
Maxim Mikityanskiy [Fri, 29 Jan 2021 12:04:34 +0000 (14:04 +0200)]
net/mlx5e: Revert parameters on errors when changing PTP state without reset

BugLink: https://bugs.launchpad.net/bugs/1922601
[ Upstream commit 74640f09735f935437bd8df9fe61a66f03eabb34 ]

Port timestamping for PTP can be enabled/disabled while the channels are
closed. In that case mlx5e_safe_switch_channels is skipped, and the
preactivate hook is called directly. However, if that hook returns an
error, the channel parameters must be reverted back to their old values.
This commit adds missing handling on this case.

Fixes: 145e5637d941 ("net/mlx5e: Add TX PTP port object support")
Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
3 years agonet/mlx5e: When changing XDP program without reset, take refs for XSK RQs
Maxim Mikityanskiy [Thu, 11 Feb 2021 13:51:11 +0000 (15:51 +0200)]
net/mlx5e: When changing XDP program without reset, take refs for XSK RQs

BugLink: https://bugs.launchpad.net/bugs/1922601
[ Upstream commit e5eb01344e9b09bb9d255b9727449186f7168df8 ]

Each RQ (including XSK RQs) takes a reference to the XDP program. When
an XDP program is attached or detached, the channels and queues are
recreated, however, there is a special flow for changing an active XDP
program to another one. In that flow, channels and queues stay alive,
but the refcounts of the old and new XDP programs are adjusted. This
flow didn't increment refcount by the number of active XSK RQs, and this
commit fixes it.

Fixes: db05815b36cb ("net/mlx5e: Add XSK zero-copy support")
Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
3 years agonet/mlx5e: Set PTP channel pointer explicitly to NULL
Aya Levin [Tue, 12 Jan 2021 11:34:29 +0000 (13:34 +0200)]
net/mlx5e: Set PTP channel pointer explicitly to NULL

BugLink: https://bugs.launchpad.net/bugs/1922601
[ Upstream commit 1c2cdf0b603a3b0c763288ad92e9f3f1555925cf ]

When closing the PTP channel, set its pointer explicitly to NULL. PTP
channel is opened on demand, the code verify the pointer validity before
access. Nullify it when closing the PTP channel to avoid unexpected
behavior.

Fixes: 145e5637d941 ("net/mlx5e: Add TX PTP port object support")
Signed-off-by: Aya Levin <ayal@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
3 years agonet/mlx5e: RX, Mind the MPWQE gaps when calculating offsets
Tariq Toukan [Tue, 12 Jan 2021 11:21:17 +0000 (13:21 +0200)]
net/mlx5e: RX, Mind the MPWQE gaps when calculating offsets

BugLink: https://bugs.launchpad.net/bugs/1922601
[ Upstream commit d5dd03b26ba49c4ffe67ee1937add82293c19794 ]

Since cited patch, MLX5E_REQUIRED_WQE_MTTS is not a power of two.
Hence, usage of MLX5E_LOG_ALIGNED_MPWQE_PPW should be replaced,
as it lost some accuracy. Use the designated macro to calculate
the number of required MTTs.

This makes sure the solution in cited patch works properly.

While here, un-inline mlx5e_get_mpwqe_offset(), and remove the
unused RQ parameter.

Fixes: c3c9402373fe ("net/mlx5e: Add resiliency in Striding RQ mode for packets larger than MTU")
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
3 years agolibbpf: Fix INSTALL flag order
Georgi Valkov [Mon, 8 Mar 2021 18:30:38 +0000 (10:30 -0800)]
libbpf: Fix INSTALL flag order

BugLink: https://bugs.launchpad.net/bugs/1922601
[ Upstream commit e7fb6465d4c8e767e39cbee72464e0060ab3d20c ]

It was reported ([0]) that having optional -m flag between source and
destination arguments in install command breaks bpftools cross-build
on MacOS. Move -m to the front to fix this issue.

  [0] https://github.com/openwrt/openwrt/pull/3959

Fixes: 7110d80d53f4 ("libbpf: Makefile set specified permission mode")
Signed-off-by: Georgi Valkov <gvalkov@abv.bg>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20210308183038.613432-1-andrii@kernel.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
3 years agobpf: Change inode_storage's lookup_elem return value from NULL to -EBADF
Tal Lossos [Sun, 7 Mar 2021 12:09:48 +0000 (14:09 +0200)]
bpf: Change inode_storage's lookup_elem return value from NULL to -EBADF

BugLink: https://bugs.launchpad.net/bugs/1922601
[ Upstream commit 769c18b254ca191b45047e1fcb3b2ce56fada0b6 ]

bpf_fd_inode_storage_lookup_elem() returned NULL when getting a bad FD,
which caused -ENOENT in bpf_map_copy_value. -EBADF error is better than
-ENOENT for a bad FD behaviour.

The patch was partially contributed by CyberArk Software, Inc.

Fixes: 8ea636848aca ("bpf: Implement bpf_local_storage for inodes")
Signed-off-by: Tal Lossos <tallossos@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Yonghong Song <yhs@fb.com>
Acked-by: KP Singh <kpsingh@kernel.org>
Link: https://lore.kernel.org/bpf/20210307120948.61414-1-tallossos@gmail.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
3 years agobpf: Dont allow vmlinux BTF to be used in map_create and prog_load.
Alexei Starovoitov [Sun, 7 Mar 2021 22:52:48 +0000 (14:52 -0800)]
bpf: Dont allow vmlinux BTF to be used in map_create and prog_load.

BugLink: https://bugs.launchpad.net/bugs/1922601
[ Upstream commit 350a5c4dd2452ea999cc5e1d4a8dbf12de2f97ef ]

The syzbot got FD of vmlinux BTF and passed it into map_create which caused
crash in btf_type_id_size() when it tried to access resolved_ids. The vmlinux
BTF doesn't have 'resolved_ids' and 'resolved_sizes' initialized to save
memory. To avoid such issues disallow using vmlinux BTF in prog_load and
map_create commands.

Fixes: 5329722057d4 ("bpf: Assign ID to vmlinux BTF and return extra info for BTF in GET_OBJ_INFO")
Reported-by: syzbot+8bab8ed346746e7540e8@syzkaller.appspotmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20210307225248.79031-1-alexei.starovoitov@gmail.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
3 years agoveth: Store queue_mapping independently of XDP prog presence
Maciej Fijalkowski [Wed, 3 Mar 2021 15:29:03 +0000 (16:29 +0100)]
veth: Store queue_mapping independently of XDP prog presence

BugLink: https://bugs.launchpad.net/bugs/1922601
[ Upstream commit edbea922025169c0e5cdca5ebf7bf5374cc5566c ]

Currently, veth_xmit() would call the skb_record_rx_queue() only when
there is XDP program loaded on peer interface in native mode.

If peer has XDP prog in generic mode, then netif_receive_generic_xdp()
has a call to netif_get_rxqueue(skb), so for multi-queue veth it will
not be possible to grab a correct rxq.

To fix that, store queue_mapping independently of XDP prog presence on
peer interface.

Fixes: 638264dc9022 ("veth: Support per queue XDP ring")
Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Toshiaki Makita <toshiaki.makita1@gmail.com>
Link: https://lore.kernel.org/bpf/20210303152903.11172-1-maciej.fijalkowski@intel.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
3 years agosoc: ti: omap-prm: Fix occasional abort on reset deassert for dra7 iva
Tony Lindgren [Thu, 18 Feb 2021 11:46:33 +0000 (13:46 +0200)]
soc: ti: omap-prm: Fix occasional abort on reset deassert for dra7 iva

BugLink: https://bugs.launchpad.net/bugs/1922601
[ Upstream commit effe89e40037038db7711bdab5d3401fe297d72c ]

On reset deassert, we must wait a bit after the rstst bit change before
we allow clockdomain autoidle again. Otherwise we get the following oops
sometimes on dra7 with iva:

Unhandled fault: imprecise external abort (0x1406) at 0x00000000
44000000.ocp:L3 Standard Error: MASTER MPU TARGET IVA_CONFIG (Read Link):
At Address: 0x0005A410 : Data Access in User mode during Functional access
Internal error: : 1406 [#1] SMP ARM
...
(sysc_write_sysconfig) from [<c0782cb0>] (sysc_enable_module+0xcc/0x260)
(sysc_enable_module) from [<c0782f0c>] (sysc_runtime_resume+0xc8/0x174)
(sysc_runtime_resume) from [<c0a3e1ac>] (genpd_runtime_resume+0x94/0x224)
(genpd_runtime_resume) from [<c0a33f0c>] (__rpm_callback+0xd8/0x180)

It is unclear what all devices this might affect, but presumably other
devices with the rstst bit too can be affected. So let's just enable the
delay for all the devices with rstst bit for now. Later on we may want to
limit the list to the know affected devices if needed.

Fixes: d30cd83f6853 ("soc: ti: omap-prm: add support for denying idle for reset clockdomain")
Reported-by: Yongqin Liu <yongqin.liu@linaro.org>
Signed-off-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
3 years agoARM: OMAP2+: Fix smartreflex init regression after dropping legacy data
Tony Lindgren [Wed, 10 Feb 2021 08:53:48 +0000 (10:53 +0200)]
ARM: OMAP2+: Fix smartreflex init regression after dropping legacy data

BugLink: https://bugs.launchpad.net/bugs/1922601
[ Upstream commit fbfa463be8dc7957ee4f81556e9e1ea2a951807d ]

When I dropped legacy data for omap4 and dra7 smartreflex in favor of
device tree based data, it seems I only testd for the "SmartReflex Class3
initialized" line in dmesg. I missed the fact that there is also
omap_devinit_smartreflex() that happens later, and now it produces an
error on boot for "No Voltage table for the corresponding vdd. Cannot
create debugfs entries for n-values".

This happens as we no longer have the smartreflex instance legacy data,
and have not yet moved completely to device tree based booting for the
driver. Let's fix the issue by changing the smartreflex init to use names.
This should all eventually go away in favor of doing the init in the
driver based on devicetree compatible value.

Note that dra7xx_init_early() is not calling any voltage domain init like
omap54xx_voltagedomains_init(), or a dra7 specific voltagedomains init.
This means that on dra7 smartreflex is still not fully initialized, and
also seems to be missing the related devicetree nodes.

Fixes: a6b1e717e942 ("ARM: OMAP2+: Drop legacy platform data for omap4 smartreflex")
Fixes: e54740b4afe8 ("ARM: OMAP2+: Drop legacy platform data for dra7 smartreflex")
Signed-off-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
3 years agosoc: ti: omap-prm: Fix reboot issue with invalid pcie reset map for dra7
Tony Lindgren [Wed, 10 Feb 2021 08:37:51 +0000 (10:37 +0200)]
soc: ti: omap-prm: Fix reboot issue with invalid pcie reset map for dra7

BugLink: https://bugs.launchpad.net/bugs/1922601
[ Upstream commit a249ca66d15fa4b54dc6deaff4155df3db1308e1 ]

Yongqin Liu <yongqin.liu@linaro.org> reported an issue where reboot hangs
on beagleboard-x15. This started happening after commit 7078a5ba7a58
("soc: ti: omap-prm: Fix boot time errors for rst_map_012 bits 0 and 1").

We now assert any 012 type resets on init to prevent unconfigured
accelerator MMUs getting enabled on init depending on the bootloader or
kexec configured state.

Turns out that we now also wrongly assert dra7 l3init domain PCIe reset
bits causing a hang during reboot. Let's fix the l3init reset bits to
use a 01 map instead of 012 map. There are only two rstctrl bits and not
three. This is documented in TRM "Table 3-1647. RM_PCIESS_RSTCTRL".

Fixes: 5a68c87afde0 ("soc: ti: omap-prm: dra7: add genpd support for remaining PRM instances")
Fixes: 7078a5ba7a58 ("soc: ti: omap-prm: Fix boot time errors for rst_map_012 bits 0 and 1")
Cc: Kishon Vijay Abraham I <kishon@ti.com>
Reported-by: Yongqin Liu <yongqin.liu@linaro.org>
Signed-off-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
3 years agobus: omap_l3_noc: mark l3 irqs as IRQF_NO_THREAD
Grygorii Strashko [Thu, 28 Jan 2021 19:15:48 +0000 (21:15 +0200)]
bus: omap_l3_noc: mark l3 irqs as IRQF_NO_THREAD

BugLink: https://bugs.launchpad.net/bugs/1922601
[ Upstream commit 7d7275b3e866cf8092bd12553ec53ba26864f7bb ]

The main purpose of l3 IRQs is to catch OCP bus access errors and identify
corresponding code places by showing call stack, so it's important to
handle L3 interconnect errors as fast as possible. On RT these IRQs will
became threaded and will be scheduled much more late from the moment actual
error occurred so showing completely useless information.

Hence, mark l3 IRQs as IRQF_NO_THREAD so they will not be forced threaded
on RT or if force_irqthreads = true.

Fixes: 0ee7261c9212 ("drivers: bus: Move the OMAP interconnect driver to drivers/bus/")
Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
3 years agodm ioctl: fix out of bounds array access when no devices
Mikulas Patocka [Fri, 26 Mar 2021 18:32:32 +0000 (14:32 -0400)]
dm ioctl: fix out of bounds array access when no devices

BugLink: https://bugs.launchpad.net/bugs/1922601
commit 4edbe1d7bcffcd6269f3b5eb63f710393ff2ec7a upstream.

If there are not any dm devices, we need to zero the "dev" argument in
the first structure dm_name_list. However, this can cause out of
bounds write, because the "needed" variable is zero and len may be
less than eight.

Fix this bug by reporting DM_BUFFER_FULL_FLAG if the result buffer is
too small to hold the "nl->dev" value.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Cc: stable@vger.kernel.org
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
3 years agodm: don't report "detected capacity change" on device creation
Mikulas Patocka [Mon, 22 Mar 2021 14:13:54 +0000 (10:13 -0400)]
dm: don't report "detected capacity change" on device creation

BugLink: https://bugs.launchpad.net/bugs/1922601
commit 5424a0b867e65f1ecf34ffe88d091a4fcbb35bc1 upstream.

When a DM device is first created it doesn't yet have an established
capacity, therefore the use of set_capacity_and_notify() should be
conditional given the potential for needless pr_info "detected
capacity change" noise even if capacity is 0.

One could argue that the pr_info() in set_capacity_and_notify() is
misplaced, but that position is not held uniformly.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Fixes: f64d9b2eacb9 ("dm: use set_capacity_and_notify")
Cc: stable@vger.kernel.org
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
3 years agodm verity: fix DM_VERITY_OPTS_MAX value
JeongHyeon Lee [Thu, 11 Mar 2021 12:10:50 +0000 (21:10 +0900)]
dm verity: fix DM_VERITY_OPTS_MAX value

BugLink: https://bugs.launchpad.net/bugs/1922601
commit 160f99db943224e55906dd83880da1a704c6e6b9 upstream.

Three optional parameters must be accepted at once in a DM verity table, e.g.:
  (verity_error_handling_mode) (ignore_zero_block) (check_at_most_once)
Fix this to be possible by incrementing DM_VERITY_OPTS_MAX.

Signed-off-by: JeongHyeon Lee <jhs2.lee@samsung.com>
Fixes: 843f38d382b1 ("dm verity: add 'check_at_most_once' option to only validate hashes once")
Cc: stable@vger.kernel.org
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
3 years agodrm/i915: Fix the GT fence revocation runtime PM logic
Imre Deak [Mon, 22 Mar 2021 20:28:17 +0000 (22:28 +0200)]
drm/i915: Fix the GT fence revocation runtime PM logic

BugLink: https://bugs.launchpad.net/bugs/1922601
commit 8840e3bd981f128846b01c12d3966d115e8617c9 upstream.

To optimize some task deferring it until runtime resume unless someone
holds a runtime PM reference (because in this case the task can be done
w/o the overhead of runtime resume), we have to use the runtime PM
get-if-active logic: If the runtime PM usage count is 0 (and so
get-if-in-use would return false) the runtime suspend handler is not
necessarily called yet (it could be just pending), so the device is not
necessarily powered down, and so the runtime resume handler is not
guaranteed to be called.

The fence revocation depends on the above deferral, so add a
get-if-active helper and use it during fence revocation.

v2:
- Add code comment explaining the fence reg programming deferral logic
  to i915_vma_revoke_fence(). (Chris)
- Add Cc: stable and Fixes: tags. (Chris)
- Fix the function docbook comment.

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: <stable@vger.kernel.org> # v4.12+
Fixes: 181df2d458f3 ("drm/i915: Take rpm wakelock for releasing the fence on unbind")
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Imre Deak <imre.deak@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210322204223.919936-1-imre.deak@intel.com
(cherry picked from commit 9d58aa46291d4d696bb1eac3436d3118f7bf2573)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
3 years agodrm/i915/dsc: fix DSS CTL register usage for ICL DSI transcoders
Jani Nikula [Fri, 19 Mar 2021 11:53:33 +0000 (13:53 +0200)]
drm/i915/dsc: fix DSS CTL register usage for ICL DSI transcoders

BugLink: https://bugs.launchpad.net/bugs/1922601
commit b61fde1beb6b1847f1743e75f4d9839acebad76a upstream.

Use the correct DSS CTL registers for ICL DSI transcoders.

As a side effect, this also brings back the sanity check for trying to
use pipe DSC registers on pipe A on ICL.

Fixes: 8a029c113b17 ("drm/i915/dp: Modify VDSC helpers to configure DSC for Bigjoiner slave")
References: http://lore.kernel.org/r/87eegxq2lq.fsf@intel.com
Cc: Manasi Navare <manasi.d.navare@intel.com>
Cc: Animesh Manna <animesh.manna@intel.com>
Cc: Vandita Kulkarni <vandita.kulkarni@intel.com>
Cc: <stable@vger.kernel.org> # v5.11+
Reviewed-by: Manasi Navare <manasi.d.navare@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210319115333.8330-1-jani.nikula@intel.com
(cherry picked from commit 5706d02871240fdba7ddd6ab1cc31672fc95a90f)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
3 years agodrm/amdgpu: Add additional Sienna Cichlid PCI ID
Alex Deucher [Thu, 18 Mar 2021 20:44:10 +0000 (16:44 -0400)]
drm/amdgpu: Add additional Sienna Cichlid PCI ID

BugLink: https://bugs.launchpad.net/bugs/1922601
commit c933b111094f2818571fc51b81b98ee0d370c035 upstream.

Add new DID.

Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
3 years agodrm/amdgpu: fix the hibernation suspend with s0ix
Prike Liang [Tue, 9 Mar 2021 01:34:00 +0000 (09:34 +0800)]
drm/amdgpu: fix the hibernation suspend with s0ix

BugLink: https://bugs.launchpad.net/bugs/1922601
commit 9aa26019c1a60013ea866d460de6392acb1712ee upstream.

During system hibernation suspend still need un-gate gfx CG/PG firstly to handle HW
status check before HW resource destory.

Signed-off-by: Prike Liang <Prike.Liang@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
3 years agodrm/amdgpu/display: restore AUX_DPHY_TX_CONTROL for DCN2.x
Alex Deucher [Tue, 16 Feb 2021 17:22:40 +0000 (12:22 -0500)]
drm/amdgpu/display: restore AUX_DPHY_TX_CONTROL for DCN2.x

BugLink: https://bugs.launchpad.net/bugs/1922601
commit 5c458585c0141754cdcbf25feebb547dd671b559 upstream.

Commit 098214999c8f added fetching of the AUX_DPHY register
values from the vbios, but it also changed the default values
in the case when there are no values in the vbios.  This causes
problems with displays with high refresh rates.  To fix this,
switch back to the original default value for AUX_DPHY_TX_CONTROL.

Fixes: 098214999c8f ("drm/amd/display: Read VBIOS Golden Settings Tbl")
Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1426
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: Igor Kravchenko <Igor.Kravchenko@amd.com>
Cc: Aric Cyr <Aric.Cyr@amd.com>
Cc: Aurabindo Pillai <aurabindo.pillai@amd.com>
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
3 years agodrm/amd/pm: workaround for audio noise issue
Kenneth Feng [Thu, 11 Mar 2021 04:19:57 +0000 (12:19 +0800)]
drm/amd/pm: workaround for audio noise issue

BugLink: https://bugs.launchpad.net/bugs/1922601
commit 9d03730ecbc5afabfda26d4dbb014310bc4ea4d9 upstream.

On some Intel platforms, audio noise can be detected due to
high pcie speed switch latency.
This patch leaverages ppfeaturemask to fix to the highest pcie
speed then disable pcie switching.

v2:
coding style fix

Signed-off-by: Kenneth Feng <kenneth.feng@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
3 years agodrm/etnaviv: Use FOLL_FORCE for userptr
Daniel Vetter [Mon, 1 Mar 2021 09:52:53 +0000 (10:52 +0100)]
drm/etnaviv: Use FOLL_FORCE for userptr

BugLink: https://bugs.launchpad.net/bugs/1922601
commit cd5297b0855f17c8b4e3ef1d20c6a3656209c7b3 upstream.

Nothing checks userptr.ro except this call to pup_fast, which means
there's nothing actually preventing userspace from writing to this.
Which means you can just read-only mmap any file you want, userptr it
and then write to it with the gpu. Not good.

The right way to handle this is FOLL_WRITE | FOLL_FORCE, which will
break any COW mappings and update tracking for MAY_WRITE mappings so
there's no exploit and the vm isn't confused about what's going on.
For any legit use case there's no difference from what userspace can
observe and do.

Reviewed-by: Lucas Stach <l.stach@pengutronix.de>
Cc: stable@vger.kernel.org
Cc: John Hubbard <jhubbard@nvidia.com>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Cc: Lucas Stach <l.stach@pengutronix.de>
Cc: Russell King <linux+etnaviv@armlinux.org.uk>
Cc: Christian Gmeiner <christian.gmeiner@gmail.com>
Cc: etnaviv@lists.freedesktop.org
Link: https://patchwork.freedesktop.org/patch/msgid/20210301095254.1946084-1-daniel.vetter@ffwll.ch
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
3 years agodrm/nouveau/kms/nve4-nv108: Limit cursors to 128x128
Lyude Paul [Fri, 5 Mar 2021 01:52:41 +0000 (20:52 -0500)]
drm/nouveau/kms/nve4-nv108: Limit cursors to 128x128

BugLink: https://bugs.launchpad.net/bugs/1922601
commit d3999c1f7bbbc100c167d7ad3cd79c1d10446ba2 upstream.

While Kepler does technically support 256x256 cursors, it turns out that
Kepler actually has some additional requirements for scanout surfaces that
we're not enforcing correctly, which aren't present on Maxwell and later.
Cursor surfaces must always use small pages (4K), and overlay surfaces must
always use large pages (128K).

Fixing this correctly though will take a bit more work: as we'll need to
add some code in prepare_fb() to move cursor FBs in large pages to small
pages, and vice-versa for overlay FBs. So until we have the time to do
that, just limit cursor surfaces to 128x128 - a size small enough to always
default to small pages.

This means small ovlys are still broken on Kepler, but it is extremely
unlikely anyone cares about those anyway :).

Signed-off-by: Lyude Paul <lyude@redhat.com>
Fixes: d3b2f0f7921c ("drm/nouveau/kms/nv50-: Report max cursor size to userspace")
Cc: <stable@vger.kernel.org> # v5.11+
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
3 years agointegrity: double check iint_cache was initialized
Mimi Zohar [Fri, 19 Mar 2021 15:17:23 +0000 (11:17 -0400)]
integrity: double check iint_cache was initialized

BugLink: https://bugs.launchpad.net/bugs/1922601
commit 92063f3ca73aab794bd5408d3361fd5b5ea33079 upstream.

The kernel may be built with multiple LSMs, but only a subset may be
enabled on the boot command line by specifying "lsm=".  Not including
"integrity" on the ordered LSM list may result in a NULL deref.

As reported by Dmitry Vyukov:
in qemu:
qemu-system-x86_64       -enable-kvm     -machine q35,nvdimm -cpu
max,migratable=off -smp 4       -m 4G,slots=4,maxmem=16G        -hda
wheezy.img      -kernel arch/x86/boot/bzImage   -nographic -vga std
 -soundhw all     -usb -usbdevice tablet  -bt hci -bt device:keyboard
   -net user,host=10.0.2.10,hostfwd=tcp::10022-:22 -net
nic,model=virtio-net-pci   -object
memory-backend-file,id=pmem1,share=off,mem-path=/dev/zero,size=64M
  -device nvdimm,id=nvdimm1,memdev=pmem1  -append "console=ttyS0
root=/dev/sda earlyprintk=serial rodata=n oops=panic panic_on_warn=1
panic=86400 lsm=smack numa=fake=2 nopcid dummy_hcd.num=8"   -pidfile
vm_pid -m 2G -cpu host

But it crashes on NULL deref in integrity_inode_get during boot:

Run /sbin/init as init process
BUG: kernel NULL pointer dereference, address: 000000000000001c
PGD 0 P4D 0
Oops: 0000 [#1] PREEMPT SMP KASAN
CPU: 3 PID: 1 Comm: swapper/0 Not tainted 5.12.0-rc2+ #97
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS
rel-1.13.0-44-g88ab0c15525c-prebuilt.qemu.org 04/01/2014
RIP: 0010:kmem_cache_alloc+0x2b/0x370 mm/slub.c:2920
Code: 57 41 56 41 55 41 54 41 89 f4 55 48 89 fd 53 48 83 ec 10 44 8b
3d d9 1f 90 0b 65 48 8b 04 25 28 00 00 00 48 89 44 24 08 31 c0 <8b> 5f
1c 4cf
RSP: 0000:ffffc9000032f9d8 EFLAGS: 00010246
RAX: 0000000000000000 RBX: ffff888017fc4f00 RCX: 0000000000000000
RDX: ffff888040220000 RSI: 0000000000000c40 RDI: 0000000000000000
RBP: 0000000000000000 R08: 0000000000000000 R09: ffff888019263627
R10: ffffffff83937cd1 R11: 0000000000000000 R12: 0000000000000c40
R13: ffff888019263538 R14: 0000000000000000 R15: 0000000000ffffff
FS:  0000000000000000(0000) GS:ffff88802d180000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 000000000000001c CR3: 000000000b48e000 CR4: 0000000000750ee0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
PKRU: 55555554
Call Trace:
 integrity_inode_get+0x47/0x260 security/integrity/iint.c:105
 process_measurement+0x33d/0x17e0 security/integrity/ima/ima_main.c:237
 ima_bprm_check+0xde/0x210 security/integrity/ima/ima_main.c:474
 security_bprm_check+0x7d/0xa0 security/security.c:845
 search_binary_handler fs/exec.c:1708 [inline]
 exec_binprm fs/exec.c:1761 [inline]
 bprm_execve fs/exec.c:1830 [inline]
 bprm_execve+0x764/0x19a0 fs/exec.c:1792
 kernel_execve+0x370/0x460 fs/exec.c:1973
 try_to_run_init_process+0x14/0x4e init/main.c:1366
 kernel_init+0x11d/0x1b8 init/main.c:1477
 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:294
Modules linked in:
CR2: 000000000000001c
---[ end trace 22d601a500de7d79 ]---

Since LSMs and IMA may be configured at build time, but not enabled at
run time, panic the system if "integrity" was not initialized before use.

Reported-by: Dmitry Vyukov <dvyukov@google.com>
Fixes: 79f7865d844c ("LSM: Introduce "lsm=" for boottime LSM selection")
Cc: stable@vger.kernel.org
Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
3 years agoARM: dts: at91-sama5d27_som1: fix phy address to 7
Claudiu Beznea [Wed, 11 Apr 2018 16:05:03 +0000 (19:05 +0300)]
ARM: dts: at91-sama5d27_som1: fix phy address to 7

BugLink: https://bugs.launchpad.net/bugs/1922601
commit 221c3a09ddf70a0a51715e6c2878d8305e95c558 upstream.

Fix the phy address to 7 for Ethernet PHY on SAMA5D27 SOM1. No
connection established if phy address 0 is used.

The board uses the 24 pins version of the KSZ8081RNA part, KSZ8081RNA
pin 16 REFCLK as PHYAD bit [2] has weak internal pull-down.  But at
reset, connected to PD09 of the MPU it's connected with an internal
pull-up forming PHYAD[2:0] = 7.

Signed-off-by: Claudiu Beznea <claudiu.beznea@microchip.com>
Fixes: 2f61929eb10a ("ARM: dts: at91: at91-sama5d27_som1: fix PHY ID")
Cc: Ludovic Desroches <ludovic.desroches@microchip.com>
Signed-off-by: Nicolas Ferre <nicolas.ferre@microchip.com>
Cc: <stable@vger.kernel.org> # 4.14+
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
3 years agoARM: dts: at91: sam9x60: fix mux-mask to match product's datasheet
Nicolas Ferre [Wed, 10 Mar 2021 15:20:06 +0000 (16:20 +0100)]
ARM: dts: at91: sam9x60: fix mux-mask to match product's datasheet

BugLink: https://bugs.launchpad.net/bugs/1922601
commit 2c69c8a1736eace8de491d480e6e577a27c2087c upstream.

Fix the whole mux-mask table according to datasheet for the sam9x60
product.  Too much functions for pins were disabled leading to
misunderstandings when enabling more peripherals or taking this table
as an example for another board.
Take advantage of this fix to move the mux-mask in the SoC file where it
belongs and use lower case letters for hex numbers like everywhere in
the file.

Signed-off-by: Nicolas Ferre <nicolas.ferre@microchip.com>
Fixes: 1e5f532c2737 ("ARM: dts: at91: sam9x60: add device tree for soc and board")
Cc: <stable@vger.kernel.org> # 5.6+
Cc: Sandeep Sheriker Mallikarjun <sandeepsheriker.mallikarjun@microchip.com>
Reviewed-by: Tudor Ambarus <tudor.ambarus@microchip.com>
Link: https://lore.kernel.org/r/20210310152006.15018-1-nicolas.ferre@microchip.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
3 years agoARM: dts: at91: sam9x60: fix mux-mask for PA7 so it can be set to A, B and C
Federico Pellegrin [Sun, 7 Feb 2021 05:00:22 +0000 (06:00 +0100)]
ARM: dts: at91: sam9x60: fix mux-mask for PA7 so it can be set to A, B and C

BugLink: https://bugs.launchpad.net/bugs/1922601
commit 664979bba8169d775959452def968d1a7c03901f upstream.

According to the datasheet PA7 can be set to either function A, B or
C (see table 6-2 of DS60001579D). The previous value would permit just
configuring with function C.

Signed-off-by: Federico Pellegrin <fede@evolware.org>
Fixes: 1e5f532c2737 ("ARM: dts: at91: sam9x60: add device tree for soc and board")
Cc: <stable@vger.kernel.org> # 5.6+
Cc: Sandeep Sheriker Mallikarjun <sandeepsheriker.mallikarjun@microchip.com>
Signed-off-by: Nicolas Ferre <nicolas.ferre@microchip.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
3 years agoarm64: dts: ls1043a: mark crypto engine dma coherent
Horia Geantă [Sun, 7 Mar 2021 20:47:36 +0000 (22:47 +0200)]
arm64: dts: ls1043a: mark crypto engine dma coherent

BugLink: https://bugs.launchpad.net/bugs/1922601
commit 4fb3a074755b7737c4081cffe0ccfa08c2f2d29d upstream.

Crypto engine (CAAM) on LS1043A platform is configured HW-coherent,
mark accordingly the DT node.

Lack of "dma-coherent" property for an IP that is configured HW-coherent
can lead to problems, similar to what has been reported for LS1046A.

Cc: <stable@vger.kernel.org> # v4.8+
Fixes: 63dac35b58f4 ("arm64: dts: ls1043a: add crypto node")
Link: https://lore.kernel.org/linux-crypto/fe6faa24-d8f7-d18f-adfa-44fa0caa1598@arm.com
Signed-off-by: Horia Geantă <horia.geanta@nxp.com>
Acked-by: Li Yang <leoyang.li@nxp.com>
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
3 years agoarm64: dts: ls1012a: mark crypto engine dma coherent
Horia Geantă [Sun, 7 Mar 2021 20:47:37 +0000 (22:47 +0200)]
arm64: dts: ls1012a: mark crypto engine dma coherent

BugLink: https://bugs.launchpad.net/bugs/1922601
commit ba8da03fa7dff59d9400250aebd38f94cde3cb0f upstream.

Crypto engine (CAAM) on LS1012A platform is configured HW-coherent,
mark accordingly the DT node.

Lack of "dma-coherent" property for an IP that is configured HW-coherent
can lead to problems, similar to what has been reported for LS1046A.

Cc: <stable@vger.kernel.org> # v4.12+
Fixes: 85b85c569507 ("arm64: dts: ls1012a: add crypto node")
Signed-off-by: Horia Geantă <horia.geanta@nxp.com>
Acked-by: Li Yang <leoyang.li@nxp.com>
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
3 years agoarm64: dts: ls1046a: mark crypto engine dma coherent
Horia Geantă [Sun, 7 Mar 2021 20:47:35 +0000 (22:47 +0200)]
arm64: dts: ls1046a: mark crypto engine dma coherent

BugLink: https://bugs.launchpad.net/bugs/1922601
commit 9c3a16f88385e671b63a0de7b82b85e604a80f42 upstream.

Crypto engine (CAAM) on LS1046A platform is configured HW-coherent,
mark accordingly the DT node.

As reported by Greg and Sascha, and explained by Robin, lack of
"dma-coherent" property for an IP that is configured HW-coherent
can lead to problems, e.g. on v5.11:

> kernel BUG at drivers/crypto/caam/jr.c:247!
> Internal error: Oops - BUG: 0 [#1] PREEMPT SMP
> Modules linked in:
> CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.11.0-20210225-3-00039-g434215968816-dirty #12
> Hardware name: TQ TQMLS1046A SoM on Arkona AT1130 (C300) board (DT)
> pstate: 60000005 (nZCv daif -PAN -UAO -TCO BTYPE=--)
> pc : caam_jr_dequeue+0x98/0x57c
> lr : caam_jr_dequeue+0x98/0x57c
> sp : ffff800010003d50
> x29: ffff800010003d50 x28: ffff8000118d4000
> x27: ffff8000118d4328 x26: 00000000000001f0
> x25: ffff0008022be480 x24: ffff0008022c6410
> x23: 00000000000001f1 x22: ffff8000118d4329
> x21: 0000000000004d80 x20: 00000000000001f1
> x19: 0000000000000001 x18: 0000000000000020
> x17: 0000000000000000 x16: 0000000000000015
> x15: ffff800011690230 x14: 2e2e2e2e2e2e2e2e
> x13: 2e2e2e2e2e2e2020 x12: 3030303030303030
> x11: ffff800011700a38 x10: 00000000fffff000
> x9 : ffff8000100ada30 x8 : ffff8000116a8a38
> x7 : 0000000000000001 x6 : 0000000000000000
> x5 : 0000000000000000 x4 : 0000000000000000
> x3 : 00000000ffffffff x2 : 0000000000000000
> x1 : 0000000000000000 x0 : 0000000000001800
> Call trace:
>  caam_jr_dequeue+0x98/0x57c
>  tasklet_action_common.constprop.0+0x164/0x18c
>  tasklet_action+0x44/0x54
>  __do_softirq+0x160/0x454
>  __irq_exit_rcu+0x164/0x16c
>  irq_exit+0x1c/0x30
>  __handle_domain_irq+0xc0/0x13c
>  gic_handle_irq+0x5c/0xf0
>  el1_irq+0xb4/0x180
>  arch_cpu_idle+0x18/0x30
>  default_idle_call+0x3c/0x1c0
>  do_idle+0x23c/0x274
>  cpu_startup_entry+0x34/0x70
>  rest_init+0xdc/0xec
>  arch_call_rest_init+0x1c/0x28
>  start_kernel+0x4ac/0x4e4
> Code: 91392021 912c2000 d377d8c6 97f24d96 (d4210000)

Cc: <stable@vger.kernel.org> # v4.10+
Fixes: 8126d88162a5 ("arm64: dts: add QorIQ LS1046A SoC support")
Link: https://lore.kernel.org/linux-crypto/fe6faa24-d8f7-d18f-adfa-44fa0caa1598@arm.com
Reported-by: Greg Ungerer <gerg@kernel.org>
Reported-by: Sascha Hauer <s.hauer@pengutronix.de>
Tested-by: Sascha Hauer <s.hauer@pengutronix.de>
Signed-off-by: Horia Geantă <horia.geanta@nxp.com>
Acked-by: Greg Ungerer <gerg@kernel.org>
Acked-by: Li Yang <leoyang.li@nxp.com>
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
3 years agoarm64: stacktrace: don't trace arch_stack_walk()
Mark Rutland [Fri, 19 Mar 2021 18:41:06 +0000 (18:41 +0000)]
arm64: stacktrace: don't trace arch_stack_walk()

BugLink: https://bugs.launchpad.net/bugs/1922601
commit c607ab4f916d4d5259072eca34055d3f5a795c21 upstream.

We recently converted arm64 to use arch_stack_walk() in commit:

  5fc57df2f6fd ("arm64: stacktrace: Convert to ARCH_STACKWALK")

The core stacktrace code expects that (when tracing the current task)
arch_stack_walk() starts a trace at its caller, and does not include
itself in the trace. However, arm64's arch_stack_walk() includes itself,
and so traces include one more entry than callers expect. The core
stacktrace code which calls arch_stack_walk() tries to skip a number of
entries to prevent itself appearing in a trace, and the additional entry
prevents skipping one of the core stacktrace functions, leaving this in
the trace unexpectedly.

We can fix this by having arm64's arch_stack_walk() begin the trace with
its caller. The first value returned by the trace will be
__builtin_return_address(0), i.e. the caller of arch_stack_walk(). The
first frame record to be unwound will be __builtin_frame_address(1),
i.e. the caller's frame record. To prevent surprises, arch_stack_walk()
is also marked noinline.

While __builtin_frame_address(1) is not safe in portable code, local GCC
developers have confirmed that it is safe on arm64. To find the caller's
frame record, the builtin can safely dereference the current function's
frame record or (in theory) could stash the original FP into another GPR
at function entry time, neither of which are problematic.

Prior to this patch, the tracing code would unexpectedly show up in
traces of the current task, e.g.

| # cat /proc/self/stack
| [<0>] stack_trace_save_tsk+0x98/0x100
| [<0>] proc_pid_stack+0xb4/0x130
| [<0>] proc_single_show+0x60/0x110
| [<0>] seq_read_iter+0x230/0x4d0
| [<0>] seq_read+0xdc/0x130
| [<0>] vfs_read+0xac/0x1e0
| [<0>] ksys_read+0x6c/0xfc
| [<0>] __arm64_sys_read+0x20/0x30
| [<0>] el0_svc_common.constprop.0+0x60/0x120
| [<0>] do_el0_svc+0x24/0x90
| [<0>] el0_svc+0x2c/0x54
| [<0>] el0_sync_handler+0x1a4/0x1b0
| [<0>] el0_sync+0x170/0x180

After this patch, the tracing code will not show up in such traces:

| # cat /proc/self/stack
| [<0>] proc_pid_stack+0xb4/0x130
| [<0>] proc_single_show+0x60/0x110
| [<0>] seq_read_iter+0x230/0x4d0
| [<0>] seq_read+0xdc/0x130
| [<0>] vfs_read+0xac/0x1e0
| [<0>] ksys_read+0x6c/0xfc
| [<0>] __arm64_sys_read+0x20/0x30
| [<0>] el0_svc_common.constprop.0+0x60/0x120
| [<0>] do_el0_svc+0x24/0x90
| [<0>] el0_svc+0x2c/0x54
| [<0>] el0_sync_handler+0x1a4/0x1b0
| [<0>] el0_sync+0x170/0x180

Erring on the side of caution, I've given this a spin with a bunch of
toolchains, verifying the output of /proc/self/stack and checking that
the assembly looked sound. For GCC (where we require version 5.1.0 or
later) I tested with the kernel.org crosstool binares for versions
5.5.0, 6.4.0, 6.5.0, 7.3.0, 7.5.0, 8.1.0, 8.3.0, 8.4.0, 9.2.0, and
10.1.0. For clang (where we require version 10.0.1 or later) I tested
with the llvm.org binary releases of 11.0.0, and 11.0.1.

Fixes: 5fc57df2f6fd ("arm64: stacktrace: Convert to ARCH_STACKWALK")
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Chen Jun <chenjun102@huawei.com>
Cc: Marco Elver <elver@google.com>
Cc: Mark Brown <broonie@kernel.org>
Cc: Will Deacon <will@kernel.org>
Cc: <stable@vger.kernel.org> # 5.10.x
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20210319184106.5688-1-mark.rutland@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
3 years agoACPICA: Always create namespace nodes using acpi_ns_create_node()
Vegard Nossum [Tue, 23 Mar 2021 21:20:33 +0000 (14:20 -0700)]
ACPICA: Always create namespace nodes using acpi_ns_create_node()

BugLink: https://bugs.launchpad.net/bugs/1922601
commit 25928deeb1e4e2cdae1dccff349320c6841eb5f8 upstream.

ACPICA commit 29da9a2a3f5b2c60420893e5c6309a0586d7a329

ACPI is allocating an object using kmalloc(), but then frees it
using kmem_cache_free(<"Acpi-Namespace" kmem_cache>).

This is wrong and can lead to boot failures manifesting like this:

    hpet0: 3 comparators, 64-bit 100.000000 MHz counter
    clocksource: Switched to clocksource tsc-early
    BUG: unable to handle page fault for address: 000000003ffe0018
    #PF: supervisor read access in kernel mode
    #PF: error_code(0x0000) - not-present page
    PGD 0 P4D 0
    Oops: 0000 [#1] SMP PTI
    CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.6.0+ #211
    Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
Ubuntu-1.8.2-1ubuntu1 04/01/2014
    RIP: 0010:kmem_cache_alloc+0x70/0x1d0
    Code: 00 00 4c 8b 45 00 65 49 8b 50 08 65 4c 03 05 6f cc e7 7e 4d 8b
20 4d 85 e4 0f 84 3d 01 00 00 8b 45 20 48 8b 7d 00 48 8d 4a 01 <49> 8b
   1c 04 4c 89 e0 65 48 0f c7 0f 0f 94 c0 84 c0 74 c5 8b 45 20
    RSP: 0000:ffffc90000013df8 EFLAGS: 00010206
    RAX: 0000000000000018 RBX: ffffffff81c49200 RCX: 0000000000000002
    RDX: 0000000000000001 RSI: 0000000000000dc0 RDI: 000000000002b300
    RBP: ffff88803e403d00 R08: ffff88803ec2b300 R09: 0000000000000001
    R10: 0000000000000dc0 R11: 0000000000000006 R12: 000000003ffe0000
    R13: ffffffff8110a583 R14: 0000000000000dc0 R15: ffffffff81c49a80
    FS:  0000000000000000(0000) GS:ffff88803ec00000(0000)
knlGS:0000000000000000
    CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    CR2: 000000003ffe0018 CR3: 0000000001c0a001 CR4: 00000000003606f0
    DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
    DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
    Call Trace:
     __trace_define_field+0x33/0xa0
     event_trace_init+0xeb/0x2b4
     tracer_init_tracefs+0x60/0x195
     ? register_tracer+0x1e7/0x1e7
     do_one_initcall+0x74/0x160
     kernel_init_freeable+0x190/0x1f0
     ? rest_init+0x9a/0x9a
     kernel_init+0x5/0xf6
     ret_from_fork+0x35/0x40
    CR2: 000000003ffe0018
    ---[ end trace 707efa023f2ee960 ]---
    RIP: 0010:kmem_cache_alloc+0x70/0x1d0

Bisection leads to unrelated changes in slab; Vlastimil Babka
suggests an unrelated layout or slab merge change merely exposed
the underlying bug.

Link: https://lore.kernel.org/lkml/4dc93ff8-f86e-f4c9-ebeb-6d3153a78d03@oracle.com/
Link: https://lore.kernel.org/r/a1461e21-c744-767d-6dfc-6641fd3e3ce2@siemens.com
Link: https://github.com/acpica/acpica/commit/29da9a2a
Fixes: f79c8e4136ea ("ACPICA: Namespace: simplify creation of the initial/default namespace")
Reported-by: Jan Kiszka <jan.kiszka@siemens.com>
Diagnosed-by: Vlastimil Babka <vbabka@suse.cz>
Diagnosed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com>
Signed-off-by: Bob Moore <robert.moore@intel.com>
Signed-off-by: Erik Kaneda <erik.kaneda@intel.com>
Cc: 5.10+ <stable@vger.kernel.org> # 5.10+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
3 years agoACPI: video: Add missing callback back for Sony VPCEH3U1E
Chris Chiu [Fri, 12 Mar 2021 03:24:30 +0000 (11:24 +0800)]
ACPI: video: Add missing callback back for Sony VPCEH3U1E

BugLink: https://bugs.launchpad.net/bugs/1922601
commit c1d1e25a8c542816ae8dee41b81a18d30c7519a0 upstream.

The .callback of the quirk for Sony VPCEH3U1E was unintetionally
removed by the commit 25417185e9b5 ("ACPI: video: Add DMI quirk
for GIGABYTE GB-BXBT-2807"). Add it back to make sure the quirk
for Sony VPCEH3U1E works as expected.

Fixes: 25417185e9b5 ("ACPI: video: Add DMI quirk for GIGABYTE GB-BXBT-2807")
Signed-off-by: Chris Chiu <chris.chiu@canonical.com>
Reported-by: Pavel Machek <pavel@ucw.cz>
Reviewed-by: Pavel Machek (CIP) <pavel@denx.de>
Cc: 5.11+ <stable@vger.kernel.org> # 5.11+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
3 years agomm/highmem: fix CONFIG_DEBUG_KMAP_LOCAL_FORCE_MAP
Ira Weiny [Thu, 25 Mar 2021 04:37:53 +0000 (21:37 -0700)]
mm/highmem: fix CONFIG_DEBUG_KMAP_LOCAL_FORCE_MAP

BugLink: https://bugs.launchpad.net/bugs/1922601
commit 487cfade12fae0eb707bdce71c4d585128238a7d upstream.

The kernel test robot found that __kmap_local_sched_out() was not
correctly skipping the guard pages when DEBUG_KMAP_LOCAL_FORCE_MAP was
set.[1] This was due to DEBUG_HIGHMEM check being used.

Change the configuration check to be correct.

[1] https://lore.kernel.org/lkml/20210304083825.GB17830@xsang-OptiPlex-9020/

Link: https://lkml.kernel.org/r/20210318230657.1497881-1-ira.weiny@intel.com
Fixes: 0e91a0c6984c ("mm/highmem: Provide CONFIG_DEBUG_KMAP_LOCAL_FORCE_MAP")
Signed-off-by: Ira Weiny <ira.weiny@intel.com>
Reported-by: kernel test robot <oliver.sang@intel.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Oliver Sang <oliver.sang@intel.com>
Cc: Chaitanya Kulkarni <Chaitanya.Kulkarni@wdc.com>
Cc: David Sterba <dsterba@suse.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
3 years agogcov: fix clang-11+ support
Nick Desaulniers [Thu, 25 Mar 2021 04:37:44 +0000 (21:37 -0700)]
gcov: fix clang-11+ support

BugLink: https://bugs.launchpad.net/bugs/1922601
commit 60bcf728ee7c60ac2a1f9a0eaceb3a7b3954cd2b upstream.

LLVM changed the expected function signatures for llvm_gcda_start_file()
and llvm_gcda_emit_function() in the clang-11 release.  Users of
clang-11 or newer may have noticed their kernels failing to boot due to
a panic when enabling CONFIG_GCOV_KERNEL=y +CONFIG_GCOV_PROFILE_ALL=y.
Fix up the function signatures so calling these functions doesn't panic
the kernel.

Link: https://reviews.llvm.org/rGcdd683b516d147925212724b09ec6fb792a40041
Link: https://reviews.llvm.org/rG13a633b438b6500ecad9e4f936ebadf3411d0f44
Link: https://lkml.kernel.org/r/20210312224132.3413602-2-ndesaulniers@google.com
Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
Reported-by: Prasad Sodagudi <psodagud@quicinc.com>
Suggested-by: Nathan Chancellor <nathan@kernel.org>
Reviewed-by: Fangrui Song <maskray@google.com>
Tested-by: Nathan Chancellor <nathan@kernel.org>
Acked-by: Peter Oberparleiter <oberpar@linux.ibm.com>
Reviewed-by: Nathan Chancellor <nathan@kernel.org>
Cc: <stable@vger.kernel.org> [5.4+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
3 years agokasan: fix per-page tags for non-page_alloc pages
Andrey Konovalov [Thu, 25 Mar 2021 04:37:20 +0000 (21:37 -0700)]
kasan: fix per-page tags for non-page_alloc pages

BugLink: https://bugs.launchpad.net/bugs/1922601
commit cf10bd4c4aff8dd64d1aa7f2a529d0c672bc16af upstream.

To allow performing tag checks on page_alloc addresses obtained via
page_address(), tag-based KASAN modes store tags for page_alloc
allocations in page->flags.

Currently, the default tag value stored in page->flags is 0x00.
Therefore, page_address() returns a 0x00ffff...  address for pages that
were not allocated via page_alloc.

This might cause problems.  A particular case we encountered is a
conflict with KFENCE.  If a KFENCE-allocated slab object is being freed
via kfree(page_address(page) + offset), the address passed to kfree()
will get tagged with 0x00 (as slab pages keep the default per-page
tags).  This leads to is_kfence_address() check failing, and a KFENCE
object ending up in normal slab freelist, which causes memory
corruptions.

This patch changes the way KASAN stores tag in page-flags: they are now
stored xor'ed with 0xff.  This way, KASAN doesn't need to initialize
per-page flags for every created page, which might be slow.

With this change, page_address() returns natively-tagged (with 0xff)
pointers for pages that didn't have tags set explicitly.

This patch fixes the encountered conflict with KFENCE and prevents more
similar issues that can occur in the future.

Link: https://lkml.kernel.org/r/1a41abb11c51b264511d9e71c303bb16d5cb367b.1615475452.git.andreyknvl@google.com
Fixes: 2813b9c02962 ("kasan, mm, arm64: tag non slab memory allocated via pagealloc")
Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
Reviewed-by: Marco Elver <elver@google.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Vincenzo Frascino <vincenzo.frascino@arm.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Peter Collingbourne <pcc@google.com>
Cc: Evgenii Stepanov <eugenis@google.com>
Cc: Branislav Rankov <Branislav.Rankov@arm.com>
Cc: Kevin Brodsky <kevin.brodsky@arm.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
3 years agohugetlb_cgroup: fix imbalanced css_get and css_put pair for shared mappings
Miaohe Lin [Thu, 25 Mar 2021 04:37:17 +0000 (21:37 -0700)]
hugetlb_cgroup: fix imbalanced css_get and css_put pair for shared mappings

BugLink: https://bugs.launchpad.net/bugs/1922601
commit d85aecf2844ff02a0e5f077252b2461d4f10c9f0 upstream.

The current implementation of hugetlb_cgroup for shared mappings could
have different behavior.  Consider the following two scenarios:

 1.Assume initial css reference count of hugetlb_cgroup is 1:
  1.1 Call hugetlb_reserve_pages with from = 1, to = 2. So css reference
      count is 2 associated with 1 file_region.
  1.2 Call hugetlb_reserve_pages with from = 2, to = 3. So css reference
      count is 3 associated with 2 file_region.
  1.3 coalesce_file_region will coalesce these two file_regions into
      one. So css reference count is 3 associated with 1 file_region
      now.

 2.Assume initial css reference count of hugetlb_cgroup is 1 again:
  2.1 Call hugetlb_reserve_pages with from = 1, to = 3. So css reference
      count is 2 associated with 1 file_region.

Therefore, we might have one file_region while holding one or more css
reference counts. This inconsistency could lead to imbalanced css_get()
and css_put() pair. If we do css_put one by one (i.g. hole punch case),
scenario 2 would put one more css reference. If we do css_put all
together (i.g. truncate case), scenario 1 will leak one css reference.

The imbalanced css_get() and css_put() pair would result in a non-zero
reference when we try to destroy the hugetlb cgroup. The hugetlb cgroup
directory is removed __but__ associated resource is not freed. This
might result in OOM or can not create a new hugetlb cgroup in a busy
workload ultimately.

In order to fix this, we have to make sure that one file_region must
hold exactly one css reference. So in coalesce_file_region case, we
should release one css reference before coalescence. Also only put css
reference when the entire file_region is removed.

The last thing to note is that the caller of region_add() will only hold
one reference to h_cg->css for the whole contiguous reservation region.
But this area might be scattered when there are already some
file_regions reside in it. As a result, many file_regions may share only
one h_cg->css reference. In order to ensure that one file_region must
hold exactly one css reference, we should do css_get() for each
file_region and release the reference held by caller when they are done.

[linmiaohe@huawei.com: fix imbalanced css_get and css_put pair for shared mappings]
Link: https://lkml.kernel.org/r/20210316023002.53921-1-linmiaohe@huawei.com
Link: https://lkml.kernel.org/r/20210301120540.37076-1-linmiaohe@huawei.com
Fixes: 075a61d07a8e ("hugetlb_cgroup: add accounting for shared mappings")
Reported-by: kernel test robot <lkp@intel.com> (auto build test ERROR)
Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Cc: Wanpeng Li <liwp.linux@gmail.com>
Cc: Mina Almasry <almasrymina@google.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
3 years agosquashfs: fix xattr id and id lookup sanity checks
Phillip Lougher [Thu, 25 Mar 2021 04:37:35 +0000 (21:37 -0700)]
squashfs: fix xattr id and id lookup sanity checks

BugLink: https://bugs.launchpad.net/bugs/1922601
commit 8b44ca2b634527151af07447a8090a5f3a043321 upstream.

The checks for maximum metadata block size is missing
SQUASHFS_BLOCK_OFFSET (the two byte length count).

Link: https://lkml.kernel.org/r/2069685113.2081245.1614583677427@webmail.123-reg.co.uk
Fixes: f37aa4c7366e23f ("squashfs: add more sanity checks in id lookup")
Signed-off-by: Phillip Lougher <phillip@squashfs.org.uk>
Cc: Sean Nyekjaer <sean@geanix.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
3 years agosquashfs: fix inode lookup sanity checks
Sean Nyekjaer [Thu, 25 Mar 2021 04:37:32 +0000 (21:37 -0700)]
squashfs: fix inode lookup sanity checks

BugLink: https://bugs.launchpad.net/bugs/1922601
commit c1b2028315c6b15e8d6725e0d5884b15887d3daa upstream.

When mouting a squashfs image created without inode compression it fails
with: "unable to read inode lookup table"

It turns out that the BLOCK_OFFSET is missing when checking the
SQUASHFS_METADATA_SIZE agaist the actual size.

Link: https://lkml.kernel.org/r/20210226092903.1473545-1-sean@geanix.com
Fixes: eabac19e40c0 ("squashfs: add more sanity checks in inode lookup")
Signed-off-by: Sean Nyekjaer <sean@geanix.com>
Acked-by: Phillip Lougher <phillip@squashfs.org.uk>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
3 years agoz3fold: prevent reclaim/free race for headless pages
Thomas Hebb [Thu, 25 Mar 2021 04:37:29 +0000 (21:37 -0700)]
z3fold: prevent reclaim/free race for headless pages

BugLink: https://bugs.launchpad.net/bugs/1922601
commit 6d679578fe9c762c8fbc3d796a067cbba84a7884 upstream.

Commit ca0246bb97c2 ("z3fold: fix possible reclaim races") introduced
the PAGE_CLAIMED flag "to avoid racing on a z3fold 'headless' page
release." By atomically testing and setting the bit in each of
z3fold_free() and z3fold_reclaim_page(), a double-free was avoided.

However, commit dcf5aedb24f8 ("z3fold: stricter locking and more careful
reclaim") appears to have unintentionally broken this behavior by moving
the PAGE_CLAIMED check in z3fold_reclaim_page() to after the page lock
gets taken, which only happens for non-headless pages.  For headless
pages, the check is now skipped entirely and races can occur again.

I have observed such a race on my system:

    page:00000000ffbd76b7 refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x165316
    flags: 0x2ffff0000000000()
    raw: 02ffff0000000000 ffffea0004535f48 ffff8881d553a170 0000000000000000
    raw: 0000000000000000 0000000000000011 00000000ffffffff 0000000000000000
    page dumped because: VM_BUG_ON_PAGE(page_ref_count(page) == 0)
    ------------[ cut here ]------------
    kernel BUG at include/linux/mm.h:707!
    invalid opcode: 0000 [#1] PREEMPT SMP KASAN PTI
    CPU: 2 PID: 291928 Comm: kworker/2:0 Tainted: G    B             5.10.7-arch1-1-kasan #1
    Hardware name: Gigabyte Technology Co., Ltd. H97N-WIFI/H97N-WIFI, BIOS F9b 03/03/2016
    Workqueue: zswap-shrink shrink_worker
    RIP: 0010:__free_pages+0x10a/0x130
    Code: c1 e7 06 48 01 ef 45 85 e4 74 d1 44 89 e6 31 d2 41 83 ec 01 e8 e7 b0 ff ff eb da 48 c7 c6 e0 32 91 88 48 89 ef e8 a6 89 f8 ff <0f> 0b 4c 89 e7 e8 fc 79 07 00 e9 33 ff ff ff 48 89 ef e8 ff 79 07
    RSP: 0000:ffff88819a2ffb98 EFLAGS: 00010296
    RAX: 0000000000000000 RBX: ffffea000594c5a8 RCX: 0000000000000000
    RDX: 1ffffd4000b298b7 RSI: 0000000000000000 RDI: ffffea000594c5b8
    RBP: ffffea000594c580 R08: 000000000000003e R09: ffff8881d5520bbb
    R10: ffffed103aaa4177 R11: 0000000000000001 R12: ffffea000594c5b4
    R13: 0000000000000000 R14: ffff888165316000 R15: ffffea000594c588
    FS:  0000000000000000(0000) GS:ffff8881d5500000(0000) knlGS:0000000000000000
    CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    CR2: 00007f7c8c3654d8 CR3: 0000000103f42004 CR4: 00000000001706e0
    Call Trace:
     z3fold_zpool_shrink+0x9b6/0x1240
     shrink_worker+0x35/0x90
     process_one_work+0x70c/0x1210
     worker_thread+0x539/0x1200
     kthread+0x330/0x400
     ret_from_fork+0x22/0x30
    Modules linked in: rfcomm ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter ccm algif_aead des_generic libdes ecb algif_skcipher cmac bnep md4 algif_hash af_alg vfat fat intel_rapl_msr intel_rapl_common x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel iwlmvm hid_logitech_hidpp kvm at24 mac80211 snd_hda_codec_realtek iTCO_wdt snd_hda_codec_generic intel_pmc_bxt snd_hda_codec_hdmi ledtrig_audio iTCO_vendor_support mei_wdt mei_hdcp snd_hda_intel snd_intel_dspcfg libarc4 soundwire_intel irqbypass iwlwifi soundwire_generic_allocation rapl soundwire_cadence intel_cstate snd_hda_codec intel_uncore btusb joydev mousedev snd_usb_audio pcspkr btrtl uvcvideo nouveau btbcm i2c_i801 btintel snd_hda_core videobuf2_vmalloc i2c_smbus snd_usbmidi_lib videobuf2_memops bluetooth snd_hwdep soundwire_bus snd_soc_rt5640 videobuf2_v4l2 cfg80211 snd_soc_rl6231 videobuf2_common snd_rawmidi lpc_ich alx videodev mdio snd_seq_device snd_soc_core mc ecdh_generic mxm_wmi mei_me
     hid_logitech_dj wmi snd_compress e1000e ac97_bus mei ttm rfkill snd_pcm_dmaengine ecc snd_pcm snd_timer snd soundcore mac_hid acpi_pad pkcs8_key_parser it87 hwmon_vid crypto_user fuse ip_tables x_tables ext4 crc32c_generic crc16 mbcache jbd2 dm_crypt cbc encrypted_keys trusted tpm rng_core usbhid dm_mod crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel aesni_intel crypto_simd cryptd glue_helper xhci_pci xhci_pci_renesas i915 video intel_gtt i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops cec drm agpgart
    ---[ end trace 126d646fc3dc0ad8 ]---

To fix the issue, re-add the earlier test and set in the case where we
have a headless page.

Link: https://lkml.kernel.org/r/c8106dbe6d8390b290cd1d7f873a2942e805349e.1615452048.git.tommyhebb@gmail.com
Fixes: dcf5aedb24f8 ("z3fold: stricter locking and more careful reclaim")
Signed-off-by: Thomas Hebb <tommyhebb@gmail.com>
Reviewed-by: Vitaly Wool <vitaly.wool@konsulko.com>
Cc: Jongseok Kim <ks77sj@gmail.com>
Cc: Snild Dolkow <snild@sony.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
3 years agopsample: Fix user API breakage
Ido Schimmel [Wed, 24 Mar 2021 19:43:32 +0000 (21:43 +0200)]
psample: Fix user API breakage

BugLink: https://bugs.launchpad.net/bugs/1922601
commit e43accba9b071dcd106b5e7643b1b106a158cbb1 upstream.

Cited commit added a new attribute before the existing group reference
count attribute, thereby changing its value and breaking existing
applications on new kernels.

Before:

 # psample -l
 libpsample ERROR psample_group_foreach: failed to recv message: Operation not supported

After:

 # psample -l
 Group Num       Refcount        Group Seq
 1               1               0

Fix by restoring the value of the old attribute and remove the
misleading comments from the enumerator to avoid future bugs.

Cc: stable@vger.kernel.org
Fixes: d8bed686ab96 ("net: psample: Add tunnel support")
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reported-by: Adiel Bidani <adielb@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
3 years agoplatform/x86: intel-vbtn: Stop reporting SW_DOCK events
Hans de Goede [Sun, 21 Mar 2021 16:35:13 +0000 (17:35 +0100)]
platform/x86: intel-vbtn: Stop reporting SW_DOCK events

BugLink: https://bugs.launchpad.net/bugs/1922601
commit 538d2dd0b9920334e6596977a664e9e7bac73703 upstream.

Stop reporting SW_DOCK events because this breaks suspend-on-lid-close.

SW_DOCK should only be reported for docking stations, but all the DSDTs in
my DSDT collection which use the intel-vbtn code, always seem to use this
for 2-in-1s / convertibles and set SW_DOCK=1 when in laptop-mode (in tandem
with setting SW_TABLET_MODE=0).

This causes userspace to think the laptop is docked to a port-replicator
and to disable suspend-on-lid-close, which is undesirable.

Map the dock events to KEY_IGNORE to avoid this broken SW_DOCK reporting.

Note this may theoretically cause us to stop reporting SW_DOCK on some
device where the 0xCA and 0xCB intel-vbtn events are actually used for
reporting docking to a classic docking-station / port-replicator but
I'm not aware of any such devices.

Also the most important thing is that we only report SW_DOCK when it
reliably reports being docked to a classic docking-station without any
false positives, which clearly is not the case here. If there is a
chance of reporting false positives then it is better to not report
SW_DOCK at all.

Cc: stable@vger.kernel.org
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Link: https://lore.kernel.org/r/20210321163513.72328-1-hdegoede@redhat.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
3 years agonetsec: restore phy power state after controller reset
Mian Yousaf Kaukab [Thu, 18 Mar 2021 08:50:26 +0000 (09:50 +0100)]
netsec: restore phy power state after controller reset

BugLink: https://bugs.launchpad.net/bugs/1922601
commit 804741ac7b9f2fdebe3740cb0579cb8d94d49e60 upstream.

Since commit 8e850f25b581 ("net: socionext: Stop PHY before resetting
netsec") netsec_netdev_init() power downs phy before resetting the
controller. However, the state is not restored once the reset is
complete. As a result it is not possible to bring up network on a
platform with Broadcom BCM5482 phy.

Fix the issue by restoring phy power state after controller reset is
complete.

Fixes: 8e850f25b581 ("net: socionext: Stop PHY before resetting netsec")
Cc: stable@vger.kernel.org
Signed-off-by: Mian Yousaf Kaukab <ykaukab@suse.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
3 years agoselinux: fix variable scope issue in live sidtab conversion
Ondrej Mosnacek [Thu, 18 Mar 2021 21:53:02 +0000 (22:53 +0100)]
selinux: fix variable scope issue in live sidtab conversion

BugLink: https://bugs.launchpad.net/bugs/1922601
commit 6406887a12ee5dcdaffff1a8508d91113d545559 upstream.

Commit 02a52c5c8c3b ("selinux: move policy commit after updating
selinuxfs") moved the selinux_policy_commit() call out of
security_load_policy() into sel_write_load(), which caused a subtle yet
rather serious bug.

The problem is that security_load_policy() passes a reference to the
convert_params local variable to sidtab_convert(), which stores it in
the sidtab, where it may be accessed until the policy is swapped over
and RCU synchronized. Before 02a52c5c8c3b, selinux_policy_commit() was
called directly from security_load_policy(), so the convert_params
pointer remained valid all the way until the old sidtab was destroyed,
but now that's no longer the case and calls to sidtab_context_to_sid()
on the old sidtab after security_load_policy() returns may cause invalid
memory accesses.

This can be easily triggered using the stress test from commit
ee1a84fdfeed ("selinux: overhaul sidtab to fix bug and improve
performance"):
```
function rand_cat() {
echo $(( $RANDOM % 1024 ))
}

function do_work() {
while true; do
echo -n "system_u:system_r:kernel_t:s0:c$(rand_cat),c$(rand_cat)" \
>/sys/fs/selinux/context 2>/dev/null || true
done
}

do_work >/dev/null &
do_work >/dev/null &
do_work >/dev/null &

while load_policy; do echo -n .; sleep 0.1; done

kill %1
kill %2
kill %3
```

Fix this by allocating the temporary sidtab convert structures
dynamically and passing them among the
selinux_policy_{load,cancel,commit} functions.

Fixes: 02a52c5c8c3b ("selinux: move policy commit after updating selinuxfs")
Cc: stable@vger.kernel.org
Tested-by: Tyler Hicks <tyhicks@linux.microsoft.com>
Reviewed-by: Tyler Hicks <tyhicks@linux.microsoft.com>
Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com>
[PM: merge fuzz in security.h and services.c]
Signed-off-by: Paul Moore <paul@paul-moore.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
3 years agoselinux: don't log MAC_POLICY_LOAD record on failed policy load
Ondrej Mosnacek [Thu, 18 Mar 2021 21:53:01 +0000 (22:53 +0100)]
selinux: don't log MAC_POLICY_LOAD record on failed policy load

BugLink: https://bugs.launchpad.net/bugs/1922601
commit 519dad3bcd809dc1523bf80ab0310ddb3bf00ade upstream.

If sel_make_policy_nodes() fails, we should jump to 'out', not 'out1',
as the latter would incorrectly log an MAC_POLICY_LOAD audit record,
even though the policy hasn't actually been reloaded. The 'out1' jump
label now becomes unused and can be removed.

Fixes: 02a52c5c8c3b ("selinux: move policy commit after updating selinuxfs")
Cc: stable@vger.kernel.org
Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
3 years agobtrfs: fix subvolume/snapshot deletion not triggered on mount
Filipe Manana [Tue, 16 Mar 2021 16:53:46 +0000 (16:53 +0000)]
btrfs: fix subvolume/snapshot deletion not triggered on mount

BugLink: https://bugs.launchpad.net/bugs/1922601
commit 8d488a8c7ba22d7112fbf6b0a82beb1cdea1c0d5 upstream.

During the mount procedure we are calling btrfs_orphan_cleanup() against
the root tree, which will find all orphans items in this tree. When an
orphan item corresponds to a deleted subvolume/snapshot (instead of an
inode space cache), it must not delete the orphan item, because that will
cause btrfs_find_orphan_roots() to not find the orphan item and therefore
not add the corresponding subvolume root to the list of dead roots, which
results in the subvolume's tree never being deleted by the cleanup thread.

The same applies to the remount from RO to RW path.

Fix this by making btrfs_find_orphan_roots() run before calling
btrfs_orphan_cleanup() against the root tree.

A test case for fstests will follow soon.

Reported-by: Robbie Ko <robbieko@synology.com>
Link: https://lore.kernel.org/linux-btrfs/b19f4310-35e0-606e-1eea-2dd84d28c5da@synology.com/
Fixes: 638331fa56caea ("btrfs: fix transaction leak and crash after cleaning up orphans on RO mount")
CC: stable@vger.kernel.org # 5.11+
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
3 years agobtrfs: fix sleep while in non-sleep context during qgroup removal
Filipe Manana [Thu, 18 Mar 2021 11:22:05 +0000 (11:22 +0000)]
btrfs: fix sleep while in non-sleep context during qgroup removal

BugLink: https://bugs.launchpad.net/bugs/1922601
commit 0bb788300990d3eb5582d3301a720f846c78925c upstream.

While removing a qgroup's sysfs entry we end up taking the kernfs_mutex,
through kobject_del(), while holding the fs_info->qgroup_lock spinlock,
producing the following trace:

  [821.843637] BUG: sleeping function called from invalid context at kernel/locking/mutex.c:281
  [821.843641] in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 28214, name: podman
  [821.843644] CPU: 3 PID: 28214 Comm: podman Tainted: G        W         5.11.6 #15
  [821.843646] Hardware name: Dell Inc. PowerEdge R330/084XW4, BIOS 2.11.0 12/08/2020
  [821.843647] Call Trace:
  [821.843650]  dump_stack+0xa1/0xfb
  [821.843656]  ___might_sleep+0x144/0x160
  [821.843659]  mutex_lock+0x17/0x40
  [821.843662]  kernfs_remove_by_name_ns+0x1f/0x80
  [821.843666]  sysfs_remove_group+0x7d/0xe0
  [821.843668]  sysfs_remove_groups+0x28/0x40
  [821.843670]  kobject_del+0x2a/0x80
  [821.843672]  btrfs_sysfs_del_one_qgroup+0x2b/0x40 [btrfs]
  [821.843685]  __del_qgroup_rb+0x12/0x150 [btrfs]
  [821.843696]  btrfs_remove_qgroup+0x288/0x2a0 [btrfs]
  [821.843707]  btrfs_ioctl+0x3129/0x36a0 [btrfs]
  [821.843717]  ? __mod_lruvec_page_state+0x5e/0xb0
  [821.843719]  ? page_add_new_anon_rmap+0xbc/0x150
  [821.843723]  ? kfree+0x1b4/0x300
  [821.843725]  ? mntput_no_expire+0x55/0x330
  [821.843728]  __x64_sys_ioctl+0x5a/0xa0
  [821.843731]  do_syscall_64+0x33/0x70
  [821.843733]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
  [821.843736] RIP: 0033:0x4cd3fb
  [821.843741] RSP: 002b:000000c000906b20 EFLAGS: 00000206 ORIG_RAX: 0000000000000010
  [821.843744] RAX: ffffffffffffffda RBX: 000000c000050000 RCX: 00000000004cd3fb
  [821.843745] RDX: 000000c000906b98 RSI: 000000004010942a RDI: 000000000000000f
  [821.843747] RBP: 000000c000907cd0 R08: 000000c000622901 R09: 0000000000000000
  [821.843748] R10: 000000c000d992c0 R11: 0000000000000206 R12: 000000000000012d
  [821.843749] R13: 000000000000012c R14: 0000000000000200 R15: 0000000000000049

Fix this by removing the qgroup sysfs entry while not holding the spinlock,
since the spinlock is only meant for protection of the qgroup rbtree.

Reported-by: Stuart Shelton <srcshelton@gmail.com>
Link: https://lore.kernel.org/linux-btrfs/7A5485BB-0628-419D-A4D3-27B1AF47E25A@gmail.com/
Fixes: 49e5fb46211de0 ("btrfs: qgroup: export qgroups in sysfs")
CC: stable@vger.kernel.org # 5.10+
Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
3 years agobtrfs: initialize device::fs_info always
Josef Bacik [Thu, 11 Mar 2021 16:23:14 +0000 (11:23 -0500)]
btrfs: initialize device::fs_info always

BugLink: https://bugs.launchpad.net/bugs/1922601
commit 820a49dafc3304de06f296c35c9ff1ebc1666343 upstream.

Neal reported a panic trying to use -o rescue=all

  BUG: kernel NULL pointer dereference, address: 0000000000000030
  PGD 0 P4D 0
  Oops: 0000 [#1] SMP NOPTI
  CPU: 0 PID: 696 Comm: mount Tainted: G        W         5.12.0-rc2+ #296
  Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.13.0-2.fc32 04/01/2014
  RIP: 0010:btrfs_device_init_dev_stats+0x1d/0x200
  RSP: 0018:ffffafaec1483bb8 EFLAGS: 00010286
  RAX: 0000000000000000 RBX: ffff9a5715bcb298 RCX: 0000000000000070
  RDX: ffff9a5703248000 RSI: ffff9a57052ea150 RDI: ffff9a5715bca400
  RBP: ffff9a57052ea150 R08: 0000000000000070 R09: ffff9a57052ea150
  R10: 000130faf0741c10 R11: 0000000000000000 R12: ffff9a5703700000
  R13: 0000000000000000 R14: ffff9a5715bcb278 R15: ffff9a57052ea150
  FS:  00007f600d122c40(0000) GS:ffff9a577bc00000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: 0000000000000030 CR3: 0000000112a46005 CR4: 0000000000370ef0
  Call Trace:
   ? btrfs_init_dev_stats+0x1f/0xf0
   ? kmem_cache_alloc+0xef/0x1f0
   btrfs_init_dev_stats+0x5f/0xf0
   open_ctree+0x10cb/0x1720
   btrfs_mount_root.cold+0x12/0xea
   legacy_get_tree+0x27/0x40
   vfs_get_tree+0x25/0xb0
   vfs_kern_mount.part.0+0x71/0xb0
   btrfs_mount+0x10d/0x380
   legacy_get_tree+0x27/0x40
   vfs_get_tree+0x25/0xb0
   path_mount+0x433/0xa00
   __x64_sys_mount+0xe3/0x120
   do_syscall_64+0x33/0x40
   entry_SYSCALL_64_after_hwframe+0x44/0xae

This happens because when we call btrfs_init_dev_stats we do
device->fs_info->dev_root.  However device->fs_info isn't initialized
because we were only calling btrfs_init_devices_late() if we properly
read the device root.  However we don't actually need the device root to
init the devices, this function simply assigns the devices their
->fs_info pointer properly, so this needs to be done unconditionally
always so that we can properly dereference device->fs_info in rescue
cases.

Reported-by: Neal Gompa <ngompa13@gmail.com>
CC: stable@vger.kernel.org # 5.11+
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
3 years agobtrfs: fix check_data_csum() error message for direct I/O
Omar Sandoval [Tue, 16 Mar 2021 19:43:00 +0000 (12:43 -0700)]
btrfs: fix check_data_csum() error message for direct I/O

BugLink: https://bugs.launchpad.net/bugs/1922601
commit c1d6abdac46ca8127274bea195d804e3f2cec7ee upstream.

Commit 1dae796aabf6 ("btrfs: inode: sink parameter start and len to
check_data_csum()") replaced the start parameter to check_data_csum()
with page_offset(), but page_offset() is not meaningful for direct I/O
pages. Bring back the start parameter.

Fixes: 265d4ac03fdf ("btrfs: sink parameter start and len to check_data_csum")
CC: stable@vger.kernel.org # 5.11+
Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: Omar Sandoval <osandov@fb.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
3 years agobtrfs: do not initialize dev replace for bad dev root
Josef Bacik [Thu, 11 Mar 2021 16:23:16 +0000 (11:23 -0500)]
btrfs: do not initialize dev replace for bad dev root

BugLink: https://bugs.launchpad.net/bugs/1922601
commit 3cb894972f1809aa8d087c42e5e8b26c64b7d508 upstream.

While helping Neal fix his broken file system I added a debug patch to
catch if we were calling btrfs_search_slot with a NULL root, and this
stack trace popped:

  we tried to search with a NULL root
  CPU: 0 PID: 1760 Comm: mount Not tainted 5.11.0-155.nealbtrfstest.1.fc34.x86_64 #1
  Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 07/22/2020
  Call Trace:
   dump_stack+0x6b/0x83
   btrfs_search_slot.cold+0x11/0x1b
   ? btrfs_init_dev_replace+0x36/0x450
   btrfs_init_dev_replace+0x71/0x450
   open_ctree+0x1054/0x1610
   btrfs_mount_root.cold+0x13/0xfa
   legacy_get_tree+0x27/0x40
   vfs_get_tree+0x25/0xb0
   vfs_kern_mount.part.0+0x71/0xb0
   btrfs_mount+0x131/0x3d0
   ? legacy_get_tree+0x27/0x40
   ? btrfs_show_options+0x640/0x640
   legacy_get_tree+0x27/0x40
   vfs_get_tree+0x25/0xb0
   path_mount+0x441/0xa80
   __x64_sys_mount+0xf4/0x130
   do_syscall_64+0x33/0x40
   entry_SYSCALL_64_after_hwframe+0x44/0xa9
  RIP: 0033:0x7f644730352e

Fix this by not starting the device replace stuff if we do not have a
NULL dev root.

Reported-by: Neal Gompa <ngompa13@gmail.com>
CC: stable@vger.kernel.org # 5.11+
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
3 years agobtrfs: do not initialize dev stats if we have no dev_root
Josef Bacik [Thu, 11 Mar 2021 16:23:15 +0000 (11:23 -0500)]
btrfs: do not initialize dev stats if we have no dev_root

BugLink: https://bugs.launchpad.net/bugs/1922601
commit 82d62d06db404d03836cdabbca41d38646d97cbb upstream.

Neal reported a panic trying to use -o rescue=all

  BUG: kernel NULL pointer dereference, address: 0000000000000030
  PGD 0 P4D 0
  Oops: 0000 [#1] SMP PTI
  CPU: 0 PID: 4095 Comm: mount Not tainted 5.11.0-0.rc7.149.fc34.x86_64 #1
  RIP: 0010:btrfs_device_init_dev_stats+0x4c/0x1f0
  RSP: 0018:ffffa60285fbfb68 EFLAGS: 00010246
  RAX: 0000000000000000 RBX: ffff88b88f806498 RCX: ffff88b82e7a2a10
  RDX: ffffa60285fbfb97 RSI: ffff88b82e7a2a10 RDI: 0000000000000000
  RBP: ffff88b88f806b3c R08: 0000000000000000 R09: 0000000000000000
  R10: ffff88b82e7a2a10 R11: 0000000000000000 R12: ffff88b88f806a00
  R13: ffff88b88f806478 R14: ffff88b88f806a00 R15: ffff88b82e7a2a10
  FS:  00007f698be1ec40(0000) GS:ffff88b937e00000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: 0000000000000030 CR3: 0000000092c9c006 CR4: 00000000003706f0
  Call Trace:
  ? btrfs_init_dev_stats+0x1f/0xf0
  btrfs_init_dev_stats+0x62/0xf0
  open_ctree+0x1019/0x15ff
  btrfs_mount_root.cold+0x13/0xfa
  legacy_get_tree+0x27/0x40
  vfs_get_tree+0x25/0xb0
  vfs_kern_mount.part.0+0x71/0xb0
  btrfs_mount+0x131/0x3d0
  ? legacy_get_tree+0x27/0x40
  ? btrfs_show_options+0x640/0x640
  legacy_get_tree+0x27/0x40
  vfs_get_tree+0x25/0xb0
  path_mount+0x441/0xa80
  __x64_sys_mount+0xf4/0x130
  do_syscall_64+0x33/0x40
  entry_SYSCALL_64_after_hwframe+0x44/0xa9
  RIP: 0033:0x7f698c04e52e

This happens because we unconditionally attempt to initialize device
stats on mount, but we may not have been able to read the device root.
Fix this by skipping initializing the device stats if we do not have a
device root.

Reported-by: Neal Gompa <ngompa13@gmail.com>
CC: stable@vger.kernel.org # 5.11+
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
3 years agoKVM: x86: Protect userspace MSR filter with SRCU, and set atomically-ish
Sean Christopherson [Tue, 16 Mar 2021 18:44:33 +0000 (11:44 -0700)]
KVM: x86: Protect userspace MSR filter with SRCU, and set atomically-ish

BugLink: https://bugs.launchpad.net/bugs/1922601
[ Upstream commit b318e8decf6b9ef1bcf4ca06fae6d6a2cb5d5c5c ]

Fix a plethora of issues with MSR filtering by installing the resulting
filter as an atomic bundle instead of updating the live filter one range
at a time.  The KVM_X86_SET_MSR_FILTER ioctl() isn't truly atomic, as
the hardware MSR bitmaps won't be updated until the next VM-Enter, but
the relevant software struct is atomically updated, which is what KVM
really needs.

Similar to the approach used for modifying memslots, make arch.msr_filter
a SRCU-protected pointer, do all the work configuring the new filter
outside of kvm->lock, and then acquire kvm->lock only when the new filter
has been vetted and created.  That way vCPU readers either see the old
filter or the new filter in their entirety, not some half-baked state.

Yuan Yao pointed out a use-after-free in ksm_msr_allowed() due to a
TOCTOU bug, but that's just the tip of the iceberg...

  - Nothing is __rcu annotated, making it nigh impossible to audit the
    code for correctness.
  - kvm_add_msr_filter() has an unpaired smp_wmb().  Violation of kernel
    coding style aside, the lack of a smb_rmb() anywhere casts all code
    into doubt.
  - kvm_clear_msr_filter() has a double free TOCTOU bug, as it grabs
    count before taking the lock.
  - kvm_clear_msr_filter() also has memory leak due to the same TOCTOU bug.

The entire approach of updating the live filter is also flawed.  While
installing a new filter is inherently racy if vCPUs are running, fixing
the above issues also makes it trivial to ensure certain behavior is
deterministic, e.g. KVM can provide deterministic behavior for MSRs with
identical settings in the old and new filters.  An atomic update of the
filter also prevents KVM from getting into a half-baked state, e.g. if
installing a filter fails, the existing approach would leave the filter
in a half-baked state, having already committed whatever bits of the
filter were already processed.

[*] https://lkml.kernel.org/r/20210312083157.25403-1-yaoyuan0329os@gmail.com

Fixes: 1a155254ff93 ("KVM: x86: Introduce MSR filtering")
Cc: stable@vger.kernel.org
Cc: Alexander Graf <graf@amazon.com>
Reported-by: Yuan Yao <yaoyuan0329os@gmail.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20210316184436.2544875-2-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
3 years agostatic_call: Fix static_call_set_init()
Peter Zijlstra [Thu, 18 Mar 2021 10:27:19 +0000 (11:27 +0100)]
static_call: Fix static_call_set_init()

BugLink: https://bugs.launchpad.net/bugs/1922601
[ Upstream commit 68b1eddd421d2b16c6655eceb48918a1e896bbbc ]

It turns out that static_call_set_init() does not preserve the other
flags; IOW. it clears TAIL if it was set.

Fixes: 9183c3f9ed710 ("static_call: Add inline static call infrastructure")
Reported-by: Sumit Garg <sumit.garg@linaro.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Jarkko Sakkinen <jarkko@kernel.org>
Tested-by: Sumit Garg <sumit.garg@linaro.org>
Link: https://lkml.kernel.org/r/20210318113610.519406371@infradead.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
3 years agostatic_call: Fix the module key fixup
Peter Zijlstra [Thu, 25 Feb 2021 22:03:51 +0000 (23:03 +0100)]
static_call: Fix the module key fixup

BugLink: https://bugs.launchpad.net/bugs/1922601
[ Upstream commit 50bf8080a94d171e843fc013abec19d8ab9f50ae ]

Provided the target address of a R_X86_64_PC32 relocation is aligned,
the low two bits should be invariant between the relative and absolute
value.

Turns out the address is not aligned and things go sideways, ensure we
transfer the bits in the absolute form when fixing up the key address.

Fixes: 73f44fe19d35 ("static_call: Allow module use without exposing static_call_key")
Reported-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Tested-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Link: https://lkml.kernel.org/r/20210225220351.GE4746@worktop.programming.kicks-ass.net
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
3 years agostatic_call: Allow module use without exposing static_call_key
Josh Poimboeuf [Wed, 27 Jan 2021 23:18:37 +0000 (17:18 -0600)]
static_call: Allow module use without exposing static_call_key

BugLink: https://bugs.launchpad.net/bugs/1922601
[ Upstream commit 73f44fe19d359635a607e8e8daa0da4001c1cfc2 ]

When exporting static_call_key; with EXPORT_STATIC_CALL*(), the module
can use static_call_update() to change the function called.  This is
not desirable in general.

Not exporting static_call_key however also disallows usage of
static_call(), since objtool needs the key to construct the
static_call_site.

Solve this by allowing objtool to create the static_call_site using
the trampoline address when it builds a module and cannot find the
static_call_key symbol. The module loader will then try and map the
trampole back to a key before it constructs the normal sites list.

Doing this requires a trampoline -> key associsation, so add another
magic section that keeps those.

Originally-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lkml.kernel.org/r/20210127231837.ifddpn7rhwdaepiu@treble
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
3 years agostatic_call: Pull some static_call declarations to the type headers
Peter Zijlstra [Mon, 18 Jan 2021 14:12:18 +0000 (15:12 +0100)]
static_call: Pull some static_call declarations to the type headers

BugLink: https://bugs.launchpad.net/bugs/1922601
[ Upstream commit 880cfed3a012d7863f42251791cea7fe78c39390 ]

Some static call declarations are going to be needed on low level header
files. Move the necessary material to the dedicated static call types
header to avoid inclusion dependency hell.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lkml.kernel.org/r/20210118141223.123667-4-frederic@kernel.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
3 years agoia64: fix ptrace(PTRACE_SYSCALL_INFO_EXIT) sign
Sergei Trofimovich [Sat, 13 Mar 2021 05:08:27 +0000 (21:08 -0800)]
ia64: fix ptrace(PTRACE_SYSCALL_INFO_EXIT) sign

BugLink: https://bugs.launchpad.net/bugs/1922601
[ Upstream commit 61bf318eac2c13356f7bd1c6a05421ef504ccc8a ]

In https://bugs.gentoo.org/769614 Dmitry noticed that
`ptrace(PTRACE_GET_SYSCALL_INFO)` does not return error sign properly.

The bug is in mismatch between get/set errors:

static inline long syscall_get_error(struct task_struct *task,
                                     struct pt_regs *regs)
{
        return regs->r10 == -1 ? regs->r8:0;
}

static inline long syscall_get_return_value(struct task_struct *task,
                                            struct pt_regs *regs)
{
        return regs->r8;
}

static inline void syscall_set_return_value(struct task_struct *task,
                                            struct pt_regs *regs,
                                            int error, long val)
{
        if (error) {
                /* error < 0, but ia64 uses > 0 return value */
                regs->r8 = -error;
                regs->r10 = -1;
        } else {
                regs->r8 = val;
                regs->r10 = 0;
        }
}

Tested on v5.10 on rx3600 machine (ia64 9040 CPU).

Link: https://lkml.kernel.org/r/20210221002554.333076-2-slyfox@gentoo.org
Link: https://bugs.gentoo.org/769614
Signed-off-by: Sergei Trofimovich <slyfox@gentoo.org>
Reported-by: Dmitry V. Levin <ldv@altlinux.org>
Reviewed-by: Dmitry V. Levin <ldv@altlinux.org>
Cc: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
Cc: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
3 years agoia64: fix ia64_syscall_get_set_arguments() for break-based syscalls
Sergei Trofimovich [Sat, 13 Mar 2021 05:08:23 +0000 (21:08 -0800)]
ia64: fix ia64_syscall_get_set_arguments() for break-based syscalls

BugLink: https://bugs.launchpad.net/bugs/1922601
[ Upstream commit 0ceb1ace4a2778e34a5414e5349712ae4dc41d85 ]

In https://bugs.gentoo.org/769614 Dmitry noticed that
`ptrace(PTRACE_GET_SYSCALL_INFO)` does not work for syscalls called via
glibc's syscall() wrapper.

ia64 has two ways to call syscalls from userspace: via `break` and via
`eps` instructions.

The difference is in stack layout:

1. `eps` creates simple stack frame: no locals, in{0..7} == out{0..8}
2. `break` uses userspace stack frame: may be locals (glibc provides
   one), in{0..7} == out{0..8}.

Both work fine in syscall handling cde itself.

But `ptrace(PTRACE_GET_SYSCALL_INFO)` uses unwind mechanism to
re-extract syscall arguments but it does not account for locals.

The change always skips locals registers. It should not change `eps`
path as kernel's handler already enforces locals=0 and fixes `break`.

Tested on v5.10 on rx3600 machine (ia64 9040 CPU).

Link: https://lkml.kernel.org/r/20210221002554.333076-1-slyfox@gentoo.org
Link: https://bugs.gentoo.org/769614
Signed-off-by: Sergei Trofimovich <slyfox@gentoo.org>
Reported-by: Dmitry V. Levin <ldv@altlinux.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
3 years agomm/fork: clear PASID for new mm
Fenghua Yu [Sat, 13 Mar 2021 05:07:15 +0000 (21:07 -0800)]
mm/fork: clear PASID for new mm

BugLink: https://bugs.launchpad.net/bugs/1922601
[ Upstream commit 82e69a121be4b1597ce758534816a8ee04c8b761 ]

When a new mm is created, its PASID should be cleared, i.e.  the PASID is
initialized to its init state 0 on both ARM and X86.

This patch was part of the series introducing mm->pasid, but got lost
along the way [1].  It still makes sense to have it, because each address
space has a different PASID.  And the IOMMU code in
iommu_sva_alloc_pasid() expects the pasid field of a new mm struct to be
cleared.

[1] https://lore.kernel.org/linux-iommu/YDgh53AcQHT+T3L0@otcwcpicx3.sc.intel.com/

Link: https://lkml.kernel.org/r/20210302103837.2562625-1-jean-philippe@linaro.org
Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Reviewed-by: Tony Luck <tony.luck@intel.com>
Cc: Jacob Pan <jacob.jun.pan@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
3 years agoio_uring: cancel deferred requests in try_cancel
Pavel Begunkov [Thu, 11 Mar 2021 23:29:35 +0000 (23:29 +0000)]
io_uring: cancel deferred requests in try_cancel

BugLink: https://bugs.launchpad.net/bugs/1922601
[ Upstream commit e1915f76a8981f0a750cf56515df42582a37c4b0 ]

As io_uring_cancel_files() and others let SQO to run between
io_uring_try_cancel_requests(), SQO may generate new deferred requests,
so it's safer to try to cancel them in it.

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
3 years agoblock: Suppress uevent for hidden device when removed
Daniel Wagner [Thu, 11 Mar 2021 15:19:17 +0000 (16:19 +0100)]
block: Suppress uevent for hidden device when removed

BugLink: https://bugs.launchpad.net/bugs/1922601
[ Upstream commit 9ec491447b90ad6a4056a9656b13f0b3a1e83043 ]

register_disk() suppress uevents for devices with the GENHD_FL_HIDDEN
but enables uevents at the end again in order to announce disk after
possible partitions are created.

When the device is removed the uevents are still on and user land sees
'remove' messages for devices which were never 'add'ed to the system.

  KERNEL[95481.571887] remove   /devices/virtual/nvme-fabrics/ctl/nvme5/nvme0c5n1 (block)

Let's suppress the uevents for GENHD_FL_HIDDEN by not enabling the
uevents at all.

Signed-off-by: Daniel Wagner <dwagner@suse.de>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Martin Wilck <mwilck@suse.com>
Link: https://lore.kernel.org/r/20210311151917.136091-1-dwagner@suse.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
3 years agonfs: we don't support removing system.nfs4_acl
J. Bruce Fields [Thu, 28 Jan 2021 22:36:38 +0000 (17:36 -0500)]
nfs: we don't support removing system.nfs4_acl

BugLink: https://bugs.launchpad.net/bugs/1922601
[ Upstream commit 4f8be1f53bf615102d103c0509ffa9596f65b718 ]

The NFSv4 protocol doesn't have any notion of reomoving an attribute, so
removexattr(path,"system.nfs4_acl") doesn't make sense.

There's no documented return value.  Arguably it could be EOPNOTSUPP but
I'm a little worried an application might take that to mean that we
don't support ACLs or xattrs.  How about EINVAL?

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
3 years agonvme-pci: add the DISABLE_WRITE_ZEROES quirk for a Samsung PM1725a
Dmitry Monakhov [Wed, 10 Mar 2021 12:06:41 +0000 (12:06 +0000)]
nvme-pci: add the DISABLE_WRITE_ZEROES quirk for a Samsung PM1725a

BugLink: https://bugs.launchpad.net/bugs/1922601
[ Upstream commit abbb5f5929ec6c52574c430c5475c158a65c2a8c ]

This adds a quirk for Samsung PM1725a drive which fixes timeouts and
I/O errors due to the fact that the controller does not properly
handle the Write Zeroes command, dmesg log:

nvme nvme0: I/O 528 QID 10 timeout, aborting
nvme nvme0: I/O 529 QID 10 timeout, aborting
nvme nvme0: I/O 530 QID 10 timeout, aborting
nvme nvme0: I/O 531 QID 10 timeout, aborting
nvme nvme0: I/O 532 QID 10 timeout, aborting
nvme nvme0: I/O 533 QID 10 timeout, aborting
nvme nvme0: I/O 534 QID 10 timeout, aborting
nvme nvme0: I/O 535 QID 10 timeout, aborting
nvme nvme0: Abort status: 0x0
nvme nvme0: Abort status: 0x0
nvme nvme0: Abort status: 0x0
nvme nvme0: Abort status: 0x0
nvme nvme0: Abort status: 0x0
nvme nvme0: Abort status: 0x0
nvme nvme0: Abort status: 0x0
nvme nvme0: Abort status: 0x0
nvme nvme0: I/O 528 QID 10 timeout, reset controller
nvme nvme0: controller is down; will reset: CSTS=0x3, PCI_STATUS=0x10
nvme nvme0: Device not ready; aborting reset, CSTS=0x3
nvme nvme0: Device not ready; aborting reset, CSTS=0x3
nvme nvme0: Removing after probe failure status: -19
nvme0n1: detected capacity change from 6251233968 to 0
blk_update_request: I/O error, dev nvme0n1, sector 32776 op 0x1:(WRITE) flags 0x3000 phys_seg 6 prio class 0
blk_update_request: I/O error, dev nvme0n1, sector 113319936 op 0x9:(WRITE_ZEROES) flags 0x800 phys_seg 0 prio class 0
Buffer I/O error on dev nvme0n1p2, logical block 1, lost async page write
blk_update_request: I/O error, dev nvme0n1, sector 113319680 op 0x9:(WRITE_ZEROES) flags 0x0 phys_seg 0 prio class 0
Buffer I/O error on dev nvme0n1p2, logical block 2, lost async page write
blk_update_request: I/O error, dev nvme0n1, sector 113319424 op 0x9:(WRITE_ZEROES) flags 0x0 phys_seg 0 prio class 0
Buffer I/O error on dev nvme0n1p2, logical block 3, lost async page write
blk_update_request: I/O error, dev nvme0n1, sector 113319168 op 0x9:(WRITE_ZEROES) flags 0x0 phys_seg 0 prio class 0
Buffer I/O error on dev nvme0n1p2, logical block 4, lost async page write
blk_update_request: I/O error, dev nvme0n1, sector 113318912 op 0x9:(WRITE_ZEROES) flags 0x0 phys_seg 0 prio class 0
Buffer I/O error on dev nvme0n1p2, logical block 5, lost async page write
blk_update_request: I/O error, dev nvme0n1, sector 113318656 op 0x9:(WRITE_ZEROES) flags 0x0 phys_seg 0 prio class 0
Buffer I/O error on dev nvme0n1p2, logical block 6, lost async page write
blk_update_request: I/O error, dev nvme0n1, sector 113318400 op 0x9:(WRITE_ZEROES) flags 0x0 phys_seg 0 prio class 0
blk_update_request: I/O error, dev nvme0n1, sector 113318144 op 0x9:(WRITE_ZEROES) flags 0x0 phys_seg 0 prio class 0
blk_update_request: I/O error, dev nvme0n1, sector 113317888 op 0x9:(WRITE_ZEROES) flags 0x0 phys_seg 0 prio class 0

Signed-off-by: Dmitry Monakhov <dmtrmonakhov@yandex-team.ru>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
3 years agonvme-rdma: Fix a use after free in nvmet_rdma_write_data_done
Lv Yunlong [Thu, 11 Mar 2021 05:44:13 +0000 (21:44 -0800)]
nvme-rdma: Fix a use after free in nvmet_rdma_write_data_done

BugLink: https://bugs.launchpad.net/bugs/1922601
[ Upstream commit abec6561fc4e0fbb19591a0b35676d8c783b5493 ]

In nvmet_rdma_write_data_done, rsp is recoverd by wc->wr_cqe and freed by
nvmet_rdma_release_rsp(). But after that, pr_info() used the freed
chunk's member object and could leak the freed chunk address with
wc->wr_cqe by computing the offset.

Signed-off-by: Lv Yunlong <lyl2019@mail.ustc.edu.cn>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
3 years agonvme-core: check ctrl css before setting up zns
Chaitanya Kulkarni [Tue, 9 Mar 2021 04:58:21 +0000 (20:58 -0800)]
nvme-core: check ctrl css before setting up zns

BugLink: https://bugs.launchpad.net/bugs/1922601
[ Upstream commit 0ec84df4953bd42c6583a555773f1d4996a061eb ]

Ensure multiple Command Sets are supported before starting to setup a
ZNS namespace.

Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
[hch: move the check around a bit]
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
3 years agonvme-fc: return NVME_SC_HOST_ABORTED_CMD when a command has been aborted
Hannes Reinecke [Fri, 26 Feb 2021 07:17:28 +0000 (08:17 +0100)]
nvme-fc: return NVME_SC_HOST_ABORTED_CMD when a command has been aborted

BugLink: https://bugs.launchpad.net/bugs/1922601
[ Upstream commit ae3afe6308b43bbf49953101d4ba2c1c481133a8 ]

When a command has been aborted we should return NVME_SC_HOST_ABORTED_CMD
to be consistent with the other transports.

Signed-off-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Reviewed-by: James Smart <jsmart2021@gmail.com>
Reviewed-by: Daniel Wagner <dwagner@suse.de>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
3 years agonvme-fc: set NVME_REQ_CANCELLED in nvme_fc_terminate_exchange()
Hannes Reinecke [Fri, 26 Feb 2021 07:17:27 +0000 (08:17 +0100)]
nvme-fc: set NVME_REQ_CANCELLED in nvme_fc_terminate_exchange()

BugLink: https://bugs.launchpad.net/bugs/1922601
[ Upstream commit 3c7aafbc8d3d4d90430dfa126847a796c3e4ecfc ]

nvme_fc_terminate_exchange() is being called when exchanges are
being deleted, and as such we should be setting the NVME_REQ_CANCELLED
flag to have identical behaviour on all transports.

Signed-off-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Reviewed-by: James Smart <jsmart2021@gmail.com>
Reviewed-by: Daniel Wagner <dwagner@suse.de>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
3 years agonvme: add NVME_REQ_CANCELLED flag in nvme_cancel_request()
Hannes Reinecke [Fri, 26 Feb 2021 07:17:26 +0000 (08:17 +0100)]
nvme: add NVME_REQ_CANCELLED flag in nvme_cancel_request()

BugLink: https://bugs.launchpad.net/bugs/1922601
[ Upstream commit d3589381987ec879b03f8ce3039df57e87f05901 ]

NVME_REQ_CANCELLED is translated into -EINTR in nvme_submit_sync_cmd(),
so we should be setting this flags during nvme_cancel_request() to
ensure that the callers to nvme_submit_sync_cmd() will get the correct
error code when the controller is reset.

Signed-off-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Reviewed-by: Chao Leng <lengchao@huawei.com>
Reviewed-by: Daniel Wagner <dwagner@suse.de>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
3 years agonvme: simplify error logic in nvme_validate_ns()
Hannes Reinecke [Fri, 26 Feb 2021 07:17:25 +0000 (08:17 +0100)]
nvme: simplify error logic in nvme_validate_ns()

BugLink: https://bugs.launchpad.net/bugs/1922601
[ Upstream commit d95c1f4179a7f3ea8aa728ed00252a8ed0f8158f ]

We only should remove namespaces when we get fatal error back from
the device or when the namespace IDs have changed.
So instead of painfully masking out error numbers which might indicate
that the error should be ignored we could use an NVME status code
to indicated when the namespace should be removed.
That simplifies the final logic and makes it less error-prone.

Signed-off-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Reviewed-by: Daniel Wagner <dwagner@suse.de>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
3 years agodrm/radeon: fix AGP dependency
Christian König [Mon, 8 Mar 2021 18:22:13 +0000 (19:22 +0100)]
drm/radeon: fix AGP dependency

BugLink: https://bugs.launchpad.net/bugs/1922601
[ Upstream commit cba2afb65cb05c3d197d17323fee4e3c9edef9cd ]

When AGP is compiled as module radeon must be compiled as module as
well.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
3 years agodrm/amdgpu: fb BO should be ttm_bo_type_device
Nirmoy Das [Mon, 8 Mar 2021 14:22:22 +0000 (15:22 +0100)]
drm/amdgpu: fb BO should be ttm_bo_type_device

BugLink: https://bugs.launchpad.net/bugs/1922601
[ Upstream commit 521f04f9e3ffc73ef96c776035f8a0a31b4cdd81 ]

FB BO should not be ttm_bo_type_kernel type and
amdgpufb_create_pinned_object() pins the FB BO anyway.

Signed-off-by: Nirmoy Das <nirmoy.das@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
3 years agodrm/amdgpu/display: Use wm_table.entries for dcn301 calculate_wm
Zhan Liu [Tue, 9 Mar 2021 01:28:22 +0000 (20:28 -0500)]
drm/amdgpu/display: Use wm_table.entries for dcn301 calculate_wm

BugLink: https://bugs.launchpad.net/bugs/1922601
[ Upstream commit eda29602f1a8b2b32d8c8c354232d9d1ee1c064d ]

[Why]
For DGPU Navi, the wm_table.nv_entries are used. These entires are not
populated for DCN301 Vangogh APU, but instead wm_table.entries are.

[How]
Use DCN21 Renoir style wm calculations.

Signed-off-by: Leo Li <sunpeng.li@amd.com>
Signed-off-by: Zhan Liu <zhan.liu@amd.com>
Reviewed-by: Dmytro Laktyushkin <Dmytro.Laktyushkin@amd.com>
Acked-by: Zhan Liu <zhan.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
3 years agodrm/amd/display: Enabled pipe harvesting in dcn30
Dillon Varone [Fri, 19 Feb 2021 23:15:30 +0000 (18:15 -0500)]
drm/amd/display: Enabled pipe harvesting in dcn30

BugLink: https://bugs.launchpad.net/bugs/1922601
[ Upstream commit d2c91285958a3e77db99c352c136af4243f8f529 ]

[Why & How]
Ported logic from dcn21 for reading in pipe fusing to dcn30.
Supported configurations are 1 and 6 pipes. Invalid fusing
will revert to 1 pipe being enabled.

Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Dillon Varone <dillon.varone@amd.com>
Reviewed-by: Jun Lei <Jun.Lei@amd.com>
Acked-by: Eryk Brol <eryk.brol@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
3 years agodrm/amd/display: Revert dram_clock_change_latency for DCN2.1
Sung Lee [Fri, 26 Feb 2021 18:20:43 +0000 (13:20 -0500)]
drm/amd/display: Revert dram_clock_change_latency for DCN2.1

BugLink: https://bugs.launchpad.net/bugs/1922601
[ Upstream commit b0075d114c33580f5c9fa9cee8e13d06db41471b ]

[WHY & HOW]
Using values provided by DF for latency may cause hangs in
multi display configurations. Revert change to previous value.

Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Sung Lee <sung.lee@amd.com>
Reviewed-by: Haonan Wang <Haonan.Wang2@amd.com>
Acked-by: Eryk Brol <eryk.brol@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
3 years agodrm/amd/display: Enable pflip interrupt upon pipe enable
Qingqing Zhuo [Fri, 19 Feb 2021 22:17:50 +0000 (17:17 -0500)]
drm/amd/display: Enable pflip interrupt upon pipe enable

BugLink: https://bugs.launchpad.net/bugs/1922601
[ Upstream commit 7afa0033d6f7fb8a84798ef99d1117661c4e696c ]

[Why]
pflip interrupt would not be enabled promptly if a pipe is disabled
and re-enabled, causing flip_done timeout error during DP
compliance tests

[How]
Enable pflip interrupt upon pipe enablement

Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Qingqing Zhuo <qingqing.zhuo@amd.com>
Reviewed-by: Nicholas Kazlauskas <Nicholas.Kazlauskas@amd.com>
Acked-by: Eryk Brol <eryk.brol@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
3 years agoblock: Fix REQ_OP_ZONE_RESET_ALL handling
Damien Le Moal [Wed, 10 Mar 2021 09:09:19 +0000 (18:09 +0900)]
block: Fix REQ_OP_ZONE_RESET_ALL handling

BugLink: https://bugs.launchpad.net/bugs/1922601
[ Upstream commit faa44c69daf9ccbd5b8a1aee13e0e0d037c0be17 ]

Similarly to a single zone reset operation (REQ_OP_ZONE_RESET), execute
REQ_OP_ZONE_RESET_ALL operations with REQ_SYNC set.

Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
3 years agoregulator: qcom-rpmh: Use correct buck for S1C regulator
satya priya [Wed, 24 Feb 2021 08:33:11 +0000 (14:03 +0530)]
regulator: qcom-rpmh: Use correct buck for S1C regulator

BugLink: https://bugs.launchpad.net/bugs/1922601
[ Upstream commit dfe03bca8db4957d4b60614ff7df4d136ba90f37 ]

Use correct buck, that is, pmic5_hfsmps515 for S1C regulator
of PM8350C PMIC.

Signed-off-by: satya priya <skakit@codeaurora.org>
Link: https://lore.kernel.org/r/1614155592-14060-7-git-send-email-skakit@codeaurora.org
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
3 years agoregulator: qcom-rpmh: Correct the pmic5_hfsmps515 buck
satya priya [Wed, 24 Feb 2021 08:33:08 +0000 (14:03 +0530)]
regulator: qcom-rpmh: Correct the pmic5_hfsmps515 buck

BugLink: https://bugs.launchpad.net/bugs/1922601
[ Upstream commit e610e072c87a30658479a7b4c51e1801cb3f450c ]

Correct the REGULATOR_LINEAR_RANGE and n_voltges for
pmic5_hfsmps515 buck.

Signed-off-by: satya priya <skakit@codeaurora.org>
Link: https://lore.kernel.org/r/1614155592-14060-4-git-send-email-skakit@codeaurora.org
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
3 years agokselftest: arm64: Fix exit code of sve-ptrace
Mark Brown [Tue, 9 Mar 2021 19:03:04 +0000 (19:03 +0000)]
kselftest: arm64: Fix exit code of sve-ptrace

BugLink: https://bugs.launchpad.net/bugs/1922601
[ Upstream commit 07e644885bf6727a48db109fad053cb43f3c9859 ]

We track if sve-ptrace encountered a failure in a variable but don't
actually use that value when we exit the program, do so.

Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20210309190304.39169-1-broonie@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
3 years agou64_stats,lockdep: Fix u64_stats_init() vs lockdep
Peter Zijlstra [Mon, 8 Mar 2021 08:38:12 +0000 (09:38 +0100)]
u64_stats,lockdep: Fix u64_stats_init() vs lockdep

BugLink: https://bugs.launchpad.net/bugs/1922601
[ Upstream commit d5b0e0677bfd5efd17c5bbb00156931f0d41cb85 ]

Jakub reported that:

    static struct net_device *rtl8139_init_board(struct pci_dev *pdev)
    {
    ...
    u64_stats_init(&tp->rx_stats.syncp);
    u64_stats_init(&tp->tx_stats.syncp);
    ...
    }

results in lockdep getting confused between the RX and TX stats lock.
This is because u64_stats_init() is an inline calling seqcount_init(),
which is a macro using a static variable to generate a lockdep class.

By wrapping that in an inline, we negate the effect of the macro and
fold the static key variable, hence the confusion.

Fix by also making u64_stats_init() a macro for the case where it
matters, leaving the other case an inline for argument validation
etc.

Reported-by: Jakub Kicinski <kuba@kernel.org>
Debugged-by: "Ahmed S. Darwish" <a.darwish@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: "Erhard F." <erhard_f@mailbox.org>
Link: https://lkml.kernel.org/r/YEXicy6+9MksdLZh@hirez.programming.kicks-ass.net
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
3 years agostaging: rtl8192e: fix kconfig dependency on CRYPTO
Julian Braha [Mon, 22 Feb 2021 18:06:07 +0000 (13:06 -0500)]
staging: rtl8192e: fix kconfig dependency on CRYPTO

BugLink: https://bugs.launchpad.net/bugs/1922601
[ Upstream commit 7c36194558cf49a86a53b5f60db8046c5e3013ae ]

When RTLLIB_CRYPTO_TKIP is enabled and CRYPTO is disabled,
Kbuild gives the following warning:

WARNING: unmet direct dependencies detected for CRYPTO_MICHAEL_MIC
  Depends on [n]: CRYPTO [=n]
  Selected by [m]:
  - RTLLIB_CRYPTO_TKIP [=m] && STAGING [=y] && RTLLIB [=m]

WARNING: unmet direct dependencies detected for CRYPTO_LIB_ARC4
  Depends on [n]: CRYPTO [=n]
  Selected by [m]:
  - RTLLIB_CRYPTO_TKIP [=m] && STAGING [=y] && RTLLIB [=m]
  - RTLLIB_CRYPTO_WEP [=m] && STAGING [=y] && RTLLIB [=m]

This is because RTLLIB_CRYPTO_TKIP selects CRYPTO_MICHAEL_MIC and
CRYPTO_LIB_ARC4, without depending on or selecting CRYPTO,
despite those config options being subordinate to CRYPTO.

Acked-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Julian Braha <julianbraha@gmail.com>
Link: https://lore.kernel.org/r/20210222180607.399753-1-julianbraha@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
3 years agohabanalabs: Disable file operations after device is removed
Tomer Tayar [Mon, 1 Feb 2021 17:44:34 +0000 (19:44 +0200)]
habanalabs: Disable file operations after device is removed

BugLink: https://bugs.launchpad.net/bugs/1922601
[ Upstream commit ffd123fe839700366ea79b19ac3683bf56817372 ]

A device can be removed from the PCI subsystem while a process holds the
file descriptor opened.
In such a case, the driver attempts to kill the process, but as it is
still possible that the process will be alive after this step, the
device removal will complete, and we will end up with a process object
that points to a device object which was already released.

To prevent the usage of this released device object, disable the
following file operations for this process object, and avoid the cleanup
steps when the file descriptor is eventually closed.
The latter is just a best effort, as memory leak will occur.

Signed-off-by: Tomer Tayar <ttayar@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
3 years agohabanalabs: Call put_pid() when releasing control device
Tomer Tayar [Fri, 19 Feb 2021 12:05:33 +0000 (14:05 +0200)]
habanalabs: Call put_pid() when releasing control device

BugLink: https://bugs.launchpad.net/bugs/1922601
[ Upstream commit 27ac5aada024e0821c86540ad18f37edadd77d5e ]

The refcount of the "hl_fpriv" structure is not used for the control
device, and thus hl_hpriv_put() is not called when releasing this
device.
This results with no call to put_pid(), so add it explicitly in
hl_device_release_ctrl().

Signed-off-by: Tomer Tayar <ttayar@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
3 years agosparc64: Fix opcode filtering in handling of no fault loads
Rob Gardner [Mon, 1 Mar 2021 05:48:16 +0000 (22:48 -0700)]
sparc64: Fix opcode filtering in handling of no fault loads

BugLink: https://bugs.launchpad.net/bugs/1922601
[ Upstream commit e5e8b80d352ec999d2bba3ea584f541c83f4ca3f ]

is_no_fault_exception() has two bugs which were discovered via random
opcode testing with stress-ng. Both are caused by improper filtering
of opcodes.

The first bug can be triggered by a floating point store with a no-fault
ASI, for instance "sta %f0, [%g0] #ASI_PNF", opcode C1A01040.

The code first tests op3[5] (0x1000000), which denotes a floating
point instruction, and then tests op3[2] (0x200000), which denotes a
store instruction. But these bits are not mutually exclusive, and the
above mentioned opcode has both bits set. The intent is to filter out
stores, so the test for stores must be done first in order to have
any effect.

The second bug can be triggered by a floating point load with one of
the invalid ASI values 0x8e or 0x8f, which pass this check in
is_no_fault_exception():
     if ((asi & 0xf2) == ASI_PNF)

An example instruction is "ldqa [%l7 + %o7] #ASI 0x8f, %f38",
opcode CF95D1EF. Asi values greater than 0x8b (ASI_SNFL) are fatal
in handle_ldf_stq(), and is_no_fault_exception() must not allow these
invalid asi values to make it that far.

In both of these cases, handle_ldf_stq() reacts by calling
sun4v_data_access_exception() or spitfire_data_access_exception(),
which call is_no_fault_exception() and results in an infinite
recursion.

Signed-off-by: Rob Gardner <rob.gardner@oracle.com>
Tested-by: Anatoly Pugachev <matorola@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>