]> git.proxmox.com Git - mirror_ubuntu-focal-kernel.git/log
mirror_ubuntu-focal-kernel.git
3 years agovideo: hyperv_fb: Fix the mmap() regression for v5.4.y and older
Dexuan Cui [Sat, 9 Jan 2021 22:53:58 +0000 (14:53 -0800)]
video: hyperv_fb: Fix the mmap() regression for v5.4.y and older

BugLink: https://bugs.launchpad.net/bugs/1913486
db49200b1dad is backported from the mainline commit
5f1251a48c17 ("video: hyperv_fb: Fix the cache type when mapping the VRAM"),
to v5.4.y and older stable branches, but unluckily db49200b1dad causes
mmap() to fail for /dev/fb0 due to EINVAL:

[ 5797.049560] x86/PAT: a.out:1910 map pfn expected mapping type
  uncached-minus for [mem 0xf8200000-0xf85cbfff], got write-back

This means the v5.4.y kernel detects an incompatibility issue about the
mapping type of the VRAM: db49200b1dad changes to use Write-Back when
mapping the VRAM, while the mmap() syscall tries to use Uncached-minus.
That’s to say, the kernel thinks Uncached-minus is incompatible with
Write-Back: see drivers/video/fbdev/core/fbmem.c: fb_mmap() ->
vm_iomap_memory() -> io_remap_pfn_range() -> ... -> track_pfn_remap() ->
reserve_pfn_range().

Note: any v5.5 and newer kernel doesn't have the issue, because they
have commit
d21987d709e8 ("video: hyperv: hyperv_fb: Support deferred IO for Hyper-V frame buffer driver")
, and when the hyperv_fb driver has the deferred_io support,
fb_deferred_io_init() overrides info->fbops->fb_mmap with
fb_deferred_io_mmap(), which doesn’t check the mapping type
incompatibility. Note: since it's VRAM here, the checking is not really
necessary.

Fix the regression by ioremap_wc(), which uses Write-combining. The kernel
thinks it's compatible with Uncached-minus. The VRAM mappped by
ioremap_wc() is slightly slower than mapped by ioremap_cache(), but is
still significantly faster than by ioremap().

Change the comment accordingly. Linux VM on ARM64 Hyper-V is still not
working in the latest mainline yet, and when it works in future, the ARM64
support is unlikely to be backported to v5.4 and older, so using
ioremap_wc() in v5.4 and older should be ok.

Note: this fix is only targeted at the stable branches:
v5.4.y, v4.19.y, v4.14.y, v4.9.y and v4.4.y.

Fixes: db49200b1dad ("video: hyperv_fb: Fix the cache type when mapping the VRAM")
Signed-off-by: Dexuan Cui <decui@microsoft.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: William Breathitt Gray <william.gray@canonical.com>
3 years agoBluetooth: revert: hci_h5: close serdev device and free hu in h5_close
Hans de Goede [Sun, 22 Nov 2020 12:17:25 +0000 (13:17 +0100)]
Bluetooth: revert: hci_h5: close serdev device and free hu in h5_close

BugLink: https://bugs.launchpad.net/bugs/1913486
commit 5c3b5796866f85354a5ce76a28f8ffba0dcefc7e upstream.

There have been multiple revisions of the patch fix the h5->rx_skb
leak. Accidentally the first revision (which is buggy) and v5 have
both been merged:

v1 commit 70f259a3f427 ("Bluetooth: hci_h5: close serdev device and free
hu in h5_close");
v5 commit 855af2d74c87 ("Bluetooth: hci_h5: fix memory leak in h5_close")

The correct v5 makes changes slightly higher up in the h5_close()
function, which allowed both versions to get merged without conflict.

The changes from v1 unconditionally frees the h5 data struct, this
is wrong because in the serdev enumeration case the memory is
allocated in h5_serdev_probe() like this:

        h5 = devm_kzalloc(dev, sizeof(*h5), GFP_KERNEL);

So its lifetime is tied to the lifetime of the driver being bound
to the serdev and it is automatically freed when the driver gets
unbound. In the serdev case the same h5 struct is re-used over
h5_close() and h5_open() calls and thus MUST not be free-ed in
h5_close().

The serdev_device_close() added to h5_close() is incorrect in the
same way, serdev_device_close() is called on driver unbound too and
also MUST no be called from h5_close().

This reverts the changes made by merging v1 of the patch, so that
just the changes of the correct v5 remain.

Cc: Anant Thazhemadam <anant.thazhemadam@gmail.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: William Breathitt Gray <william.gray@canonical.com>
3 years agokbuild: don't hardcode depmod path
Dominique Martinet [Tue, 1 Dec 2020 13:17:30 +0000 (14:17 +0100)]
kbuild: don't hardcode depmod path

BugLink: https://bugs.launchpad.net/bugs/1913486
commit 436e980e2ed526832de822cbf13c317a458b78e1 upstream.

depmod is not guaranteed to be in /sbin, just let make look for
it in the path like all the other invoked programs

Signed-off-by: Dominique Martinet <asmadeus@codewreck.org>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: William Breathitt Gray <william.gray@canonical.com>
3 years agonet/sched: sch_taprio: ensure to reset/destroy all child qdiscs
Davide Caratti [Thu, 17 Dec 2020 21:29:46 +0000 (22:29 +0100)]
net/sched: sch_taprio: ensure to reset/destroy all child qdiscs

BugLink: https://bugs.launchpad.net/bugs/1913486
[ Upstream commit 698285da79f5b0b099db15a37ac661ac408c80eb ]

taprio_graft() can insert a NULL element in the array of child qdiscs. As
a consquence, taprio_reset() might not reset child qdiscs completely, and
taprio_destroy() might leak resources. Fix it by ensuring that loops that
iterate over q->qdiscs[] don't end when they find the first NULL item.

Fixes: 44d4775ca518 ("net/sched: sch_taprio: reset child qdiscs before freeing them")
Fixes: 5a781ccbd19e ("tc: Add support for configuring the taprio scheduler")
Suggested-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Link: https://lore.kernel.org/r/13edef6778fef03adc751582562fba4a13e06d6a.1608240532.git.dcaratti@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: William Breathitt Gray <william.gray@canonical.com>
3 years agoionic: account for vlan tag len in rx buffer len
Shannon Nelson [Fri, 18 Dec 2020 21:50:01 +0000 (13:50 -0800)]
ionic: account for vlan tag len in rx buffer len

BugLink: https://bugs.launchpad.net/bugs/1913486
[ Upstream commit 83469893204281ecf65d572bddf02de29a19787c ]

Let the FW know we have enough receive buffer space for the
vlan tag if it isn't stripped.

Fixes: 0f3154e6bcb3 ("ionic: Add Tx and Rx handling")
Signed-off-by: Shannon Nelson <snelson@pensando.io>
Link: https://lore.kernel.org/r/20201218215001.64696-1-snelson@pensando.io
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: William Breathitt Gray <william.gray@canonical.com>
3 years agovhost_net: fix ubuf refcount incorrectly when sendmsg fails
Yunjian Wang [Tue, 29 Dec 2020 02:01:48 +0000 (10:01 +0800)]
vhost_net: fix ubuf refcount incorrectly when sendmsg fails

BugLink: https://bugs.launchpad.net/bugs/1913486
[ Upstream commit 01e31bea7e622f1890c274f4aaaaf8bccd296aa5 ]

Currently the vhost_zerocopy_callback() maybe be called to decrease
the refcount when sendmsg fails in tun. The error handling in vhost
handle_tx_zerocopy() will try to decrease the same refcount again.
This is wrong. To fix this issue, we only call vhost_net_ubuf_put()
when vq->heads[nvq->desc].len == VHOST_DMA_IN_PROGRESS.

Fixes: bab632d69ee4 ("vhost: vhost TX zero-copy support")
Signed-off-by: Yunjian Wang <wangyunjian@huawei.com>
Acked-by: Willem de Bruijn <willemb@google.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Link: https://lore.kernel.org/r/1609207308-20544-1-git-send-email-wangyunjian@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: William Breathitt Gray <william.gray@canonical.com>
3 years agonet: usb: qmi_wwan: add Quectel EM160R-GL
Bjørn Mork [Wed, 30 Dec 2020 15:24:51 +0000 (16:24 +0100)]
net: usb: qmi_wwan: add Quectel EM160R-GL

BugLink: https://bugs.launchpad.net/bugs/1913486
[ Upstream commit cfd82dfc9799c53ef109343a23af006a0f6860a9 ]

New modem using ff/ff/30 for QCDM, ff/00/00 for  AT and NMEA,
and ff/ff/ff for RMNET/QMI.

T: Bus=02 Lev=01 Prnt=01 Port=00 Cnt=01 Dev#= 2 Spd=5000 MxCh= 0
D: Ver= 3.20 Cls=ef(misc ) Sub=02 Prot=01 MxPS= 9 #Cfgs= 1
P: Vendor=2c7c ProdID=0620 Rev= 4.09
S: Manufacturer=Quectel
S: Product=EM160R-GL
S: SerialNumber=e31cedc1
C:* #Ifs= 5 Cfg#= 1 Atr=a0 MxPwr=896mA
I:* If#= 0 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=30 Driver=(none)
E: Ad=81(I) Atr=02(Bulk) MxPS=1024 Ivl=0ms
E: Ad=01(O) Atr=02(Bulk) MxPS=1024 Ivl=0ms
I:* If#= 1 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=(none)
E: Ad=83(I) Atr=03(Int.) MxPS= 10 Ivl=32ms
E: Ad=82(I) Atr=02(Bulk) MxPS=1024 Ivl=0ms
E: Ad=02(O) Atr=02(Bulk) MxPS=1024 Ivl=0ms
I:* If#= 2 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=(none)
E: Ad=85(I) Atr=03(Int.) MxPS= 10 Ivl=32ms
E: Ad=84(I) Atr=02(Bulk) MxPS=1024 Ivl=0ms
E: Ad=03(O) Atr=02(Bulk) MxPS=1024 Ivl=0ms
I:* If#= 3 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=(none)
E: Ad=87(I) Atr=03(Int.) MxPS= 10 Ivl=32ms
E: Ad=86(I) Atr=02(Bulk) MxPS=1024 Ivl=0ms
E: Ad=04(O) Atr=02(Bulk) MxPS=1024 Ivl=0ms
I:* If#= 4 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=ff Driver=(none)
E: Ad=88(I) Atr=03(Int.) MxPS= 8 Ivl=32ms
E: Ad=8e(I) Atr=02(Bulk) MxPS=1024 Ivl=0ms
E: Ad=0f(O) Atr=02(Bulk) MxPS=1024 Ivl=0ms

Signed-off-by: Bjørn Mork <bjorn@mork.no>
Link: https://lore.kernel.org/r/20201230152451.245271-1-bjorn@mork.no
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: William Breathitt Gray <william.gray@canonical.com>
3 years agoCDC-NCM: remove "connected" log message
Roland Dreier [Thu, 24 Dec 2020 03:21:16 +0000 (19:21 -0800)]
CDC-NCM: remove "connected" log message

BugLink: https://bugs.launchpad.net/bugs/1913486
[ Upstream commit 59b4a8fa27f5a895582ada1ae5034af7c94a57b5 ]

The cdc_ncm driver passes network connection notifications up to
usbnet_link_change(), which is the right place for any logging.
Remove the netdev_info() duplicating this from the driver itself.

This stops devices such as my "TRENDnet USB 10/100/1G/2.5G LAN"
(ID 20f4:e02b) adapter from spamming the kernel log with

    cdc_ncm 2-2:2.0 enp0s2u2c2: network connection: connected

messages every 60 msec or so.

Signed-off-by: Roland Dreier <roland@kernel.org>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Link: https://lore.kernel.org/r/20201224032116.2453938-1-roland@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: William Breathitt Gray <william.gray@canonical.com>
3 years agonet: dsa: lantiq_gswip: Fix GSWIP_MII_CFG(p) register access
Martin Blumenstingl [Sun, 3 Jan 2021 01:25:44 +0000 (02:25 +0100)]
net: dsa: lantiq_gswip: Fix GSWIP_MII_CFG(p) register access

BugLink: https://bugs.launchpad.net/bugs/1913486
[ Upstream commit 709a3c9dff2a639966ae7d8ba6239d2b8aba036d ]

There is one GSWIP_MII_CFG register for each switch-port except the CPU
port. The register offset for the first port is 0x0, 0x02 for the
second, 0x04 for the third and so on.

Update the driver to not only restrict the GSWIP_MII_CFG registers to
ports 0, 1 and 5. Handle ports 0..5 instead but skip the CPU port. This
means we are not overwriting the configuration for the third port (port
two since we start counting from zero) with the settings for the sixth
port (with number five) anymore.

The GSWIP_MII_PCDU(p) registers are not updated because there's really
only three (one for each of the following ports: 0, 1, 5).

Fixes: 14fceff4771e51 ("net: dsa: Add Lantiq / Intel DSA driver for vrx200")
Signed-off-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com>
Acked-by: Hauke Mehrtens <hauke@hauke-m.de>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: William Breathitt Gray <william.gray@canonical.com>
3 years agonet: dsa: lantiq_gswip: Enable GSWIP_MII_CFG_EN also for internal PHYs
Martin Blumenstingl [Sun, 3 Jan 2021 01:25:43 +0000 (02:25 +0100)]
net: dsa: lantiq_gswip: Enable GSWIP_MII_CFG_EN also for internal PHYs

BugLink: https://bugs.launchpad.net/bugs/1913486
[ Upstream commit c1a9ec7e5d577a9391660800c806c53287fca991 ]

Enable GSWIP_MII_CFG_EN also for internal PHYs to make traffic flow.
Without this the PHY link is detected properly and ethtool statistics
for TX are increasing but there's no RX traffic coming in.

Fixes: 14fceff4771e51 ("net: dsa: Add Lantiq / Intel DSA driver for vrx200")
Suggested-by: Hauke Mehrtens <hauke@hauke-m.de>
Signed-off-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com>
Acked-by: Hauke Mehrtens <hauke@hauke-m.de>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: William Breathitt Gray <william.gray@canonical.com>
3 years agor8169: work around power-saving bug on some chip versions
Heiner Kallweit [Wed, 30 Dec 2020 18:33:34 +0000 (19:33 +0100)]
r8169: work around power-saving bug on some chip versions

BugLink: https://bugs.launchpad.net/bugs/1913486
[ Upstream commit e80bd76fbf563cc7ed8c9e9f3bbcdf59b0897f69 ]

A user reported failing network with RTL8168dp (a quite rare chip
version). Realtek confirmed that few chip versions suffer from a PLL
power-down hw bug.

Fixes: 07df5bd874f0 ("r8169: power down chip in probe")
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Link: https://lore.kernel.org/r/a1c39460-d533-7f9e-fa9d-2b8990b02426@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: William Breathitt Gray <william.gray@canonical.com>
3 years agonet: hdlc_ppp: Fix issues when mod_timer is called while timer is running
Xie He [Mon, 28 Dec 2020 02:53:39 +0000 (18:53 -0800)]
net: hdlc_ppp: Fix issues when mod_timer is called while timer is running

BugLink: https://bugs.launchpad.net/bugs/1913486
[ Upstream commit 1fef73597fa545c35fddc953979013882fbd4e55 ]

ppp_cp_event is called directly or indirectly by ppp_rx with "ppp->lock"
held. It may call mod_timer to add a new timer. However, at the same time
ppp_timer may be already running and waiting for "ppp->lock". In this
case, there's no need for ppp_timer to continue running and it can just
exit.

If we let ppp_timer continue running, it may call add_timer. This causes
kernel panic because add_timer can't be called with a timer pending.
This patch fixes this problem.

Fixes: e022c2f07ae5 ("WAN: new synchronous PPP implementation for generic HDLC.")
Cc: Krzysztof Halasa <khc@pm.waw.pl>
Signed-off-by: Xie He <xie.he.0141@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: William Breathitt Gray <william.gray@canonical.com>
3 years agoerspan: fix version 1 check in gre_parse_header()
Cong Wang [Sat, 26 Dec 2020 23:44:53 +0000 (15:44 -0800)]
erspan: fix version 1 check in gre_parse_header()

BugLink: https://bugs.launchpad.net/bugs/1913486
[ Upstream commit 085c7c4e1c0e50d90b7d90f61a12e12b317a91e2 ]

Both version 0 and version 1 use ETH_P_ERSPAN, but version 0 does not
have an erspan header. So the check in gre_parse_header() is wrong,
we have to distinguish version 1 from version 0.

We can just check the gre header length like is_erspan_type1().

Fixes: cb73ee40b1b3 ("net: ip_gre: use erspan key field for tunnel lookup")
Reported-by: syzbot+f583ce3d4ddf9836b27a@syzkaller.appspotmail.com
Cc: William Tu <u9012063@gmail.com>
Cc: Lorenzo Bianconi <lorenzo.bianconi@redhat.com>
Signed-off-by: Cong Wang <cong.wang@bytedance.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: William Breathitt Gray <william.gray@canonical.com>
3 years agonet: hns: fix return value check in __lb_other_process()
Yunjian Wang [Sat, 26 Dec 2020 08:10:05 +0000 (16:10 +0800)]
net: hns: fix return value check in __lb_other_process()

BugLink: https://bugs.launchpad.net/bugs/1913486
[ Upstream commit 5ede3ada3da7f050519112b81badc058190b9f9f ]

The function skb_copy() could return NULL, the return value
need to be checked.

Fixes: b5996f11ea54 ("net: add Hisilicon Network Subsystem basic ethernet support")
Signed-off-by: Yunjian Wang <wangyunjian@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: William Breathitt Gray <william.gray@canonical.com>
3 years agonet: sched: prevent invalid Scell_log shift count
Randy Dunlap [Fri, 25 Dec 2020 06:23:44 +0000 (22:23 -0800)]
net: sched: prevent invalid Scell_log shift count

BugLink: https://bugs.launchpad.net/bugs/1913486
[ Upstream commit bd1248f1ddbc48b0c30565fce897a3b6423313b8 ]

Check Scell_log shift size in red_check_params() and modify all callers
of red_check_params() to pass Scell_log.

This prevents a shift out-of-bounds as detected by UBSAN:
  UBSAN: shift-out-of-bounds in ./include/net/red.h:252:22
  shift exponent 72 is too large for 32-bit type 'int'

Fixes: 8afa10cbe281 ("net_sched: red: Avoid illegal values")
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Reported-by: syzbot+97c5bd9cc81eca63d36e@syzkaller.appspotmail.com
Cc: Nogah Frankel <nogahf@mellanox.com>
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Cc: Cong Wang <xiyou.wangcong@gmail.com>
Cc: Jiri Pirko <jiri@resnulli.us>
Cc: netdev@vger.kernel.org
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: William Breathitt Gray <william.gray@canonical.com>
3 years agoipv4: Ignore ECN bits for fib lookups in fib_compute_spec_dst()
Guillaume Nault [Thu, 24 Dec 2020 19:01:09 +0000 (20:01 +0100)]
ipv4: Ignore ECN bits for fib lookups in fib_compute_spec_dst()

BugLink: https://bugs.launchpad.net/bugs/1913486
[ Upstream commit 21fdca22eb7df2a1e194b8adb812ce370748b733 ]

RT_TOS() only clears one of the ECN bits. Therefore, when
fib_compute_spec_dst() resorts to a fib lookup, it can return
different results depending on the value of the second ECN bit.

For example, ECT(0) and ECT(1) packets could be treated differently.

  $ ip netns add ns0
  $ ip netns add ns1
  $ ip link add name veth01 netns ns0 type veth peer name veth10 netns ns1
  $ ip -netns ns0 link set dev lo up
  $ ip -netns ns1 link set dev lo up
  $ ip -netns ns0 link set dev veth01 up
  $ ip -netns ns1 link set dev veth10 up

  $ ip -netns ns0 address add 192.0.2.10/24 dev veth01
  $ ip -netns ns1 address add 192.0.2.11/24 dev veth10

  $ ip -netns ns1 address add 192.0.2.21/32 dev lo
  $ ip -netns ns1 route add 192.0.2.10/32 tos 4 dev veth10 src 192.0.2.21
  $ ip netns exec ns1 sysctl -wq net.ipv4.icmp_echo_ignore_broadcasts=0

With TOS 4 and ECT(1), ns1 replies using source address 192.0.2.21
(ping uses -Q to set all TOS and ECN bits):

  $ ip netns exec ns0 ping -c 1 -b -Q 5 192.0.2.255
  [...]
  64 bytes from 192.0.2.21: icmp_seq=1 ttl=64 time=0.544 ms

But with TOS 4 and ECT(0), ns1 replies using source address 192.0.2.11
because the "tos 4" route isn't matched:

  $ ip netns exec ns0 ping -c 1 -b -Q 6 192.0.2.255
  [...]
  64 bytes from 192.0.2.11: icmp_seq=1 ttl=64 time=0.597 ms

After this patch the ECN bits don't affect the result anymore:

  $ ip netns exec ns0 ping -c 1 -b -Q 6 192.0.2.255
  [...]
  64 bytes from 192.0.2.21: icmp_seq=1 ttl=64 time=0.591 ms

Fixes: 35ebf65e851c ("ipv4: Create and use fib_compute_spec_dst() helper.")
Signed-off-by: Guillaume Nault <gnault@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: William Breathitt Gray <william.gray@canonical.com>
3 years agonet: mvpp2: fix pkt coalescing int-threshold configuration
Stefan Chulski [Wed, 23 Dec 2020 18:35:21 +0000 (20:35 +0200)]
net: mvpp2: fix pkt coalescing int-threshold configuration

BugLink: https://bugs.launchpad.net/bugs/1913486
[ Upstream commit 4f374d2c43a9e5e773f1dee56db63bd6b8a36276 ]

The packet coalescing interrupt threshold has separated registers
for different aggregated/cpu (sw-thread). The required value should
be loaded for every thread but not only for 1 current cpu.

Fixes: 213f428f5056 ("net: mvpp2: add support for TX interrupts and RX queue distribution modes")
Signed-off-by: Stefan Chulski <stefanc@marvell.com>
Link: https://lore.kernel.org/r/1608748521-11033-1-git-send-email-stefanc@marvell.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: William Breathitt Gray <william.gray@canonical.com>
3 years agotun: fix return value when the number of iovs exceeds MAX_SKB_FRAGS
Yunjian Wang [Fri, 25 Dec 2020 02:52:16 +0000 (10:52 +0800)]
tun: fix return value when the number of iovs exceeds MAX_SKB_FRAGS

BugLink: https://bugs.launchpad.net/bugs/1913486
[ Upstream commit 950271d7cc0b4546af3549d8143c4132d6e1f138 ]

Currently the tun_napi_alloc_frags() function returns -ENOMEM when the
number of iovs exceeds MAX_SKB_FRAGS + 1. However this is inappropriate,
we should use -EMSGSIZE instead of -ENOMEM.

The following distinctions are matters:
1. the caller need to drop the bad packet when -EMSGSIZE is returned,
   which means meeting a persistent failure.
2. the caller can try again when -ENOMEM is returned, which means
   meeting a transient failure.

Fixes: 90e33d459407 ("tun: enable napi_gro_frags() for TUN/TAP driver")
Signed-off-by: Yunjian Wang <wangyunjian@huawei.com>
Acked-by: Willem de Bruijn <willemb@google.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Link: https://lore.kernel.org/r/1608864736-24332-1-git-send-email-wangyunjian@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: William Breathitt Gray <william.gray@canonical.com>
3 years agonet: ethernet: ti: cpts: fix ethtool output when no ptp_clock registered
Grygorii Strashko [Thu, 24 Dec 2020 16:24:05 +0000 (18:24 +0200)]
net: ethernet: ti: cpts: fix ethtool output when no ptp_clock registered

BugLink: https://bugs.launchpad.net/bugs/1913486
[ Upstream commit 4614792eebcbf81c60ad3604c1aeeb2b0899cea4 ]

The CPTS driver registers PTP PHC clock when first netif is going up and
unregister it when all netif are down. Now ethtool will show:
 - PTP PHC clock index 0 after boot until first netif is up;
 - the last assigned PTP PHC clock index even if PTP PHC clock is not
registered any more after all netifs are down.

This patch ensures that -1 is returned by ethtool when PTP PHC clock is not
registered any more.

Fixes: 8a2c9a5ab4b9 ("net: ethernet: ti: cpts: rework initialization/deinitialization")
Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
Acked-by: Richard Cochran <richardcochran@gmail.com>
Link: https://lore.kernel.org/r/20201224162405.28032-1-grygorii.strashko@ti.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: William Breathitt Gray <william.gray@canonical.com>
3 years agonet-sysfs: take the rtnl lock when accessing xps_rxqs_map and num_tc
Antoine Tenart [Wed, 23 Dec 2020 21:23:23 +0000 (22:23 +0100)]
net-sysfs: take the rtnl lock when accessing xps_rxqs_map and num_tc

BugLink: https://bugs.launchpad.net/bugs/1913486
[ Upstream commit 4ae2bb81649dc03dfc95875f02126b14b773f7ab ]

Accesses to dev->xps_rxqs_map (when using dev->num_tc) should be
protected by the rtnl lock, like we do for netif_set_xps_queue. I didn't
see an actual bug being triggered, but let's be safe here and take the
rtnl lock while accessing the map in sysfs.

Fixes: 8af2c06ff4b1 ("net-sysfs: Add interface for Rx queue(s) map per Tx queue")
Signed-off-by: Antoine Tenart <atenart@kernel.org>
Reviewed-by: Alexander Duyck <alexanderduyck@fb.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: William Breathitt Gray <william.gray@canonical.com>
3 years agonet-sysfs: take the rtnl lock when storing xps_rxqs
Antoine Tenart [Wed, 23 Dec 2020 21:23:22 +0000 (22:23 +0100)]
net-sysfs: take the rtnl lock when storing xps_rxqs

BugLink: https://bugs.launchpad.net/bugs/1913486
[ Upstream commit 2d57b4f142e0b03e854612b8e28978935414bced ]

Two race conditions can be triggered when storing xps rxqs, resulting in
various oops and invalid memory accesses:

1. Calling netdev_set_num_tc while netif_set_xps_queue:

   - netif_set_xps_queue uses dev->tc_num as one of the parameters to
     compute the size of new_dev_maps when allocating it. dev->tc_num is
     also used to access the map, and the compiler may generate code to
     retrieve this field multiple times in the function.

   - netdev_set_num_tc sets dev->tc_num.

   If new_dev_maps is allocated using dev->tc_num and then dev->tc_num
   is set to a higher value through netdev_set_num_tc, later accesses to
   new_dev_maps in netif_set_xps_queue could lead to accessing memory
   outside of new_dev_maps; triggering an oops.

2. Calling netif_set_xps_queue while netdev_set_num_tc is running:

   2.1. netdev_set_num_tc starts by resetting the xps queues,
        dev->tc_num isn't updated yet.

   2.2. netif_set_xps_queue is called, setting up the map with the
        *old* dev->num_tc.

   2.3. netdev_set_num_tc updates dev->tc_num.

   2.4. Later accesses to the map lead to out of bound accesses and
        oops.

   A similar issue can be found with netdev_reset_tc.

One way of triggering this is to set an iface up (for which the driver
uses netdev_set_num_tc in the open path, such as bnx2x) and writing to
xps_rxqs in a concurrent thread. With the right timing an oops is
triggered.

Both issues have the same fix: netif_set_xps_queue, netdev_set_num_tc
and netdev_reset_tc should be mutually exclusive. We do that by taking
the rtnl lock in xps_rxqs_store.

Fixes: 8af2c06ff4b1 ("net-sysfs: Add interface for Rx queue(s) map per Tx queue")
Signed-off-by: Antoine Tenart <atenart@kernel.org>
Reviewed-by: Alexander Duyck <alexanderduyck@fb.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: William Breathitt Gray <william.gray@canonical.com>
3 years agonet-sysfs: take the rtnl lock when accessing xps_cpus_map and num_tc
Antoine Tenart [Wed, 23 Dec 2020 21:23:21 +0000 (22:23 +0100)]
net-sysfs: take the rtnl lock when accessing xps_cpus_map and num_tc

BugLink: https://bugs.launchpad.net/bugs/1913486
[ Upstream commit fb25038586d0064123e393cadf1fadd70a9df97a ]

Accesses to dev->xps_cpus_map (when using dev->num_tc) should be
protected by the rtnl lock, like we do for netif_set_xps_queue. I didn't
see an actual bug being triggered, but let's be safe here and take the
rtnl lock while accessing the map in sysfs.

Fixes: 184c449f91fe ("net: Add support for XPS with QoS via traffic classes")
Signed-off-by: Antoine Tenart <atenart@kernel.org>
Reviewed-by: Alexander Duyck <alexanderduyck@fb.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: William Breathitt Gray <william.gray@canonical.com>
3 years agonet-sysfs: take the rtnl lock when storing xps_cpus
Antoine Tenart [Wed, 23 Dec 2020 21:23:20 +0000 (22:23 +0100)]
net-sysfs: take the rtnl lock when storing xps_cpus

BugLink: https://bugs.launchpad.net/bugs/1913486
[ Upstream commit 1ad58225dba3f2f598d2c6daed4323f24547168f ]

Two race conditions can be triggered when storing xps cpus, resulting in
various oops and invalid memory accesses:

1. Calling netdev_set_num_tc while netif_set_xps_queue:

   - netif_set_xps_queue uses dev->tc_num as one of the parameters to
     compute the size of new_dev_maps when allocating it. dev->tc_num is
     also used to access the map, and the compiler may generate code to
     retrieve this field multiple times in the function.

   - netdev_set_num_tc sets dev->tc_num.

   If new_dev_maps is allocated using dev->tc_num and then dev->tc_num
   is set to a higher value through netdev_set_num_tc, later accesses to
   new_dev_maps in netif_set_xps_queue could lead to accessing memory
   outside of new_dev_maps; triggering an oops.

2. Calling netif_set_xps_queue while netdev_set_num_tc is running:

   2.1. netdev_set_num_tc starts by resetting the xps queues,
        dev->tc_num isn't updated yet.

   2.2. netif_set_xps_queue is called, setting up the map with the
        *old* dev->num_tc.

   2.3. netdev_set_num_tc updates dev->tc_num.

   2.4. Later accesses to the map lead to out of bound accesses and
        oops.

   A similar issue can be found with netdev_reset_tc.

One way of triggering this is to set an iface up (for which the driver
uses netdev_set_num_tc in the open path, such as bnx2x) and writing to
xps_cpus in a concurrent thread. With the right timing an oops is
triggered.

Both issues have the same fix: netif_set_xps_queue, netdev_set_num_tc
and netdev_reset_tc should be mutually exclusive. We do that by taking
the rtnl lock in xps_cpus_store.

Fixes: 184c449f91fe ("net: Add support for XPS with QoS via traffic classes")
Signed-off-by: Antoine Tenart <atenart@kernel.org>
Reviewed-by: Alexander Duyck <alexanderduyck@fb.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: William Breathitt Gray <william.gray@canonical.com>
3 years agonet: ethernet: Fix memleak in ethoc_probe
Dinghao Liu [Wed, 23 Dec 2020 11:06:12 +0000 (19:06 +0800)]
net: ethernet: Fix memleak in ethoc_probe

BugLink: https://bugs.launchpad.net/bugs/1913486
[ Upstream commit 5d41f9b7ee7a5a5138894f58846a4ffed601498a ]

When mdiobus_register() fails, priv->mdio allocated
by mdiobus_alloc() has not been freed, which leads
to memleak.

Fixes: e7f4dc3536a4 ("mdio: Move allocation of interrupts into core")
Signed-off-by: Dinghao Liu <dinghao.liu@zju.edu.cn>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://lore.kernel.org/r/20201223110615.31389-1-dinghao.liu@zju.edu.cn
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: William Breathitt Gray <william.gray@canonical.com>
3 years agonet/ncsi: Use real net-device for response handler
John Wang [Wed, 23 Dec 2020 05:55:23 +0000 (13:55 +0800)]
net/ncsi: Use real net-device for response handler

BugLink: https://bugs.launchpad.net/bugs/1913486
[ Upstream commit 427c940558560bff2583d07fc119a21094675982 ]

When aggregating ncsi interfaces and dedicated interfaces to bond
interfaces, the ncsi response handler will use the wrong net device to
find ncsi_dev, so that the ncsi interface will not work properly.
Here, we use the original net device to fix it.

Fixes: 138635cc27c9 ("net/ncsi: NCSI response packet handler")
Signed-off-by: John Wang <wangzhiqiang.bj@bytedance.com>
Link: https://lore.kernel.org/r/20201223055523.2069-1-wangzhiqiang.bj@bytedance.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: William Breathitt Gray <william.gray@canonical.com>
3 years agovirtio_net: Fix recursive call to cpus_read_lock()
Jeff Dike [Wed, 23 Dec 2020 02:54:21 +0000 (21:54 -0500)]
virtio_net: Fix recursive call to cpus_read_lock()

BugLink: https://bugs.launchpad.net/bugs/1913486
[ Upstream commit de33212f768c5d9e2fe791b008cb26f92f0aa31c ]

virtnet_set_channels can recursively call cpus_read_lock if CONFIG_XPS
and CONFIG_HOTPLUG are enabled.

The path is:
    virtnet_set_channels - calls get_online_cpus(), which is a trivial
wrapper around cpus_read_lock()
    netif_set_real_num_tx_queues
    netif_reset_xps_queues_gt
    netif_reset_xps_queues - calls cpus_read_lock()

This call chain and potential deadlock happens when the number of TX
queues is reduced.

This commit the removes netif_set_real_num_[tr]x_queues calls from
inside the get/put_online_cpus section, as they don't require that it
be held.

Fixes: 47be24796c13 ("virtio-net: fix the set affinity bug when CPU IDs are not consecutive")
Signed-off-by: Jeff Dike <jdike@akamai.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Link: https://lore.kernel.org/r/20201223025421.671-1-jdike@akamai.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: William Breathitt Gray <william.gray@canonical.com>
3 years agonet: ethernet: mvneta: Fix error handling in mvneta_probe
Dinghao Liu [Sun, 20 Dec 2020 08:29:30 +0000 (16:29 +0800)]
net: ethernet: mvneta: Fix error handling in mvneta_probe

BugLink: https://bugs.launchpad.net/bugs/1913486
[ Upstream commit 58f60329a6be35a5653edb3fd2023ccef9eb9943 ]

When mvneta_port_power_up() fails, we should execute
cleanup functions after label err_netdev to avoid memleak.

Fixes: 41c2b6b4f0f80 ("net: ethernet: mvneta: Add back interface mode validation")
Signed-off-by: Dinghao Liu <dinghao.liu@zju.edu.cn>
Link: https://lore.kernel.org/r/20201220082930.21623-1-dinghao.liu@zju.edu.cn
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: William Breathitt Gray <william.gray@canonical.com>
3 years agoibmvnic: continue fatal error reset after passive init
Lijun Pan [Sat, 19 Dec 2020 21:40:34 +0000 (15:40 -0600)]
ibmvnic: continue fatal error reset after passive init

BugLink: https://bugs.launchpad.net/bugs/1913486
[ Upstream commit 1f45dc22066797479072978feeada0852502e180 ]

Commit f9c6cea0b385 ("ibmvnic: Skip fatal error reset after passive init")
says "If the passive
CRQ initialization occurs before the FATAL reset task is processed,
the FATAL error reset task would try to access a CRQ message queue
that was freed, causing an oops. The problem may be most likely to
occur during DLPAR add vNIC with a non-default MTU, because the DLPAR
process will automatically issue a change MTU request.
Fix this by not processing fatal error reset if CRQ is passively
initialized after client-driven CRQ initialization fails."

Even with this commit, we still see similar kernel crashes. In order
to completely solve this problem, we'd better continue the fatal error
reset, capture the kernel crash, and try to fix it from that end.

Fixes: f9c6cea0b385 ("ibmvnic: Skip fatal error reset after passive init")
Signed-off-by: Lijun Pan <ljp@linux.ibm.com>
Link: https://lore.kernel.org/r/20201219214034.21123-1-ljp@linux.ibm.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: William Breathitt Gray <william.gray@canonical.com>
3 years agonet: mvpp2: Fix GoP port 3 Networking Complex Control configurations
Stefan Chulski [Sun, 20 Dec 2020 11:02:29 +0000 (13:02 +0200)]
net: mvpp2: Fix GoP port 3 Networking Complex Control configurations

BugLink: https://bugs.launchpad.net/bugs/1913486
[ Upstream commit 2575bc1aa9d52a62342b57a0b7d0a12146cf6aed ]

During GoP port 2 Networking Complex Control mode of operation configurations,
also GoP port 3 mode of operation was wrongly set.
Patch removes these configurations.

Fixes: f84bf386f395 ("net: mvpp2: initialize the GoP")
Acked-by: Marcin Wojtas <mw@semihalf.com>
Signed-off-by: Stefan Chulski <stefanc@marvell.com>
Link: https://lore.kernel.org/r/1608462149-1702-1-git-send-email-stefanc@marvell.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: William Breathitt Gray <william.gray@canonical.com>
3 years agoatm: idt77252: call pci_disable_device() on error path
Dan Carpenter [Sat, 19 Dec 2020 11:01:44 +0000 (14:01 +0300)]
atm: idt77252: call pci_disable_device() on error path

BugLink: https://bugs.launchpad.net/bugs/1913486
[ Upstream commit 8df66af5c1e5f80562fe728db5ec069b21810144 ]

This error path needs to disable the pci device before returning.

Fixes: ede58ef28e10 ("atm: remove deprecated use of pci api")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Link: https://lore.kernel.org/r/X93dmC4NX0vbTpGp@mwanda
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: William Breathitt Gray <william.gray@canonical.com>
3 years agoethernet: ucc_geth: set dev->max_mtu to 1518
Rasmus Villemoes [Fri, 18 Dec 2020 10:55:36 +0000 (11:55 +0100)]
ethernet: ucc_geth: set dev->max_mtu to 1518

BugLink: https://bugs.launchpad.net/bugs/1913486
[ Upstream commit 1385ae5c30f238f81bc6528d897c6d7a0816783f ]

All the buffers and registers are already set up appropriately for an
MTU slightly above 1500, so we just need to expose this to the
networking stack. AFAICT, there's no need to implement .ndo_change_mtu
when the receive buffers are always set up to support the max_mtu.

This fixes several warnings during boot on our mpc8309-board with an
embedded mv88e6250 switch:

mv88e6085 mdio@e0102120:10: nonfatal error -34 setting MTU 1500 on port 0
...
mv88e6085 mdio@e0102120:10: nonfatal error -34 setting MTU 1500 on port 4
ucc_geth e0102000.ethernet eth1: error -22 setting MTU to 1504 to include DSA overhead

The last line explains what the DSA stack tries to do: achieving an MTU
of 1500 on-the-wire requires that the master netdevice connected to
the CPU port supports an MTU of 1500+the tagging overhead.

Fixes: bfcb813203e6 ("net: dsa: configure the MTU for switch ports")
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Rasmus Villemoes <rasmus.villemoes@prevas.dk>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: William Breathitt Gray <william.gray@canonical.com>
3 years agoethernet: ucc_geth: fix use-after-free in ucc_geth_remove()
Rasmus Villemoes [Fri, 18 Dec 2020 10:55:38 +0000 (11:55 +0100)]
ethernet: ucc_geth: fix use-after-free in ucc_geth_remove()

BugLink: https://bugs.launchpad.net/bugs/1913486
[ Upstream commit e925e0cd2a705aaacb0b907bb3691fcac3a973a4 ]

ugeth is the netdiv_priv() part of the netdevice. Accessing the memory
pointed to by ugeth (such as done by ucc_geth_memclean() and the two
of_node_puts) after free_netdev() is thus use-after-free.

Fixes: 80a9fad8e89a ("ucc_geth: fix module removal")
Signed-off-by: Rasmus Villemoes <rasmus.villemoes@prevas.dk>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: William Breathitt Gray <william.gray@canonical.com>
3 years agonet: systemport: set dev->max_mtu to UMAC_MAX_MTU_SIZE
Florian Fainelli [Fri, 18 Dec 2020 17:38:43 +0000 (09:38 -0800)]
net: systemport: set dev->max_mtu to UMAC_MAX_MTU_SIZE

BugLink: https://bugs.launchpad.net/bugs/1913486
[ Upstream commit 54ddbdb024882e226055cc4c3c246592ddde2ee5 ]

The driver is already allocating receive buffers of 2KiB and the
Ethernet MAC is configured to accept frames up to UMAC_MAX_MTU_SIZE.

Fixes: bfcb813203e6 ("net: dsa: configure the MTU for switch ports")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Link: https://lore.kernel.org/r/20201218173843.141046-1-f.fainelli@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: William Breathitt Gray <william.gray@canonical.com>
3 years agonet: mvpp2: prs: fix PPPoE with ipv6 packet parse
Stefan Chulski [Thu, 17 Dec 2020 18:37:46 +0000 (20:37 +0200)]
net: mvpp2: prs: fix PPPoE with ipv6 packet parse

BugLink: https://bugs.launchpad.net/bugs/1913486
[ Upstream commit fec6079b2eeab319d9e3d074f54d3b6f623e9701 ]

Current PPPoE+IPv6 entry is jumping to 'next-hdr'
field and not to 'DIP' field as done for IPv4.

Fixes: 3f518509dedc ("ethernet: Add new driver for Marvell Armada 375 network unit")
Reported-by: Liron Himi <lironh@marvell.com>
Signed-off-by: Stefan Chulski <stefanc@marvell.com>
Link: https://lore.kernel.org/r/1608230266-22111-1-git-send-email-stefanc@marvell.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: William Breathitt Gray <william.gray@canonical.com>
3 years agonet: mvpp2: Add TCAM entry to drop flow control pause frames
Stefan Chulski [Thu, 17 Dec 2020 18:30:17 +0000 (20:30 +0200)]
net: mvpp2: Add TCAM entry to drop flow control pause frames

BugLink: https://bugs.launchpad.net/bugs/1913486
[ Upstream commit 3f48fab62bb81a7f9d01e9d43c40395fad011dd5 ]

Issue:
Flow control frame used to pause GoP(MAC) was delivered to the CPU
and created a load on the CPU. Since XOFF/XON frames are used only
by MAC, these frames should be dropped inside MAC.

Fix:
According to 802.3-2012 - IEEE Standard for Ethernet pause frame
has unique destination MAC address 01-80-C2-00-00-01.
Add TCAM parser entry to track and drop pause frames by destination MAC.

Fixes: 3f518509dedc ("ethernet: Add new driver for Marvell Armada 375 network unit")
Signed-off-by: Stefan Chulski <stefanc@marvell.com>
Link: https://lore.kernel.org/r/1608229817-21951-1-git-send-email-stefanc@marvell.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: William Breathitt Gray <william.gray@canonical.com>
3 years agoiavf: fix double-release of rtnl_lock
Jakub Kicinski [Thu, 3 Dec 2020 02:18:06 +0000 (18:18 -0800)]
iavf: fix double-release of rtnl_lock

BugLink: https://bugs.launchpad.net/bugs/1913486
[ Upstream commit f1340265726e0edf8a8cef28e665b28ad6302ce9 ]

This code does not jump to exit on an error in iavf_lan_add_device(),
so the rtnl_unlock() from the normal path will follow.

Fixes: b66c7bc1cd4d ("iavf: Refactor init state machine")
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: William Breathitt Gray <william.gray@canonical.com>
3 years agoi40e: Fix Error I40E_AQ_RC_EINVAL when removing VFs
Sylwester Dziedziuch [Thu, 22 Oct 2020 10:39:36 +0000 (12:39 +0200)]
i40e: Fix Error I40E_AQ_RC_EINVAL when removing VFs

BugLink: https://bugs.launchpad.net/bugs/1913486
[ Upstream commit 3ac874fa84d1baaf0c0175f2a1499f5d88d528b2 ]

When removing VFs for PF added to bridge there was
an error I40E_AQ_RC_EINVAL. It was caused by not properly
resetting and reinitializing PF when adding/removing VFs.
Changed how reset is performed when adding/removing VFs
to properly reinitialize PFs VSI.

Fixes: fc60861e9b00 ("i40e: start up in VEPA mode by default")
Signed-off-by: Sylwester Dziedziuch <sylwesterx.dziedziuch@intel.com>
Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: William Breathitt Gray <william.gray@canonical.com>
3 years agoproc: fix lookup in /proc/net subdirectories after setns(2)
Alexey Dobriyan [Wed, 16 Dec 2020 04:42:39 +0000 (20:42 -0800)]
proc: fix lookup in /proc/net subdirectories after setns(2)

BugLink: https://bugs.launchpad.net/bugs/1913486
[ Upstream commit c6c75deda81344c3a95d1d1f606d5cee109e5d54 ]

Commit 1fde6f21d90f ("proc: fix /proc/net/* after setns(2)") only forced
revalidation of regular files under /proc/net/

However, /proc/net/ is unusual in the sense of /proc/net/foo handlers
take netns pointer from parent directory which is old netns.

Steps to reproduce:

(void)open("/proc/net/sctp/snmp", O_RDONLY);
unshare(CLONE_NEWNET);

int fd = open("/proc/net/sctp/snmp", O_RDONLY);
read(fd, &c, 1);

Read will read wrong data from original netns.

Patch forces lookup on every directory under /proc/net .

Link: https://lkml.kernel.org/r/20201205160916.GA109739@localhost.localdomain
Fixes: 1da4d377f943 ("proc: revalidate misc dentries")
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Reported-by: "Rantala, Tommi T. (Nokia - FI/Espoo)" <tommi.t.rantala@nokia.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: William Breathitt Gray <william.gray@canonical.com>
3 years agoproc: change ->nlink under proc_subdir_lock
Alexey Dobriyan [Thu, 5 Dec 2019 00:49:59 +0000 (16:49 -0800)]
proc: change ->nlink under proc_subdir_lock

BugLink: https://bugs.launchpad.net/bugs/1913486
[ Upstream commit e06689bf57017ac022ccf0f2a5071f760821ce0f ]

Currently gluing PDE into global /proc tree is done under lock, but
changing ->nlink is not.  Additionally struct proc_dir_entry::nlink is
not atomic so updates can be lost.

Link: http://lkml.kernel.org/r/20190925202436.GA17388@avx2
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: William Breathitt Gray <william.gray@canonical.com>
3 years agodepmod: handle the case of /sbin/depmod without /sbin in PATH
Linus Torvalds [Mon, 28 Dec 2020 19:40:22 +0000 (11:40 -0800)]
depmod: handle the case of /sbin/depmod without /sbin in PATH

BugLink: https://bugs.launchpad.net/bugs/1913486
[ Upstream commit cedd1862be7e666be87ec824dabc6a2b05618f36 ]

Commit 436e980e2ed5 ("kbuild: don't hardcode depmod path") stopped
hard-coding the path of depmod, but in the process caused trouble for
distributions that had that /sbin location, but didn't have it in the
PATH (generally because /sbin is limited to the super-user path).

Work around it for now by just adding /sbin to the end of PATH in the
depmod.sh script.

Reported-and-tested-by: Sedat Dilek <sedat.dilek@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: William Breathitt Gray <william.gray@canonical.com>
3 years agolib/genalloc: fix the overflow when size is too big
Huang Shijie [Tue, 29 Dec 2020 23:14:58 +0000 (15:14 -0800)]
lib/genalloc: fix the overflow when size is too big

BugLink: https://bugs.launchpad.net/bugs/1913486
[ Upstream commit 36845663843fc59c5d794e3dc0641472e3e572da ]

Some graphic card has very big memory on chip, such as 32G bytes.

In the following case, it will cause overflow:

    pool = gen_pool_create(PAGE_SHIFT, NUMA_NO_NODE);
    ret = gen_pool_add(pool, 0x1000000, SZ_32G, NUMA_NO_NODE);

    va = gen_pool_alloc(pool, SZ_4G);

The overflow occurs in gen_pool_alloc_algo_owner():

....
size = nbits << order;
....

The @nbits is "int" type, so it will overflow.
Then the gen_pool_avail() will return the wrong value.

This patch converts some "int" to "unsigned long", and
changes the compare code in while.

Link: https://lkml.kernel.org/r/20201229060657.3389-1-sjhuang@iluvatar.ai
Signed-off-by: Huang Shijie <sjhuang@iluvatar.ai>
Reported-by: Shi Jiasheng <jiasheng.shi@iluvatar.ai>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: William Breathitt Gray <william.gray@canonical.com>
3 years agoscsi: scsi_transport_spi: Set RQF_PM for domain validation commands
Bart Van Assche [Wed, 9 Dec 2020 05:29:48 +0000 (21:29 -0800)]
scsi: scsi_transport_spi: Set RQF_PM for domain validation commands

BugLink: https://bugs.launchpad.net/bugs/1913486
[ Upstream commit cfefd9f8240a7b9fdd96fcd54cb029870b6d8d88 ]

Disable runtime power management during domain validation. Since a later
patch removes RQF_PREEMPT, set RQF_PM for domain validation commands such
that these are executed in the quiesced SCSI device state.

Link: https://lore.kernel.org/r/20201209052951.16136-6-bvanassche@acm.org
Cc: Alan Stern <stern@rowland.harvard.edu>
Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
Cc: Woody Suwalski <terraluna977@gmail.com>
Cc: Can Guo <cang@codeaurora.org>
Cc: Stanley Chu <stanley.chu@mediatek.com>
Cc: Ming Lei <ming.lei@redhat.com>
Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: Stan Johnson <userm57@yahoo.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jens Axboe <axboe@kernel.dk>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: William Breathitt Gray <william.gray@canonical.com>
3 years agoscsi: ide: Do not set the RQF_PREEMPT flag for sense requests
Bart Van Assche [Wed, 9 Dec 2020 05:29:46 +0000 (21:29 -0800)]
scsi: ide: Do not set the RQF_PREEMPT flag for sense requests

BugLink: https://bugs.launchpad.net/bugs/1913486
[ Upstream commit 96d86e6a80a3ab9aff81d12f9f1f2a0da2917d38 ]

RQF_PREEMPT is used for two different purposes in the legacy IDE code:

 1. To mark power management requests.

 2. To mark requests that should preempt another request. An (old)
    explanation of that feature is as follows: "The IDE driver in the Linux
    kernel normally uses a series of busywait delays during its
    initialization. When the driver executes these busywaits, the kernel
    does nothing for the duration of the wait. The time spent in these
    waits could be used for other initialization activities, if they could
    be run concurrently with these waits.

    More specifically, busywait-style delays such as udelay() in module
    init functions inhibit kernel preemption because the Big Kernel Lock is
    held, while yielding APIs such as schedule_timeout() allow
    preemption. This is true because the kernel handles the BKL specially
    and releases and reacquires it across reschedules allowed by the
    current thread.

    This IDE-preempt specification requires that the driver eliminate these
    busywaits and replace them with a mechanism that allows other work to
    proceed while the IDE driver is initializing."

Since I haven't found an implementation of (2), do not set the PREEMPT flag
for sense requests. This patch causes sense requests to be postponed while
a drive is suspended instead of being submitted to ide_queue_rq().

If it would ever be necessary to restore the IDE PREEMPT functionality,
that can be done by introducing a new flag in struct ide_request.

Link: https://lore.kernel.org/r/20201209052951.16136-4-bvanassche@acm.org
Cc: David S. Miller <davem@davemloft.net>
Cc: Alan Stern <stern@rowland.harvard.edu>
Cc: Can Guo <cang@codeaurora.org>
Cc: Stanley Chu <stanley.chu@mediatek.com>
Cc: Ming Lei <ming.lei@redhat.com>
Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: William Breathitt Gray <william.gray@canonical.com>
3 years agoscsi: ufs-pci: Ensure UFS device is in PowerDown mode for suspend-to-disk ->poweroff()
Adrian Hunter [Mon, 7 Dec 2020 08:31:18 +0000 (10:31 +0200)]
scsi: ufs-pci: Ensure UFS device is in PowerDown mode for suspend-to-disk ->poweroff()

BugLink: https://bugs.launchpad.net/bugs/1913486
[ Upstream commit af423534d2de86cd0db729a5ac41f056ca8717de ]

The expectation for suspend-to-disk is that devices will be powered-off, so
the UFS device should be put in PowerDown mode. If spm_lvl is not 5, then
that will not happen. Change the pm callbacks to force spm_lvl 5 for
suspend-to-disk poweroff.

Link: https://lore.kernel.org/r/20201207083120.26732-3-adrian.hunter@intel.com
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: William Breathitt Gray <william.gray@canonical.com>
3 years agoscsi: ufs: Fix wrong print message in dev_err()
Bean Huo [Mon, 7 Dec 2020 19:01:37 +0000 (20:01 +0100)]
scsi: ufs: Fix wrong print message in dev_err()

BugLink: https://bugs.launchpad.net/bugs/1913486
[ Upstream commit 1fa0570002e3f66db9b58c32c60de4183b857a19 ]

Change dev_err() print message from "dme-reset" to "dme_enable" in function
ufshcd_dme_enable().

Link: https://lore.kernel.org/r/20201207190137.6858-3-huobean@gmail.com
Acked-by: Alim Akhtar <alim.akhtar@samsung.com>
Acked-by: Avri Altman <avri.altman@wdc.com>
Signed-off-by: Bean Huo <beanhuo@micron.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: William Breathitt Gray <william.gray@canonical.com>
3 years agoworkqueue: Kick a worker based on the actual activation of delayed works
Yunfeng Ye [Thu, 19 Nov 2020 06:21:25 +0000 (14:21 +0800)]
workqueue: Kick a worker based on the actual activation of delayed works

BugLink: https://bugs.launchpad.net/bugs/1913486
[ Upstream commit 01341fbd0d8d4e717fc1231cdffe00343088ce0b ]

In realtime scenario, We do not want to have interference on the
isolated cpu cores. but when invoking alloc_workqueue() for percpu wq
on the housekeeping cpu, it kick a kworker on the isolated cpu.

  alloc_workqueue
    pwq_adjust_max_active
      wake_up_worker

The comment in pwq_adjust_max_active() said:
  "Need to kick a worker after thawed or an unbound wq's
   max_active is bumped"

So it is unnecessary to kick a kworker for percpu's wq when invoking
alloc_workqueue(). this patch only kick a worker based on the actual
activation of delayed works.

Signed-off-by: Yunfeng Ye <yeyunfeng@huawei.com>
Reviewed-by: Lai Jiangshan <jiangshanlai@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: William Breathitt Gray <william.gray@canonical.com>
3 years agoUBUNTU: upstream stable to v5.4.88
Kamal Mostafa [Mon, 25 Jan 2021 20:47:59 +0000 (12:47 -0800)]
UBUNTU: upstream stable to v5.4.88

BugLink: https://bugs.launchpad.net/bugs/1913223
Ignore: yes
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: William Breathitt Gray <william.gray@canonical.com>
3 years agoLinux 5.4.88
Greg Kroah-Hartman [Sat, 9 Jan 2021 12:44:55 +0000 (13:44 +0100)]
Linux 5.4.88

BugLink: https://bugs.launchpad.net/bugs/1913223
Tested-by: Jon Hunter <jonathanh@nvidia.com>
Tested-by: Shuah Khan <skhan@linuxfoundation.org>
Tested-by: Linux Kernel Functional Testing <lkft@linaro.org>
Tested-by: Guenter Roeck <linux@roeck-us.net>
Link: https://lore.kernel.org/r/20210107143049.929352526@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: William Breathitt Gray <william.gray@canonical.com>
3 years agomwifiex: Fix possible buffer overflows in mwifiex_cmd_802_11_ad_hoc_start
Zhang Xiaohui [Sun, 6 Dec 2020 08:48:01 +0000 (16:48 +0800)]
mwifiex: Fix possible buffer overflows in mwifiex_cmd_802_11_ad_hoc_start

BugLink: https://bugs.launchpad.net/bugs/1913223
[ Upstream commit 5c455c5ab332773464d02ba17015acdca198f03d ]

mwifiex_cmd_802_11_ad_hoc_start() calls memcpy() without checking
the destination size may trigger a buffer overflower,
which a local user could use to cause denial of service
or the execution of arbitrary code.
Fix it by putting the length check before calling memcpy().

Signed-off-by: Zhang Xiaohui <ruc_zhangxiaohui@163.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/20201206084801.26479-1-ruc_zhangxiaohui@163.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: William Breathitt Gray <william.gray@canonical.com>
3 years agoexec: Transform exec_update_mutex into a rw_semaphore
Eric W. Biederman [Thu, 3 Dec 2020 20:12:00 +0000 (14:12 -0600)]
exec: Transform exec_update_mutex into a rw_semaphore

BugLink: https://bugs.launchpad.net/bugs/1913223
[ Upstream commit f7cfd871ae0c5008d94b6f66834e7845caa93c15 ]

Recently syzbot reported[0] that there is a deadlock amongst the users
of exec_update_mutex.  The problematic lock ordering found by lockdep
was:

   perf_event_open  (exec_update_mutex -> ovl_i_mutex)
   chown            (ovl_i_mutex       -> sb_writes)
   sendfile         (sb_writes         -> p->lock)
     by reading from a proc file and writing to overlayfs
   proc_pid_syscall (p->lock           -> exec_update_mutex)

While looking at possible solutions it occured to me that all of the
users and possible users involved only wanted to state of the given
process to remain the same.  They are all readers.  The only writer is
exec.

There is no reason for readers to block on each other.  So fix
this deadlock by transforming exec_update_mutex into a rw_semaphore
named exec_update_lock that only exec takes for writing.

Cc: Jann Horn <jannh@google.com>
Cc: Vasiliy Kulikov <segoon@openwall.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Bernd Edlinger <bernd.edlinger@hotmail.de>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Christopher Yeoh <cyeoh@au1.ibm.com>
Cc: Cyrill Gorcunov <gorcunov@gmail.com>
Cc: Sargun Dhillon <sargun@sargun.me>
Cc: Christian Brauner <christian.brauner@ubuntu.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Fixes: eea9673250db ("exec: Add exec_update_mutex to replace cred_guard_mutex")
[0] https://lkml.kernel.org/r/00000000000063640c05ade8e3de@google.com
Reported-by: syzbot+db9cdf3dd1f64252c6ef@syzkaller.appspotmail.com
Link: https://lkml.kernel.org/r/87ft4mbqen.fsf@x220.int.ebiederm.org
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: William Breathitt Gray <william.gray@canonical.com>
3 years agorwsem: Implement down_read_interruptible
Eric W. Biederman [Thu, 3 Dec 2020 20:11:13 +0000 (14:11 -0600)]
rwsem: Implement down_read_interruptible

BugLink: https://bugs.launchpad.net/bugs/1913223
[ Upstream commit 31784cff7ee073b34d6eddabb95e3be2880a425c ]

In preparation for converting exec_update_mutex to a rwsem so that
multiple readers can execute in parallel and not deadlock, add
down_read_interruptible.  This is needed for perf_event_open to be
converted (with no semantic changes) from working on a mutex to
wroking on a rwsem.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/87k0tybqfy.fsf@x220.int.ebiederm.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: William Breathitt Gray <william.gray@canonical.com>
3 years agorwsem: Implement down_read_killable_nested
Eric W. Biederman [Thu, 3 Dec 2020 20:10:32 +0000 (14:10 -0600)]
rwsem: Implement down_read_killable_nested

BugLink: https://bugs.launchpad.net/bugs/1913223
[ Upstream commit 0f9368b5bf6db0c04afc5454b1be79022a681615 ]

In preparation for converting exec_update_mutex to a rwsem so that
multiple readers can execute in parallel and not deadlock, add
down_read_killable_nested.  This is needed so that kcmp_lock
can be converted from working on a mutexes to working on rw_semaphores.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/87o8jabqh3.fsf@x220.int.ebiederm.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: William Breathitt Gray <william.gray@canonical.com>
3 years agoperf: Break deadlock involving exec_update_mutex
peterz@infradead.org [Fri, 28 Aug 2020 12:37:20 +0000 (14:37 +0200)]
perf: Break deadlock involving exec_update_mutex

BugLink: https://bugs.launchpad.net/bugs/1913223
[ Upstream commit 78af4dc949daaa37b3fcd5f348f373085b4e858f ]

Syzbot reported a lock inversion involving perf. The sore point being
perf holding exec_update_mutex() for a very long time, specifically
across a whole bunch of filesystem ops in pmu::event_init() (uprobes)
and anon_inode_getfile().

This then inverts against procfs code trying to take
exec_update_mutex.

Move the permission checks later, such that we need to hold the mutex
over less code.

Reported-by: syzbot+db9cdf3dd1f64252c6ef@syzkaller.appspotmail.com
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: William Breathitt Gray <william.gray@canonical.com>
3 years agofuse: fix bad inode
Miklos Szeredi [Thu, 10 Dec 2020 14:33:14 +0000 (15:33 +0100)]
fuse: fix bad inode

BugLink: https://bugs.launchpad.net/bugs/1913223
[ Upstream commit 5d069dbe8aaf2a197142558b6fb2978189ba3454 ]

Jan Kara's analysis of the syzbot report (edited):

  The reproducer opens a directory on FUSE filesystem, it then attaches
  dnotify mark to the open directory.  After that a fuse_do_getattr() call
  finds that attributes returned by the server are inconsistent, and calls
  make_bad_inode() which, among other things does:

          inode->i_mode = S_IFREG;

  This then confuses dnotify which doesn't tear down its structures
  properly and eventually crashes.

Avoid calling make_bad_inode() on a live inode: switch to a private flag on
the fuse inode.  Also add the test to ops which the bad_inode_ops would
have caught.

This bug goes back to the initial merge of fuse in 2.6.14...

Reported-by: syzbot+f427adf9324b92652ccc@syzkaller.appspotmail.com
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Tested-by: Jan Kara <jack@suse.cz>
Cc: <stable@vger.kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: William Breathitt Gray <william.gray@canonical.com>
3 years agoiio:imu:bmi160: Fix alignment and data leak issues
Jonathan Cameron [Sun, 20 Sep 2020 11:27:39 +0000 (12:27 +0100)]
iio:imu:bmi160: Fix alignment and data leak issues

BugLink: https://bugs.launchpad.net/bugs/1913223
commit 7b6b51234df6cd8b04fe736b0b89c25612d896b8 upstream

One of a class of bugs pointed out by Lars in a recent review.
iio_push_to_buffers_with_timestamp assumes the buffer used is aligned
to the size of the timestamp (8 bytes).  This is not guaranteed in
this driver which uses an array of smaller elements on the stack.
As Lars also noted this anti pattern can involve a leak of data to
userspace and that indeed can happen here.  We close both issues by
moving to a suitable array in the iio_priv() data with alignment
explicitly requested.  This data is allocated with kzalloc() so no
data can leak apart from previous readings.

In this driver, depending on which channels are enabled, the timestamp
can be in a number of locations.  Hence we cannot use a structure
to specify the data layout without it being misleading.

Fixes: 77c4ad2d6a9b ("iio: imu: Add initial support for Bosch BMI160")
Reported-by: Lars-Peter Clausen <lars@metafoo.de>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Alexandru Ardelean <alexandru.ardelean@analog.com>
Cc: Daniel Baluta <daniel.baluta@gmail.com>
Cc: Daniel Baluta <daniel.baluta@oss.nxp.com>
Cc: <Stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20200920112742.170751-6-jic23@kernel.org
[sudip: adjust context]
Signed-off-by: Sudip Mukherjee <sudipm.mukherjee@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: William Breathitt Gray <william.gray@canonical.com>
3 years agokdev_t: always inline major/minor helper functions
Josh Poimboeuf [Tue, 29 Dec 2020 23:14:55 +0000 (15:14 -0800)]
kdev_t: always inline major/minor helper functions

BugLink: https://bugs.launchpad.net/bugs/1913223
commit aa8c7db494d0a83ecae583aa193f1134ef25d506 upstream.

Silly GCC doesn't always inline these trivial functions.

Fixes the following warning:

  arch/x86/kernel/sys_ia32.o: warning: objtool: cp_stat64()+0xd8: call to new_encode_dev() with UACCESS enabled

Link: https://lkml.kernel.org/r/984353b44a4484d86ba9f73884b7306232e25e30.1608737428.git.jpoimboe@redhat.com
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Reported-by: Randy Dunlap <rdunlap@infradead.org>
Acked-by: Randy Dunlap <rdunlap@infradead.org> [build-tested]
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: William Breathitt Gray <william.gray@canonical.com>
3 years agodmaengine: at_hdmac: add missing kfree() call in at_dma_xlate()
Yu Kuai [Mon, 17 Aug 2020 11:57:28 +0000 (19:57 +0800)]
dmaengine: at_hdmac: add missing kfree() call in at_dma_xlate()

BugLink: https://bugs.launchpad.net/bugs/1913223
commit e097eb7473d9e70da9e03276f61cd392ccb9d79f upstream.

If memory allocation for 'atslave' succeed, at_dma_xlate() doesn't have a
corresponding kfree() in exception handling. Thus add kfree() for this
function implementation.

Fixes: bbe89c8e3d59 ("at_hdmac: move to generic DMA binding")
Signed-off-by: Yu Kuai <yukuai3@huawei.com>
Link: https://lore.kernel.org/r/20200817115728.1706719-4-yukuai3@huawei.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: William Breathitt Gray <william.gray@canonical.com>
3 years agodmaengine: at_hdmac: add missing put_device() call in at_dma_xlate()
Yu Kuai [Mon, 17 Aug 2020 11:57:27 +0000 (19:57 +0800)]
dmaengine: at_hdmac: add missing put_device() call in at_dma_xlate()

BugLink: https://bugs.launchpad.net/bugs/1913223
commit 3832b78b3ec2cf51e07102f9b4480e343459b20f upstream.

If of_find_device_by_node() succeed, at_dma_xlate() doesn't have a
corresponding put_device(). Thus add put_device() to fix the exception
handling for this function implementation.

Fixes: bbe89c8e3d59 ("at_hdmac: move to generic DMA binding")
Signed-off-by: Yu Kuai <yukuai3@huawei.com>
Link: https://lore.kernel.org/r/20200817115728.1706719-3-yukuai3@huawei.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: William Breathitt Gray <william.gray@canonical.com>
3 years agodmaengine: at_hdmac: Substitute kzalloc with kmalloc
Tudor Ambarus [Thu, 23 Jan 2020 14:03:02 +0000 (14:03 +0000)]
dmaengine: at_hdmac: Substitute kzalloc with kmalloc

BugLink: https://bugs.launchpad.net/bugs/1913223
commit a6e7f19c910068cb54983f36acebedb376f3a9ac upstream.

All members of the structure are initialized below in the function,
there is no need to use kzalloc.

Signed-off-by: Tudor Ambarus <tudor.ambarus@microchip.com>
Acked-by: Ludovic Desroches <ludovic.desroches@microchip.com>
Link: https://lore.kernel.org/r/20200123140237.125799-1-tudor.ambarus@microchip.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: William Breathitt Gray <william.gray@canonical.com>
3 years agoRevert "mtd: spinand: Fix OOB read"
Felix Fietkau [Tue, 5 Jan 2021 10:18:21 +0000 (11:18 +0100)]
Revert "mtd: spinand: Fix OOB read"

BugLink: https://bugs.launchpad.net/bugs/1913223
This reverts stable commit baad618d078c857f99cc286ea249e9629159901f.

This commit is adding lines to spinand_write_to_cache_op, wheras the upstream
commit 868cbe2a6dcee451bd8f87cbbb2a73cf463b57e5 that this was supposed to
backport was touching spinand_read_from_cache_op.
It causes a crash on writing OOB data by attempting to write to read-only
kernel memory.

Cc: Miquel Raynal <miquel.raynal@bootlin.com>
Signed-off-by: Felix Fietkau <nbd@nbd.name>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: William Breathitt Gray <william.gray@canonical.com>
3 years agoRevert "drm/amd/display: Fix memory leaks in S3 resume"
Alex Deucher [Tue, 5 Jan 2021 16:45:45 +0000 (11:45 -0500)]
Revert "drm/amd/display: Fix memory leaks in S3 resume"

BugLink: https://bugs.launchpad.net/bugs/1913223
This reverts commit a135a1b4c4db1f3b8cbed9676a40ede39feb3362.

This leads to blank screens on some boards after replugging a
display.  Revert until we understand the root cause and can
fix both the leak and the blank screen after replug.

Cc: Stylon Wang <stylon.wang@amd.com>
Cc: Harry Wentland <harry.wentland@amd.com>
Cc: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Cc: Andre Tomt <andre@tomt.net>
Cc: Oleksandr Natalenko <oleksandr@natalenko.name>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: William Breathitt Gray <william.gray@canonical.com>
3 years agoUBUNTU: upstream stable to v5.4.87
Kamal Mostafa [Thu, 21 Jan 2021 18:13:00 +0000 (10:13 -0800)]
UBUNTU: upstream stable to v5.4.87

BugLink: https://bugs.launchpad.net/bugs/1912681
Ignore: Yes
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: William Breathitt Gray <william.gray@canonical.com>
3 years agoLinux 5.4.87
Greg Kroah-Hartman [Wed, 6 Jan 2021 13:48:41 +0000 (14:48 +0100)]
Linux 5.4.87

BugLink: https://bugs.launchpad.net/bugs/1912681
Tested-by: Jon Hunter <jonathanh@nvidia.com>
Tested-by: Linux Kernel Functional Testing <lkft@linaro.org>
Tested-by: Shuah Khan <skhan@linuxfoundation.org>
Tested-by: Guenter Roeck <linux@roeck-us.net>
Link: https://lore.kernel.org/r/20210104155705.740576914@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: William Breathitt Gray <william.gray@canonical.com>
3 years agodm verity: skip verity work if I/O error when system is shutting down
Hyeongseok Kim [Thu, 3 Dec 2020 00:46:59 +0000 (09:46 +0900)]
dm verity: skip verity work if I/O error when system is shutting down

BugLink: https://bugs.launchpad.net/bugs/1912681
[ Upstream commit 252bd1256396cebc6fc3526127fdb0b317601318 ]

If emergency system shutdown is called, like by thermal shutdown,
a dm device could be alive when the block device couldn't process
I/O requests anymore. In this state, the handling of I/O errors
by new dm I/O requests or by those already in-flight can lead to
a verity corruption state, which is a misjudgment.

So, skip verity work in response to I/O error when system is shutting
down.

Signed-off-by: Hyeongseok Kim <hyeongseok@gmail.com>
Reviewed-by: Sami Tolvanen <samitolvanen@google.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: William Breathitt Gray <william.gray@canonical.com>
3 years agoALSA: pcm: Clear the full allocated memory at hw_params
Takashi Iwai [Fri, 18 Dec 2020 14:56:25 +0000 (15:56 +0100)]
ALSA: pcm: Clear the full allocated memory at hw_params

BugLink: https://bugs.launchpad.net/bugs/1912681
[ Upstream commit 618de0f4ef11acd8cf26902e65493d46cc20cc89 ]

The PCM hw_params core function tries to clear up the PCM buffer
before actually using for avoiding the information leak from the
previous usages or the usage before a new allocation.  It performs the
memset() with runtime->dma_bytes, but this might still leave some
remaining bytes untouched; namely, the PCM buffer size is aligned in
page size for mmap, hence runtime->dma_bytes doesn't necessarily cover
all PCM buffer pages, and the remaining bytes are exposed via mmap.

This patch changes the memory clearance to cover the all buffer pages
if the stream is supposed to be mmap-ready (that guarantees that the
buffer size is aligned in page size).

Reviewed-by: Lars-Peter Clausen <lars@metafoo.de>
Link: https://lore.kernel.org/r/20201218145625.2045-3-tiwai@suse.de
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: William Breathitt Gray <william.gray@canonical.com>
3 years agotick/sched: Remove bogus boot "safety" check
Thomas Gleixner [Sun, 6 Dec 2020 21:12:55 +0000 (22:12 +0100)]
tick/sched: Remove bogus boot "safety" check

BugLink: https://bugs.launchpad.net/bugs/1912681
[ Upstream commit ba8ea8e7dd6e1662e34e730eadfc52aa6816f9dd ]

can_stop_idle_tick() checks whether the do_timer() duty has been taken over
by a CPU on boot. That's silly because the boot CPU always takes over with
the initial clockevent device.

But even if no CPU would have installed a clockevent and taken over the
duty then the question whether the tick on the current CPU can be stopped
or not is moot. In that case the current CPU would have no clockevent
either, so there would be nothing to keep ticking.

Remove it.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Frederic Weisbecker <frederic@kernel.org>
Link: https://lore.kernel.org/r/20201206212002.725238293@linutronix.de
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: William Breathitt Gray <william.gray@canonical.com>
3 years agoum: ubd: Submit all data segments atomically
Gabriel Krisman Bertazi [Sun, 22 Nov 2020 04:13:56 +0000 (23:13 -0500)]
um: ubd: Submit all data segments atomically

BugLink: https://bugs.launchpad.net/bugs/1912681
[ Upstream commit fc6b6a872dcd48c6f39c7975836d75113db67d37 ]

Internally, UBD treats each physical IO segment as a separate command to
be submitted in the execution pipe.  If the pipe returns a transient
error after a few segments have already been written, UBD will tell the
block layer to requeue the request, but there is no way to reclaim the
segments already submitted.  When a new attempt to dispatch the request
is done, those segments already submitted will get duplicated, causing
the WARN_ON below in the best case, and potentially data corruption.

In my system, running a UML instance with 2GB of RAM and a 50M UBD disk,
I can reproduce the WARN_ON by simply running mkfs.fvat against the
disk on a freshly booted system.

There are a few ways to around this, like reducing the pressure on
the pipe by reducing the queue depth, which almost eliminates the
occurrence of the problem, increasing the pipe buffer size on the host
system, or by limiting the request to one physical segment, which causes
the block layer to submit way more requests to resolve a single
operation.

Instead, this patch modifies the format of a UBD command, such that all
segments are sent through a single element in the communication pipe,
turning the command submission atomic from the point of view of the
block layer.  The new format has a variable size, depending on the
number of elements, and looks like this:

+------------+-----------+-----------+------------
| cmd_header | segment 0 | segment 1 | segment ...
+------------+-----------+-----------+------------

With this format, we push a pointer to cmd_header in the submission
pipe.

This has the advantage of reducing the memory footprint of executing a
single request, since it allow us to merge some fields in the header.
It is possible to reduce even further each segment memory footprint, by
merging bitmap_words and cow_offset, for instance, but this is not the
focus of this patch and is left as future work.  One issue with the
patch is that for a big number of segments, we now perform one big
memory allocation instead of multiple small ones, but I wasn't able to
trigger any real issues or -ENOMEM because of this change, that wouldn't
be reproduced otherwise.

This was tested using fio with the verify-crc32 option, and by running
an ext4 filesystem over this UBD device.

The original WARN_ON was:

------------[ cut here ]------------
WARNING: CPU: 0 PID: 0 at lib/refcount.c:28 refcount_warn_saturate+0x13f/0x141
refcount_t: underflow; use-after-free.
Modules linked in:
CPU: 0 PID: 0 Comm: swapper Not tainted 5.5.0-rc6-00002-g2a5bb2cf75c8 #346
Stack:
 6084eed0 6063dc77 00000009 6084ef60
 00000000 604b8d9f 6084eee0 6063dcbc
 6084ef40 6006ab8d e013d780 1c00000000
Call Trace:
 [<600a0c1c>] ? printk+0x0/0x94
 [<6004a888>] show_stack+0x13b/0x155
 [<6063dc77>] ? dump_stack_print_info+0xdf/0xe8
 [<604b8d9f>] ? refcount_warn_saturate+0x13f/0x141
 [<6063dcbc>] dump_stack+0x2a/0x2c
 [<6006ab8d>] __warn+0x107/0x134
 [<6008da6c>] ? wake_up_process+0x17/0x19
 [<60487628>] ? blk_queue_max_discard_sectors+0x0/0xd
 [<6006b05f>] warn_slowpath_fmt+0xd1/0xdf
 [<6006af8e>] ? warn_slowpath_fmt+0x0/0xdf
 [<600acc14>] ? raw_read_seqcount_begin.constprop.0+0x0/0x15
 [<600619ae>] ? os_nsecs+0x1d/0x2b
 [<604b8d9f>] refcount_warn_saturate+0x13f/0x141
 [<6048bc8f>] refcount_sub_and_test.constprop.0+0x2f/0x37
 [<6048c8de>] blk_mq_free_request+0xf1/0x10d
 [<6048ca06>] __blk_mq_end_request+0x10c/0x114
 [<6005ac0f>] ubd_intr+0xb5/0x169
 [<600a1a37>] __handle_irq_event_percpu+0x6b/0x17e
 [<600a1b70>] handle_irq_event_percpu+0x26/0x69
 [<600a1bd9>] handle_irq_event+0x26/0x34
 [<600a1bb3>] ? handle_irq_event+0x0/0x34
 [<600a5186>] ? unmask_irq+0x0/0x37
 [<600a57e6>] handle_edge_irq+0xbc/0xd6
 [<600a131a>] generic_handle_irq+0x21/0x29
 [<60048f6e>] do_IRQ+0x39/0x54
 [...]
---[ end trace c6e7444e55386c0f ]---

Cc: Christopher Obbard <chris.obbard@collabora.com>
Reported-by: Martyn Welch <martyn@collabora.com>
Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.com>
Tested-by: Christopher Obbard <chris.obbard@collabora.com>
Acked-by: Anton Ivanov <anton.ivanov@cambridgegreys.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: William Breathitt Gray <william.gray@canonical.com>
3 years agofs/namespace.c: WARN if mnt_count has become negative
Eric Biggers [Sun, 1 Nov 2020 04:40:21 +0000 (21:40 -0700)]
fs/namespace.c: WARN if mnt_count has become negative

BugLink: https://bugs.launchpad.net/bugs/1912681
[ Upstream commit edf7ddbf1c5eb98b720b063b73e20e8a4a1ce673 ]

Missing calls to mntget() (or equivalently, too many calls to mntput())
are hard to detect because mntput() delays freeing mounts using
task_work_add(), then again using call_rcu().  As a result, mnt_count
can often be decremented to -1 without getting a KASAN use-after-free
report.  Such cases are still bugs though, and they point to real
use-after-frees being possible.

For an example of this, see the bug fixed by commit 1b0b9cc8d379
("vfs: fsmount: add missing mntget()"), discussed at
https://lkml.kernel.org/linux-fsdevel/20190605135401.GB30925@xxxxxxxxxxxxxxxxxxxxxxxxx/T/#u.
This bug *should* have been trivial to find.  But actually, it wasn't
found until syzkaller happened to use fchdir() to manipulate the
reference count just right for the bug to be noticeable.

Address this by making mntput_no_expire() issue a WARN if mnt_count has
become negative.

Suggested-by: Miklos Szeredi <miklos@szeredi.hu>
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: William Breathitt Gray <william.gray@canonical.com>
3 years agomodule: delay kobject uevent until after module init call
Jessica Yu [Fri, 27 Nov 2020 09:09:39 +0000 (10:09 +0100)]
module: delay kobject uevent until after module init call

BugLink: https://bugs.launchpad.net/bugs/1912681
[ Upstream commit 38dc717e97153e46375ee21797aa54777e5498f3 ]

Apparently there has been a longstanding race between udev/systemd and
the module loader. Currently, the module loader sends a uevent right
after sysfs initialization, but before the module calls its init
function. However, some udev rules expect that the module has
initialized already upon receiving the uevent.

This race has been triggered recently (see link in references) in some
systemd mount unit files. For instance, the configfs module creates the
/sys/kernel/config mount point in its init function, however the module
loader issues the uevent before this happens. sys-kernel-config.mount
expects to be able to mount /sys/kernel/config upon receipt of the
module loading uevent, but if the configfs module has not called its
init function yet, then this directory will not exist and the mount unit
fails. A similar situation exists for sys-fs-fuse-connections.mount, as
the fuse sysfs mount point is created during the fuse module's init
function. If udev is faster than module initialization then the mount
unit would fail in a similar fashion.

To fix this race, delay the module KOBJ_ADD uevent until after the
module has finished calling its init routine.

References: https://github.com/systemd/systemd/issues/17586
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Tested-By: Nicolas Morey-Chaisemartin <nmoreychaisemartin@suse.com>
Signed-off-by: Jessica Yu <jeyu@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: William Breathitt Gray <william.gray@canonical.com>
3 years agof2fs: avoid race condition for shrinker count
Jaegeuk Kim [Fri, 6 Nov 2020 21:22:05 +0000 (13:22 -0800)]
f2fs: avoid race condition for shrinker count

BugLink: https://bugs.launchpad.net/bugs/1912681
[ Upstream commit a95ba66ac1457b76fe472c8e092ab1006271f16c ]

Light reported sometimes shinker gets nat_cnt < dirty_nat_cnt resulting in
wrong do_shinker work. Let's avoid to return insane overflowed value by adding
single tracking value.

Reported-by: Light Hsieh <Light.Hsieh@mediatek.com>
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: William Breathitt Gray <william.gray@canonical.com>
3 years agoNFSv4: Fix a pNFS layout related use-after-free race when freeing the inode
Trond Myklebust [Wed, 25 Nov 2020 17:06:14 +0000 (12:06 -0500)]
NFSv4: Fix a pNFS layout related use-after-free race when freeing the inode

BugLink: https://bugs.launchpad.net/bugs/1912681
[ Upstream commit b6d49ecd1081740b6e632366428b960461f8158b ]

When returning the layout in nfs4_evict_inode(), we need to ensure that
the layout is actually done being freed before we can proceed to free the
inode itself.

Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: William Breathitt Gray <william.gray@canonical.com>
3 years agoi3c master: fix missing destroy_workqueue() on error in i3c_master_register
Qinglang Miao [Wed, 28 Oct 2020 09:15:43 +0000 (17:15 +0800)]
i3c master: fix missing destroy_workqueue() on error in i3c_master_register

BugLink: https://bugs.launchpad.net/bugs/1912681
[ Upstream commit 59165d16c699182b86b5c65181013f1fd88feb62 ]

Add the missing destroy_workqueue() before return from
i3c_master_register in the error handling case.

Signed-off-by: Qinglang Miao <miaoqinglang@huawei.com>
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Link: https://lore.kernel.org/linux-i3c/20201028091543.136167-1-miaoqinglang@huawei.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: William Breathitt Gray <william.gray@canonical.com>
3 years agopowerpc: sysdev: add missing iounmap() on error in mpic_msgr_probe()
Qinglang Miao [Wed, 28 Oct 2020 09:15:51 +0000 (17:15 +0800)]
powerpc: sysdev: add missing iounmap() on error in mpic_msgr_probe()

BugLink: https://bugs.launchpad.net/bugs/1912681
[ Upstream commit ffa1797040c5da391859a9556be7b735acbe1242 ]

I noticed that iounmap() of msgr_block_addr before return from
mpic_msgr_probe() in the error handling case is missing. So use
devm_ioremap() instead of just ioremap() when remapping the message
register block, so the mapping will be automatically released on
probe failure.

Signed-off-by: Qinglang Miao <miaoqinglang@huawei.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20201028091551.136400-1-miaoqinglang@huawei.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: William Breathitt Gray <william.gray@canonical.com>
3 years agortc: pl031: fix resource leak in pl031_probe
Zheng Liang [Thu, 12 Nov 2020 09:31:39 +0000 (17:31 +0800)]
rtc: pl031: fix resource leak in pl031_probe

BugLink: https://bugs.launchpad.net/bugs/1912681
[ Upstream commit 1eab0fea2514b269e384c117f5b5772b882761f0 ]

When devm_rtc_allocate_device is failed in pl031_probe, it should release
mem regions with device.

Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Zheng Liang <zhengliang6@huawei.com>
Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
Acked-by: Linus Walleij <linus.walleij@linaro.org>
Link: https://lore.kernel.org/r/20201112093139.32566-1-zhengliang6@huawei.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: William Breathitt Gray <william.gray@canonical.com>
3 years agoquota: Don't overflow quota file offsets
Jan Kara [Mon, 2 Nov 2020 15:32:10 +0000 (16:32 +0100)]
quota: Don't overflow quota file offsets

BugLink: https://bugs.launchpad.net/bugs/1912681
[ Upstream commit 10f04d40a9fa29785206c619f80d8beedb778837 ]

The on-disk quota format supports quota files with upto 2^32 blocks. Be
careful when computing quota file offsets in the quota files from block
numbers as they can overflow 32-bit types. Since quota files larger than
4GB would require ~26 millions of quota users, this is mostly a
theoretical concern now but better be careful, fuzzers would find the
problem sooner or later anyway...

Reviewed-by: Andreas Dilger <adilger@dilger.ca>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: William Breathitt Gray <william.gray@canonical.com>
3 years agomodule: set MODULE_STATE_GOING state when a module fails to load
Miroslav Benes [Tue, 27 Oct 2020 14:03:36 +0000 (15:03 +0100)]
module: set MODULE_STATE_GOING state when a module fails to load

BugLink: https://bugs.launchpad.net/bugs/1912681
[ Upstream commit 5e8ed280dab9eeabc1ba0b2db5dbe9fe6debb6b5 ]

If a module fails to load due to an error in prepare_coming_module(),
the following error handling in load_module() runs with
MODULE_STATE_COMING in module's state. Fix it by correctly setting
MODULE_STATE_GOING under "bug_cleanup" label.

Signed-off-by: Miroslav Benes <mbenes@suse.cz>
Signed-off-by: Jessica Yu <jeyu@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: William Breathitt Gray <william.gray@canonical.com>
3 years agortc: sun6i: Fix memleak in sun6i_rtc_clk_init
Dinghao Liu [Tue, 20 Oct 2020 06:12:26 +0000 (14:12 +0800)]
rtc: sun6i: Fix memleak in sun6i_rtc_clk_init

BugLink: https://bugs.launchpad.net/bugs/1912681
[ Upstream commit 28d211919e422f58c1e6c900e5810eee4f1ce4c8 ]

When clk_hw_register_fixed_rate_with_accuracy() fails,
clk_data should be freed. It's the same for the subsequent
two error paths, but we should also unregister the already
registered clocks in them.

Signed-off-by: Dinghao Liu <dinghao.liu@zju.edu.cn>
Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
Link: https://lore.kernel.org/r/20201020061226.6572-1-dinghao.liu@zju.edu.cn
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: William Breathitt Gray <william.gray@canonical.com>
3 years agofcntl: Fix potential deadlock in send_sig{io, urg}()
Boqun Feng [Thu, 5 Nov 2020 06:23:51 +0000 (14:23 +0800)]
fcntl: Fix potential deadlock in send_sig{io, urg}()

BugLink: https://bugs.launchpad.net/bugs/1912681
commit 8d1ddb5e79374fb277985a6b3faa2ed8631c5b4c upstream.

Syzbot reports a potential deadlock found by the newly added recursive
read deadlock detection in lockdep:

[...] ========================================================
[...] WARNING: possible irq lock inversion dependency detected
[...] 5.9.0-rc2-syzkaller #0 Not tainted
[...] --------------------------------------------------------
[...] syz-executor.1/10214 just changed the state of lock:
[...] ffff88811f506338 (&f->f_owner.lock){.+..}-{2:2}, at: send_sigurg+0x1d/0x200
[...] but this lock was taken by another, HARDIRQ-safe lock in the past:
[...]  (&dev->event_lock){-...}-{2:2}
[...]
[...]
[...] and interrupts could create inverse lock ordering between them.
[...]
[...]
[...] other info that might help us debug this:
[...] Chain exists of:
[...]   &dev->event_lock --> &new->fa_lock --> &f->f_owner.lock
[...]
[...]  Possible interrupt unsafe locking scenario:
[...]
[...]        CPU0                    CPU1
[...]        ----                    ----
[...]   lock(&f->f_owner.lock);
[...]                                local_irq_disable();
[...]                                lock(&dev->event_lock);
[...]                                lock(&new->fa_lock);
[...]   <Interrupt>
[...]     lock(&dev->event_lock);
[...]
[...]  *** DEADLOCK ***

The corresponding deadlock case is as followed:

CPU 0 CPU 1 CPU 2
read_lock(&fown->lock);
spin_lock_irqsave(&dev->event_lock, ...)
write_lock_irq(&filp->f_owner.lock); // wait for the lock
read_lock(&fown-lock); // have to wait until the writer release
       // due to the fairness
<interrupted>
spin_lock_irqsave(&dev->event_lock); // wait for the lock

The lock dependency on CPU 1 happens if there exists a call sequence:

input_inject_event():
  spin_lock_irqsave(&dev->event_lock,...);
  input_handle_event():
    input_pass_values():
      input_to_handler():
        handler->event(): // evdev_event()
          evdev_pass_values():
            spin_lock(&client->buffer_lock);
            __pass_event():
              kill_fasync():
                kill_fasync_rcu():
                  read_lock(&fa->fa_lock);
                  send_sigio():
                    read_lock(&fown->lock);

To fix this, make the reader in send_sigurg() and send_sigio() use
read_lock_irqsave() and read_lock_irqrestore().

Reported-by: syzbot+22e87cdf94021b984aa6@syzkaller.appspotmail.com
Reported-by: syzbot+c5e32344981ad9f33750@syzkaller.appspotmail.com
Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: William Breathitt Gray <william.gray@canonical.com>
3 years agobfs: don't use WARNING: string when it's just info.
Randy Dunlap [Wed, 16 Dec 2020 04:45:44 +0000 (20:45 -0800)]
bfs: don't use WARNING: string when it's just info.

BugLink: https://bugs.launchpad.net/bugs/1912681
commit dc889b8d4a8122549feabe99eead04e6b23b6513 upstream.

Make the printk() [bfs "printf" macro] seem less severe by changing
"WARNING:" to "NOTE:".

<asm-generic/bug.h> warns us about using WARNING or BUG in a format string
other than in WARN() or BUG() family macros.  bfs/inode.c is doing just
that in a normal printk() call, so change the "WARNING" string to be
"NOTE".

Link: https://lkml.kernel.org/r/20201203212634.17278-1-rdunlap@infradead.org
Reported-by: syzbot+3fd34060f26e766536ff@syzkaller.appspotmail.com
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Al Viro <viro@ZenIV.linux.org.uk>
Cc: "Tigran A. Aivazian" <aivazian.tigran@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: William Breathitt Gray <william.gray@canonical.com>
3 years agoALSA: rawmidi: Access runtime->avail always in spinlock
Takashi Iwai [Sun, 6 Dec 2020 08:35:27 +0000 (09:35 +0100)]
ALSA: rawmidi: Access runtime->avail always in spinlock

BugLink: https://bugs.launchpad.net/bugs/1912681
commit 88a06d6fd6b369d88cec46c62db3e2604a2f50d5 upstream.

The runtime->avail field may be accessed concurrently while some
places refer to it without taking the runtime->lock spinlock, as
detected by KCSAN.  Usually this isn't a big problem, but for
consistency and safety, we should take the spinlock at each place
referencing this field.

Reported-by: syzbot+a23a6f1215c84756577c@syzkaller.appspotmail.com
Reported-by: syzbot+3d367d1df1d2b67f5c19@syzkaller.appspotmail.com
Link: https://lore.kernel.org/r/20201206083527.21163-1-tiwai@suse.de
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: William Breathitt Gray <william.gray@canonical.com>
3 years agoALSA: seq: Use bool for snd_seq_queue internal flags
Takashi Iwai [Sun, 6 Dec 2020 08:34:56 +0000 (09:34 +0100)]
ALSA: seq: Use bool for snd_seq_queue internal flags

BugLink: https://bugs.launchpad.net/bugs/1912681
commit 4ebd47037027c4beae99680bff3b20fdee5d7c1e upstream.

The snd_seq_queue struct contains various flags in the bit fields.
Those are categorized to two different use cases, both of which are
protected by different spinlocks.  That implies that there are still
potential risks of the bad operations for bit fields by concurrent
accesses.

For addressing the problem, this patch rearranges those flags to be
a standard bool instead of a bit field.

Reported-by: syzbot+63cbe31877bb80ef58f5@syzkaller.appspotmail.com
Link: https://lore.kernel.org/r/20201206083456.21110-1-tiwai@suse.de
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: William Breathitt Gray <william.gray@canonical.com>
3 years agof2fs: fix shift-out-of-bounds in sanity_check_raw_super()
Chao Yu [Wed, 9 Dec 2020 08:49:36 +0000 (16:49 +0800)]
f2fs: fix shift-out-of-bounds in sanity_check_raw_super()

BugLink: https://bugs.launchpad.net/bugs/1912681
commit e584bbe821229a3e7cc409eecd51df66f9268c21 upstream.

syzbot reported a bug which could cause shift-out-of-bounds issue,
fix it.

Call Trace:
 __dump_stack lib/dump_stack.c:79 [inline]
 dump_stack+0x107/0x163 lib/dump_stack.c:120
 ubsan_epilogue+0xb/0x5a lib/ubsan.c:148
 __ubsan_handle_shift_out_of_bounds.cold+0xb1/0x181 lib/ubsan.c:395
 sanity_check_raw_super fs/f2fs/super.c:2812 [inline]
 read_raw_super_block fs/f2fs/super.c:3267 [inline]
 f2fs_fill_super.cold+0x16c9/0x16f6 fs/f2fs/super.c:3519
 mount_bdev+0x34d/0x410 fs/super.c:1366
 legacy_get_tree+0x105/0x220 fs/fs_context.c:592
 vfs_get_tree+0x89/0x2f0 fs/super.c:1496
 do_new_mount fs/namespace.c:2896 [inline]
 path_mount+0x12ae/0x1e70 fs/namespace.c:3227
 do_mount fs/namespace.c:3240 [inline]
 __do_sys_mount fs/namespace.c:3448 [inline]
 __se_sys_mount fs/namespace.c:3425 [inline]
 __x64_sys_mount+0x27f/0x300 fs/namespace.c:3425
 do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46
 entry_SYSCALL_64_after_hwframe+0x44/0xa9

Reported-by: syzbot+ca9a785f8ac472085994@syzkaller.appspotmail.com
Signed-off-by: Anant Thazhemadam <anant.thazhemadam@gmail.com>
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: William Breathitt Gray <william.gray@canonical.com>
3 years agomedia: gp8psk: initialize stats at power control logic
Mauro Carvalho Chehab [Fri, 27 Nov 2020 06:40:21 +0000 (07:40 +0100)]
media: gp8psk: initialize stats at power control logic

BugLink: https://bugs.launchpad.net/bugs/1912681
commit d0ac1a26ed5943127cb0156148735f5f52a07075 upstream.

As reported on:
https://lore.kernel.org/linux-media/20190627222020.45909-1-willemdebruijn.kernel@gmail.com/

if gp8psk_usb_in_op() returns an error, the status var is not
initialized. Yet, this var is used later on, in order to
identify:
- if the device was already started;
- if firmware has loaded;
- if the LNBf was powered on.

Using status = 0 seems to ensure that everything will be
properly powered up.

So, instead of the proposed solution, let's just set
status = 0.

Reported-by: syzbot <syzkaller@googlegroups.com>
Reported-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: William Breathitt Gray <william.gray@canonical.com>
3 years agomisc: vmw_vmci: fix kernel info-leak by initializing dbells in vmci_ctx_get_chkpt_doo...
Anant Thazhemadam [Sun, 22 Nov 2020 22:45:34 +0000 (04:15 +0530)]
misc: vmw_vmci: fix kernel info-leak by initializing dbells in vmci_ctx_get_chkpt_doorbells()

BugLink: https://bugs.launchpad.net/bugs/1912681
commit 31dcb6c30a26d32650ce134820f27de3c675a45a upstream.

A kernel-infoleak was reported by syzbot, which was caused because
dbells was left uninitialized.
Using kzalloc() instead of kmalloc() fixes this issue.

Reported-by: syzbot+a79e17c39564bedf0930@syzkaller.appspotmail.com
Tested-by: syzbot+a79e17c39564bedf0930@syzkaller.appspotmail.com
Signed-off-by: Anant Thazhemadam <anant.thazhemadam@gmail.com>
Link: https://lore.kernel.org/r/20201122224534.333471-1-anant.thazhemadam@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: William Breathitt Gray <william.gray@canonical.com>
3 years agoreiserfs: add check for an invalid ih_entry_count
Rustam Kovhaev [Sun, 1 Nov 2020 14:09:58 +0000 (06:09 -0800)]
reiserfs: add check for an invalid ih_entry_count

BugLink: https://bugs.launchpad.net/bugs/1912681
commit d24396c5290ba8ab04ba505176874c4e04a2d53c upstream.

when directory item has an invalid value set for ih_entry_count it might
trigger use-after-free or out-of-bounds read in bin_search_in_dir_item()

ih_entry_count * IH_SIZE for directory item should not be larger than
ih_item_len

Link: https://lore.kernel.org/r/20201101140958.3650143-1-rkovhaev@gmail.com
Reported-and-tested-by: syzbot+83b6f7cf9922cae5c4d7@syzkaller.appspotmail.com
Link: https://syzkaller.appspot.com/bug?extid=83b6f7cf9922cae5c4d7
Signed-off-by: Rustam Kovhaev <rkovhaev@gmail.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: William Breathitt Gray <william.gray@canonical.com>
3 years agoBluetooth: hci_h5: close serdev device and free hu in h5_close
Anant Thazhemadam [Tue, 29 Sep 2020 18:58:15 +0000 (00:28 +0530)]
Bluetooth: hci_h5: close serdev device and free hu in h5_close

BugLink: https://bugs.launchpad.net/bugs/1912681
commit 70f259a3f4276b71db365b1d6ff1eab805ea6ec3 upstream.

When h5_close() gets called, the memory allocated for the hu gets
freed only if hu->serdev doesn't exist. This leads to a memory leak.
So when h5_close() is requested, close the serdev device instance and
free the memory allocated to the hu entirely instead.

Fixes: https://syzkaller.appspot.com/bug?extid=6ce141c55b2f7aafd1c4
Reported-by: syzbot+6ce141c55b2f7aafd1c4@syzkaller.appspotmail.com
Tested-by: syzbot+6ce141c55b2f7aafd1c4@syzkaller.appspotmail.com
Signed-off-by: Anant Thazhemadam <anant.thazhemadam@gmail.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: William Breathitt Gray <william.gray@canonical.com>
3 years agoscsi: cxgb4i: Fix TLS dependency
Randy Dunlap [Tue, 8 Dec 2020 22:05:05 +0000 (14:05 -0800)]
scsi: cxgb4i: Fix TLS dependency

BugLink: https://bugs.launchpad.net/bugs/1912681
commit cb5253198f10a4cd79b7523c581e6173c7d49ddb upstream.

SCSI_CXGB4_ISCSI selects CHELSIO_T4. The latter depends on TLS || TLS=n, so
since 'select' does not check dependencies of the selected symbol,
SCSI_CXGB4_ISCSI should also depend on TLS || TLS=n.

This prevents the following kconfig warning and restricts SCSI_CXGB4_ISCSI
to 'm' whenever TLS=m.

WARNING: unmet direct dependencies detected for CHELSIO_T4
  Depends on [m]: NETDEVICES [=y] && ETHERNET [=y] && NET_VENDOR_CHELSIO [=y] && PCI [=y] && (IPV6 [=y] || IPV6 [=y]=n) && (TLS [=m] || TLS [=m]=n)
  Selected by [y]:
  - SCSI_CXGB4_ISCSI [=y] && SCSI_LOWLEVEL [=y] && SCSI [=y] && PCI [=y] && INET [=y] && (IPV6 [=y] || IPV6 [=y]=n) && ETHERNET [=y]

Link: https://lore.kernel.org/r/20201208220505.24488-1-rdunlap@infradead.org
Fixes: 7b36b6e03b0d ("[SCSI] cxgb4i v5: iscsi driver")
Cc: Karen Xie <kxie@chelsio.com>
Cc: linux-scsi@vger.kernel.org
Cc: "James E.J. Bottomley" <jejb@linux.ibm.com>
Cc: "Martin K. Petersen" <martin.petersen@oracle.com>
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: William Breathitt Gray <william.gray@canonical.com>
3 years agocgroup: Fix memory leak when parsing multiple source parameters
Qinglang Miao [Thu, 10 Dec 2020 01:29:43 +0000 (09:29 +0800)]
cgroup: Fix memory leak when parsing multiple source parameters

BugLink: https://bugs.launchpad.net/bugs/1912681
commit 2d18e54dd8662442ef5898c6bdadeaf90b3cebbc upstream.

A memory leak is found in cgroup1_parse_param() when multiple source
parameters overwrite fc->source in the fs_context struct without free.

unreferenced object 0xffff888100d930e0 (size 16):
  comm "mount", pid 520, jiffies 4303326831 (age 152.783s)
  hex dump (first 16 bytes):
    74 65 73 74 6c 65 61 6b 00 00 00 00 00 00 00 00  testleak........
  backtrace:
    [<000000003e5023ec>] kmemdup_nul+0x2d/0xa0
    [<00000000377dbdaa>] vfs_parse_fs_string+0xc0/0x150
    [<00000000cb2b4882>] generic_parse_monolithic+0x15a/0x1d0
    [<000000000f750198>] path_mount+0xee1/0x1820
    [<0000000004756de2>] do_mount+0xea/0x100
    [<0000000094cafb0a>] __x64_sys_mount+0x14b/0x1f0

Fix this bug by permitting a single source parameter and rejecting with
an error all subsequent ones.

Fixes: 8d2451f4994f ("cgroup1: switch to option-by-option parsing")
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Qinglang Miao <miaoqinglang@huawei.com>
Reviewed-by: Zefan Li <lizefan@huawei.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: William Breathitt Gray <william.gray@canonical.com>
3 years agoof: fix linker-section match-table corruption
Johan Hovold [Mon, 23 Nov 2020 10:23:12 +0000 (11:23 +0100)]
of: fix linker-section match-table corruption

BugLink: https://bugs.launchpad.net/bugs/1912681
commit 5812b32e01c6d86ba7a84110702b46d8a8531fe9 upstream.

Specify type alignment when declaring linker-section match-table entries
to prevent gcc from increasing alignment and corrupting the various
tables with padding (e.g. timers, irqchips, clocks, reserved memory).

This is specifically needed on x86 where gcc (typically) aligns larger
objects like struct of_device_id with static extent on 32-byte
boundaries which at best prevents matching on anything but the first
entry. Specifying alignment when declaring variables suppresses this
optimisation.

Here's a 64-bit example where all entries are corrupt as 16 bytes of
padding has been inserted before the first entry:

ffffffff8266b4b0 D __clk_of_table
ffffffff8266b4c0 d __of_table_fixed_factor_clk
ffffffff8266b5a0 d __of_table_fixed_clk
ffffffff8266b680 d __clk_of_table_sentinel

And here's a 32-bit example where the 8-byte-aligned table happens to be
placed on a 32-byte boundary so that all but the first entry are corrupt
due to the 28 bytes of padding inserted between entries:

812b3ec0 D __irqchip_of_table
812b3ec0 d __of_table_irqchip1
812b3fa0 d __of_table_irqchip2
812b4080 d __of_table_irqchip3
812b4160 d irqchip_of_match_end

Verified on x86 using gcc-9.3 and gcc-4.9 (which uses 64-byte
alignment), and on arm using gcc-7.2.

Note that there are no in-tree users of these tables on x86 currently
(even if they are included in the image).

Fixes: 54196ccbe0ba ("of: consolidate linker section OF match table declarations")
Fixes: f6e916b82022 ("irqchip: add basic infrastructure")
Cc: stable <stable@vger.kernel.org> # 3.9
Signed-off-by: Johan Hovold <johan@kernel.org>
Link: https://lore.kernel.org/r/20201123102319.8090-2-johan@kernel.org
[ johan: adjust context to 5.4 ]
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: William Breathitt Gray <william.gray@canonical.com>
3 years agonull_blk: Fix zone size initialization
Damien Le Moal [Fri, 20 Nov 2020 01:55:11 +0000 (10:55 +0900)]
null_blk: Fix zone size initialization

BugLink: https://bugs.launchpad.net/bugs/1912681
commit 0ebcdd702f49aeb0ad2e2d894f8c124a0acc6e23 upstream.

For a null_blk device with zoned mode enabled is currently initialized
with a number of zones equal to the device capacity divided by the zone
size, without considering if the device capacity is a multiple of the
zone size. If the zone size is not a divisor of the capacity, the zones
end up not covering the entire capacity, potentially resulting is out
of bounds accesses to the zone array.

Fix this by adding one last smaller zone with a size equal to the
remainder of the disk capacity divided by the zone size if the capacity
is not a multiple of the zone size. For such smaller last zone, the zone
capacity is also checked so that it does not exceed the smaller zone
size.

Reported-by: Naohiro Aota <naohiro.aota@wdc.com>
Fixes: ca4b2a011948 ("null_blk: add zone support")
Cc: stable@vger.kernel.org
Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: William Breathitt Gray <william.gray@canonical.com>
3 years agotools headers UAPI: Sync linux/const.h with the kernel headers
Arnaldo Carvalho de Melo [Thu, 17 Dec 2020 17:55:01 +0000 (14:55 -0300)]
tools headers UAPI: Sync linux/const.h with the kernel headers

BugLink: https://bugs.launchpad.net/bugs/1912681
commit 7ddcdea5b54492f54700f427f58690cf1e187e5e upstream.

To pick up the changes in:

  a85cbe6159ffc973 ("uapi: move constants from <linux/kernel.h> to <linux/const.h>")

That causes no changes in tooling, just addresses this perf build
warning:

  Warning: Kernel ABI header at 'tools/include/uapi/linux/const.h' differs from latest version at 'include/uapi/linux/const.h'
  diff -u tools/include/uapi/linux/const.h include/uapi/linux/const.h

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Petr Vorel <petr.vorel@gmail.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: William Breathitt Gray <william.gray@canonical.com>
3 years agouapi: move constants from <linux/kernel.h> to <linux/const.h>
Petr Vorel [Tue, 15 Dec 2020 03:03:21 +0000 (19:03 -0800)]
uapi: move constants from <linux/kernel.h> to <linux/const.h>

BugLink: https://bugs.launchpad.net/bugs/1912681
commit a85cbe6159ffc973e5702f70a3bd5185f8f3c38d upstream.

and include <linux/const.h> in UAPI headers instead of <linux/kernel.h>.

The reason is to avoid indirect <linux/sysinfo.h> include when using
some network headers: <linux/netlink.h> or others -> <linux/kernel.h>
-> <linux/sysinfo.h>.

This indirect include causes on MUSL redefinition of struct sysinfo when
included both <sys/sysinfo.h> and some of UAPI headers:

    In file included from x86_64-buildroot-linux-musl/sysroot/usr/include/linux/kernel.h:5,
                     from x86_64-buildroot-linux-musl/sysroot/usr/include/linux/netlink.h:5,
                     from ../include/tst_netlink.h:14,
                     from tst_crypto.c:13:
    x86_64-buildroot-linux-musl/sysroot/usr/include/linux/sysinfo.h:8:8: error: redefinition of `struct sysinfo'
     struct sysinfo {
            ^~~~~~~
    In file included from ../include/tst_safe_macros.h:15,
                     from ../include/tst_test.h:93,
                     from tst_crypto.c:11:
    x86_64-buildroot-linux-musl/sysroot/usr/include/sys/sysinfo.h:10:8: note: originally defined here

Link: https://lkml.kernel.org/r/20201015190013.8901-1-petr.vorel@gmail.com
Signed-off-by: Petr Vorel <petr.vorel@gmail.com>
Suggested-by: Rich Felker <dalias@aerifal.cx>
Acked-by: Rich Felker <dalias@libc.org>
Cc: Peter Korsgaard <peter@korsgaard.com>
Cc: Baruch Siach <baruch@tkos.co.il>
Cc: Florian Weimer <fweimer@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: William Breathitt Gray <william.gray@canonical.com>
3 years agoscsi: block: Fix a race in the runtime power management code
Bart Van Assche [Wed, 9 Dec 2020 05:29:44 +0000 (21:29 -0800)]
scsi: block: Fix a race in the runtime power management code

BugLink: https://bugs.launchpad.net/bugs/1912681
commit fa4d0f1992a96f6d7c988ef423e3127e613f6ac9 upstream.

With the current implementation the following race can happen:

 * blk_pre_runtime_suspend() calls blk_freeze_queue_start() and
   blk_mq_unfreeze_queue().

 * blk_queue_enter() calls blk_queue_pm_only() and that function returns
   true.

 * blk_queue_enter() calls blk_pm_request_resume() and that function does
   not call pm_request_resume() because the queue runtime status is
   RPM_ACTIVE.

 * blk_pre_runtime_suspend() changes the queue status into RPM_SUSPENDING.

Fix this race by changing the queue runtime status into RPM_SUSPENDING
before switching q_usage_counter to atomic mode.

Link: https://lore.kernel.org/r/20201209052951.16136-2-bvanassche@acm.org
Fixes: 986d413b7c15 ("blk-mq: Enable support for runtime power management")
Cc: Ming Lei <ming.lei@redhat.com>
Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: stable <stable@vger.kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Jens Axboe <axboe@kernel.dk>
Acked-by: Alan Stern <stern@rowland.harvard.edu>
Acked-by: Stanley Chu <stanley.chu@mediatek.com>
Co-developed-by: Can Guo <cang@codeaurora.org>
Signed-off-by: Can Guo <cang@codeaurora.org>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: William Breathitt Gray <william.gray@canonical.com>
3 years agojffs2: Fix NULL pointer dereference in rp_size fs option parsing
Jamie Iles [Mon, 12 Oct 2020 13:12:04 +0000 (14:12 +0100)]
jffs2: Fix NULL pointer dereference in rp_size fs option parsing

BugLink: https://bugs.launchpad.net/bugs/1912681
[ Upstream commit a61df3c413e49b0042f9caf774c58512d1cc71b7 ]

syzkaller found the following JFFS2 splat:

  Unable to handle kernel paging request at virtual address dfffa00000000001
  Mem abort info:
    ESR = 0x96000004
    EC = 0x25: DABT (current EL), IL = 32 bits
    SET = 0, FnV = 0
    EA = 0, S1PTW = 0
  Data abort info:
    ISV = 0, ISS = 0x00000004
    CM = 0, WnR = 0
  [dfffa00000000001] address between user and kernel address ranges
  Internal error: Oops: 96000004 [#1] SMP
  Dumping ftrace buffer:
     (ftrace buffer empty)
  Modules linked in:
  CPU: 0 PID: 12745 Comm: syz-executor.5 Tainted: G S                5.9.0-rc8+ #98
  Hardware name: linux,dummy-virt (DT)
  pstate: 20400005 (nzCv daif +PAN -UAO BTYPE=--)
  pc : jffs2_parse_param+0x138/0x308 fs/jffs2/super.c:206
  lr : jffs2_parse_param+0x108/0x308 fs/jffs2/super.c:205
  sp : ffff000022a57910
  x29: ffff000022a57910 x28: 0000000000000000
  x27: ffff000057634008 x26: 000000000000d800
  x25: 000000000000d800 x24: ffff0000271a9000
  x23: ffffa0001adb5dc0 x22: ffff000023fdcf00
  x21: 1fffe0000454af2c x20: ffff000024cc9400
  x19: 0000000000000000 x18: 0000000000000000
  x17: 0000000000000000 x16: ffffa000102dbdd0
  x15: 0000000000000000 x14: ffffa000109e44bc
  x13: ffffa00010a3a26c x12: ffff80000476e0b3
  x11: 1fffe0000476e0b2 x10: ffff80000476e0b2
  x9 : ffffa00010a3ad60 x8 : ffff000023b70593
  x7 : 0000000000000003 x6 : 00000000f1f1f1f1
  x5 : ffff000023fdcf00 x4 : 0000000000000002
  x3 : ffffa00010000000 x2 : 0000000000000001
  x1 : dfffa00000000000 x0 : 0000000000000008
  Call trace:
   jffs2_parse_param+0x138/0x308 fs/jffs2/super.c:206
   vfs_parse_fs_param+0x234/0x4e8 fs/fs_context.c:117
   vfs_parse_fs_string+0xe8/0x148 fs/fs_context.c:161
   generic_parse_monolithic+0x17c/0x208 fs/fs_context.c:201
   parse_monolithic_mount_data+0x7c/0xa8 fs/fs_context.c:649
   do_new_mount fs/namespace.c:2871 [inline]
   path_mount+0x548/0x1da8 fs/namespace.c:3192
   do_mount+0x124/0x138 fs/namespace.c:3205
   __do_sys_mount fs/namespace.c:3413 [inline]
   __se_sys_mount fs/namespace.c:3390 [inline]
   __arm64_sys_mount+0x164/0x238 fs/namespace.c:3390
   __invoke_syscall arch/arm64/kernel/syscall.c:36 [inline]
   invoke_syscall arch/arm64/kernel/syscall.c:48 [inline]
   el0_svc_common.constprop.0+0x15c/0x598 arch/arm64/kernel/syscall.c:149
   do_el0_svc+0x60/0x150 arch/arm64/kernel/syscall.c:195
   el0_svc+0x34/0xb0 arch/arm64/kernel/entry-common.c:226
   el0_sync_handler+0xc8/0x5b4 arch/arm64/kernel/entry-common.c:236
   el0_sync+0x15c/0x180 arch/arm64/kernel/entry.S:663
  Code: d2d40001 f2fbffe1 91002260 d343fc02 (38e16841)
  ---[ end trace 4edf690313deda44 ]---

This is because since ec10a24f10c8, the option parsing happens before
fill_super and so the MTD device isn't associated with the filesystem.
Defer the size check until there is a valid association.

Fixes: ec10a24f10c8 ("vfs: Convert jffs2 to use the new mount API")
Cc: <stable@vger.kernel.org>
Cc: David Howells <dhowells@redhat.com>
Signed-off-by: Jamie Iles <jamie@nuviainc.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: William Breathitt Gray <william.gray@canonical.com>
3 years agojffs2: Allow setting rp_size to zero during remounting
lizhe [Wed, 14 Oct 2020 06:54:43 +0000 (14:54 +0800)]
jffs2: Allow setting rp_size to zero during remounting

BugLink: https://bugs.launchpad.net/bugs/1912681
[ Upstream commit cd3ed3c73ac671ff6b0230ccb72b8300292d3643 ]

Set rp_size to zero will be ignore during remounting.

The method to identify whether we input a remounting option of
rp_size is to check if the rp_size input is zero. It can not work
well if we pass "rp_size=0".

This patch add a bool variable "set_rp_size" to fix this problem.

Reported-by: Jubin Zhong <zhongjubin@huawei.com>
Signed-off-by: lizhe <lizhe67@huawei.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: William Breathitt Gray <william.gray@canonical.com>
3 years agopowerpc/bitops: Fix possible undefined behaviour with fls() and fls64()
Christophe Leroy [Thu, 22 Oct 2020 14:05:46 +0000 (14:05 +0000)]
powerpc/bitops: Fix possible undefined behaviour with fls() and fls64()

BugLink: https://bugs.launchpad.net/bugs/1912681
[ Upstream commit 1891ef21d92c4801ea082ee8ed478e304ddc6749 ]

fls() and fls64() are using __builtin_ctz() and _builtin_ctzll().
On powerpc, those builtins trivially use ctlzw and ctlzd power
instructions.

Allthough those instructions provide the expected result with
input argument 0, __builtin_ctz() and __builtin_ctzll() are
documented as undefined for value 0.

The easiest fix would be to use fls() and fls64() functions
defined in include/asm-generic/bitops/builtin-fls.h and
include/asm-generic/bitops/fls64.h, but GCC output is not optimal:

00000388 <testfls>:
 388:   2c 03 00 00     cmpwi   r3,0
 38c:   41 82 00 10     beq     39c <testfls+0x14>
 390:   7c 63 00 34     cntlzw  r3,r3
 394:   20 63 00 20     subfic  r3,r3,32
 398:   4e 80 00 20     blr
 39c:   38 60 00 00     li      r3,0
 3a0:   4e 80 00 20     blr

000003b0 <testfls64>:
 3b0:   2c 03 00 00     cmpwi   r3,0
 3b4:   40 82 00 1c     bne     3d0 <testfls64+0x20>
 3b8:   2f 84 00 00     cmpwi   cr7,r4,0
 3bc:   38 60 00 00     li      r3,0
 3c0:   4d 9e 00 20     beqlr   cr7
 3c4:   7c 83 00 34     cntlzw  r3,r4
 3c8:   20 63 00 20     subfic  r3,r3,32
 3cc:   4e 80 00 20     blr
 3d0:   7c 63 00 34     cntlzw  r3,r3
 3d4:   20 63 00 40     subfic  r3,r3,64
 3d8:   4e 80 00 20     blr

When the input of fls(x) is a constant, just check x for nullity and
return either 0 or __builtin_clz(x). Otherwise, use cntlzw instruction
directly.

For fls64() on PPC64, do the same but with __builtin_clzll() and
cntlzd instruction. On PPC32, lets take the generic fls64() which
will use our fls(). The result is as expected:

00000388 <testfls>:
 388:   7c 63 00 34     cntlzw  r3,r3
 38c:   20 63 00 20     subfic  r3,r3,32
 390:   4e 80 00 20     blr

000003a0 <testfls64>:
 3a0:   2c 03 00 00     cmpwi   r3,0
 3a4:   40 82 00 10     bne     3b4 <testfls64+0x14>
 3a8:   7c 83 00 34     cntlzw  r3,r4
 3ac:   20 63 00 20     subfic  r3,r3,32
 3b0:   4e 80 00 20     blr
 3b4:   7c 63 00 34     cntlzw  r3,r3
 3b8:   20 63 00 40     subfic  r3,r3,64
 3bc:   4e 80 00 20     blr

Fixes: 2fcff790dcb4 ("powerpc: Use builtin functions for fls()/__fls()/fls64()")
Cc: stable@vger.kernel.org
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Acked-by: Segher Boessenkool <segher@kernel.crashing.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/348c2d3f19ffcff8abe50d52513f989c4581d000.1603375524.git.christophe.leroy@csgroup.eu
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: William Breathitt Gray <william.gray@canonical.com>
3 years agoKVM: x86: reinstate vendor-agnostic check on SPEC_CTRL cpuid bits
Paolo Bonzini [Thu, 3 Dec 2020 14:40:15 +0000 (09:40 -0500)]
KVM: x86: reinstate vendor-agnostic check on SPEC_CTRL cpuid bits

BugLink: https://bugs.launchpad.net/bugs/1912681
[ Upstream commit 39485ed95d6b83b62fa75c06c2c4d33992e0d971 ]

Until commit e7c587da1252 ("x86/speculation: Use synthetic bits for
IBRS/IBPB/STIBP"), KVM was testing both Intel and AMD CPUID bits before
allowing the guest to write MSR_IA32_SPEC_CTRL and MSR_IA32_PRED_CMD.
Testing only Intel bits on VMX processors, or only AMD bits on SVM
processors, fails if the guests are created with the "opposite" vendor
as the host.

While at it, also tweak the host CPU check to use the vendor-agnostic
feature bit X86_FEATURE_IBPB, since we only care about the availability
of the MSR on the host here and not about specific CPUID bits.

Fixes: e7c587da1252 ("x86/speculation: Use synthetic bits for IBRS/IBPB/STIBP")
Cc: stable@vger.kernel.org
Reported-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: William Breathitt Gray <william.gray@canonical.com>
3 years agoKVM: SVM: relax conditions for allowing MSR_IA32_SPEC_CTRL accesses
Paolo Bonzini [Wed, 5 Feb 2020 15:10:52 +0000 (16:10 +0100)]
KVM: SVM: relax conditions for allowing MSR_IA32_SPEC_CTRL accesses

BugLink: https://bugs.launchpad.net/bugs/1912681
[ Upstream commit df7e8818926eb4712b67421442acf7d568fe2645 ]

Userspace that does not know about the AMD_IBRS bit might still
allow the guest to protect itself with MSR_IA32_SPEC_CTRL using
the Intel SPEC_CTRL bit.  However, svm.c disallows this and will
cause a #GP in the guest when writing to the MSR.  Fix this by
loosening the test and allowing the Intel CPUID bit, and in fact
allow the AMD_STIBP bit as well since it allows writing to
MSR_IA32_SPEC_CTRL too.

Reported-by: Zhiyi Guo <zhguo@redhat.com>
Analyzed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Analyzed-by: Laszlo Ersek <lersek@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: William Breathitt Gray <william.gray@canonical.com>
3 years agoKVM: x86: avoid incorrect writes to host MSR_IA32_SPEC_CTRL
Paolo Bonzini [Mon, 20 Jan 2020 15:33:06 +0000 (16:33 +0100)]
KVM: x86: avoid incorrect writes to host MSR_IA32_SPEC_CTRL

BugLink: https://bugs.launchpad.net/bugs/1912681
[ Upstream commit 6441fa6178f5456d1d4b512c08798888f99db185 ]

If the guest is configured to have SPEC_CTRL but the host does not
(which is a nonsensical configuration but these are not explicitly
forbidden) then a host-initiated MSR write can write vmx->spec_ctrl
(respectively svm->spec_ctrl) and trigger a #GP when KVM tries to
restore the host value of the MSR.  Add a more comprehensive check
for valid bits of SPEC_CTRL, covering host CPUID flags and,
since we are at it and it is more correct that way, guest CPUID
flags too.

For AMD, remove the unnecessary is_guest_mode check around setting
the MSR interception bitmap, so that the code looks the same as
for Intel.

Cc: Jim Mattson <jmattson@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: William Breathitt Gray <william.gray@canonical.com>
3 years agoext4: don't remount read-only with errors=continue on reboot
Jan Kara [Fri, 27 Nov 2020 11:33:54 +0000 (12:33 +0100)]
ext4: don't remount read-only with errors=continue on reboot

BugLink: https://bugs.launchpad.net/bugs/1912681
[ Upstream commit b08070eca9e247f60ab39d79b2c25d274750441f ]

ext4_handle_error() with errors=continue mount option can accidentally
remount the filesystem read-only when the system is rebooting. Fix that.

Fixes: 1dc1097ff60e ("ext4: avoid panic during forced reboot")
Signed-off-by: Jan Kara <jack@suse.cz>
Reviewed-by: Andreas Dilger <adilger@dilger.ca>
Cc: stable@kernel.org
Link: https://lore.kernel.org/r/20201127113405.26867-2-jack@suse.cz
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: William Breathitt Gray <william.gray@canonical.com>