git.proxmox.com Git - mirror_ubuntu-focal-kernel.git/log

drm/exynos: hdmi: don't leak enable HDMI_EN regulator if probe fails

BugLink: https://bugs.launchpad.net/bugs/1869061
[ Upstream commit 3b6a9b19ab652efac7ad4c392add6f1235019568 ]

Move enabling and disabling HDMI_EN optional regulator to probe() function
to keep track on the regulator status. This fixes following warning if
probe() fails (for example when I2C DDC adapter cannot be yet gathered
due to the missing driver). This fixes following warning observed on
Arndale5250 board with multi_v7_defconfig:

[drm] Failed to get ddc i2c adapter by node
------------[ cut here ]------------
WARNING: CPU: 0 PID: 214 at drivers/regulator/core.c:2051 _regulator_put+0x16c/0x184
Modules linked in: ...
CPU: 0 PID: 214 Comm: systemd-udevd Not tainted 5.6.0-rc2-next-20200219-00040-g38af1dfafdbb #7570
Hardware name: Samsung Exynos (Flattened Device Tree)
[<c0312258>] (unwind_backtrace) from [<c030cc10>] (show_stack+0x10/0x14)
[<c030cc10>] (show_stack) from [<c0f0d3a0>] (dump_stack+0xcc/0xe0)
[<c0f0d3a0>] (dump_stack) from [<c0346a58>] (__warn+0xe0/0xf8)
[<c0346a58>] (__warn) from [<c0346b20>] (warn_slowpath_fmt+0xb0/0xb8)
[<c0346b20>] (warn_slowpath_fmt) from [<c0893f58>] (_regulator_put+0x16c/0x184)
[<c0893f58>] (_regulator_put) from [<c0893f8c>] (regulator_put+0x1c/0x2c)
[<c0893f8c>] (regulator_put) from [<c09b2664>] (release_nodes+0x17c/0x200)
[<c09b2664>] (release_nodes) from [<c09aebe8>] (really_probe+0x10c/0x350)
[<c09aebe8>] (really_probe) from [<c09aefa8>] (driver_probe_device+0x60/0x1a0)
[<c09aefa8>] (driver_probe_device) from [<c09af288>] (device_driver_attach+0x58/0x60)
[<c09af288>] (device_driver_attach) from [<c09af310>] (__driver_attach+0x80/0xbc)
[<c09af310>] (__driver_attach) from [<c09ace34>] (bus_for_each_dev+0x68/0xb4)
[<c09ace34>] (bus_for_each_dev) from [<c09ae00c>] (bus_add_driver+0x130/0x1e8)
[<c09ae00c>] (bus_add_driver) from [<c09afd98>] (driver_register+0x78/0x110)
[<c09afd98>] (driver_register) from [<bf139558>] (exynos_drm_init+0xe8/0x11c [exynosdrm])
[<bf139558>] (exynos_drm_init [exynosdrm]) from [<c0302fa8>] (do_one_initcall+0x50/0x220)
[<c0302fa8>] (do_one_initcall) from [<c03dc02c>] (do_init_module+0x60/0x210)
[<c03dc02c>] (do_init_module) from [<c03daf44>] (load_module+0x1c0c/0x2310)
[<c03daf44>] (load_module) from [<c03db85c>] (sys_finit_module+0xac/0xbc)
[<c03db85c>] (sys_finit_module) from [<c0301000>] (ret_fast_syscall+0x0/0x54)
Exception stack(0xecca3fa8 to 0xecca3ff0)
...
---[ end trace 276c91214635905c ]---

Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Reviewed-by: Andrzej Hajda <a.hajda@samsung.com>
Signed-off-by: Inki Dae <inki.dae@samsung.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>

drm/exynos: dsi: fix workaround for the legacy clock name

BugLink: https://bugs.launchpad.net/bugs/1869061
[ Upstream commit c0fd99d659ba5582e09625c7a985d63fc2ca74b5 ]

Writing to the built-in strings arrays doesn't work if driver is loaded
as kernel module. This is also considered as a bad pattern. Fix this by
adding a call to clk_get() with legacy clock name. This fixes following
kernel oops if driver is loaded as module:

Unable to handle kernel paging request at virtual address bf047978
pgd = (ptrval)
[bf047978] *pgd=59344811, *pte=5903c6df, *ppte=5903c65f
Internal error: Oops: 80f [#1] SMP ARM
Modules linked in: mc exynosdrm(+) analogix_dp rtc_s3c exynos_ppmu i2c_gpio
CPU: 1 PID: 212 Comm: systemd-udevd Not tainted 5.6.0-rc2-next-20200219 #326
videodev: Linux video capture interface: v2.00
Hardware name: Samsung Exynos (Flattened Device Tree)
PC is at exynos_dsi_probe+0x1f0/0x384 [exynosdrm]
LR is at exynos_dsi_probe+0x1dc/0x384 [exynosdrm]
...
Process systemd-udevd (pid: 212, stack limit = 0x(ptrval))
...
[<bf03cf14>] (exynos_dsi_probe [exynosdrm]) from [<c09b1ca0>] (platform_drv_probe+0x6c/0xa4)
[<c09b1ca0>] (platform_drv_probe) from [<c09afcb8>] (really_probe+0x210/0x350)
[<c09afcb8>] (really_probe) from [<c09aff74>] (driver_probe_device+0x60/0x1a0)
[<c09aff74>] (driver_probe_device) from [<c09b0254>] (device_driver_attach+0x58/0x60)
[<c09b0254>] (device_driver_attach) from [<c09b02dc>] (__driver_attach+0x80/0xbc)
[<c09b02dc>] (__driver_attach) from [<c09ade00>] (bus_for_each_dev+0x68/0xb4)
[<c09ade00>] (bus_for_each_dev) from [<c09aefd8>] (bus_add_driver+0x130/0x1e8)
[<c09aefd8>] (bus_add_driver) from [<c09b0d64>] (driver_register+0x78/0x110)
[<c09b0d64>] (driver_register) from [<bf038558>] (exynos_drm_init+0xe8/0x11c [exynosdrm])
[<bf038558>] (exynos_drm_init [exynosdrm]) from [<c0302fa8>] (do_one_initcall+0x50/0x220)
[<c0302fa8>] (do_one_initcall) from [<c03dd02c>] (do_init_module+0x60/0x210)
[<c03dd02c>] (do_init_module) from [<c03dbf44>] (load_module+0x1c0c/0x2310)
[<c03dbf44>] (load_module) from [<c03dc85c>] (sys_finit_module+0xac/0xbc)
[<c03dc85c>] (sys_finit_module) from [<c0301000>] (ret_fast_syscall+0x0/0x54)
Exception stack(0xd979bfa8 to 0xd979bff0)
...
---[ end trace db16efe05faab470 ]---

Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Reviewed-by: Andrzej Hajda <a.hajda@samsung.com>
Signed-off-by: Inki Dae <inki.dae@samsung.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>

drm/exynos: dsi: propagate error value and silence meaningless warning

BugLink: https://bugs.launchpad.net/bugs/1869061
[ Upstream commit 0a9d1e3f3f038785ebc72d53f1c409d07f6b4ff5 ]

Properly propagate error value from devm_regulator_bulk_get() and don't
confuse user with meaningless warning about failure in getting regulators
in case of deferred probe.

Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Reviewed-by: Krzysztof Kozlowski <krzk@kernel.org>
Signed-off-by: Inki Dae <inki.dae@samsung.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>

spi/zynqmp: remove entry that causes a cs glitch

BugLink: https://bugs.launchpad.net/bugs/1869061
[ Upstream commit 5dd8304981ecffa77bb72b1c57c4be5dfe6cfae9 ]

In the public interface for chipselect, there is always an entry
commented as "Dummy generic FIFO entry" pushed down to the fifo right
after the activate/deactivate command. The dummy entry is 0x0,
irregardless if the intention was to activate or deactive the cs. This
causes the cs line to glitch rather than beeing activated in the case
when there was an activate command.

This has been observed on oscilloscope, and have caused problems for at
least one specific flash device type connected to the qspi port. After
the change the glitch is gone and cs goes active when intended.

The reason why this worked before (except for the glitch) was because
when sending the actual data, the CS bits are once again set. Since
most flashes uses mode 0, there is always a half clk period anyway for
cs to clk active setup time. If someone would rely on timing from a
chip_select call to a transfer_one, it would fail though.

It is unknown why the dummy entry was there in the first place, git log
seems to be of no help in this case. The reference manual gives no
indication of the necessity of this. In fact the lower 8 bits are a
setup (or hold in case of deactivate) time expressed in cycles. So this
should not be needed to fulfill any setup/hold timings.

Signed-off-by: Thommy Jakobsson <thommyj@gmail.com>
Reviewed-by: Naga Sureshkumar Relli <naga.sureshkumar.relli@xilinx.com>
Link: https://lore.kernel.org/r/20200224162643.29102-1-thommyj@gmail.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>

spi: pxa2xx: Add CS control clock quirk

BugLink: https://bugs.launchpad.net/bugs/1869061
[ Upstream commit 683f65ded66a9a7ff01ed7280804d2132ebfdf7e ]

In some circumstances on Intel LPSS controllers, toggling the LPSS
CS control register doesn't actually cause the CS line to toggle.
This seems to be failure of dynamic clock gating that occurs after
going through a suspend/resume transition, where the controller
is sent through a reset transition. This ruins SPI transactions
that either rely on delay_usecs, or toggle the CS line without
sending data.

Whenever CS is toggled, momentarily set the clock gating register
to "Force On" to poke the controller into acting on CS.

Signed-off-by: Rajat Jain <rajatja@google.com>
Signed-off-by: Evan Green <evgreen@chromium.org>
Link: https://lore.kernel.org/r/20200211223700.110252-1-rajatja@google.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>

ARM: dts: dra7: Add "dma-ranges" property to PCIe RC DT nodes

BugLink: https://bugs.launchpad.net/bugs/1869061
[ Upstream commit 27f13774654ea6bd0b6fc9b97cce8d19e5735661 ]

'dma-ranges' in a PCI bridge node does correctly set dma masks for PCI
devices not described in the DT. Certain DRA7 platforms (e.g., DRA76)
has RAM above 32-bit boundary (accessible with LPAE config) though the
PCIe bridge will be able to access only 32-bits. Add 'dma-ranges'
property in PCIe RC DT nodes to indicate the host bridge can access
only 32 bits.

Signed-off-by: Kishon Vijay Abraham I <kishon@ti.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>

cifs: add missing mount option to /proc/mounts

BugLink: https://bugs.launchpad.net/bugs/1869061
[ Upstream commit ec57010acd03428a749d2600bf09bd537eaae993 ]

We were not displaying the mount option "signloosely" in /proc/mounts
for cifs mounts which some users found confusing recently

Signed-off-by: Steve French <stfrench@microsoft.com>
Reviewed-by: Aurelien Aptel <aaptel@suse.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>

cifs: fix potential mismatch of UNC paths

BugLink: https://bugs.launchpad.net/bugs/1869061
[ Upstream commit 154255233830e1e4dd0d99ac929a5dce588c0b81 ]

Ensure that full_path is an UNC path that contains '\\' as delimiter,
which is required by cifs_build_devname().

The build_path_from_dentry_optional_prefix() function may return a
path with '/' as delimiter when using SMB1 UNIX extensions, for
example.

Signed-off-by: Paulo Alcantara (SUSE) <pc@cjr.nz>
Signed-off-by: Steve French <stfrench@microsoft.com>
Acked-by: Ronnie Sahlberg <lsahlber@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>

powerpc: Include .BTF section

BugLink: https://bugs.launchpad.net/bugs/1869061
[ Upstream commit cb0cc635c7a9fa8a3a0f75d4d896721819c63add ]

Selecting CONFIG_DEBUG_INFO_BTF results in the below warning from ld:
ld: warning: orphan section `.BTF' from `.btf.vmlinux.bin.o' being placed in section `.BTF'

Include .BTF section in vmlinux explicitly to fix the same.

Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200220113132.857132-1-naveen.n.rao@linux.vnet.ibm.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>

spi: qup: call spi_qup_pm_resume_runtime before suspending

BugLink: https://bugs.launchpad.net/bugs/1869061
[ Upstream commit 136b5cd2e2f97581ae560cff0db2a3b5369112da ]

spi_qup_suspend() will cause synchronous external abort when
runtime suspend is enabled and applied, as it tries to
access SPI controller register while clock is already disabled
in spi_qup_pm_suspend_runtime().

Signed-off-by: Yuji sasaki <sasakiy@chromium.org>
Signed-off-by: Vinod Koul <vkoul@kernel.org>
Link: https://lore.kernel.org/r/20200214074340.2286170-1-vkoul@kernel.org
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>

ARM: dts: dra7-l4: mark timer13-16 as pwm capable

BugLink: https://bugs.launchpad.net/bugs/1869061
[ Upstream commit 00a39c92c8ab94727f021297d1748531af113fcd ]

DMTimers 13 - 16 are PWM capable and also can be used for CPTS input
signals generation. Hence, mark them as "ti,timer-pwm".

Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
Reviewed-by: Lokesh Vutla <lokeshvutla@ti.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>

phy: ti: gmii-sel: do not fail in case of gmii

BugLink: https://bugs.launchpad.net/bugs/1869061
[ Upstream commit 58aa7729310db04ffcc022c98002dd8fcb486c58 ]

The "gmii" PHY interface mode is supported on TI AM335x/437x/5xx SoCs, so
don't fail if it's selected.

Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
Signed-off-by: Kishon Vijay Abraham I <kishon@ti.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>

phy: ti: gmii-sel: fix set of copy-paste errors

BugLink: https://bugs.launchpad.net/bugs/1869061
[ Upstream commit eefed634eb61e4094b9fb8183cb8d43b26838517 ]

- under PHY_INTERFACE_MODE_MII the 'mode' func parameter is assigned
instead of 'gmii_sel_mode' and it's working only because the default value
'gmii_sel_mode' is set to 0.

- console outputs use 'rgmii_id' and 'mode' values to print PHY mode
instead of using 'submode' value which is representing PHY interface mode
now.

This patch fixes above two cases.

Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
Signed-off-by: Kishon Vijay Abraham I <kishon@ti.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>

drm/mediatek: Find the cursor plane instead of hard coding it

BugLink: https://bugs.launchpad.net/bugs/1869061
[ Upstream commit 318caac7c81cdf5806df30c3d72385659a5f0f53 ]

The cursor and primary planes were hard coded.
Now search for them for passing to drm_crtc_init_with_planes

Signed-off-by: Evan Benn <evanbenn@chromium.org>
Reviewed-by: Sean Paul <seanpaul@chromium.org>
Signed-off-by: CK Hu <ck.hu@mediatek.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>

spi: spi-omap2-mcspi: Support probe deferral for DMA channels

BugLink: https://bugs.launchpad.net/bugs/1869061
[ Upstream commit 32f2fc5dc3992b4b60cc6b1a6a31be605cc9c3a2 ]

dma_request_channel() can return -EPROBE_DEFER, if DMA driver is not
ready. Currently driver just falls back to PIO mode on probe deferral.
Fix this by requesting all required channels during probe and
propagating EPROBE_DEFER error code.

Signed-off-by: Vignesh Raghavendra <vigneshr@ti.com>
Link: https://lore.kernel.org/r/20200204124816.16735-3-vigneshr@ti.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>

locks: reinstate locks_delete_block optimization

BugLink: https://bugs.launchpad.net/bugs/1869061
[ Upstream commit dcf23ac3e846ca0cf626c155a0e3fcbbcf4fae8a ]

There is measurable performance impact in some synthetic tests due to
commit 6d390e4b5d48 (locks: fix a potential use-after-free problem when
wakeup a waiter). Fix the race condition instead by clearing the
fl_blocker pointer after the wake_up, using explicit acquire/release
semantics.

This does mean that we can no longer use the clearing of fl_blocker as
the wait condition, so switch the waiters over to checking whether the
fl_blocked_member list_head is empty.

Reviewed-by: yangerkun <yangerkun@huawei.com>
Reviewed-by: NeilBrown <neilb@suse.de>
Fixes: 6d390e4b5d48 (locks: fix a potential use-after-free problem when wakeup a waiter)
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>

locks: fix a potential use-after-free problem when wakeup a waiter

BugLink: https://bugs.launchpad.net/bugs/1869061
[ Upstream commit 6d390e4b5d48ec03bb87e63cf0a2bff5f4e116da ]

'16306a61d3b7 ("fs/locks: always delete_block after waiting.")' add the
logic to check waiter->fl_blocker without blocked_lock_lock. And it will
trigger a UAF when we try to wakeup some waiter：

Thread 1 has create a write flock a on file, and now thread 2 try to
unlock and delete flock a, thread 3 try to add flock b on the same file.

Thread2                         Thread3
                                flock syscall(create flock b)
                        ...flock_lock_inode_wait
    flock_lock_inode(will insert
    our fl_blocked_member list
    to flock a's fl_blocked_requests)
   sleep
flock syscall(unlock)
...flock_lock_inode_wait
    locks_delete_lock_ctx
    ...__locks_wake_up_blocks
        __locks_delete_blocks(
b->fl_blocker = NULL)
...
                                   break by a signal
   locks_delete_block
    b->fl_blocker == NULL &&
    list_empty(&b->fl_blocked_requests)
                            success, return directly
locks_free_lock b
wake_up(&b->fl_waiter)
trigger UAF

Fix it by remove this logic, and this patch may also fix CVE-2019-19769.

Cc: stable@vger.kernel.org
Fixes: 16306a61d3b7 ("fs/locks: always delete_block after waiting.")
Signed-off-by: yangerkun <yangerkun@huawei.com>
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>

EDAC/amd64: Drop some family checks for newer systems

BugLink: https://bugs.launchpad.net/bugs/1869061
BugLink: https://bugs.launchpad.net/bugs/1867363
In general, "pvt->umc != NULL" is used to check if the system is Family
17h+. However, there are a few places that are using direct family
checks.

Replace the remaining family checks with a check for "pvt->umc != NULL".

Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20200110015651.14887-6-Yazen.Ghannam@amd.com
(cherry picked from commit dcd01394ce7cd7d25bb15c81ad2e804d8090611f)
Signed-off-by: Kamal Mostafa <kamal@canonical.com>

x86/amd_nb: Add Family 19h PCI IDs

BugLink: https://bugs.launchpad.net/bugs/1869061
BugLink: https://bugs.launchpad.net/bugs/1867363
Add the new PCI Device 18h IDs for AMD Family 19h systems. Note that
Family 19h systems will not have a new PCI root device ID.

Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20200110015651.14887-4-Yazen.Ghannam@amd.com
(cherry picked from commit b3f79ae45904ae987a7c06a9e8d6084d7b73e67f)
Signed-off-by: Kamal Mostafa <kamal@canonical.com>

EDAC/mce_amd: Always load on SMCA systems

BugLink: https://bugs.launchpad.net/bugs/1869061
BugLink: https://bugs.launchpad.net/bugs/1867363
MCA error decoding on SMCA systems is not dependent on family. Return
success early if the system supports the SMCA feature.

Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20200110015651.14887-3-Yazen.Ghannam@amd.com
(cherry picked from commit 9f6aef86315ac31481a288ba1b3f43b2aac93757)
Signed-off-by: Kamal Mostafa <kamal@canonical.com>

x86/MCE/AMD, EDAC/mce_amd: Add new Load Store unit McaType

BugLink: https://bugs.launchpad.net/bugs/1869061
BugLink: https://bugs.launchpad.net/bugs/1867363
Add support for a new version of the Load Store unit bank type as
indicated by its McaType value, which will be present in future SMCA
systems.

Add the new (HWID, MCATYPE) tuple. Reuse the same name, since this is
logically the same to the user.

Also, add the new error descriptions to edac_mce_amd.

Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20200110015651.14887-2-Yazen.Ghannam@amd.com
(cherry picked from commit 89a76171bf50bd20d44338408b8c09433c302956)
Signed-off-by: Kamal Mostafa <kamal@canonical.com>

EDAC/amd64: Add family ops for Family 19h Models 00h-0Fh

BugLink: https://bugs.launchpad.net/bugs/1869061
BugLink: https://bugs.launchpad.net/bugs/1867363
Add family ops to support AMD Family 19h systems. Existing Family 17h
functions can be used. Also, add Family 19h to the list of families to
automatically load the module.

kamal@canonical.com: For backport to v5.4, omit the ".max_mcs" value.

Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20200110015651.14887-5-Yazen.Ghannam@amd.com
(backported from commit 2eb61c91c3e2738218e55f2eaf7e78a4435c233d)
Signed-off-by: Kamal Mostafa <kamal@canonical.com>

perf/x86/amd: Add support for Large Increment per Cycle Events

BugLink: https://bugs.launchpad.net/bugs/1869061
BugLink: https://bugs.launchpad.net/bugs/1867363
Description of hardware operation
---------------------------------

The core AMD PMU has a 4-bit wide per-cycle increment for each
performance monitor counter.  That works for most events, but
now with AMD Family 17h and above processors, some events can
occur more than 15 times in a cycle.  Those events are called
"Large Increment per Cycle" events. In order to count these
events, two adjacent h/w PMCs get their count signals merged
to form 8 bits per cycle total.  In addition, the PERF_CTR count
registers are merged to be able to count up to 64 bits.

Normally, events like instructions retired, get programmed on a single
counter like so:

PERF_CTL0 (MSR 0xc0010200) 0x000000000053ff0c # event 0x0c, umask 0xff
PERF_CTR0 (MSR 0xc0010201) 0x0000800000000001 # r/w 48-bit count

The next counter at MSRs 0xc0010202-3 remains unused, or can be used
independently to count something else.

When counting Large Increment per Cycle events, such as FLOPs,
however, we now have to reserve the next counter and program the
PERF_CTL (config) register with the Merge event (0xFFF), like so:

PERF_CTL0 (msr 0xc0010200) 0x000000000053ff03 # FLOPs event, umask 0xff
PERF_CTR0 (msr 0xc0010201) 0x0000800000000001 # rd 64-bit cnt, wr lo 48b
PERF_CTL1 (msr 0xc0010202) 0x0000000f004000ff # Merge event, enable bit
PERF_CTR1 (msr 0xc0010203) 0x0000000000000000 # wr hi 16-bits count

The count is widened from the normal 48-bits to 64 bits by having the
second counter carry the higher 16 bits of the count in its lower 16
bits of its counter register.

The odd counter, e.g., PERF_CTL1, is programmed with the enabled Merge
event before the even counter, PERF_CTL0.

The Large Increment feature is available starting with Family 17h.
For more details, search any Family 17h PPR for the "Large Increment
per Cycle Events" section, e.g., section 2.1.15.3 on p. 173 in this
version:

https://www.amd.com/system/files/TechDocs/56176_ppr_Family_17h_Model_71h_B0_pub_Rev_3.06.zip

Description of software operation
---------------------------------

The following steps are taken in order to support reserving and
enabling the extra counter for Large Increment per Cycle events:

1. In the main x86 scheduler, we reduce the number of available
counters by the number of Large Increment per Cycle events being
scheduled, tracked by a new cpuc variable 'n_pair' and a new
amd_put_event_constraints_f17h().  This improves the counter
scheduler success rate.

2. In perf_assign_events(), if a counter is assigned to a Large
Increment event, we increment the current counter variable, so the
counter used for the Merge event is removed from assignment
consideration by upcoming event assignments.

3. In find_counter(), if a counter has been found for the Large
Increment event, we set the next counter as used, to prevent other
events from using it.

4. We perform steps 2 & 3 also in the x86 scheduler fastpath, i.e.,
we add Merge event accounting to the existing used_mask logic.

5. Finally, we add on the programming of Merge event to the
neighbouring PMC counters in the counter enable/disable{_all}
code paths.

Currently, software does not support a single PMU with mixed 48- and
64-bit counting, so Large increment event counts are limited to 48
bits.  In set_period, we zero-out the upper 16 bits of the count, so
the hardware doesn't copy them to the even counter's higher bits.

Simple invocation example showing counting 8 FLOPs per 256-bit/%ymm
vaddps instruction executed in a loop 100 million times:

perf stat -e cpu/fp_ret_sse_avx_ops.all/,cpu/instructions/ <workload>

Performance counter stats for '<workload>':

       800,000,000      cpu/fp_ret_sse_avx_ops.all/u
       300,042,101      cpu/instructions/u

Prior to this patch, the reported SSE/AVX FLOPs retired count would
be wrong.

[peterz: lots of renames and edits to the code]

Signed-off-by: Kim Phillips <kim.phillips@amd.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
(cherry picked from commit 5738891229a25e9e678122a843cbf0466a456d0c)

Signed-off-by: Kamal Mostafa <kamal@canonical.com>

UBUNTU: SAUCE: selftests: net: ip_defrag: limit packet to 1000 fragments

The ip_defrag selftest will fail when run with a conntrack rule because
it might push more than a 1000 fragments through loopback. This will hit
the backlog limit, causing fragments to be dropped, leading to test
failures.

This is considered a real bug by Eric Dumazet, so the test change is
just a workaround so we can keep testing for other regressions while
avoiding this particular failure.

BugLink: https://bugs.launchpad.net/bugs/1870543
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>

scsi: mpt3sas: Update drive version to 33.100.00.00

Update mpt3sas driver version from 32.100.00.00 to 33.100.00.00

Link: https://lore.kernel.org/r/20191226111333.26131-11-sreekanth.reddy@broadcom.com
Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit c53cf10ef6d9faeee9baa1fab824139c6f10a134)
Signed-off-by: Michael Reed <Michael.Reed@canonical.com>
BugLink: https://bugs.launchpad.net/bugs/1863574
Update mpt3sas Driver to 33.100.00.00

Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>

scsi: mpt3sas: Remove usage of device_busy counter

Remove usage of device_busy counter from driver. Instead of device_busy
counter now driver uses 'nr_active' counter of request_queue to get the
number of inflight request for a LUN.

Link: https://lore.kernel.org/r/20191226111333.26131-10-sreekanth.reddy@broadcom.com
Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit c50ed99cd56ee725d9e14dffec8e8f1641b8ca30)
Signed-off-by: Michael Reed <Michael.Reed@canonical.com>
BugLink: https://bugs.launchpad.net/bugs/1863574
Update mpt3sas Driver to 33.100.00.00

Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>

scsi: mpt3sas: Print function name in which cmd timed out

Print the function name in which MPT command got timed out. This will
facilitate debugging in which path corresponding MPT command got timeout in
first failure instance of log itself.

Link: https://lore.kernel.org/r/20191226111333.26131-9-sreekanth.reddy@broadcom.com
Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit c6bdb6a10892d1130638a5e28d1523a813e45d5e)
Signed-off-by: Michael Reed <Michael.Reed@canonical.com>
BugLink: https://bugs.launchpad.net/bugs/1863574
Update mpt3sas Driver to 33.100.00.00

Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>

scsi: mpt3sas: Optimize mpt3sas driver logging

This improves mpt3sas driver default debug information collection and
allows for a higher percentage of issues being able to be resolved with a
first-time data capture.  However, this improvement to balance the amount
of debug data captured with the performance of driver.

Enabled below print messages with out affecting the IO performance,

1. When task abort TM is received then print IO commands's timeout value
   and how much time this command has been outstanding.

2. Whenever hard reset occurs then print from where this hard reset has
   been issued.

3. Failure message should be displayed for failure scenarios without any
   logging level.

4. Added a print after driver successfully register or unregistered a
   target drive with the SML. This print will be useful for debugging the
   issue where the drive addition or deletion is hanging at SML.

5. During driver load time print request, reply, sense and config page
   pool's information such as its address, length and size. Also printed
   sg_tablesize information.

Link: https://lore.kernel.org/r/20191226111333.26131-8-sreekanth.reddy@broadcom.com
Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit 5b061980e362820894d7d884370b37005bed23ec)
Signed-off-by: Michael Reed <Michael.Reed@canonical.com>
BugLink: https://bugs.launchpad.net/bugs/1863574
Update mpt3sas Driver to 33.100.00.00

Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>

scsi: mpt3sas: print in which path firmware fault occurred

When Firmware fault occurs then print in which path firmware fault has
occurred. This will be useful while debugging the firmware fault issues.

Link: https://lore.kernel.org/r/20191226111333.26131-7-sreekanth.reddy@broadcom.com
Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit c59777189433621392f6f5c82ecfc62f00a1232d)
Signed-off-by: Michael Reed <Michael.Reed@canonical.com>
BugLink: https://bugs.launchpad.net/bugs/1863574
Update mpt3sas Driver to 33.100.00.00

Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>

scsi: mpt3sas: Handle CoreDump state from watchdog thread

Watchdog thread polls for IOC state every 1 second. If it detects that IOC
state is in CoreDump state then it immediately stops the IOs and also
clears the outstanding commands issued to the HBA firmware and then it will
poll for IOC state to be out of CoreDump state and once it detects that IOC
state is changed from CoreDump state to Fault state (or) CoreDumpTOSec
number of seconds are elapsed then it will issue host reset operation and
moves the IOC state to Operational state and resumes the IOs.

Whenever any TM is received from SML then if driver detects the IOC state
is in CoreDump state then it will wait for CoreDump state to be cleared and
will host reset operation.

Link: https://lore.kernel.org/r/20191226111333.26131-6-sreekanth.reddy@broadcom.com
Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit fce0aa08792b3ae725395fa25d44507dee0b603b)
Signed-off-by: Michael Reed <Michael.Reed@canonical.com>
BugLink: https://bugs.launchpad.net/bugs/1863574
Update mpt3sas Driver to 33.100.00.00

Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>

scsi: mpt3sas: Add support IOCs new state named COREDUMP

New feature is added in HBA firmware where it copies the collected firmware
logs in flash region named 'CoreDump' whenever HBA firmware faults occur.

For copying the logs to CoreDump flash region firmware needs some time and
hence it has introduced a new IOC state named "CoreDump" State.

Whenever driver detects the CoreDump state then it means that some firmware
fault has occurred and firmware is copying the logs to the coredump flash
region. During this time driver should not perform any operation with the
HBA, driver should wait for HBA firmware to move the IOC state from
'CoreDump' state to 'Fault' state once it's done with copying the logs to
coredump region. Once driver detects the Fault state then it will issue the
diag reset/host reset operation to move the IOC state from Fault to
Operational state.

Here the valid IOC state transactions w.r.t to this CoreDump state feature,

Operational -> Fault:
The IOC transitions to the Fault state when an operational error occurs AND
CoreDump is not supported (or disabled) by the firmware(FW).

Operational -> CoreDump:
The IOC transitions to the CoreDump state when an operational error occurs
AND CoreDump is supported & enabled by the FW.

CoreDump -> Fault:
A transition from CoreDump state to Fault state happens when the FW
completes the CoreDump collection.

CoreDump -> Reset:
A transition out of the CoreDump state happens when the host sets the Reset
Adapter bit in the System Diagnostic Register (Hard Reset). This reset
action indicates that CoreDump took longer than the host time out.

Firmware informs the driver about the maximum time that driver has to wait
for firmware to transition the IOC state from 'CoreDump' to 'FAULT' state
through 'CoreDumpTOSec' field of ManufacturingPage11 page. if this
'CoreDumpTOSec' field value is zero then driver will wait for max 15
seconds.

Driver informs the HBA firmware that it supports this new IOC state named
'CoreDump' state by enabling COREDUMP_ENABLE flag in ConfigurationFlags
field of ioc init request message.

Current patch handles the CoreDump state only during HBA initialization and
release scenarios where watchdog thread (which polls the IOC state in every
one second) is disabled. Next subsequent patch handle the CoreDump state
when watchdog thread is enabled.

During HBA initialization or release execution time if driver detects the
CoreDump state then driver will wait for maximum CoreDumpTOSec value
seconds for FW to copy the logs. After that it will issue the diag reset
operation to move the IOC state to Operational state.

Link: https://lore.kernel.org/r/20191226111333.26131-5-sreekanth.reddy@broadcom.com
Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit e8c2307e6a690db9aaff84153b2857c5c4dfd969)
Signed-off-by: Michael Reed <Michael.Reed@canonical.com>
BugLink: https://bugs.launchpad.net/bugs/1863574
Update mpt3sas Driver to 33.100.00.00

Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>

scsi: mpt3sas: renamed _base_after_reset_handler function

Renamed _base_after_reset_handler function to
_base_clear_outstanding_commands so that it can be used in multiple
scenarios with suitable name which matches with the operation it does.

Also renamed its child functions. No functional changes.

Link: https://lore.kernel.org/r/20191226111333.26131-4-sreekanth.reddy@broadcom.com
Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit 36c6c7f75b0998f5a4b5c79cbb94ee1ab4ee35c0)
Signed-off-by: Michael Reed <Michael.Reed@canonical.com>
BugLink: https://bugs.launchpad.net/bugs/1863574
Update mpt3sas Driver to 33.100.00.00

Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>

scsi: mpt3sas: Add support for NVMe shutdown

Introduce function _scsih_nvme_shutdown() to issue IO Unit Control message
to IOC firmware with operation code 'shutdown'. This causes IOC firmware to
issue NVMe shutdown commands to all NVMe drives attached to it.

NVMe Shutdown:

NVMe devices need to have a specific shutdown sequence performed before
power is removed. For this, the IOC firmware needs to be notified when the
system is being shutdown. So during the system shutdown time, driver issues
an IO Unit Control request with operation code MPI26_CTRL_OP_SHUTDOWN to
inform firmware that a shutdown is initiated.

This shutdown command is issued only if NVMe devices are attached to the
controller.

During each NVMe device addition, driver reads pcie device page2 to get
shutdown latency (e.g. drive's RTD3 Entry Latency) and updates the max
latency value among the added NVMe drives in ioc->max_shutdown_latency.
This is used as the timeout value for IO Unit Control command at the time
of shutdown.

When a NVMe drive is removed and its shutdown latency matches which
ioc->max_shutdown_latency then ioc->max_shutdown_latency is updated to next
max value (by iterating over the list of available devices). If the
shutdown latency is 0, then default timeout is set to six seconds.

Link: https://lore.kernel.org/r/20191226111333.26131-3-sreekanth.reddy@broadcom.com
Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit d3f623ae8e0323ca434ee9029100312a8be37773)
Signed-off-by: Michael Reed <Michael.Reed@canonical.com>
BugLink: https://bugs.launchpad.net/bugs/1863574
Update mpt3sas Driver to 33.100.00.00

Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>

scsi: mpt3sas: Update MPI Headers to v02.00.57

Update MPI Headers to version 02.00.57.

Link: https://lore.kernel.org/r/20191226111333.26131-2-sreekanth.reddy@broadcom.com
Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit 1ade26b616cc2da0b7277a97e3799c99bae0655b)
Signed-off-by: Michael Reed <Michael.Reed@canonical.com>
BugLink: https://bugs.launchpad.net/bugs/1863574
Update mpt3sas Driver to 33.100.00.00

Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>

scsi: mpt3sas: change allocation option

From an interrupt handler path memory may be allocated using
GFP_KERNEL, replace it with GFP_ATOMIC.
_base_interrupt->_scsih_io_done->_scsih_smart_predicted_fault

Link: https://lore.kernel.org/r/20191024152835.6177-1-thenzl@redhat.com
Signed-off-by: Tomas Henzl <thenzl@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit 5bb2f743cdaa6da618e77a6aab5c38b46072365b)
Signed-off-by: Michael Reed <Michael.Reed@canonical.com>
BugLink: https://bugs.launchpad.net/bugs/1863574
Update mpt3sas Driver to 33.100.00.00

Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>

scsi: mpt3sas: Clean up some indenting

This line is indented too far so it's a bit confusing.

Link: https://lore.kernel.org/r/20191004100615.GA823@mwanda
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit 3524a38e594dd5f090cbc3226e5f47cb4067fac7)
Signed-off-by: Michael Reed <Michael.Reed@canonical.com>
BugLink: https://bugs.launchpad.net/bugs/1863574
Update mpt3sas Driver to 33.100.00.00

Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>

scsi: mpt3sas: Bump mpt3sas driver version to 32.100.00.00

Bump mpt3sas driver version to 32.100.00.00

Link: https://lore.kernel.org/r/1568379890-18347-14-git-send-email-sreekanth.reddy@broadcom.com
Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit d8b2625f4699a36b1753d87e96b0c50a4531c065)
Signed-off-by: Michael Reed <Michael.Reed@canonical.com>
BugLink: https://bugs.launchpad.net/bugs/1863574
Update mpt3sas Driver to 33.100.00.00

Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>

scsi: mpt3sas: Fix module parameter max_msix_vectors

Load driver with module parameter "max_msix_vectors". Value provided in
module parameter is not used by mpt3sas driver. Driver loads with max
controller supported MSI-X value.

In _base_alloc_irq_vectors use reply_queue_count which is determined using
user provided msix value insted of ioc->msix_vector_count which tells max
supported msix value of the controller.

Link: https://lore.kernel.org/r/1568379890-18347-13-git-send-email-sreekanth.reddy@broadcom.com
Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit 9e64fd1e65f72d229f96fcf390576e772770a01c)
Signed-off-by: Michael Reed <Michael.Reed@canonical.com>
BugLink: https://bugs.launchpad.net/bugs/1863574
Update mpt3sas Driver to 33.100.00.00

Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>

scsi: mpt3sas: Use Component img header to get Package ver

The firmware image layout has been changed for Aero controllers. All
compatible HBAs have to get Firmware Package version from Component Image
Header layout.

The Signature field in FW header is set to 0xEB000042 for products
compatible with Component Image Header.

For compatible controllers, driver fetches firmware package version from
ApplicationSpecific field of Component Image Header.

Link: https://lore.kernel.org/r/1568379890-18347-11-git-send-email-sreekanth.reddy@broadcom.com
Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit b06ff10249036eec74fa8565503ac40d0ee92213)
Signed-off-by: Michael Reed <Michael.Reed@canonical.com>
BugLink: https://bugs.launchpad.net/bugs/1863574
Update mpt3sas Driver to 33.100.00.00

Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>

scsi: mpt3sas: Fail release cmnd if diag buffer is released

Return the diag buffer release command with -EINVAL status if the buffer is
already released.

Link: https://lore.kernel.org/r/1568379890-18347-10-git-send-email-sreekanth.reddy@broadcom.com
Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit 29f571f8b4cc652cae9244630f714a610549d301)
Signed-off-by: Michael Reed <Michael.Reed@canonical.com>
BugLink: https://bugs.launchpad.net/bugs/1863574
Update mpt3sas Driver to 33.100.00.00

Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>

scsi: mpt3sas: Add app owned flag support for diag buffer

Added a new status flag named MPT3_DIAG_BUFFER_IS_APP_OWNED and it will set
whenever application registers the diag buffer & it will be cleared when
application unregisters the buffer.

When this flag is enabled, and if application issues diag buffer register
command without releasing the buffer, then register command will be failed
with -EINVAL status by saying that this buffer is already registered by the
application.

When user issues a trace buffer register command through sysfs parameter,
and if trace buffer is in released stated but not yet unregistered by the
application which was owning it, then driver will unregister the buffer by
itself and freshly register the 1MB sized trace buffer with the HBA
firmware.

Link: https://lore.kernel.org/r/1568379890-18347-9-git-send-email-sreekanth.reddy@broadcom.com
Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit a8a6cbcd038de4ee3722c17edd7a4d84ce423f7d)
Signed-off-by: Michael Reed <Michael.Reed@canonical.com>
BugLink: https://bugs.launchpad.net/bugs/1863574
Update mpt3sas Driver to 33.100.00.00

Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>

scsi: mpt3sas: Reuse diag buffer allocated at load time

The diag buffer which is allocated during driver load time or through sysfs
parameter is marked as driver allocated diag buffer.
MPT3_DIAG_BUFFER_IS_DRIVER_ALLOCATED bit will be set for this buffer.

This buffer won't be de-allocated even when application issues unregister
command, driver just clears the registered status bit. Same buffer will be
reused while re-registering the same diag buffer type by any application.
While re-registering the same diag buffer type application has to register
with the same size that the buffer was allocated during driver load
time. This buffer size can be read by the application by issuing diag
'query' command.

This always makes sure that the memory is available for applications for
collecting the firmware logs. Only thing is that this won't allow the
application to re-register the diag buffer with different size, but the
buffer size which is allocated during driver load time will be enough for
most of the cases for collecting the firmware logs.

Link: https://lore.kernel.org/r/1568379890-18347-8-git-send-email-sreekanth.reddy@broadcom.com
Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit a066f4c31359d07b1a2c5144b4b9a29901365fd0)
Signed-off-by: Michael Reed <Michael.Reed@canonical.com>
BugLink: https://bugs.launchpad.net/bugs/1863574
Update mpt3sas Driver to 33.100.00.00

Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>

scsi: mpt3sas: clear release bit when buffer reregistered

Clear MPT3_DIAG_BUFFER_IS_RELEASED bit once diag buffer is re-registered
after reading the buffer, else driver won't release the buffer and return
the 'diag release' command with -EINVAL status saying that buffer is
already released.

Link: https://lore.kernel.org/r/1568379890-18347-7-git-send-email-sreekanth.reddy@broadcom.com
Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit dd180e4eedfd85b80020a6fd566601f4765a9d69)
Signed-off-by: Michael Reed <Michael.Reed@canonical.com>
BugLink: https://bugs.launchpad.net/bugs/1863574
Update mpt3sas Driver to 33.100.00.00

Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>

scsi: mpt3sas: Maintain owner of buffer through UniqueID

Application A has registered a diag buffer and looking for particular event
to happen to release & read the trace buffer. Meanwhile application B has
unregistered the diag buffer and now Application A can't get the required
diag buffer. So proper diag buffer ownership is missing.

Each application has to maintain its own Unique ID. Now driver has to save
the Application's UniqueID for each diag buffer type when diag buffer is
registered. And driver has to allow 'release', 'read' & 'unregister' diag
commands only if application's UniqueID matches with saved UniqueID for the
corresponding diag buffer type.

When diag buffer is registered by the driver, then the UniqueID saved by
the driver is "BRCM" (i.e. 0x4252434D) for SAS3 and above generations HBA
devices. For SAS2 HBAs, driver keeps the legacy UniqueID 0x07075900 for
maintaining compatibility with the legacy SAS2 application and this
improvement won't be applicable for SAS2 HBA devices.

Any application can own the buffer registered by the driver by sending
diag register request to driver with same buffer type and size
(Application can get the buffer size by sending 'query' command). Then
driver changes the ownership of the buffer by saving application's
UniqueID for that corresponding buffer type.

Also, application can re-register the diag buffer with same size without
un-registering it, but diag buffer should be released before re-registering
it. By allowing this, driver no need to deallocate and allocate a new
buffer for re-register command, same buffer can be re-used.

Link: https://lore.kernel.org/r/1568379890-18347-6-git-send-email-sreekanth.reddy@broadcom.com
Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit 08e7378ee331a803cfdd91c512a3dea040f1da79)
Signed-off-by: Michael Reed <Michael.Reed@canonical.com>
BugLink: https://bugs.launchpad.net/bugs/1863574
Update mpt3sas Driver to 33.100.00.00

Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>

scsi: mpt3sas: Free diag buffer without any status check

Memory leak can happen when diag buffer is released but not unregistered
(where buffer is deallocated) by the user. During module unload time driver
is not deallocating the buffer if the buffer is in released state.

Deallocate the diag buffer during module unload time without any diag
buffer status checks.

Link: https://lore.kernel.org/r/1568379890-18347-5-git-send-email-sreekanth.reddy@broadcom.com
Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit 764f472ba4a7a0c18107ebfbe1a9f1f5f5a1e411)
Signed-off-by: Michael Reed <Michael.Reed@canonical.com>
BugLink: https://bugs.launchpad.net/bugs/1863574
Update mpt3sas Driver to 33.100.00.00

Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>

scsi: mpt3sas: Display message before releasing diag buffer

Display message before releasing the diag buffer so that user knows which
event caused the release of diag buffer.

Releasing of diag buffer means HBA firmware stops posting the firmware logs
on the registered diag buffer.

Link: https://lore.kernel.org/r/1568379890-18347-3-git-send-email-sreekanth.reddy@broadcom.com
Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit 4bc50dc1afb7b77dfc80a1f8f0742b14d6f6e376)
Signed-off-by: Michael Reed <Michael.Reed@canonical.com>
BugLink: https://bugs.launchpad.net/bugs/1863574
Update mpt3sas Driver to 33.100.00.00

Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>

scsi: mpt3sas: Register trace buffer based on NVDATA settings

Currently if user wishes to enable the host trace buffer during driver load
time, then user has to load the driver with module parameter
'diag_buffer_enable' set to one.

Alternatively now the user can enable host trace buffer by enabling the
following fields in manufacturing page11 in NVDATA (nvdata xml is used
while building HBA firmware image):

* HostTraceBufferMaxSizeKB - Maximum trace buffer size in KB that host can
                              allocate,

* HostTraceBufferMinSizeKB - Minimum trace buffer size in KB atleast host
                              should allocate,

* HostTraceBufferDecrementSizeKB - size by which host can reduce from
                              buffer size and retry the buffer allocation
                              when buffer allocation failed with previous
                              calculated buffer size.

The driver will register the trace buffer automatically without any module
parameter during boot time when above fields are enabled in manufacturing
page11 in HBA firmware.

Driver follows the following algorithm for enabling the host trace buffer
during driver load time:

* If user has loaded the driver with module parameter 'diag_buffer_enable'
  set to one, then driver allocates 2MB buffer and registers this buffer
  with HBA firmware for capturing the firmware trace logs.

* Else driver reads manufacture page11 data and checks whether
  HostTraceBufferMaxSizeKB filed is zero or not?

  - If HostTraceBufferMaxSizeKB is non-zero then driver tries to allocate
    HostTraceBufferMaxSizeKB size of memory. If the buffer allocation is
    successful, then it will register this buffer with HBA firmware, else
    in a loop the driver will try again by reducing the current buffer size
    with HostTraceBufferDecrementSizeKB size until memory allocation is
    successful or buffer size falls below HostTraceBufferMinSizeKB. If the
    memory allocation is successful, then the buffer will be registered
    with the firmware. Else, if the buffer size falls below the
    HostTraceBufferMinSizeKB, then driver won't register trace buffer with
    HBA firmware.

  - If HostTraceBufferMaxSizeKB is zero, then driver won't register trace
    buffer with HBA firmware.

Link: https://lore.kernel.org/r/1568379890-18347-2-git-send-email-sreekanth.reddy@broadcom.com
Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit d04a6edfed0b2e99d9a8385503737944db3d5649)
Signed-off-by: Michael Reed <Michael.Reed@canonical.com>
BugLink: https://bugs.launchpad.net/bugs/1863574
Update mpt3sas Driver to 33.100.00.0

Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>

scsi: megaraid_sas: fixup MSIx interrupt setup during resume

Streamline resume workflow by using the same functions for enabling MSIx
interrupts as used during initialisation. Without it the driver might
crash during resume with:

WARNING: CPU: 2 PID: 4306 at ../drivers/pci/msi.c:1303 pci_irq_get_affinity+0x3b/0x90

Link: https://lore.kernel.org/r/20200113132609.69536-1-hare@suse.de
Signed-off-by: Hannes Reinecke <hare@suse.de>
Acked-by: Sumit Saxena <sumit.saxena@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit 92b4f9d150593a7a78d9872c2d5dc05ffae4521b)
Signed-off-by: Michael Reed <Michael.Reed@canonical.com>
BugLink: https://bugs.launchpad.net/bugs/1863581
Update the Megaraid_sas driver to version 07.713.01.00-rc1 from 07.710.50.00-rc1

Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>

scsi: megaraid_sas: Update driver version to 07.713.01.00-rc1

Link: https://lore.kernel.org/r/1579000882-20246-12-git-send-email-anand.lodnoor@broadcom.com
Signed-off-by: Anand Lodnoor <anand.lodnoor@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit 824b72db5086b84d944cea595a62ef1158f63af1)
Signed-off-by: Michael Reed <Michael.Reed@canonical.com>
BugLink: https://bugs.launchpad.net/bugs/1863581
Update the Megaraid_sas driver to version 07.713.01.00-rc1 from 07.710.50.00-rc1

Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>

scsi: megaraid_sas: Use Block layer API to check SCSI device in-flight IO requests

Remove usage of device_busy counter from driver. Instead of device_busy
counter now driver uses 'nr_active' counter of request_queue to get the
number of inflight request for a LUN.

Link: https://lore.kernel.org/r/1579000882-20246-11-git-send-email-anand.lodnoor@broadcom.com
Link : https://patchwork.kernel.org/patch/11249297/
Signed-off-by: Chandrakanth Patil <chandrakanth.patil@broadcom.com>
Signed-off-by: Anand Lodnoor <anand.lodnoor@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit 4d1634b8d12ecb844f0d1af0b79c703cf9911484)
Signed-off-by: Michael Reed <Michael.Reed@canonical.com>
BugLink: https://bugs.launchpad.net/bugs/1863581
Update the Megaraid_sas driver to version 07.713.01.00-rc1 from 07.710.50.00-rc1

Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>

scsi: megaraid_sas: Limit the number of retries for the IOCTLs causing firmware fault

IOCTLs causing firmware fault may end up in failed controller resets and
finally killing the adapter.

This patch fixes this problem as stated below:

In OCR sequence, driver will attempt refiring pended IOCTLs upto two times.
If first two attempts fail, then in third attempt driver will return pended
IOCTLs with EBUSY status to application. These changes are done to ensure
if any of pended IOCTLs is causing firmware fault and resulting into OCR
failure, then in last attempt of OCR driver will refrain firing it to
firmware and saving adapter from being killed due to faulty IOCTL.

Link: https://lore.kernel.org/r/1579000882-20246-10-git-send-email-anand.lodnoor@broadcom.com
Signed-off-by: Sumit Saxena <sumit.saxena@broadcom.com>
Signed-off-by: Shivasharan S <shivasharan.srikanteshwara@broadcom.com>
Signed-off-by: Chandrakanth Patil <chandrakanth.patil@broadcom.com>
Signed-off-by: Anand Lodnoor <anand.lodnoor@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit 56ee0c585602d32058d19da0d3b664be5bc374ba)
Signed-off-by: Michael Reed <Michael.Reed@canonical.com>
BugLink: https://bugs.launchpad.net/bugs/1863581
Update the Megaraid_sas driver to version 07.713.01.00-rc1 from 07.710.50.00-rc1

Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>

scsi: megaraid_sas: Re-Define enum DCMD_RETURN_STATUS

DCMD_INIT is introduced to indicate the initial DCMD status, which was
earlier set to MFI status. DCMD_BUSY indicates the resource is busy or
locked.

Link: https://lore.kernel.org/r/1579000882-20246-8-git-send-email-anand.lodnoor@broadcom.com
Signed-off-by: Shivasharan S <shivasharan.srikanteshwara@broadcom.com>
Signed-off-by: Anand Lodnoor <anand.lodnoor@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit 201a810cc188af87d5798a383dc5115509e752ae)
Signed-off-by: Michael Reed <Michael.Reed@canonical.com>
BugLink: https://bugs.launchpad.net/bugs/1863581
Update the Megaraid_sas driver to version 07.713.01.00-rc1 from 07.710.50.00-rc1

Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>

scsi: megaraid_sas: Do not set HBA Operational if FW is not in operational state

After issuing a adapter reset, driver blindly used to set adprecovery flag
to OPERATIONAL state. Add a check to see if the FW is operational before
setting the flag and marking reset adapter successful.

Link: https://lore.kernel.org/r/1579000882-20246-7-git-send-email-anand.lodnoor@broadcom.com
Signed-off-by: Shivasharan S <shivasharan.srikanteshwara@broadcom.com>
Signed-off-by: Anand Lodnoor <anand.lodnoor@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit eeb63c23ffe1704990202af279400bf2b448ad89)
Signed-off-by: Michael Reed <Michael.Reed@canonical.com>
BugLink: https://bugs.launchpad.net/bugs/1863581
Update the Megaraid_sas driver to version 07.713.01.00-rc1 from 07.710.50.00-rc1

Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>

scsi: megaraid_sas: Do not kill HBA if JBOD Seqence map or RAID map is disabled

At the time of firmware initialization, if JBOD map or RAID map is not
available, driver can function without these features in a limited
functionality mode.

Link: https://lore.kernel.org/r/1579000882-20246-6-git-send-email-anand.lodnoor@broadcom.com
Signed-off-by: Shivasharan S <shivasharan.srikanteshwara@broadcom.com>
Signed-off-by: Anand Lodnoor <anand.lodnoor@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit 9330a0fd827a02234ebdb536810d6adbbb6a2aaa)
Signed-off-by: Michael Reed <Michael.Reed@canonical.com>
BugLink: https://bugs.launchpad.net/bugs/1863581
Update the Megaraid_sas driver to version 07.713.01.00-rc1 from 07.710.50.00-rc1

Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>

scsi: megaraid_sas: Do not kill host bus adapter, if adapter is already dead

Link: https://lore.kernel.org/r/1579000882-20246-5-git-send-email-anand.lodnoor@broadcom.com
Signed-off-by: Shivasharan S <shivasharan.srikanteshwara@broadcom.com>
Signed-off-by: Anand Lodnoor <anand.lodnoor@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit eb974f34bb9daa4b7309800beb596ee173f74eda)
Signed-off-by: Michael Reed <Michael.Reed@canonical.com>
BugLink: https://bugs.launchpad.net/bugs/1863581
Update the Megaraid_sas driver to version 07.713.01.00-rc1 from 07.710.50.00-rc1

Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>

scsi: megaraid_sas: Update optimal queue depth for SAS and NVMe devices

Ideally, optimal queue depth will be provided by firmware. The driver
defines will be used as a fallback mechanism in case the FW assisted QD is
not supported. The driver defined values provide optimal queue depth for
most of the drives and the workloads, as is learned from the firmware
assisted QD results.

Link: https://lore.kernel.org/r/1579000882-20246-4-git-send-email-anand.lodnoor@broadcom.com
Signed-off-by: Chandrakanth Patil <chandrakanth.patil@broadcom.com>
Signed-off-by: Anand Lodnoor <anand.lodnoor@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit 6e73550670ed1c07779706bb6cf61b99c871fc42)
Signed-off-by: Michael Reed <Michael.Reed@canonical.com>
BugLink: https://bugs.launchpad.net/bugs/1863581
Update the Megaraid_sas driver to version 07.713.01.00-rc1 from 07.710.50.00-rc1

Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>

scsi: megaraid_sas: Set no_write_same only for Virtual Disk

Disable WRITE_SAME (no_write_same) for Virtual Disks only. For System PDs
and EPDs (Enhanced PDs), WRITE_SAME need not be disabled by default.

Link: https://lore.kernel.org/r/1579000882-20246-3-git-send-email-anand.lodnoor@broadcom.com
Signed-off-by: Anand Lodnoor <anand.lodnoor@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit a7faf81d7858b504279713d6cb98053f0ff00082)
Signed-off-by: Michael Reed <Michael.Reed@canonical.com>
BugLink: https://bugs.launchpad.net/bugs/1863581
Update the Megaraid_sas driver to version 07.713.01.00-rc1 from 07.710.50.00-rc1

Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>

scsi: megaraid_sas: Reset adapter if FW is not in READY state after device resume

After device resume we expect the firmware to be in READY state.
Transition to READY might fail due to unhandled exceptions, such as an
internal error or a hardware failure. Retry initiating chip reset and wait
for the controller to come to ready state.

Link: https://lore.kernel.org/r/1579000882-20246-2-git-send-email-anand.lodnoor@broadcom.com
Signed-off-by: Chandrakanth Patil <chandrakanth.patil@broadcom.com>
Signed-off-by: Anand Lodnoor <anand.lodnoor@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit 499e7246d6daa7c2655958e81febfbd76af1bc75)
Signed-off-by: Michael Reed <Michael.Reed@canonical.com>
BugLink: https://bugs.launchpad.net/bugs/1863581
Update the Megaraid_sas driver to version 07.713.01.00-rc1 from 07.710.50.00-rc1

Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>

scsi: megaraid_sas: Make poll_aen_lock static

Fix sparse warning:

drivers/scsi/megaraid/megaraid_sas_base.c:187:12:
warning: symbol 'poll_aen_lock' was not declared. Should it be static?

Link: https://lore.kernel.org/r/20191125144454.22680-1-yuehaibing@huawei.com
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit 73374b39b01e91b90ff683dc259b89f528c4f367)
Signed-off-by: Michael Reed <Michael.Reed@canonical.com>
BugLink: https://bugs.launchpad.net/bugs/1863581
Update the Megaraid_sas driver to version 07.713.01.00-rc1 from 07.710.50.00-rc1

Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>

compat_ioctl: use correct compat_ptr() translation in drivers

A handful of drivers all have a trivial wrapper around their ioctl
handler, but don't call the compat_ptr() conversion function at the
moment. In practice this does not matter, since none of them are used
on the s390 architecture and for all other architectures, compat_ptr()
does not do anything, but using the new compat_ptr_ioctl()
helper makes it more correct in theory, and simplifies the code.

I checked that all ioctl handlers in these files are compatible
and take either pointer arguments or no argument.

Acked-by: Al Viro <viro@zeniv.linux.org.uk>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Acked-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Acked-by: Felipe Balbi <felipe.balbi@linux.intel.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
(cherry picked from commit 01b8bca81e181ccca475e1fdb92ebb00d9d9b547)
Signed-off-by: Michael Reed <Michael.Reed@canonical.com>
BugLink: https://bugs.launchpad.net/bugs/1863581
Update the Megaraid_sas driver to version 07.713.01.00-rc1 from 07.710.50.00-rc1

Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>

scsi: megaraid_sas: remove unused variables 'debugBlk','fusion'

Fixes gcc '-Wunused-but-set-variable' warning:

drivers/scsi/megaraid/megaraid_sas_fp.c: In function MR_GetSpanBlock:
drivers/scsi/megaraid/megaraid_sas_fp.c:400:16: warning: variable debugBlk set but not used [-Wunused-but-set-variable]
drivers/scsi/megaraid/megaraid_sas_fp.c: In function mr_spanset_get_phy_params:
drivers/scsi/megaraid/megaraid_sas_fp.c:713:25: warning: variable fusion set but not used [-Wunused-but-set-variable]
drivers/scsi/megaraid/megaraid_sas_fp.c: In function MR_GetPhyParams:
drivers/scsi/megaraid/megaraid_sas_fp.c:815:25: warning: variable fusion set but not used [-Wunused-but-set-variable]

'debugBlk' is introduced by commit 9c915a8c99bc ("[SCSI] megaraid_sas:
Add 9565/9285 specific code"), but never used, so remove it

'fusion' is not used since commit c365178f3147 ("scsi: megaraid_sas:
use adapter_type for all gen controllers")

Link: https://lore.kernel.org/r/1570605824-89133-1-git-send-email-zhengbin13@huawei.com
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: zhengbin <zhengbin13@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit 8cfb8e40d686a756428c627dccfaff28ca70dd8b)
Signed-off-by: Michael Reed <Michael.Reed@canonical.com>
BugLink: https://bugs.launchpad.net/bugs/1863581
Update the Megaraid_sas driver to version 07.713.01.00-rc1 from 07.710.50.00-rc1

Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>

scsi: megaraid_sas: Unique names for MSI-X vectors

Currently, MSI-X vectors name appears in /proc/interrupts is "megasas"
which is same for all the vectors. This patch provides a unique name for
all megaraid_sas controllers and their associated MSI-X interrupts.

Link: https://lore.kernel.org/r/20191007051828.12294-1-chandrakanth.patil@broadcom.com
Suggested-by: Konstantin Shalygin <k0ste@k0ste.ru>
Signed-off-by: Sumit Saxena <sumit.saxena@broadcom.com>
Signed-off-by: Chandrakanth Patil <chandrakanth.patil@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit ff7ca7fd03ce7dc225b3fc653d6877d2485c5716)
Signed-off-by: Michael Reed <Michael.Reed@canonical.com>
BugLink: https://bugs.launchpad.net/bugs/1863581
Update the Megaraid_sas driver to version 07.713.01.00-rc1 from 07.710.50.00-rc1

Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>

UBUNTU: Start new release

Ignore: yes
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>

UBUNTU: Ubuntu-5.4.0-21.25

Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>

UBUNTU: SAUCE: bpf: undo incorrect __reg_bound_offset32 handling

Anatoly has been fuzzing with kBdysch harness and reported a hang in one
of the outcomes again:

  0: (b7) r0 = 808464432
  1: (7f) r0 >>= r0
  2: (14) w0 -= 808464432
  3: (07) r0 += 808464432
  4: (b7) r1 = 808464432
  5: (de) if w1 s<= w0 goto pc+0
   R0_w=invP(id=0,umin_value=808464432,umax_value=5103431727,var_off=(0x30303020;0x10000001f)) R1_w=invP808464432 R10=fp0
  6: (07) r0 += -2144337872
  7: (14) w0 -= -1607454672
  8: (25) if r0 > 0x30303030 goto pc+0
   R0_w=invP(id=0,umin_value=271581184,umax_value=271581311,var_off=(0x10300000;0x7f)) R1_w=invP808464432 R10=fp0
  9: (76) if w0 s>= 0x303030 goto pc+2
  12: (95) exit

  from 8 to 9: safe

  from 5 to 6: R0_w=invP(id=0,umin_value=808464432,umax_value=5103431727,var_off=(0x30303020;0x10000001f)) R1_w=invP808464432 R10=fp0
  6: (07) r0 += -2144337872
  7: (14) w0 -= -1607454672
  8: (25) if r0 > 0x30303030 goto pc+0
   R0_w=invP(id=0,umin_value=271581184,umax_value=271581311,var_off=(0x10300000;0x7f)) R1_w=invP808464432 R10=fp0
  9: safe

  from 8 to 9: safe
  verification time 589 usec
  stack depth 0
  processed 17 insns (limit 1000000) [...]

The underlying program was xlated as follows:

  # bpftool p d x i 9
   0: (b7) r0 = 808464432
   1: (7f) r0 >>= r0
   2: (14) w0 -= 808464432
   3: (07) r0 += 808464432
   4: (b7) r1 = 808464432
   5: (de) if w1 s<= w0 goto pc+0
   6: (07) r0 += -2144337872
   7: (14) w0 -= -1607454672
   8: (25) if r0 > 0x30303030 goto pc+0
   9: (76) if w0 s>= 0x303030 goto pc+2
  10: (05) goto pc-1
  11: (05) goto pc-1
  12: (95) exit

The verifier rewrote original instructions it recognized as dead code with
'goto pc-1', but reality differs from verifier simulation in that we're
actually able to trigger a hang due to hitting the 'goto pc-1' instructions.

Taking different examples to make the issue more obvious: in this example
we're probing bounds on a completely unknown scalar variable in r1:

  [...]
  5: R0_w=inv1 R1_w=inv(id=0) R10=fp0
  5: (18) r2 = 0x4000000000
  7: R0_w=inv1 R1_w=inv(id=0) R2_w=inv274877906944 R10=fp0
  7: (18) r3 = 0x2000000000
  9: R0_w=inv1 R1_w=inv(id=0) R2_w=inv274877906944 R3_w=inv137438953472 R10=fp0
  9: (18) r4 = 0x400
  11: R0_w=inv1 R1_w=inv(id=0) R2_w=inv274877906944 R3_w=inv137438953472 R4_w=inv1024 R10=fp0
  11: (18) r5 = 0x200
  13: R0_w=inv1 R1_w=inv(id=0) R2_w=inv274877906944 R3_w=inv137438953472 R4_w=inv1024 R5_w=inv512 R10=fp0
  13: (2d) if r1 > r2 goto pc+4
   R0_w=inv1 R1_w=inv(id=0,umax_value=274877906944,var_off=(0x0; 0x7fffffffff)) R2_w=inv274877906944 R3_w=inv137438953472 R4_w=inv1024 R5_w=inv512 R10=fp0
  14: R0_w=inv1 R1_w=inv(id=0,umax_value=274877906944,var_off=(0x0; 0x7fffffffff)) R2_w=inv274877906944 R3_w=inv137438953472 R4_w=inv1024 R5_w=inv512 R10=fp0
  14: (ad) if r1 < r3 goto pc+3
   R0_w=inv1 R1_w=inv(id=0,umin_value=137438953472,umax_value=274877906944,var_off=(0x0; 0x7fffffffff)) R2_w=inv274877906944 R3_w=inv137438953472 R4_w=inv1024 R5_w=inv512 R10=fp0
  15: R0=inv1 R1=inv(id=0,umin_value=137438953472,umax_value=274877906944,var_off=(0x0; 0x7fffffffff)) R2=inv274877906944 R3=inv137438953472 R4=inv1024 R5=inv512 R10=fp0
  15: (2e) if w1 > w4 goto pc+2
   R0=inv1 R1=inv(id=0,umin_value=137438953472,umax_value=274877906944,var_off=(0x0; 0x7f00000000)) R2=inv274877906944 R3=inv137438953472 R4=inv1024 R5=inv512 R10=fp0
  16: R0=inv1 R1=inv(id=0,umin_value=137438953472,umax_value=274877906944,var_off=(0x0; 0x7f00000000)) R2=inv274877906944 R3=inv137438953472 R4=inv1024 R5=inv512 R10=fp0
  16: (ae) if w1 < w5 goto pc+1
   R0=inv1 R1=inv(id=0,umin_value=137438953472,umax_value=274877906944,var_off=(0x0; 0x7f00000000)) R2=inv274877906944 R3=inv137438953472 R4=inv1024 R5=inv512 R10=fp0
  [...]

We're first probing lower/upper bounds via jmp64, later we do a similar
check via jmp32 and examine the resulting var_off there. After fall-through
in insn 14, we get the following bounded r1 with 0x7fffffffff unknown marked
bits in the variable section.

Thus, after knowing r1 <= 0x4000000000 and r1 >= 0x2000000000:

  max: 0b100000000000000000000000000000000000000 / 0x4000000000
  var: 0b111111111111111111111111111111111111111 / 0x7fffffffff
  min: 0b010000000000000000000000000000000000000 / 0x2000000000

Now, in insn 15 and 16, we perform a similar probe with lower/upper bounds
in jmp32.

Thus, after knowing r1 <= 0x4000000000 and r1 >= 0x2000000000 and
                    w1 <= 0x400        and w1 >= 0x200:

  max: 0b100000000000000000000000000000000000000 / 0x4000000000
  var: 0b111111100000000000000000000000000000000 / 0x7f00000000
  min: 0b010000000000000000000000000000000000000 / 0x2000000000

The lower/upper bounds haven't changed since they have high bits set in
u64 space and the jmp32 tests can only refine bounds in the low bits.

However, for the var part the expectation would have been 0x7f000007ff
or something less precise up to 0x7fffffffff. A outcome of 0x7f00000000
is not correct since it would contradict the earlier probed bounds
where we know that the result should have been in [0x200,0x400] in u32
space. Therefore, tests with such info will lead to wrong verifier
assumptions later on like falsely predicting conditional jumps to be
always taken, etc.

The issue here is that __reg_bound_offset32()'s implementation from
commit 581738a681b6 ("bpf: Provide better register bounds after jmp32
instructions") makes an incorrect range assumption:

  static void __reg_bound_offset32(struct bpf_reg_state *reg)
  {
        u64 mask = 0xffffFFFF;
        struct tnum range = tnum_range(reg->umin_value & mask,
                                       reg->umax_value & mask);
        struct tnum lo32 = tnum_cast(reg->var_off, 4);
        struct tnum hi32 = tnum_lshift(tnum_rshift(reg->var_off, 32), 32);

        reg->var_off = tnum_or(hi32, tnum_intersect(lo32, range));
  }

In the above walk-through example, __reg_bound_offset32() as-is chose
a range after masking with 0xffffffff of [0x0,0x0] since umin:0x2000000000
and umax:0x4000000000 and therefore the lo32 part was clamped to 0x0 as
well. However, in the umin:0x2000000000 and umax:0x4000000000 range above
we'd end up with an actual possible interval of [0x0,0xffffffff] for u32
space instead.

In case of the original reproducer, the situation looked as follows at
insn 5 for r0:

  [...]
  5: R0_w=invP(id=0,umin_value=808464432,umax_value=5103431727,var_off=(0x0; 0x1ffffffff)) R1_w=invP808464432 R10=fp0
                               0x30303030           0x13030302f
  5: (de) if w1 s<= w0 goto pc+0
   R0_w=invP(id=0,umin_value=808464432,umax_value=5103431727,var_off=(0x30303020; 0x10000001f)) R1_w=invP808464432 R10=fp0
                             0x30303030           0x13030302f
  [...]

After the fall-through, we similarly forced the var_off result into
the wrong range [0x30303030,0x3030302f] suggesting later on that fixed
bits must only be of 0x30303020 with 0x10000001f unknowns whereas such
assumption can only be made when both bounds in hi32 range match.

Originally, I was thinking to fix this by moving reg into a temp reg and
use proper coerce_reg_to_size() helper on the temp reg where we can then
based on that define the range tnum for later intersection:

  static void __reg_bound_offset32(struct bpf_reg_state *reg)
  {
        struct bpf_reg_state tmp = *reg;
        struct tnum lo32, hi32, range;

        coerce_reg_to_size(&tmp, 4);
        range = tnum_range(tmp.umin_value, tmp.umax_value);
        lo32 = tnum_cast(reg->var_off, 4);
        hi32 = tnum_lshift(tnum_rshift(reg->var_off, 32), 32);
        reg->var_off = tnum_or(hi32, tnum_intersect(lo32, range));
  }

In the case of the concrete example, this gives us a more conservative unknown
section. Thus, after knowing r1 <= 0x4000000000 and r1 >= 0x2000000000 and
                             w1 <= 0x400        and w1 >= 0x200:

  max: 0b100000000000000000000000000000000000000 / 0x4000000000
  var: 0b111111111111111111111111111111111111111 / 0x7fffffffff
  min: 0b010000000000000000000000000000000000000 / 0x2000000000

However, above new __reg_bound_offset32() has no effect on refining the
knowledge of the register contents. Meaning, if the bounds in hi32 range
mismatch we'll get the identity function given the range reg spans
[0x0,0xffffffff] and we cast var_off into lo32 only to later on binary
or it again with the hi32.

Likewise, if the bounds in hi32 range match, then we mask both bounds
with 0xffffffff, use the resulting umin/umax for the range to later
intersect the lo32 with it. However, _prior_ called __reg_bound_offset()
did already such intersection on the full reg and we therefore would only
repeat the same operation on the lo32 part twice.

Given this has no effect and the original commit had false assumptions,
this patch reverts the code entirely. The bounds refinement would need
a significantly more complex approach which is currently being worked on
via [0] (but far from stable material). After the revert, the original
reported program gets correctly rejected as follows:

  1: (7f) r0 >>= r0
  2: (14) w0 -= 808464432
  3: (07) r0 += 808464432
  4: (b7) r1 = 808464432
  5: (de) if w1 s<= w0 goto pc+0
   R0_w=invP(id=0,umin_value=808464432,umax_value=5103431727,var_off=(0x0; 0x1ffffffff)) R1_w=invP808464432 R10=fp0
  6: (07) r0 += -2144337872
  7: (14) w0 -= -1607454672
  8: (25) if r0 > 0x30303030 goto pc+0
   R0_w=invP(id=0,umax_value=808464432,var_off=(0x0; 0x3fffffff)) R1_w=invP808464432 R10=fp0
  9: (76) if w0 s>= 0x303030 goto pc+2
   R0=invP(id=0,umax_value=3158063,var_off=(0x0; 0x3fffff)) R1=invP808464432 R10=fp0
  10: (30) r0 = *(u8 *)skb[808464432]
  BPF_LD_[ABS|IND] uses reserved fields
  processed 11 insns (limit 1000000) [...]

  [0] https://lore.kernel.org/bpf/158507130343.15666.8018068546764556975.stgit@john-Precision-5820-Tower/T/

Fixes: 581738a681b6 ("bpf: Provide better register bounds after jmp32 instructions")
Reported-by: Anatoly Trosinenko <anatoly.trosinenko@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
CVE-2020-8835
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>

UBUNTU: Start new release

Ignore: yes
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>

UBUNTU: Ubuntu-5.4.0-20.24

Signed-off-by: Seth Forshee <seth.forshee@canonical.com>

UBUNTU: SAUCE: (lockdown) Reduce lockdown level to INTEGRITY for secure boot

Setting this to CONFIDENTIALITY under secure boot restricts many
useful features. Reduce this to INTEGRITY, which provides the
necessary level of restriction to protect the running kernel.

Signed-off-by: Seth Forshee <seth.forshee@canonical.com>

UBUNTU: Start new release

Ignore: yes
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>

UBUNTU: Ubuntu-5.4.0-19.23

Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>

UBUNTU: [Config] gcc version updateconfigs

Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>

UBUNTU: link-to-tracker: update tracking bug

BugLink: https://bugs.launchpad.net/bugs/1868347
Properties: no-test-build
Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>

UBUNTU: update dkms package versions

BugLink: https://bugs.launchpad.net/bugs/1786013
Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>

UBUNTU: [Packaging] update helper scripts

BugLink: https://bugs.launchpad.net/bugs/1786013
Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>

Linux 5.4.27

BugLink: https://bugs.launchpad.net/bugs/1868538
Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>

ipv4: ensure rcu_read_lock() in cipso_v4_error()

BugLink: https://bugs.launchpad.net/bugs/1868538
commit 3e72dfdf8227b052393f71d820ec7599909dddc2 upstream.

Similarly to commit c543cb4a5f07 ("ipv4: ensure rcu_read_lock() in
ipv4_link_failure()"), __ip_options_compile() must be called under rcu
protection.

Fixes: 3da1ed7ac398 ("net: avoid use IPCB in cipso_v4_error")
Suggested-by: Guillaume Nault <gnault@redhat.com>
Signed-off-by: Matteo Croce <mcroce@redhat.com>
Acked-by: Paul Moore <paul@paul-moore.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>

ARM: 8961/2: Fix Kbuild issue caused by per-task stack protector GCC plugin

BugLink: https://bugs.launchpad.net/bugs/1868538
commit 89604523a76eb3e13014b2bdab7f8870becee284 upstream.

When using plugins, GCC requires that the -fplugin= options precedes
any of its plugin arguments appearing on the command line as well.
This is usually not a concern, but as it turns out, this requirement
is causing some issues with ARM's per-task stack protector plugin
and Kbuild's implementation of $(cc-option).

When the per-task stack protector plugin is enabled, and we tweak
the implementation of cc-option not to pipe the stderr output of
GCC to /dev/null, the following output is generated when GCC is
executed in the context of cc-option:

  cc1: error: plugin arm_ssp_per_task_plugin should be specified before \
         -fplugin-arg-arm_ssp_per_task_plugin-tso=1 in the command line
  cc1: error: plugin arm_ssp_per_task_plugin should be specified before \
         -fplugin-arg-arm_ssp_per_task_plugin-offset=24 in the command line

These errors will cause any option passed to cc-option to be treated
as unsupported, which is obviously incorrect.

The cause of this issue is the fact that the -fplugin= argument is
added to GCC_PLUGINS_CFLAGS, whereas the arguments above are added
to KBUILD_CFLAGS, and the contents of the former get filtered out of
the latter before being passed to the GCC running the cc-option test,
and so the -fplugin= option does not appear at all on the GCC command
line.

Adding the arguments to GCC_PLUGINS_CFLAGS instead of KBUILD_CFLAGS
would be the correct approach here, if it weren't for the fact that we
are using $(eval) to defer the moment that they are added until after
asm-offsets.h is generated, which is after the point where the contents
of GCC_PLUGINS_CFLAGS are added to KBUILD_CFLAGS. So instead, we have
to add our plugin arguments to both.

For similar reasons, we cannot append DISABLE_ARM_SSP_PER_TASK_PLUGIN
to KBUILD_CFLAGS, as it will be passed to GCC when executing in the
context of cc-option, whereas the other plugin arguments will have
been filtered out, resulting in a similar error and false negative
result as above. So add it to ccflags-y instead.

Fixes: 189af4657186da08 ("ARM: smp: add support for per-task stack canaries")
Reported-by: Merlijn Wajer <merlijn@wizzup.org>
Tested-by: Tony Lindgren <tony@atomide.com>
Acked-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>

HID: add ALWAYS_POLL quirk to lenovo pixart mouse

BugLink: https://bugs.launchpad.net/bugs/1868538
commit 819d578d51d0ce73f06e35d69395ef55cd683a74 upstream.

A lenovo pixart mouse (17ef:608d) is afflicted common the the malfunction
where it disconnects and reconnects every minute--each time incrementing
the device number. This patch adds the device id of the device and
specifies that it needs the HID_QUIRK_ALWAYS_POLL quirk in order to
work properly.

Signed-off-by: Tony Fischetti <tony.fischetti@gmail.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>

HID: google: add moonball USB id

BugLink: https://bugs.launchpad.net/bugs/1868538
commit 58322a1590fc189a8e1e349d309637d4a4942840 upstream.

Add 1 additional hammer-like device.

Signed-off-by: Chen-Tsung Hsieh <chentsung@chromium.org>
Reviewed-by: Nicolas Boichat <drinkcat@chromium.org>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>

mm: slub: add missing TID bump in kmem_cache_alloc_bulk()

BugLink: https://bugs.launchpad.net/bugs/1868538
commit fd4d9c7d0c71866ec0c2825189ebd2ce35bd95b8 upstream.

When kmem_cache_alloc_bulk() attempts to allocate N objects from a percpu
freelist of length M, and N > M > 0, it will first remove the M elements
from the percpu freelist, then call ___slab_alloc() to allocate the next
element and repopulate the percpu freelist. ___slab_alloc() can re-enable
IRQs via allocate_slab(), so the TID must be bumped before ___slab_alloc()
to properly commit the freelist head change.

Fix it by unconditionally bumping c->tid when entering the slowpath.

Cc: stable@vger.kernel.org
Fixes: ebe909e0fdb3 ("slub: improve bulk alloc strategy")
Signed-off-by: Jann Horn <jannh@google.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>

ARM: 8958/1: rename missed uaccess .fixup section

BugLink: https://bugs.launchpad.net/bugs/1868538
commit f87b1c49bc675da30d8e1e8f4b60b800312c7b90 upstream.

When the uaccess .fixup section was renamed to .text.fixup, one case was
missed. Under ld.bfd, the orphaned section was moved close to .text
(since they share the "ax" bits), so things would work normally on
uaccess faults. Under ld.lld, the orphaned section was placed outside
the .text section, making it unreachable.

Link: https://github.com/ClangBuiltLinux/linux/issues/282
Link: https://bugs.chromium.org/p/chromium/issues/detail?id=1020633#c44
Link: https://lore.kernel.org/r/nycvar.YSQ.7.76.1912032147340.17114@knanqh.ubzr
Link: https://lore.kernel.org/lkml/202002071754.F5F073F1D@keescook/
Fixes: c4a84ae39b4a5 ("ARM: 8322/1: keep .text and .fixup regions closer together")
Cc: stable@vger.kernel.org
Signed-off-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>

ARM: 8957/1: VDSO: Match ARMv8 timer in cntvct_functional()

BugLink: https://bugs.launchpad.net/bugs/1868538
commit 45939ce292b4b11159719faaf60aba7d58d5fe33 upstream.

It is possible for a system with an ARMv8 timer to run a 32-bit kernel.
When this happens we will unconditionally have the vDSO code remove the
__vdso_gettimeofday and __vdso_clock_gettime symbols because
cntvct_functional() returns false since it does not match that
compatibility string.

Fixes: ecf99a439105 ("ARM: 8331/1: VDSO initialization, mapping, and synchronization")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>

net: qrtr: fix len of skb_put_padto in qrtr_node_enqueue

BugLink: https://bugs.launchpad.net/bugs/1868538
commit ce57785bf91b1ceaef4f4bffed8a47dc0919c8da upstream.

The len used for skb_put_padto is wrong, it need to add len of hdr.

In qrtr_node_enqueue, local variable size_t len is assign with
skb->len, then skb_push(skb, sizeof(*hdr)) will add skb->len with
sizeof(*hdr), so local variable size_t len is not same with skb->len
after skb_push(skb, sizeof(*hdr)).

Then the purpose of skb_put_padto(skb, ALIGN(len, 4)) is to add add
pad to the end of the skb's data if skb->len is not aligned to 4, but
unfortunately it use len instead of skb->len, at this line, skb->len
is 32 bytes(sizeof(*hdr)) more than len, for example, len is 3 bytes,
then skb->len is 35 bytes(3 + 32), and ALIGN(len, 4) is 4 bytes, so
__skb_put_padto will do nothing after check size(35) < len(4), the
correct value should be 36(sizeof(*hdr) + ALIGN(len, 4) = 32 + 4),
then __skb_put_padto will pass check size(35) < len(36) and add 1 byte
to the end of skb's data, then logic is correct.

function of skb_push:
void *skb_push(struct sk_buff *skb, unsigned int len)
{
skb->data -= len;
skb->len += len;
if (unlikely(skb->data < skb->head))
skb_under_panic(skb, len, __builtin_return_address(0));
return skb->data;
}

function of skb_put_padto
static inline int skb_put_padto(struct sk_buff *skb, unsigned int len)
{
return __skb_put_padto(skb, len, true);
}

function of __skb_put_padto
static inline int __skb_put_padto(struct sk_buff *skb, unsigned int len,
bool free_on_error)
{
unsigned int size = skb->len;

if (unlikely(size < len)) {
len -= size;
if (__skb_pad(skb, len, free_on_error))
return -ENOMEM;
__skb_put(skb, len);
}
return 0;
}

Signed-off-by: Carl Huang <cjhuang@codeaurora.org>
Signed-off-by: Wen Gong <wgong@codeaurora.org>
Cc: Doug Anderson <dianders@chromium.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>

blk-mq: insert flush request to the front of dispatch queue

BugLink: https://bugs.launchpad.net/bugs/1868538
[ Upstream commit cc3200eac4c5eb11c3f34848a014d1f286316310 ]

commit 01e99aeca397 ("blk-mq: insert passthrough request into
hctx->dispatch directly") may change to add flush request to the tail
of dispatch by applying the 'add_head' parameter of
blk_mq_sched_insert_request.

Turns out this way causes performance regression on NCQ controller because
flush is non-NCQ command, which can't be queued when there is any in-flight
NCQ command. When adding flush rq to the front of hctx->dispatch, it is
easier to introduce extra time to flush rq's latency compared with adding
to the tail of dispatch queue because of S_SCHED_RESTART, then chance of
flush merge is increased, and less flush requests may be issued to
controller.

So always insert flush request to the front of dispatch queue just like
before applying commit 01e99aeca397 ("blk-mq: insert passthrough request
into hctx->dispatch directly").

Cc: Damien Le Moal <Damien.LeMoal@wdc.com>
Cc: Shinichiro Kawasaki <shinichiro.kawasaki@wdc.com>
Reported-by: Shinichiro Kawasaki <shinichiro.kawasaki@wdc.com>
Fixes: 01e99aeca397 ("blk-mq: insert passthrough request into hctx->dispatch directly")
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>

jbd2: fix data races at struct journal_head

BugLink: https://bugs.launchpad.net/bugs/1868538
[ Upstream commit 6c5d911249290f41f7b50b43344a7520605b1acb ]

journal_head::b_transaction and journal_head::b_next_transaction could
be accessed concurrently as noticed by KCSAN,

LTP: starting fsync04
/dev/zero: Can't open blockdev
EXT4-fs (loop0): mounting ext3 file system using the ext4 subsystem
EXT4-fs (loop0): mounted filesystem with ordered data mode. Opts: (null)
==================================================================
BUG: KCSAN: data-race in __jbd2_journal_refile_buffer [jbd2] / jbd2_write_access_granted [jbd2]

write to 0xffff99f9b1bd0e30 of 8 bytes by task 25721 on cpu 70:
  __jbd2_journal_refile_buffer+0xdd/0x210 [jbd2]
  __jbd2_journal_refile_buffer at fs/jbd2/transaction.c:2569
  jbd2_journal_commit_transaction+0x2d15/0x3f20 [jbd2]
  (inlined by) jbd2_journal_commit_transaction at fs/jbd2/commit.c:1034
  kjournald2+0x13b/0x450 [jbd2]
  kthread+0x1cd/0x1f0
  ret_from_fork+0x27/0x50

read to 0xffff99f9b1bd0e30 of 8 bytes by task 25724 on cpu 68:
  jbd2_write_access_granted+0x1b2/0x250 [jbd2]
  jbd2_write_access_granted at fs/jbd2/transaction.c:1155
  jbd2_journal_get_write_access+0x2c/0x60 [jbd2]
  __ext4_journal_get_write_access+0x50/0x90 [ext4]
  ext4_mb_mark_diskspace_used+0x158/0x620 [ext4]
  ext4_mb_new_blocks+0x54f/0xca0 [ext4]
  ext4_ind_map_blocks+0xc79/0x1b40 [ext4]
  ext4_map_blocks+0x3b4/0x950 [ext4]
  _ext4_get_block+0xfc/0x270 [ext4]
  ext4_get_block+0x3b/0x50 [ext4]
  __block_write_begin_int+0x22e/0xae0
  __block_write_begin+0x39/0x50
  ext4_write_begin+0x388/0xb50 [ext4]
  generic_perform_write+0x15d/0x290
  ext4_buffered_write_iter+0x11f/0x210 [ext4]
  ext4_file_write_iter+0xce/0x9e0 [ext4]
  new_sync_write+0x29c/0x3b0
  __vfs_write+0x92/0xa0
  vfs_write+0x103/0x260
  ksys_write+0x9d/0x130
  __x64_sys_write+0x4c/0x60
  do_syscall_64+0x91/0xb05
  entry_SYSCALL_64_after_hwframe+0x49/0xbe

5 locks held by fsync04/25724:
  #0: ffff99f9911093f8 (sb_writers#13){.+.+}, at: vfs_write+0x21c/0x260
  #1: ffff99f9db4c0348 (&sb->s_type->i_mutex_key#15){+.+.}, at: ext4_buffered_write_iter+0x65/0x210 [ext4]
  #2: ffff99f5e7dfcf58 (jbd2_handle){++++}, at: start_this_handle+0x1c1/0x9d0 [jbd2]
  #3: ffff99f9db4c0168 (&ei->i_data_sem){++++}, at: ext4_map_blocks+0x176/0x950 [ext4]
  #4: ffffffff99086b40 (rcu_read_lock){....}, at: jbd2_write_access_granted+0x4e/0x250 [jbd2]
irq event stamp: 1407125
hardirqs last  enabled at (1407125): [<ffffffff980da9b7>] __find_get_block+0x107/0x790
hardirqs last disabled at (1407124): [<ffffffff980da8f9>] __find_get_block+0x49/0x790
softirqs last  enabled at (1405528): [<ffffffff98a0034c>] __do_softirq+0x34c/0x57c
softirqs last disabled at (1405521): [<ffffffff97cc67a2>] irq_exit+0xa2/0xc0

Reported by Kernel Concurrency Sanitizer on:
CPU: 68 PID: 25724 Comm: fsync04 Tainted: G L 5.6.0-rc2-next-20200221+ #7
Hardware name: HPE ProLiant DL385 Gen10/ProLiant DL385 Gen10, BIOS A40 07/10/2019

The plain reads are outside of jh->b_state_lock critical section which result
in data races. Fix them by adding pairs of READ|WRITE_ONCE().

Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Qian Cai <cai@lca.pw>
Link: https://lore.kernel.org/r/20200222043111.2227-1-cai@lca.pw
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>

sfc: fix timestamp reconstruction at 16-bit rollover points

BugLink: https://bugs.launchpad.net/bugs/1868538
[ Upstream commit 23797b98909f34b75fd130369bde86f760db69d0 ]

We can't just use the top bits of the last sync event as they could be
off-by-one every 65,536 seconds, giving an error in reconstruction of
65,536 seconds.

This patch uses the difference in the bottom 16 bits (mod 2^16) to
calculate an offset that needs to be applied to the last sync event to
get to the current time.

Signed-off-by: Alexandru-Mihai Maftei <amaftei@solarflare.com>
Acked-by: Martin Habets <mhabets@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>

net: rmnet: fix packet forwarding in rmnet bridge mode

BugLink: https://bugs.launchpad.net/bugs/1868538
[ Upstream commit ad3cc31b599ea80f06b29ebdc18b3a39878a48d6 ]

Packet forwarding is not working in rmnet bridge mode.
Because when a packet is forwarded, skb_push() for an ethernet header
is needed. But it doesn't call skb_push().
So, the ethernet header will be lost.

Test commands:
    modprobe rmnet
    ip netns add nst
    ip netns add nst2
    ip link add veth0 type veth peer name veth1
    ip link add veth2 type veth peer name veth3
    ip link set veth1 netns nst
    ip link set veth3 netns nst2

    ip link add rmnet0 link veth0 type rmnet mux_id 1
    ip link set veth2 master rmnet0
    ip link set veth0 up
    ip link set veth2 up
    ip link set rmnet0 up
    ip a a 192.168.100.1/24 dev rmnet0

    ip netns exec nst ip link set veth1 up
    ip netns exec nst ip a a 192.168.100.2/24 dev veth1
    ip netns exec nst2 ip link set veth3 up
    ip netns exec nst2 ip a a 192.168.100.3/24 dev veth3
    ip netns exec nst2 ping 192.168.100.2

Fixes: 60d58f971c10 ("net: qualcomm: rmnet: Implement bridge mode")
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>

net: rmnet: fix bridge mode bugs

BugLink: https://bugs.launchpad.net/bugs/1868538
[ Upstream commit d939b6d30bea1a2322bc536b12be0a7c4c2bccd7 ]

In order to attach a bridge interface to the rmnet interface,
"master" operation is used.
(e.g. ip link set dummy1 master rmnet0)
But, in the rmnet_add_bridge(), which is a callback of ->ndo_add_slave()
doesn't register lower interface.
So, ->ndo_del_slave() doesn't work.
There are other problems too.
1. It couldn't detect circular upper/lower interface relationship.
2. It couldn't prevent stack overflow because of too deep depth
of upper/lower interface
3. It doesn't check the number of lower interfaces.
4. Panics because of several reasons.

The root problem of these issues is actually the same.
So, in this patch, these all problems will be fixed.

Test commands:
    modprobe rmnet
    ip link add dummy0 type dummy
    ip link add rmnet0 link dummy0 type rmnet mux_id 1
    ip link add dummy1 master rmnet0 type dummy
    ip link add dummy2 master rmnet0 type dummy
    ip link del rmnet0
    ip link del dummy2
    ip link del dummy1

Splat looks like:
[   41.867595][ T1164] general protection fault, probably for non-canonical address 0xdffffc0000000101I
[   41.869993][ T1164] KASAN: null-ptr-deref in range [0x0000000000000808-0x000000000000080f]
[   41.872950][ T1164] CPU: 0 PID: 1164 Comm: ip Not tainted 5.6.0-rc1+ #447
[   41.873915][ T1164] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
[   41.875161][ T1164] RIP: 0010:rmnet_unregister_bridge.isra.6+0x71/0xf0 [rmnet]
[   41.876178][ T1164] Code: 48 89 ef 48 89 c6 5b 5d e9 fc fe ff ff e8 f7 f3 ff ff 48 8d b8 08 08 00 00 48 ba 00 7
[   41.878925][ T1164] RSP: 0018:ffff8880c4d0f188 EFLAGS: 00010202
[   41.879774][ T1164] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000101
[   41.887689][ T1164] RDX: dffffc0000000000 RSI: ffffffffb8cf64f0 RDI: 0000000000000808
[   41.888727][ T1164] RBP: ffff8880c40e4000 R08: ffffed101b3c0e3c R09: 0000000000000001
[   41.889749][ T1164] R10: 0000000000000001 R11: ffffed101b3c0e3b R12: 1ffff110189a1e3c
[   41.890783][ T1164] R13: ffff8880c4d0f200 R14: ffffffffb8d56160 R15: ffff8880ccc2c000
[   41.891794][ T1164] FS:  00007f4300edc0c0(0000) GS:ffff8880d9c00000(0000) knlGS:0000000000000000
[   41.892953][ T1164] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   41.893800][ T1164] CR2: 00007f43003bc8c0 CR3: 00000000ca53e001 CR4: 00000000000606f0
[   41.894824][ T1164] Call Trace:
[   41.895274][ T1164]  ? rcu_is_watching+0x2c/0x80
[   41.895895][ T1164]  rmnet_config_notify_cb+0x1f7/0x590 [rmnet]
[   41.896687][ T1164]  ? rmnet_unregister_bridge.isra.6+0xf0/0xf0 [rmnet]
[   41.897611][ T1164]  ? rmnet_unregister_bridge.isra.6+0xf0/0xf0 [rmnet]
[   41.898508][ T1164]  ? __module_text_address+0x13/0x140
[   41.899162][ T1164]  notifier_call_chain+0x90/0x160
[   41.899814][ T1164]  rollback_registered_many+0x660/0xcf0
[   41.900544][ T1164]  ? netif_set_real_num_tx_queues+0x780/0x780
[   41.901316][ T1164]  ? __lock_acquire+0xdfe/0x3de0
[   41.901958][ T1164]  ? memset+0x1f/0x40
[   41.902468][ T1164]  ? __nla_validate_parse+0x98/0x1ab0
[   41.903166][ T1164]  unregister_netdevice_many.part.133+0x13/0x1b0
[   41.903988][ T1164]  rtnl_delete_link+0xbc/0x100
[ ... ]

Fixes: 60d58f971c10 ("net: qualcomm: rmnet: Implement bridge mode")
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>

net: rmnet: use upper/lower device infrastructure

BugLink: https://bugs.launchpad.net/bugs/1868538
[ Upstream commit 037f9cdf72fb8a7ff9ec2b5dd05336ec1492bdf1 ]

netdev_upper_dev_link() is useful to manage lower/upper interfaces.
And this function internally validates looping, maximum depth.
All or most virtual interfaces that could have a real interface
(e.g. macsec, macvlan, ipvlan etc.) use lower/upper infrastructure.

Test commands:
    modprobe rmnet
    ip link add dummy0 type dummy
    ip link add rmnet1 link dummy0 type rmnet mux_id 1
    for i in {2..100}
    do
        let A=$i-1
        ip link add rmnet$i link rmnet$A type rmnet mux_id $i
    done
    ip link del dummy0

The purpose of the test commands is to make stack overflow.

Splat looks like:
[   52.411438][ T1395] BUG: KASAN: slab-out-of-bounds in find_busiest_group+0x27e/0x2c00
[   52.413218][ T1395] Write of size 64 at addr ffff8880c774bde0 by task ip/1395
[   52.414841][ T1395]
[   52.430720][ T1395] CPU: 1 PID: 1395 Comm: ip Not tainted 5.6.0-rc1+ #447
[   52.496511][ T1395] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
[   52.513597][ T1395] Call Trace:
[   52.546516][ T1395]
[   52.558773][ T1395] Allocated by task 3171537984:
[   52.588290][ T1395] BUG: unable to handle page fault for address: ffffffffb999e260
[   52.589311][ T1395] #PF: supervisor read access in kernel mode
[   52.590529][ T1395] #PF: error_code(0x0000) - not-present page
[   52.591374][ T1395] PGD d6818067 P4D d6818067 PUD d6819063 PMD 0
[   52.592288][ T1395] Thread overran stack, or stack corrupted
[   52.604980][ T1395] Oops: 0000 [#1] SMP DEBUG_PAGEALLOC KASAN PTI
[   52.605856][ T1395] CPU: 1 PID: 1395 Comm: ip Not tainted 5.6.0-rc1+ #447
[   52.611764][ T1395] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
[   52.621520][ T1395] RIP: 0010:stack_depot_fetch+0x10/0x30
[   52.622296][ T1395] Code: ff e9 f9 fe ff ff 48 89 df e8 9c 1d 91 ff e9 ca fe ff ff cc cc cc cc cc cc cc 89 f8 0
[   52.627887][ T1395] RSP: 0018:ffff8880c774bb60 EFLAGS: 00010006
[   52.628735][ T1395] RAX: 00000000001f8880 RBX: ffff8880c774d140 RCX: 0000000000000000
[   52.631773][ T1395] RDX: 000000000000001d RSI: ffff8880c774bb68 RDI: 0000000000003ff0
[   52.649584][ T1395] RBP: ffffea00031dd200 R08: ffffed101b43e403 R09: ffffed101b43e403
[   52.674857][ T1395] R10: 0000000000000001 R11: ffffed101b43e402 R12: ffff8880d900e5c0
[   52.678257][ T1395] R13: ffff8880c774c000 R14: 0000000000000000 R15: dffffc0000000000
[   52.694541][ T1395] FS:  00007fe867f6e0c0(0000) GS:ffff8880da000000(0000) knlGS:0000000000000000
[   52.764039][ T1395] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   52.815008][ T1395] CR2: ffffffffb999e260 CR3: 00000000c26aa005 CR4: 00000000000606e0
[   52.862312][ T1395] Call Trace:
[   52.887133][ T1395] Modules linked in: dummy rmnet veth openvswitch nsh nf_conncount nf_nat nf_conntrack nf_dex
[   52.936749][ T1395] CR2: ffffffffb999e260
[   52.965695][ T1395] ---[ end trace 7e32ca99482dbb31 ]---
[   52.966556][ T1395] RIP: 0010:stack_depot_fetch+0x10/0x30
[   52.971083][ T1395] Code: ff e9 f9 fe ff ff 48 89 df e8 9c 1d 91 ff e9 ca fe ff ff cc cc cc cc cc cc cc 89 f8 0
[   53.003650][ T1395] RSP: 0018:ffff8880c774bb60 EFLAGS: 00010006
[   53.043183][ T1395] RAX: 00000000001f8880 RBX: ffff8880c774d140 RCX: 0000000000000000
[   53.076480][ T1395] RDX: 000000000000001d RSI: ffff8880c774bb68 RDI: 0000000000003ff0
[   53.093858][ T1395] RBP: ffffea00031dd200 R08: ffffed101b43e403 R09: ffffed101b43e403
[   53.112795][ T1395] R10: 0000000000000001 R11: ffffed101b43e402 R12: ffff8880d900e5c0
[   53.139837][ T1395] R13: ffff8880c774c000 R14: 0000000000000000 R15: dffffc0000000000
[   53.141500][ T1395] FS:  00007fe867f6e0c0(0000) GS:ffff8880da000000(0000) knlGS:0000000000000000
[   53.143343][ T1395] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   53.152007][ T1395] CR2: ffffffffb999e260 CR3: 00000000c26aa005 CR4: 00000000000606e0
[   53.156459][ T1395] Kernel panic - not syncing: Fatal exception
[   54.213570][ T1395] Shutting down cpus with NMI
[   54.354112][ T1395] Kernel Offset: 0x33000000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0x)
[   54.355687][ T1395] Rebooting in 5 seconds..

Fixes: b37f78f234bf ("net: qualcomm: rmnet: Fix crash on real dev unregistration")
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>

net: rmnet: do not allow to change mux id if mux id is duplicated

BugLink: https://bugs.launchpad.net/bugs/1868538
[ Upstream commit 1dc49e9d164cd7e11c81279c83db84a147e14740 ]

Basically, duplicate mux id isn't be allowed.
So, the creation of rmnet will be failed if there is duplicate mux id
is existing.
But, changelink routine doesn't check duplicate mux id.

Test commands:
    modprobe rmnet
    ip link add dummy0 type dummy
    ip link add rmnet0 link dummy0 type rmnet mux_id 1
    ip link add rmnet1 link dummy0 type rmnet mux_id 2
    ip link set rmnet1 type rmnet mux_id 1

Fixes: 23790ef12082 ("net: qualcomm: rmnet: Allow to configure flags for existing devices")
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>

net: rmnet: remove rcu_read_lock in rmnet_force_unassociate_device()

BugLink: https://bugs.launchpad.net/bugs/1868538
[ Upstream commit c026d970102e9af9958edefb4a015702c6aab636 ]

The notifier_call() of the slave interface removes rmnet interface with
unregister_netdevice_queue().
But, before calling unregister_netdevice_queue(), it acquires
rcu readlock.
In the RCU critical section, sleeping isn't be allowed.
But, unregister_netdevice_queue() internally calls synchronize_net(),
which would sleep.
So, suspicious RCU usage warning occurs.

Test commands:
    modprobe rmnet
    ip link add dummy0 type dummy
    ip link add dummy1 type dummy
    ip link add rmnet0 link dummy0 type rmnet mux_id 1
    ip link set dummy1 master rmnet0
    ip link del dummy0

Splat looks like:
[   79.639245][ T1195] =============================
[   79.640134][ T1195] WARNING: suspicious RCU usage
[   79.640852][ T1195] 5.6.0-rc1+ #447 Not tainted
[   79.641657][ T1195] -----------------------------
[   79.642472][ T1195] ./include/linux/rcupdate.h:273 Illegal context switch in RCU read-side critical section!
[   79.644043][ T1195]
[   79.644043][ T1195] other info that might help us debug this:
[   79.644043][ T1195]
[   79.645682][ T1195]
[   79.645682][ T1195] rcu_scheduler_active = 2, debug_locks = 1
[   79.646980][ T1195] 2 locks held by ip/1195:
[   79.647629][ T1195]  #0: ffffffffa3cf64f0 (rtnl_mutex){+.+.}, at: rtnetlink_rcv_msg+0x457/0x890
[   79.649312][ T1195]  #1: ffffffffa39256c0 (rcu_read_lock){....}, at: rmnet_config_notify_cb+0xf0/0x590 [rmnet]
[   79.651717][ T1195]
[   79.651717][ T1195] stack backtrace:
[   79.652650][ T1195] CPU: 3 PID: 1195 Comm: ip Not tainted 5.6.0-rc1+ #447
[   79.653702][ T1195] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
[   79.655037][ T1195] Call Trace:
[   79.655560][ T1195]  dump_stack+0x96/0xdb
[   79.656252][ T1195]  ___might_sleep+0x345/0x440
[   79.656994][ T1195]  synchronize_net+0x18/0x30
[   79.661132][ T1195]  netdev_rx_handler_unregister+0x40/0xb0
[   79.666266][ T1195]  rmnet_unregister_real_device+0x42/0xb0 [rmnet]
[   79.667211][ T1195]  rmnet_config_notify_cb+0x1f7/0x590 [rmnet]
[   79.668121][ T1195]  ? rmnet_unregister_bridge.isra.6+0xf0/0xf0 [rmnet]
[   79.669166][ T1195]  ? rmnet_unregister_bridge.isra.6+0xf0/0xf0 [rmnet]
[   79.670286][ T1195]  ? __module_text_address+0x13/0x140
[   79.671139][ T1195]  notifier_call_chain+0x90/0x160
[   79.671973][ T1195]  rollback_registered_many+0x660/0xcf0
[   79.672893][ T1195]  ? netif_set_real_num_tx_queues+0x780/0x780
[   79.675091][ T1195]  ? __lock_acquire+0xdfe/0x3de0
[   79.675825][ T1195]  ? memset+0x1f/0x40
[   79.676367][ T1195]  ? __nla_validate_parse+0x98/0x1ab0
[   79.677290][ T1195]  unregister_netdevice_many.part.133+0x13/0x1b0
[   79.678163][ T1195]  rtnl_delete_link+0xbc/0x100
[ ... ]

Fixes: ceed73a2cf4a ("drivers: net: ethernet: qualcomm: rmnet: Initial implementation")
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>

net: rmnet: fix suspicious RCU usage

BugLink: https://bugs.launchpad.net/bugs/1868538
[ Upstream commit 102210f7664442d8c0ce332c006ea90626df745b ]

rmnet_get_port() internally calls rcu_dereference_rtnl(),
which checks RTNL.
But rmnet_get_port() could be called by packet path.
The packet path is not protected by RTNL.
So, the suspicious RCU usage problem occurs.

Test commands:
    modprobe rmnet
    ip netns add nst
    ip link add veth0 type veth peer name veth1
    ip link set veth1 netns nst
    ip link add rmnet0 link veth0 type rmnet mux_id 1
    ip netns exec nst ip link add rmnet1 link veth1 type rmnet mux_id 1
    ip netns exec nst ip link set veth1 up
    ip netns exec nst ip link set rmnet1 up
    ip netns exec nst ip a a 192.168.100.2/24 dev rmnet1
    ip link set veth0 up
    ip link set rmnet0 up
    ip a a 192.168.100.1/24 dev rmnet0
    ping 192.168.100.2

Splat looks like:
[  146.630958][ T1174] WARNING: suspicious RCU usage
[  146.631735][ T1174] 5.6.0-rc1+ #447 Not tainted
[  146.632387][ T1174] -----------------------------
[  146.633151][ T1174] drivers/net/ethernet/qualcomm/rmnet/rmnet_config.c:386 suspicious rcu_dereference_check() !
[  146.634742][ T1174]
[  146.634742][ T1174] other info that might help us debug this:
[  146.634742][ T1174]
[  146.645992][ T1174]
[  146.645992][ T1174] rcu_scheduler_active = 2, debug_locks = 1
[  146.646937][ T1174] 5 locks held by ping/1174:
[  146.647609][ T1174]  #0: ffff8880c31dea70 (sk_lock-AF_INET){+.+.}, at: raw_sendmsg+0xab8/0x2980
[  146.662463][ T1174]  #1: ffffffff93925660 (rcu_read_lock_bh){....}, at: ip_finish_output2+0x243/0x2150
[  146.671696][ T1174]  #2: ffffffff93925660 (rcu_read_lock_bh){....}, at: __dev_queue_xmit+0x213/0x2940
[  146.673064][ T1174]  #3: ffff8880c19ecd58 (&dev->qdisc_running_key#7){+...}, at: ip_finish_output2+0x714/0x2150
[  146.690358][ T1174]  #4: ffff8880c5796898 (&dev->qdisc_xmit_lock_key#3){+.-.}, at: sch_direct_xmit+0x1e2/0x1020
[  146.699875][ T1174]
[  146.699875][ T1174] stack backtrace:
[  146.701091][ T1174] CPU: 0 PID: 1174 Comm: ping Not tainted 5.6.0-rc1+ #447
[  146.705215][ T1174] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
[  146.706565][ T1174] Call Trace:
[  146.707102][ T1174]  dump_stack+0x96/0xdb
[  146.708007][ T1174]  rmnet_get_port.part.9+0x76/0x80 [rmnet]
[  146.709233][ T1174]  rmnet_egress_handler+0x107/0x420 [rmnet]
[  146.710492][ T1174]  ? sch_direct_xmit+0x1e2/0x1020
[  146.716193][ T1174]  rmnet_vnd_start_xmit+0x3d/0xa0 [rmnet]
[  146.717012][ T1174]  dev_hard_start_xmit+0x160/0x740
[  146.717854][ T1174]  sch_direct_xmit+0x265/0x1020
[  146.718577][ T1174]  ? register_lock_class+0x14d0/0x14d0
[  146.719429][ T1174]  ? dev_watchdog+0xac0/0xac0
[  146.723738][ T1174]  ? __dev_queue_xmit+0x15fd/0x2940
[  146.724469][ T1174]  ? lock_acquire+0x164/0x3b0
[  146.725172][ T1174]  __dev_queue_xmit+0x20c7/0x2940
[ ... ]

Fixes: ceed73a2cf4a ("drivers: net: ethernet: qualcomm: rmnet: Initial implementation")
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>

net: rmnet: fix NULL pointer dereference in rmnet_changelink()

BugLink: https://bugs.launchpad.net/bugs/1868538
[ Upstream commit 1eb1f43a6e37282348a41e3d68f5e9a6a4359212 ]

In the rmnet_changelink(), it uses IFLA_LINK without checking
NULL pointer.
tb[IFLA_LINK] could be NULL pointer.
So, NULL-ptr-deref could occur.

rmnet already has a lower interface (real_dev).
So, after this patch, rmnet_changelink() does not use IFLA_LINK anymore.

Test commands:
    modprobe rmnet
    ip link add dummy0 type dummy
    ip link add rmnet0 link dummy0 type rmnet mux_id 1
    ip link set rmnet0 type rmnet mux_id 2

Splat looks like:
[   90.578726][ T1131] general protection fault, probably for non-canonical address 0xdffffc0000000000I
[   90.581121][ T1131] KASAN: null-ptr-deref in range [0x0000000000000000-0x0000000000000007]
[   90.582380][ T1131] CPU: 2 PID: 1131 Comm: ip Not tainted 5.6.0-rc1+ #447
[   90.584285][ T1131] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
[   90.587506][ T1131] RIP: 0010:rmnet_changelink+0x5a/0x8a0 [rmnet]
[   90.588546][ T1131] Code: 83 ec 20 48 c1 ea 03 80 3c 02 00 0f 85 6f 07 00 00 48 8b 5e 28 48 b8 00 00 00 00 00 0
[   90.591447][ T1131] RSP: 0018:ffff8880ce78f1b8 EFLAGS: 00010247
[   90.592329][ T1131] RAX: dffffc0000000000 RBX: 0000000000000000 RCX: ffff8880ce78f8b0
[   90.593253][ T1131] RDX: 0000000000000000 RSI: ffff8880ce78f4a0 RDI: 0000000000000004
[   90.594058][ T1131] RBP: ffff8880cf543e00 R08: 0000000000000002 R09: 0000000000000002
[   90.594859][ T1131] R10: ffffffffc0586a40 R11: 0000000000000000 R12: ffff8880ca47c000
[   90.595690][ T1131] R13: ffff8880ca47c000 R14: ffff8880cf545000 R15: 0000000000000000
[   90.596553][ T1131] FS:  00007f21f6c7e0c0(0000) GS:ffff8880da400000(0000) knlGS:0000000000000000
[   90.597504][ T1131] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   90.599418][ T1131] CR2: 0000556e413db458 CR3: 00000000c917a002 CR4: 00000000000606e0
[   90.600289][ T1131] Call Trace:
[   90.600631][ T1131]  __rtnl_newlink+0x922/0x1270
[   90.601194][ T1131]  ? lock_downgrade+0x6e0/0x6e0
[   90.601724][ T1131]  ? rtnl_link_unregister+0x220/0x220
[   90.602309][ T1131]  ? lock_acquire+0x164/0x3b0
[   90.602784][ T1131]  ? is_bpf_image_address+0xff/0x1d0
[   90.603331][ T1131]  ? rtnl_newlink+0x4c/0x90
[   90.603810][ T1131]  ? kernel_text_address+0x111/0x140
[   90.604419][ T1131]  ? __kernel_text_address+0xe/0x30
[   90.604981][ T1131]  ? unwind_get_return_address+0x5f/0xa0
[   90.605616][ T1131]  ? create_prof_cpu_mask+0x20/0x20
[   90.606304][ T1131]  ? arch_stack_walk+0x83/0xb0
[   90.606985][ T1131]  ? stack_trace_save+0x82/0xb0
[   90.607656][ T1131]  ? stack_trace_consume_entry+0x160/0x160
[   90.608503][ T1131]  ? deactivate_slab.isra.78+0x2c5/0x800
[   90.609336][ T1131]  ? kasan_unpoison_shadow+0x30/0x40
[   90.610096][ T1131]  ? kmem_cache_alloc_trace+0x135/0x350
[   90.610889][ T1131]  ? rtnl_newlink+0x4c/0x90
[   90.611512][ T1131]  rtnl_newlink+0x65/0x90
[ ... ]

Fixes: 23790ef12082 ("net: qualcomm: rmnet: Allow to configure flags for existing devices")
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>

net: rmnet: fix NULL pointer dereference in rmnet_newlink()

BugLink: https://bugs.launchpad.net/bugs/1868538
[ Upstream commit 93b5cbfa9636d385126f211dca9efa7e3f683202 ]

rmnet registers IFLA_LINK interface as a lower interface.
But, IFLA_LINK could be NULL.
In the current code, rmnet doesn't check IFLA_LINK.
So, panic would occur.

Test commands:
    modprobe rmnet
    ip link add rmnet0 type rmnet mux_id 1

Splat looks like:
[   36.826109][ T1115] general protection fault, probably for non-canonical address 0xdffffc0000000000I
[   36.838817][ T1115] KASAN: null-ptr-deref in range [0x0000000000000000-0x0000000000000007]
[   36.839908][ T1115] CPU: 1 PID: 1115 Comm: ip Not tainted 5.6.0-rc1+ #447
[   36.840569][ T1115] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
[   36.841408][ T1115] RIP: 0010:rmnet_newlink+0x54/0x510 [rmnet]
[   36.841986][ T1115] Code: 83 ec 18 48 c1 e9 03 80 3c 01 00 0f 85 d4 03 00 00 48 8b 6a 28 48 b8 00 00 00 00 00 c
[   36.843923][ T1115] RSP: 0018:ffff8880b7e0f1c0 EFLAGS: 00010247
[   36.844756][ T1115] RAX: dffffc0000000000 RBX: ffff8880d14cca00 RCX: 1ffff11016fc1e99
[   36.845859][ T1115] RDX: 0000000000000000 RSI: ffff8880c3d04000 RDI: 0000000000000004
[   36.846961][ T1115] RBP: 0000000000000000 R08: ffff8880b7e0f8b0 R09: ffff8880b6ac2d90
[   36.848020][ T1115] R10: ffffffffc0589a40 R11: ffffed1016d585b7 R12: ffffffff88ceaf80
[   36.848788][ T1115] R13: ffff8880c3d04000 R14: ffff8880b7e0f8b0 R15: ffff8880c3d04000
[   36.849546][ T1115] FS:  00007f50ab3360c0(0000) GS:ffff8880da000000(0000) knlGS:0000000000000000
[   36.851784][ T1115] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   36.852422][ T1115] CR2: 000055871afe5ab0 CR3: 00000000ae246001 CR4: 00000000000606e0
[   36.853181][ T1115] Call Trace:
[   36.853514][ T1115]  __rtnl_newlink+0xbdb/0x1270
[   36.853967][ T1115]  ? lock_downgrade+0x6e0/0x6e0
[   36.854420][ T1115]  ? rtnl_link_unregister+0x220/0x220
[   36.854936][ T1115]  ? lock_acquire+0x164/0x3b0
[   36.855376][ T1115]  ? is_bpf_image_address+0xff/0x1d0
[   36.855884][ T1115]  ? rtnl_newlink+0x4c/0x90
[   36.856304][ T1115]  ? kernel_text_address+0x111/0x140
[   36.856857][ T1115]  ? __kernel_text_address+0xe/0x30
[   36.857440][ T1115]  ? unwind_get_return_address+0x5f/0xa0
[   36.858063][ T1115]  ? create_prof_cpu_mask+0x20/0x20
[   36.858644][ T1115]  ? arch_stack_walk+0x83/0xb0
[   36.859171][ T1115]  ? stack_trace_save+0x82/0xb0
[   36.859710][ T1115]  ? stack_trace_consume_entry+0x160/0x160
[   36.860357][ T1115]  ? deactivate_slab.isra.78+0x2c5/0x800
[   36.860928][ T1115]  ? kasan_unpoison_shadow+0x30/0x40
[   36.861520][ T1115]  ? kmem_cache_alloc_trace+0x135/0x350
[   36.862125][ T1115]  ? rtnl_newlink+0x4c/0x90
[   36.864073][ T1115]  rtnl_newlink+0x65/0x90
[ ... ]

Fixes: ceed73a2cf4a ("drivers: net: ethernet: qualcomm: rmnet: Initial implementation")
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>

hinic: fix a bug of rss configuration

BugLink: https://bugs.launchpad.net/bugs/1868538
[ Upstream commit 386d4716fd91869e07c731657f2cde5a33086516 ]

should use real receive queue number to configure hw rss
indirect table rather than maximal queue number

Signed-off-by: Luo bin <luobin9@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>

hinic: fix a bug of setting hw_ioctxt

BugLink: https://bugs.launchpad.net/bugs/1868538
[ Upstream commit d2ed69ce9ed3477e2a9527e6b89fe4689d99510e ]

a reserved field is used to signify prime physical function index
in the latest firmware version, so we must assign a value to it
correctly

Signed-off-by: Luo bin <luobin9@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>

hinic: fix a irq affinity bug

BugLink: https://bugs.launchpad.net/bugs/1868538
[ Upstream commit 0bff777bd0cba73ad4cd0145696ad284d7e6a99f ]

can not use a local variable as an input parameter of
irq_set_affinity_hint

Signed-off-by: Luo bin <luobin9@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>

net: phy: mscc: fix firmware paths

BugLink: https://bugs.launchpad.net/bugs/1868538
[ Upstream commit c87a9d6fc6d555e4981f2ded77d9a8cce950743e ]

The firmware paths for the VSC8584 PHYs not not contain the leading
'microchip/' directory, as used in linux-firmware, resulting in an
error when probing the driver. This patch fixes it.

Fixes: a5afc1678044 ("net: phy: mscc: add support for VSC8584 PHY")
Signed-off-by: Antoine Tenart <antoine.tenart@bootlin.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>

slip: not call free_netdev before rtnl_unlock in slip_open

BugLink: https://bugs.launchpad.net/bugs/1868538
[ Upstream commit f596c87005f7b1baeb7d62d9a9e25d68c3dfae10 ]

As the description before netdev_run_todo, we cannot call free_netdev
before rtnl_unlock, fix it by reorder the code.

Signed-off-by: yangerkun <yangerkun@huawei.com>
Reviewed-by: Oliver Hartkopp <socketcan@hartkopp.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>

signal: avoid double atomic counter increments for user accounting

BugLink: https://bugs.launchpad.net/bugs/1868538
[ Upstream commit fda31c50292a5062332fa0343c084bd9f46604d9 ]

When queueing a signal, we increment both the users count of pending
signals (for RLIMIT_SIGPENDING tracking) and we increment the refcount
of the user struct itself (because we keep a reference to the user in
the signal structure in order to correctly account for it when freeing).

That turns out to be fairly expensive, because both of them are atomic
updates, and particularly under extreme signal handling pressure on big
machines, you can get a lot of cache contention on the user struct.
That can then cause horrid cacheline ping-pong when you do these
multiple accesses.

So change the reference counting to only pin the user for the _first_
pending signal, and to unpin it when the last pending signal is
dequeued.  That means that when a user sees a lot of concurrent signal
queuing - which is the only situation when this matters - the only
atomic access needed is generally the 'sigpending' count update.

This was noticed because of a particularly odd timing artifact on a
dual-socket 96C/192T Cascade Lake platform: when you get into bad
contention, on that machine for some reason seems to be much worse when
the contention happens in the upper 32-byte half of the cacheline.

As a result, the kernel test robot will-it-scale 'signal1' benchmark had
an odd performance regression simply due to random alignment of the
'struct user_struct' (and pointed to a completely unrelated and
apparently nonsensical commit for the regression).

Avoiding the double increments (and decrements on the dequeueing side,
of course) makes for much less contention and hugely improved
performance on that will-it-scale microbenchmark.

Quoting Feng Tang:

"It makes a big difference, that the performance score is tripled! bump
  from original 17000 to 54000. Also the gap between 5.0-rc6 and
  5.0-rc6+Jiri's patch is reduced to around 2%"

[ The "2% gap" is the odd cacheline placement difference on that
  platform: under the extreme contention case, the effect of which half
  of the cacheline was hot was 5%, so with the reduced contention the
  odd timing artifact is reduced too ]

It does help in the non-contended case too, but is not nearly as
noticeable.

Reported-and-tested-by: Feng Tang <feng.tang@intel.com>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Huang, Ying <ying.huang@intel.com>
Cc: Philip Li <philip.li@intel.com>
Cc: Andi Kleen <andi.kleen@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>