In functions regmap_encx24j600_phy_reg_read() and
regmap_encx24j600_phy_reg_write() in the conditions of the waiting
cycles for filling the variable 'ret' it is necessary to add parentheses
to prevent wrong assignment due to logical operations precedence.
Found by Linux Verification Center (linuxtesting.org) with SVACE.
Fixes: d70e53262f5c ("net: Microchip encx24j600 driver") Signed-off-by: Valentina Goncharenko <goncharenko.vp@ispras.ru> Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
(cherry picked from commit 8aae746d065c45338f0c5ba47d836eea61edd87f) Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
ieee802154_if_add() allocates wpan_dev as netdev's private data, but not
init the list in struct wpan_dev. cfg802154_netdev_notifier_call() manage
the list when device register/unregister, and may lead to null-ptr-deref.
Use INIT_LIST_HEAD() on it to initialize it correctly.
Fixes: fcf39e6e88e9 ("ieee802154: add wpan_dev_list") Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Acked-by: Alexander Aring <aahringo@redhat.com> Link: https://lore.kernel.org/r/20221130091705.1831140-1-weiyongjun@huaweicloud.com Signed-off-by: Stefan Schmidt <stefan@datenfreihafen.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
(cherry picked from commit a110287ef4a423980309490df632e1c1e73b3dc9) Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
When testing in kci_test_ipsec_offload, srcip is configured as $dstip,
it should add xfrm policy rule in instead of out.
The test result of this patch is as follows:
PASS: ipsec_offload
Fixes: 2766a11161cc ("selftests: rtnetlink: add ipsec offload API test") Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com> Acked-by: Hangbin Liu <liuhangbin@gmail.com> Link: https://lore.kernel.org/r/20221201082246.14131-1-shaozhengchao@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
(cherry picked from commit e046421bed5a5b89b292e164a08c9c0a836fe57b) Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
hci_get_route() takes reference, we should use hci_dev_put() to release
it when not need anymore.
Fixes: 6b8d4a6a0314 ("Bluetooth: 6LoWPAN: Use connected oriented channel instead of fixed one") Signed-off-by: Wang ShaoBo <bobo.shaobowang@huawei.com> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
(cherry picked from commit 3322193949ac83b45263c4831ae68870e11d255d) Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Commit 39f9895a00f4 ("vmxnet3: add support for 32 Tx/Rx queues")
added support for 32Tx/Rx queues. As a part of this patch, intrConf
structure was extended to incorporate increased queues.
This patch fixes the issue where incorrect reference is being used.
Fixes: 39f9895a00f4 ("vmxnet3: add support for 32 Tx/Rx queues") Signed-off-by: Ronak Doshi <doshir@vmware.com> Acked-by: Guolin Yang <gyang@vmware.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
(cherry picked from commit 6c88c764e0c426d11b07a5b2606527ccaca3ca02) Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Commit dacce2be3312 ("vmxnet3: add geneve and vxlan tunnel offload
support") added support for encapsulation offload. However, the
pathc did not report correctly the encapsulated packet which is
LRO'ed by the hypervisor.
This patch fixes this issue by using correct callback for the LRO'ed
encapsulated packet.
Fixes: dacce2be3312 ("vmxnet3: add geneve and vxlan tunnel offload support") Signed-off-by: Ronak Doshi <doshir@vmware.com> Acked-by: Guolin Yang <gyang@vmware.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
(cherry picked from commit 5ad0d85757f8673acc4c0cda8a052e26f735c1b7) Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Wei Chen reported a NULL deref in sk_user_ns() [0][1], and Paolo diagnosed
the root cause: in unix_diag_get_exact(), the newly allocated skb does not
have sk. [2]
We must get the user_ns from the NETLINK_CB(in_skb).sk and pass it to
sk_diag_fill().
Cheap monitors sometimes advertise YUV modes they don't really have
(HDMI specification mandates YUV support so even monitors without actual
support will often wrongfully advertise it) which results in YUV matches
and user forum complaints of a red tint to light colour display areas in
common desktop environments.
Moving the default RGB fall-back before YUV selection results in RGB
mode matching in most cases, reducing complaints.
Fixes: 6c3c719936da ("drm/bridge: synopsys: dw-hdmi: add bus format negociation") Signed-off-by: Guillaume BRUN <the.cheaterman@gmail.com> Tested-by: Christian Hewitt <christianshewitt@gmail.com> Reviewed-by: Robert Foss <robert.foss@linaro.org> Signed-off-by: Neil Armstrong <neil.armstrong@linaro.org> Link: https://patchwork.freedesktop.org/patch/msgid/20221116143523.2126-1-the.cheaterman@gmail.com Signed-off-by: Sasha Levin <sashal@kernel.org>
(cherry picked from commit 807a01a3297947074543016320255076efcec3d2) Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
commit 8d820bc9d12b ("net: broadcom: Fix BCMGENET Kconfig") fixes the build
that contain 99addbe31f55 ("net: broadcom: Select BROADCOM_PHY for BCMGENET")
and enable BCMGENET=y but PTP_1588_CLOCK_OPTIONAL=m, which otherwise
leads to a link failure. However this may trigger a runtime failure.
Fix the original issue by propagating the PTP_1588_CLOCK_OPTIONAL dependency
of BROADCOM_PHY down to BCMGENET.
Without this change, the interrupt test fail with MSI-X environment:
$ sudo ethtool -t enp0s2 offline
[ 43.921783] igb 0000:00:02.0: offline testing starting
[ 44.855824] igb 0000:00:02.0 enp0s2: igb: enp0s2 NIC Link is Down
[ 44.961249] igb 0000:00:02.0 enp0s2: igb: enp0s2 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX/TX
[ 51.272202] igb 0000:00:02.0: testing shared interrupt
[ 56.996975] igb 0000:00:02.0 enp0s2: igb: enp0s2 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX/TX
The test result is FAIL
The test extra info:
Register test (offline) 0
Eeprom test (offline) 0
Interrupt test (offline) 4
Loopback test (offline) 0
Link test (on/offline) 0
Here, "4" means an expected interrupt was not delivered.
To fix this, route IRQs correctly to the first MSI-X vector by setting
IVAR_MISC. Also, set bit 0 of EIMS so that the vector will not be
masked. The interrupt test now runs properly with this change:
$ sudo ethtool -t enp0s2 offline
[ 42.762985] igb 0000:00:02.0: offline testing starting
[ 50.141967] igb 0000:00:02.0: testing shared interrupt
[ 56.163957] igb 0000:00:02.0 enp0s2: igb: enp0s2 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX/TX
The test result is PASS
The test extra info:
Register test (offline) 0
Eeprom test (offline) 0
Interrupt test (offline) 0
Loopback test (offline) 0
Link test (on/offline) 0
Fixes: 4eefa8f01314 ("igb: add single vector msi-x testing to interrupt test") Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com> Reviewed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Tested-by: Gurucharan G <gurucharanx.g@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
(cherry picked from commit 16eb678bca8e154b23084a5ffd24319e3274bf01) Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
e1000_xmit_frame is expected to stop the queue and dispatch frames to
hardware if there is not sufficient space for the next frame in the
buffer, but sometimes it failed to do so because the estimated maximum
size of frame was wrong. As the consequence, the later invocation of
e1000_xmit_frame failed with NETDEV_TX_BUSY, and the frame in the buffer
remained forever, resulting in a watchdog failure.
This change fixes the estimated size by making it match with the
condition for NETDEV_TX_BUSY. Apparently, the old estimation failed to
account for the following lines which determines the space requirement
for not causing NETDEV_TX_BUSY:
```
/* reserve a descriptor for the offload context */
if ((mss) || (skb->ip_summed == CHECKSUM_PARTIAL))
count++;
count++;
This issue was found when running http-stress02 test included in Linux
Test Project 20220930 on QEMU with the following commandline:
```
qemu-system-x86_64 -M q35,accel=kvm -m 8G -smp 8
-drive if=virtio,format=raw,file=root.img,file.locking=on
-device e1000e,netdev=netdev
-netdev tap,script=ifup,downscript=no,id=netdev
```
Fixes: bc7f75fa9788 ("[E1000E]: New pci-express e1000 driver (currently for ICH9 devices only)") Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com> Tested-by: Gurucharan G <gurucharanx.g@intel.com> (A Contingent worker at Intel) Tested-by: Naama Meir <naamax.meir@linux.intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
(cherry picked from commit 34c6367c946cbe604cd54bf1a074105d44165154) Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
for_each_pci_dev() is implemented by pci_get_device(). The comment of
pci_get_device() says that it will increase the reference count for the
returned pci_dev and also decrease the reference count for the input
pci_dev @from if it is not NULL.
If we break for_each_pci_dev() loop with pdev not NULL, we need to call
pci_dev_put() to decrease the reference count. Add the missing
pci_dev_put() after the 'out' label. Since pci_dev_put() can handle NULL
input parameter, there is no problem for the 'Device not found' branch.
For the normal path, add pci_dev_put() in amd_gpio_exit().
Fixes: f942a7de047d ("gpio: add a driver for GPIO pins found on AMD-8111 south bridge chips") Signed-off-by: Xiongfeng Wang <wangxiongfeng2@huawei.com> Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
(cherry picked from commit 4271515f189bd5fe2ec86b4089dab7cb804625d2) Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
According to the description in ti-sn65dsi86's datasheet:
CHA_HSYNC_POLARITY:
0 = Active High Pulse. Synchronization signal is high for the sync
pulse width. (default)
1 = Active Low Pulse. Synchronization signal is low for the sync
pulse width.
CHA_VSYNC_POLARITY:
0 = Active High Pulse. Synchronization signal is high for the sync
pulse width. (Default)
1 = Active Low Pulse. Synchronization signal is low for the sync
pulse width.
We should only set these bits when the polarity is negative.
Fixes: a095f15c00e2 ("drm/bridge: add support for sn65dsi86 bridge driver") Signed-off-by: Qiqi Zhang <eddy.zhang@rock-chips.com> Reviewed-by: Douglas Anderson <dianders@chromium.org> Tested-by: Douglas Anderson <dianders@chromium.org> Reviewed-by: Tomi Valkeinen <tomi.valkeinen@ideasonboard.com> Signed-off-by: Douglas Anderson <dianders@chromium.org> Link: https://patchwork.freedesktop.org/patch/msgid/20221125104558.84616-1-eddy.zhang@rock-chips.com Signed-off-by: Sasha Levin <sashal@kernel.org>
(cherry picked from commit d57b60e9b35539d93a74e6de429de007b192c1bb) Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
The struct cas_control embeds multiple generic SPI structures and we
have to make sure these structures are initialized to default values.
This driver does not set all attributes. When using kmalloc before some
attributes were not initialized and contained random data which caused
random crashes at bootup.
Embarrassingly, nft_pipapo_insert() checked for interval validity in
the first field only.
The start_p and end_p pointers were reset to key data from the first
field at every iteration of the loop which was supposed to go over
the set fields.
Fixes: 3c4287f62044 ("nf_tables: Add set type for arbitrary concatenation of ranges") Reported-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Stefano Brivio <sbrivio@redhat.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
(cherry picked from commit 7b09ba9036b26a7ed41c62b1ce7cad378549021f) Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Here is a backtrace report about memory leak detected in
gpiochip_setup_dev():
unreferenced object 0xffff88810b406400 (size 512):
comm "python3", pid 1682, jiffies 4295346908 (age 24.090s)
backtrace:
kmalloc_trace
device_add device_private_init at drivers/base/core.c:3361
(inlined by) device_add at drivers/base/core.c:3411
cdev_device_add
gpiolib_cdev_register
gpiochip_setup_dev
gpiochip_add_data_with_key
gcdev_register() & gcdev_unregister() would call device_add() &
device_del() (no matter CONFIG_GPIO_CDEV is enabled or not) to
register/unregister device.
However, if device_add() succeeds, some resource (like
struct device_private allocated by device_private_init())
is not released by device_del().
Therefore, after device_add() succeeds by gcdev_register(), it
needs to call put_device() to release resource in the error handle
path.
Here we move forward the register of release function, and let it
release every piece of resource by put_device() instead of kfree().
While at it, fix another subtle issue, i.e. when gc->ngpio is equal
to 0, we still call kcalloc() and, in case of further error, kfree()
on the ZERO_PTR pointer, which is not NULL. It's not a bug per se,
but rather waste of the resources and potentially wrong expectation
about contents of the gdev->descs variable.
Fixes: 159f3cd92f17 ("gpiolib: Defer gpio device setup until after gpiolib initialization") Signed-off-by: Zeng Heng <zengheng4@huawei.com> Co-developed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
(cherry picked from commit 6daaa84b621485fe28c401be18debf92ae8ef04a) Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Several drivers read the 'ngpios' device property on their own, but
since it's defined as a standard GPIO property in the device tree bindings
anyway, it's a good candidate for generalization. If the driver didn't
set its gc->ngpio, try to read the 'ngpios' property from the GPIO
device's firmware node before bailing out.
Signed-off-by: Bartosz Golaszewski <brgl@bgdev.pl> Suggested-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Reviewed-by: Linus Walleij <linus.walleij@linaro.org> Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Stable-dep-of: ec851b23084b ("gpiolib: fix memory leak in gpiochip_setup_dev()") Signed-off-by: Sasha Levin <sashal@kernel.org>
(cherry picked from commit 1a1075d371082bda15892e203a9e464c459c4a7c) Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
The release function is called with a pointer to the memory returned by
devres_alloc(). I was confused about that by the code before the
generalization that used a struct clk **ptr.
Reported-by: Marek Szyprowski <m.szyprowski@samsung.com> Fixes: abae8e57e49a ("clk: generalize devm_clk_get() a bit") Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Link: https://lore.kernel.org/r/20220620171815.114212-1-u.kleine-koenig@pengutronix.de Tested-by: Marek Szyprowski <m.szyprowski@samsung.com> Tested-by: Linux Kernel Functional Testing <lkft@linaro.org> Signed-off-by: Stephen Boyd <sboyd@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
(cherry picked from commit 3b714f25fc591641b92865b85933993c19923e47) Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Analogue to commit 8aa59e355949 ("can: af_can: fix NULL pointer
dereference in can_rx_register()") we need to check for a missing
initialization of ml_priv in the receive path of CAN frames.
Since commit 4e096a18867a ("net: introduce CAN specific pointer in the
struct net_device") the check for dev->type to be ARPHRD_CAN is not
sufficient anymore since bonding or tun netdevices claim to be CAN
devices but do not initialize ml_priv accordingly.
Fixes: 4e096a18867a ("net: introduce CAN specific pointer in the struct net_device") Reported-by: syzbot+2d7f58292cb5b29eb5ad@syzkaller.appspotmail.com Reported-by: Wei Chen <harperchen1110@gmail.com> Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net> Link: https://lore.kernel.org/all/20221206201259.3028-1-socketcan@hartkopp.net Cc: stable@vger.kernel.org Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit c142cba37de29f740a3852f01f59876af8ae462a) Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Syzbot reported shift-out-of-bounds in hid_report_raw_event.
microsoft 0003:045E:07DA.0001: hid_field_extract() called with n (128) >
32! (swapper/0)
======================================================================
UBSAN: shift-out-of-bounds in drivers/hid/hid-core.c:1323:20
shift exponent 127 is too large for 32-bit type 'int'
CPU: 0 PID: 0 Comm: swapper/0 Not tainted 6.1.0-rc4-syzkaller-00159-g4bbf3422df78 #0
Hardware name: Google Compute Engine/Google Compute Engine, BIOS
Google 10/26/2022
Call Trace:
<IRQ>
__dump_stack lib/dump_stack.c:88 [inline]
dump_stack_lvl+0x1e3/0x2cb lib/dump_stack.c:106
ubsan_epilogue lib/ubsan.c:151 [inline]
__ubsan_handle_shift_out_of_bounds+0x3a6/0x420 lib/ubsan.c:322
snto32 drivers/hid/hid-core.c:1323 [inline]
hid_input_fetch_field drivers/hid/hid-core.c:1572 [inline]
hid_process_report drivers/hid/hid-core.c:1665 [inline]
hid_report_raw_event+0xd56/0x18b0 drivers/hid/hid-core.c:1998
hid_input_report+0x408/0x4f0 drivers/hid/hid-core.c:2066
hid_irq_in+0x459/0x690 drivers/hid/usbhid/hid-core.c:284
__usb_hcd_giveback_urb+0x369/0x530 drivers/usb/core/hcd.c:1671
dummy_timer+0x86b/0x3110 drivers/usb/gadget/udc/dummy_hcd.c:1988
call_timer_fn+0xf5/0x210 kernel/time/timer.c:1474
expire_timers kernel/time/timer.c:1519 [inline]
__run_timers+0x76a/0x980 kernel/time/timer.c:1790
run_timer_softirq+0x63/0xf0 kernel/time/timer.c:1803
__do_softirq+0x277/0x75b kernel/softirq.c:571
__irq_exit_rcu+0xec/0x170 kernel/softirq.c:650
irq_exit_rcu+0x5/0x20 kernel/softirq.c:662
sysvec_apic_timer_interrupt+0x91/0xb0 arch/x86/kernel/apic/apic.c:1107
======================================================================
If the size of the integer (unsigned n) is bigger than 32 in snto32(),
shift exponent will be too large for 32-bit type 'int', resulting in a
shift-out-of-bounds bug.
Fix this by adding a check on the size of the integer (unsigned n) in
snto32(). To add support for n greater than 32 bits, set n to 32, if n
is greater than 32.
The SJA1105 family has 45 L2 policing table entries
(SJA1105_MAX_L2_POLICING_COUNT) and SJA1110 has 110
(SJA1110_MAX_L2_POLICING_COUNT). Keeping the table structure but
accounting for the difference in port count (5 in SJA1105 vs 10 in
SJA1110) does not fully explain the difference. Rather, the SJA1110 also
has L2 ingress policers for multicast traffic. If a packet is classified
as multicast, it will be processed by the policer index 99 + SRCPORT.
The sja1105_init_l2_policing() function initializes all L2 policers such
that they don't interfere with normal packet reception by default. To have
a common code between SJA1105 and SJA1110, the index of the multicast
policer for the port is calculated because it's an index that is out of
bounds for SJA1105 but in bounds for SJA1110, and a bounds check is
performed.
The code fails to do the proper thing when determining what to do with the
multicast policer of port 0 on SJA1105 (ds->num_ports = 5). The "mcast"
index will be equal to 45, which is also equal to
table->ops->max_entry_count (SJA1105_MAX_L2_POLICING_COUNT). So it passes
through the check. But at the same time, SJA1105 doesn't have multicast
policers. So the code programs the SHARINDX field of an out-of-bounds
element in the L2 Policing table of the static config.
The comparison between index 45 and 45 entries should have determined the
code to not access this policer index on SJA1105, since its memory wasn't
even allocated.
With enough bad luck, the out-of-bounds write could even overwrite other
valid kernel data, but in this case, the issue was detected using KASAN.
Kernel log:
sja1105 spi5.0: Probed switch chip: SJA1105Q
==================================================================
BUG: KASAN: slab-out-of-bounds in sja1105_setup+0x1cbc/0x2340
Write of size 8 at addr ffffff880bd57708 by task kworker/u8:0/8
...
Workqueue: events_unbound deferred_probe_work_func
Call trace:
...
sja1105_setup+0x1cbc/0x2340
dsa_register_switch+0x1284/0x18d0
sja1105_probe+0x748/0x840
...
Allocated by task 8:
...
sja1105_setup+0x1bcc/0x2340
dsa_register_switch+0x1284/0x18d0
sja1105_probe+0x748/0x840
...
Fixes: 38fbe91f2287 ("net: dsa: sja1105: configure the multicast policers, if present") CC: stable@vger.kernel.org # 5.15+ Signed-off-by: Radu Nicolae Pirea (OSS) <radu-nicolae.pirea@oss.nxp.com> Reviewed-by: Vladimir Oltean <olteanv@gmail.com> Link: https://lore.kernel.org/r/20221207132347.38698-1-radu-nicolae.pirea@oss.nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 5e88c6f4aaa70c542e59e5a9d2244bcc99cd245d) Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
vm_open() is not allowed to fail. Fortunately we are guaranteed that
the pages are already pinned, thanks to the initial mmap which is now
being cloned into a forked process, and only need to increment the
refcnt. So just increment it directly. Previously if a signal was
delivered at the wrong time to the forking process, the
mutex_lock_interruptible() could fail resulting in the pages_use_count
not being incremented.
Fixes: 2194a63a818d ("drm: Add library for shmem backed GEM objects") Cc: stable@vger.kernel.org Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Javier Martinez Canillas <javierm@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221130185748.357410-3-robdclark@gmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 1074fefce9cc6730fe5f602c0287850c2698e315) Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
In the SDMA s0ix save process requires to turn off SDMA ring buffer for
avoiding the SDMA in-flight request, otherwise will suffer from SDMA page
fault which causes by page request from in-flight SDMA ring accessing at
SDMA restore phase.
Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2248 Cc: stable@vger.kernel.org # 6.0,5.15+ Fixes: f8f4e2a51834 ("drm/amdgpu: skipping SDMA hw_init and hw_fini for S0ix.") Signed-off-by: Prike Liang <Prike.Liang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Tested-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 249011f4c3b8a636f019c9740d5c4356d11031be) Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
When SEV is enabled gmr's and mob's are explicitly disabled because
the encrypted system memory can not be used by the hypervisor.
The driver was disabling GMR's but the presentation code, which depends
on GMR's, wasn't honoring it which lead to black screen on hosts
with SEV enabled.
Make sure screen objects presentation is not used when guest memory
regions have been disabled to fix presentation on SEV enabled hosts.
Fixes: 3b0d6458c705 ("drm/vmwgfx: Refuse DMA operation when SEV encryption is active") Cc: <stable@vger.kernel.org> # v5.7+ Signed-off-by: Zack Rusin <zackr@vmware.com> Reported-by: Nicholas Hunt <nhunt@vmware.com> Reviewed-by: Martin Krastev <krastevm@vmware.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221201175341.491884-1-zack@kde.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 1e4fe9a1546ffd3d261028fa042a160f5d91d281) Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
We recently experienced some weird huge time jumps in nested guests when
rebooting them in certain cases. After adding some debug code to the epoch
handling in vsie.c (thanks to David Hildenbrand for the idea!), it was
obvious that the "epdx" field (the multi-epoch extension) did not get set
to 0xff in case the "epoch" field was negative.
Seems like the code misses to copy the value from the epdx field from
the guest to the shadow control block. By doing so, the weird time
jumps are gone in our scenarios.
Link: https://bugzilla.redhat.com/show_bug.cgi?id=2140899 Fixes: 8fa1696ea781 ("KVM: s390: Multiple Epoch Facility support") Signed-off-by: Thomas Huth <thuth@redhat.com> Reviewed-by: Christian Borntraeger <borntraeger@linux.ibm.com> Acked-by: David Hildenbrand <david@redhat.com> Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com> Reviewed-by: Janosch Frank <frankja@linux.ibm.com> Cc: stable@vger.kernel.org # 4.19+ Link: https://lore.kernel.org/r/20221123090833.292938-1-thuth@redhat.com
Message-Id: <20221123090833.292938-1-thuth@redhat.com> Signed-off-by: Janosch Frank <frankja@linux.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit f6550976fe22db551b53fc36d7eecb3297420f85) Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
After calling napi_complete_done(), the NAPIF_STATE_SCHED bit may be
cleared, and another CPU can start napi thread and access per-CQ variable,
cq->work_done. If the other thread (for example, from busy_poll) sets
it to a value >= budget, this thread will continue to run when it should
stop, and cause memory corruption and panic.
To fix this issue, save the per-CQ work_done variable in a local variable
before napi_complete_done(), so it won't be corrupted by a possible
concurrent thread after napi_complete_done().
Also, add a flag bit to advertise to the NIC firmware: the NAPI work_done
variable race is fixed, so the driver is able to reliably support features
like busy_poll.
The rationale of showing this is that it's potentially critical
information to diagnose and find more CSR compatibility bugs in the
future and it will save a lot of headaches.
Given that clones come from a wide array of vendors (some are actually
Barrot, some are something else) and these numbers are what let us find
differences between actual and fake ones, it will be immensely helpful
to scour the Internet looking for this pattern and building an actual
database to find correlations and improve the checks.
Cc: stable@vger.kernel.org Cc: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Ismael Ferreras Morezuelas <swyterzone@gmail.com> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 1dee2b5047718a8ec725be639c1ab8e6809bf337) Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
For dax pud, pud_huge() returns true on x86. So the function works as long
as hugetlb is configured. However, dax doesn't depend on hugetlb.
Commit 414fd080d125 ("mm/gup: fix gup_pmd_range() for dax") fixed
devmap-backed huge PMDs, but missed devmap-backed huge PUDs. Fix this as
well.
memcg_write_event_control() accesses the dentry->d_name of the specified
control fd to route the write call. As a cgroup interface file can't be
renamed, it's safe to access d_name as long as the specified file is a
regular cgroup file. Also, as these cgroup interface files can't be
removed before the directory, it's safe to access the parent too.
Prior to 347c4a874710 ("memcg: remove cgroup_event->cft"), there was a
call to __file_cft() which verified that the specified file is a regular
cgroupfs file before further accesses. The cftype pointer returned from
__file_cft() was no longer necessary and the commit inadvertently dropped
the file type check with it allowing any file to slip through. With the
invarients broken, the d_name and parent accesses can now race against
renames and removals of arbitrary files and cause use-after-free's.
Fix the bug by resurrecting the file type check in __file_cft(). Now that
cgroupfs is implemented through kernfs, checking the file operations needs
to go through a layer of indirection. Instead, let's check the superblock
and dentry type.
Link: https://lkml.kernel.org/r/Y5FRm/cfcKPGzWwl@slm.duckdns.org Fixes: 347c4a874710 ("memcg: remove cgroup_event->cft") Signed-off-by: Tejun Heo <tj@kernel.org> Reported-by: Jann Horn <jannh@google.com> Acked-by: Roman Gushchin <roman.gushchin@linux.dev> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Michal Hocko <mhocko@kernel.org> Cc: Muchun Song <songmuchun@bytedance.com> Cc: Shakeel Butt <shakeelb@google.com> Cc: <stable@vger.kernel.org> [3.14+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit aad8bbd17a1d586005feb9226c2e9cfce1432e13) Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Sanity checks were added to verify the v4l2_bt_timings blanking fields
in order to avoid integer overflows when userspace passes weird values.
But that assumed that userspace would correctly fill in the front porch,
backporch and sync values, but sometimes all you know is the total
blanking, which is then assigned to just one of these fields.
And that can fail with these checks.
So instead set a maximum for the total horizontal and vertical
blanking and check that each field remains below that.
That is still sufficient to avoid integer overflows, but it also
allows for more flexibility in how userspace fills in these fields.
Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl> Fixes: 4b6d66a45ed3 ("media: v4l2-dv-timings: add sanity checks for blanking values") Signed-off-by: Mauro Carvalho Chehab <mchehab@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 6fb8bc29bfa80707994a63cc97e2f9920e0b0608) Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
It introduced a boot regression on colibri-imx7, and potentially any
other i.MX7 boards with MTD partition list generated into the fdt by
U-Boot.
While the commit we are reverting here is not obviously wrong, it fixes
only a dt binding checker warning that is non-functional, while it
introduces a boot regression and there is no obvious fix ready.
The bus->clk_stop_timeout member is only initialized to a non-zero value
during the codec driver probe. This can lead to corner cases where this
value remains pegged at zero when the bus suspends, which results in an
endless loop in sdw_bus_wait_for_clk_prep_deprep().
Corner cases include configurations with no codecs described in the
firmware, or delays in probing codec drivers.
Initializing the default timeout to the smallest non-zero value avoid this
problem and allows for the existing logic to be preserved: the
bus->clk_stop_timeout is set as the maximum required by all codecs
connected on the bus.
While vb2_mmap took the mmap_lock mutex, vb2_get_unmapped_area didn't.
Add this.
Also take this opportunity to move the 'q->memory != VB2_MEMORY_MMAP'
check and vb2_fileio_is_active() check into __find_plane_by_offset() so
both vb2_mmap and vb2_get_unmapped_area do the same checks.
Since q->memory is checked while mmap_lock is held, also take that lock
in reqbufs and create_bufs when it is set, and set it back to
MEMORY_UNKNOWN on error.
It is not allowed to call kfree_skb() from hardware interrupt
context or with interrupts being disabled. So remove kfree_skb()
from the spin_lock_irqsave() section and use the already existing
"drop" label in xenvif_start_xmit() for dropping the SKB. At the
same time replace the dev_kfree_skb() call there with a call of
dev_kfree_skb_any(), as xenvif_start_xmit() can be called with
disabled interrupts.
This is XSA-424 / CVE-2022-42328 / CVE-2022-42329.
Fixes: be81992f9086 ("xen/netback: don't queue unlimited number of packages") Reported-by: Yang Yingliang <yangyingliang@huawei.com> Signed-off-by: Juergen Gross <jgross@suse.com> Reviewed-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
(cherry picked from commit 5d0fa6fc8899fe842329c0109f8ddd01144b1ed8) Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
In some cases, the frontend may send a packet where the protocol headers
are spread across multiple slots. This would result in netback creating
an skb where the protocol headers spill over into the non-linear area.
Some drivers and NICs don't handle this properly resulting in an
interface reset or worse.
This issue was introduced by the removal of an unconditional skb pull in
the tx path to improve performance. Fix this without reintroducing the
pull by setting up grant copy ops for as many slots as needed to reach
the XEN_NETBACK_TX_COPY_LEN size. Adjust the rest of the code to handle
multiple copy operations per skb.
This is XSA-423 / CVE-2022-3643.
Fixes: 7e5d7753956b ("xen-netback: remove unconditional __pskb_pull_tail() in guest Tx path") Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com> Reviewed-by: Paul Durrant <paul@xen.org> Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
(cherry picked from commit 0fe29bd92594a747a2561589bd452c259451929e) Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
edid_read() was assumed to return 0 on success. After commit 7f16d0f3b8e2("drm/bridge: anx7625: Propagate errors from sp_tx_rst_aux()"),
the function will return > 0 for successful case, representing the i2c
read bytes. Otherwise -EIO on failure cases. Update the g_edid_break
break condition accordingly.
Fixes: 7f16d0f3b8e2("drm/bridge: anx7625: Propagate errors from sp_tx_rst_aux()") Signed-off-by: Hsin-Yi Wang <hsinyi@chromium.org> Reviewed-by: Robert Foss <robert.foss@linaro.org> Reviewed-by: Xin Ji <xji@analogixsemi.com> Signed-off-by: Robert Foss <robert.foss@linaro.org> Link: https://patchwork.freedesktop.org/patch/msgid/20211118193002.407168-1-hsinyi@chromium.org Signed-off-by: Sasha Levin <sashal@kernel.org>
(cherry picked from commit f01677be31d1d52f2330f6f94d6bd943b9317692) Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
`hostname` needs to be set as null-pointer after free in
`cifs_put_tcp_session` function, or when `cifsd` thread attempts
to resolve hostname and reconnect the host, the thread would deref
the invalid pointer.
Here is one of practical backtrace examples as reference:
cifsd
---------------------------
kthread
cifs_demultiplex_thread
cifs_reconnect
reconn_set_ipaddr_from_hostname
--> if (!server->hostname)
--> if (server->hostname[0] == '\0') // !! UAF fault here
CIFS: VFS: cifs_mount failed w/return code = -112
mount error(112): Host is down
BUG: KASAN: use-after-free in reconn_set_ipaddr_from_hostname+0x2ba/0x310
Read of size 1 at addr ffff888108f35380 by task cifsd/480
CPU: 2 PID: 480 Comm: cifsd Not tainted 6.1.0-rc2-00106-gf705792f89dd-dirty #25
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1.1 04/01/2014
Call Trace:
<TASK>
dump_stack_lvl+0x68/0x85
print_report+0x16c/0x4a3
kasan_report+0x95/0x190
reconn_set_ipaddr_from_hostname+0x2ba/0x310
__cifs_reconnect.part.0+0x241/0x800
cifs_reconnect+0x65f/0xb60
cifs_demultiplex_thread+0x1570/0x2570
kthread+0x2c5/0x380
ret_from_fork+0x22/0x30
</TASK>
Allocated by task 477:
kasan_save_stack+0x1e/0x40
kasan_set_track+0x21/0x30
__kasan_kmalloc+0x7e/0x90
__kmalloc_node_track_caller+0x52/0x1b0
kstrdup+0x3b/0x70
cifs_get_tcp_session+0xbc/0x19b0
mount_get_conns+0xa9/0x10c0
cifs_mount+0xdf/0x1970
cifs_smb3_do_mount+0x295/0x1660
smb3_get_tree+0x352/0x5e0
vfs_get_tree+0x8e/0x2e0
path_mount+0xf8c/0x1990
do_mount+0xee/0x110
__x64_sys_mount+0x14b/0x1f0
do_syscall_64+0x3b/0x90
entry_SYSCALL_64_after_hwframe+0x63/0xcd
Freed by task 477:
kasan_save_stack+0x1e/0x40
kasan_set_track+0x21/0x30
kasan_save_free_info+0x2a/0x50
__kasan_slab_free+0x10a/0x190
__kmem_cache_free+0xca/0x3f0
cifs_put_tcp_session+0x30c/0x450
cifs_mount+0xf95/0x1970
cifs_smb3_do_mount+0x295/0x1660
smb3_get_tree+0x352/0x5e0
vfs_get_tree+0x8e/0x2e0
path_mount+0xf8c/0x1990
do_mount+0xee/0x110
__x64_sys_mount+0x14b/0x1f0
do_syscall_64+0x3b/0x90
entry_SYSCALL_64_after_hwframe+0x63/0xcd
The buggy address belongs to the object at ffff888108f35380
which belongs to the cache kmalloc-16 of size 16
The buggy address is located 0 bytes inside of
16-byte region [ffff888108f35380, ffff888108f35390)
The buggy address belongs to the physical page:
page:00000000333f8e58 refcount:1 mapcount:0 mapping:0000000000000000 index:0xffff888108f350e0 pfn:0x108f35
flags: 0x200000000000200(slab|node=0|zone=2)
raw: 02000000000002000000000000000000dead000000000122ffff8881000423c0
raw: ffff888108f350e0000000008080007a00000001ffffffff0000000000000000
page dumped because: kasan: bad access detected
Memory state around the buggy address: ffff888108f35280: fa fb fc fc fa fb fc fc fa fb fc fc fa fb fc fc ffff888108f35300: fa fb fc fc fa fb fc fc fa fb fc fc fa fb fc fc
>ffff888108f35380: fa fb fc fc fa fb fc fc fa fb fc fc fa fb fc fc
^ ffff888108f35400: fa fb fc fc fc fc fc fc fc fc fc fc fc fc fc fc ffff888108f35480: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
Fixes: 7be3248f3139 ("cifs: To match file servers, make sure the server hostname matches") Signed-off-by: Zeng Heng <zengheng4@huawei.com> Reviewed-by: Paulo Alcantara (SUSE) <pc@cjr.nz> Signed-off-by: Steve French <stfrench@microsoft.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
(cherry picked from commit ee2536830b161d16859b2771effdde6b819c253f) Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Some Intel chipsets disconnect the time and date RTC registers when the
clock update is in progress: during this time reads may return bogus
values and writes fail silently. This includes the RTC alarm registers.
[1]
cmos_read_alarm() did not take account for that, which caused alarm time
reads to sometimes return bogus values. This can be shown with a test
patch that I am attaching to this patch series.
Fix this, by using mc146818_avoid_UIP().
[1] 7th Generation Intel ® Processor Family I/O for U/Y Platforms [...]
Datasheet, Volume 1 of 2 (Intel's Document Number: 334658-006)
Page 208
https://www.intel.com/content/dam/www/public/us/en/documents/datasheets/7th-and-8th-gen-core-family-mobile-u-y-processor-lines-i-o-datasheet-vol-1.pdf
"If a RAM read from the ten time and date bytes is attempted
during an update cycle, the value read do not necessarily
represent the true contents of those locations. Any RAM writes
under the same conditions are ignored."
Some Intel chipsets disconnect the time and date RTC registers when the
clock update is in progress: during this time reads may return bogus
values and writes fail silently. This includes the RTC alarm registers.
[1]
cmos_set_alarm() did not take account for that, fix it.
[1] 7th Generation Intel ® Processor Family I/O for U/Y Platforms [...]
Datasheet, Volume 1 of 2 (Intel's Document Number: 334658-006)
Page 208
https://www.intel.com/content/dam/www/public/us/en/documents/datasheets/7th-and-8th-gen-core-family-mobile-u-y-processor-lines-i-o-datasheet-vol-1.pdf
"If a RAM read from the ten time and date bytes is attempted
during an update cycle, the value read do not necessarily
represent the true contents of those locations. Any RAM writes
under the same conditions are ignored."
Function mc146818_get_time() contains an elaborate mechanism of reading
the RTC time while no RTC update is in progress. It turns out that
reading the RTC alarm clock also requires avoiding the RTC update.
Therefore, the mechanism in mc146818_get_time() should be reused - so
extract it into a separate function.
The logic in mc146818_avoid_UIP() is same as in mc146818_get_time()
except that after every
if (CMOS_READ(RTC_FREQ_SELECT) & RTC_UIP) {
there is now "mdelay(1)".
To avoid producing a very unreadable patch, mc146818_get_time() will be
refactored to use mc146818_avoid_UIP() in the next patch.
Any codepath that zaps page table entries must invoke MMU notifiers to
ensure that secondary MMUs (like KVM) don't keep accessing pages which
aren't mapped anymore. Secondary MMUs don't hold their own references to
pages that are mirrored over, so failing to notify them can lead to page
use-after-free.
I'm marking this as addressing an issue introduced in commit f3f0e1d2150b
("khugepaged: add support of collapse for tmpfs/shmem pages"), but most of
the security impact of this only came in commit 27e1f8273113 ("khugepaged:
enable collapse pmd for pte-mapped THP"), which actually omitted flushes
for the removal of present PTEs, not just for the removal of empty page
tables.
Since commit 70cbc3cc78a99 ("mm: gup: fix the fast GUP race against THP
collapse"), the lockless_pages_from_mm() fastpath rechecks the pmd_t to
ensure that the page table was not removed by khugepaged in between.
However, lockless_pages_from_mm() still requires that the page table is
not concurrently freed. Fix it by sending IPIs (if the architecture uses
semi-RCU-style page table freeing) before freeing/reusing page tables.
pagetable walks on address ranges mapped by VMAs can be done under the
mmap lock, the lock of an anon_vma attached to the VMA, or the lock of the
VMA's address_space. Only one of these needs to be held, and it does not
need to be held in exclusive mode.
Under those circumstances, the rules for concurrent access to page table
entries are:
- Terminal page table entries (entries that don't point to another page
table) can be arbitrarily changed under the page table lock, with the
exception that they always need to be consistent for
hardware page table walks and lockless_pages_from_mm().
This includes that they can be changed into non-terminal entries.
- Non-terminal page table entries (which point to another page table)
can not be modified; readers are allowed to READ_ONCE() an entry, verify
that it is non-terminal, and then assume that its value will stay as-is.
Retracting a page table involves modifying a non-terminal entry, so
page-table-level locks are insufficient to protect against concurrent page
table traversal; it requires taking all the higher-level locks under which
it is possible to start a page walk in the relevant range in exclusive
mode.
The collapse_huge_page() path for anonymous THP already follows this rule,
but the shmem/file THP path was getting it wrong, making it possible for
concurrent rmap-based operations to cause corruption.
LARA-L6 module can be configured (by AT interface) in three different
USB modes:
* Default mode (Vendor ID: 0x1546 Product ID: 0x1341) with 4 serial
interfaces
* RmNet mode (Vendor ID: 0x1546 Product ID: 0x1342) with 4 serial
interfaces and 1 RmNet virtual network interface
* CDC-ECM mode (Vendor ID: 0x1546 Product ID: 0x1343) with 4 serial
interface and 1 CDC-ECM virtual network interface
In RmNet mode LARA-L6 exposes the following interfaces:
If 0: Diagnostic
If 1: AT parser
If 2: AT parser
If 3: AT parset/alternative functions
If 4: RMNET interface
Signed-off-by: Davide Tronchin <davide.tronchin.94@gmail.com> Acked-by: Bjørn Mork <bjorn@mork.no> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
(cherry picked from commit 26f084e554669b203ef33c2f753a9629d77c2823) Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
trans_xen did not check the data fits into the buffer before copying
from the xen ring, but we probably should.
Add a check that just skips the request and return an error to
userspace if it did not fit
If there's a disconnection while operating in eSS, there may be a delay
in VBUS drop response from the connector. In that case, the internal
link state may drop to operate in usb2 speed while the controller thinks
the VBUS is still high. The driver must make sure to disable
GUSB2PHYCFG.SUSPHY when sending endpoint command while in usb2 speed.
The End Transfer command may be called, and only that command needs to
go through at this point. Let's keep it simple and unconditionally
disable GUSB2PHYCFG.SUSPHY whenever we issue the command.
This scenario is not seen in real hardware. In a rare case, our
prototype type-c controller/interface may have a slow response
triggerring this issue.
Status is reported as always off in the 6032 case. Status
reporting now matches the logic in the setters. Once of
the differences to the 6030 is that there are no groups,
therefore the state needs to be read out in the lower bits.
When trying to see if we can clone a file range, there are cases where we
end up sending two write operations in case the inode from the source root
has an i_size that is not sector size aligned and the length from the
current offset to its i_size is less than the remaining length we are
trying to clone.
Issuing two write operations when we could instead issue a single write
operation is not incorrect. However it is not optimal, specially if the
extents are compressed and the flag BTRFS_SEND_FLAG_COMPRESSED was passed
to the send ioctl. In that case we can end up sending an encoded write
with an offset that is not sector size aligned, which makes the receiver
fallback to decompressing the data and writing it using regular buffered
IO (so re-compressing the data in case the fs is mounted with compression
enabled), because encoded writes fail with -EINVAL when an offset is not
sector size aligned.
The following example, which triggered a bug in the receiver code for the
fallback logic of decompressing + regular buffer IO and is fixed by the
patchset referred in a Link at the bottom of this changelog, is an example
where we have the non-optimal behaviour due to an unaligned encoded write:
$ cat test.sh
#!/bin/bash
DEV=/dev/sdj
MNT=/mnt/sdj
mkfs.btrfs -f $DEV > /dev/null
mount -o compress $DEV $MNT
# File foo has a size of 33K, not aligned to the sector size.
xfs_io -f -c "pwrite -S 0xab 0 33K" $MNT/foo
xfs_io -f -c "pwrite -S 0xcd 0 64K" $MNT/bar
# Now clone the first 32K of file bar into foo at offset 0.
xfs_io -c "reflink $MNT/bar 0 0 32K" $MNT/foo
# Snapshot the default subvolume and create a full send stream (v2).
btrfs subvolume snapshot -r $MNT $MNT/snap
echo -e "\nFile bar in the original filesystem:"
od -A d -t x1 $MNT/snap/bar
umount $MNT
mkfs.btrfs -f $DEV > /dev/null
mount $DEV $MNT
echo -e "\nReceiving stream in a new filesystem..."
btrfs receive -f /tmp/test.send $MNT
echo -e "\nFile bar in the new filesystem:"
od -A d -t x1 $MNT/snap/bar
umount $MNT
Before this patch, the send stream included one regular write and one
encoded write for file 'bar', with the later being not sector size aligned
and causing the receiver to fallback to decompression + buffered writes.
The output of the btrfs receive command in verbose mode (-vvv):
(...)
mkfile o258-7-0
rename o258-7-0 -> bar
utimes
clone bar - source=foo source offset=0 offset=0 length=32768
write bar - offset=32768 length=1024
encoded_write bar - offset=33792, len=4096, unencoded_offset=33792, unencoded_file_len=31744, unencoded_len=65536, compression=1, encryption=0
encoded_write bar - falling back to decompress and write due to errno 22 ("Invalid argument")
(...)
This patch avoids the regular write followed by an unaligned encoded write
so that we end up sending a single encoded write that is aligned. So after
this patch the stream content is (output of btrfs receive -vvv):
(...)
mkfile o258-7-0
rename o258-7-0 -> bar
utimes
clone bar - source=foo source offset=0 offset=0 length=32768
encoded_write bar - offset=32768, len=4096, unencoded_offset=32768, unencoded_file_len=32768, unencoded_len=65536, compression=1, encryption=0
(...)
So we get more optimal behaviour and avoid the silent data loss bug in
versions of btrfs-progs affected by the bug referred by the Link tag
below (btrfs-progs v5.19, v5.19.1, v6.0 and v6.0.1).
The `nettest` binary, built from `selftests/net/nettest.c`,
was expected to be found in the path during test execution of
`fcnal-test.sh` and `pmtu.sh`, leading to tests getting
skipped when the binary is not installed in the system, as can
be seen in these logs found in the wild [1]:
# TEST: vti4: PMTU exceptions [SKIP]
[ 350.600250] IPv6: ADDRCONF(NETDEV_CHANGE): veth_b: link becomes ready
[ 350.607421] IPv6: ADDRCONF(NETDEV_CHANGE): veth_a: link becomes ready
# 'nettest' command not found; skipping tests
# xfrm6udp not supported
# TEST: vti6: PMTU exceptions (ESP-in-UDP) [SKIP]
[ 351.605102] IPv6: ADDRCONF(NETDEV_CHANGE): veth_b: link becomes ready
[ 351.612243] IPv6: ADDRCONF(NETDEV_CHANGE): veth_a: link becomes ready
# 'nettest' command not found; skipping tests
# xfrm4udp not supported
The `unicast_extensions.sh` tests also rely on `nettest`, but
it runs fine there because it looks for the binary in the
current working directory [2]:
The same mechanism that works for the Unicast extensions tests
is here copied over to the PMTU and functional tests.
Signed-off-by: Daniel Díaz <daniel.diaz@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
(cherry picked from commit f54e1edf579d7f58476c57de97f85abaa7b97bc0) Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
With clang's kernel control flow integrity (kCFI, CONFIG_CFI_CLANG),
indirect call targets are validated against the expected function
pointer prototype to make sure the call target is valid to help mitigate
ROP attacks. If they are not identical, there is a failure at run time,
which manifests as either a kernel panic or thread getting killed.
seq_copy_in_user() and seq_copy_in_kernel() did not have prototypes
matching snd_seq_dump_func_t. Adjust this and remove the casts. There
are not resulting binary output differences.
This was found as a result of Clang's new -Wcast-function-type-strict
flag, which is more sensitive than the simpler -Wcast-function-type,
which only checks for type width mismatches.
Sony's downstream driver [1], among some other changes, adds a
seemingly random 10ms usleep_range, which turned out to be necessary
for the hardware to function properly on at least Sony Xperia 1 IV.
Without this, I2C transactions with the SLG51000 straight up fail.
Relax (10-10ms -> 10-11ms) and add the aforementioned sleep to make
sure the hardware has some time to wake up.
Due to the hardware behavior, it takes some time for CBJ detection/impedance sensing/de-bounce.
The ClockStop_NotFinished flag will be raised until these functions are completed.
In ClockStopMode0 mode case, the SdW controller might check this flag from D3 to D0 when the
jack detection interrupt happened.
The clock source and the sched_clock provided by the arm_global_timer
on Rockchip rk3066a/rk3188 are quite unstable because their rates
depend on the CPU frequency.
Recent changes to the arm_global_timer driver makes it impossible to use.
On the other side, the arm_global_timer has a higher rating than the
ROCKCHIP_TIMER, it will be selected by default by the time framework
while we want to use the stable Rockchip clock source.
Keep the arm_global_timer disabled in order to have the
DW_APB_TIMER (rk3066a) or ROCKCHIP_TIMER (rk3188) selected by default.
A DEVAPC violation occurs when removing the module
due to accessing HW registers without base clock.
To fix this bug, the correct method is:
1. Call the runtime resume function to enable the
clock;
2. Operate the registers to reset the HW;
3. Turn off the clocks and disable the device
RPM mechanism.
DSPCLK_DIV field in WM8962_CLOCKING1 register is used to generate
correct frequency of LRCLK and BCLK. Sometimes the read-only value
can't be updated timely after enabling SYSCLK. This results in wrong
calculation values. Delay is introduced here to wait for newest value
from register. The time of the delay should be at least 500~1000us
according to test.
Actually in no-MMU SoCs(i.e. i.MXRT) ZERO_PAGE(vaddr) expands to
```
virt_to_page(0)
```
that in order expands to:
```
pfn_to_page(virt_to_pfn(0))
```
and then virt_to_pfn(0) to:
```
((((unsigned long)(0) - PAGE_OFFSET) >> PAGE_SHIFT) +
PHYS_PFN_OFFSET)
```
where PAGE_OFFSET and PHYS_PFN_OFFSET are the DRAM offset(0x80000000) and
PAGE_SHIFT is 12. This way we obtain 16MB(0x01000000) summed to the base of
DRAM(0x80000000).
When ZERO_PAGE(0) is then used, for example in bio_add_page(), the page
gets an address that is out of DRAM bounds.
So instead of using fake virtual page 0 let's allocate a dedicated
zero_page during paging_init() and assign it to a global 'struct page *
empty_zero_page' the same way mmu.c does and it's the same approach used
in m68k with commit dc068f462179 as discussed here[0]. Then let's move
ZERO_PAGE() definition to the top of pgtable.h to be in common between
mmu.c and nommu.c.
Store the frame address where arm_get_current_stackframe() looks for it
(ARM_r7 instead of ARM_fp if CONFIG_THUMB2_KERNEL=y). Otherwise frame->fp
gets set to 0, causing unwind_frame() to fail.
We must prevent the CPU from reordering the files->count read with the
FD table access like this, on architectures where read-read reordering is
possible:
I would like to mark this for stable, but the stable rules explicitly say
"no theoretical races", and given that the FD table pointer and
files->count are explicitly stored in the same cacheline, this sort of
reordering seems quite unlikely in practice...
Signed-off-by: Jann Horn <jannh@google.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Sasha Levin <sashal@kernel.org>
(cherry picked from commit 66717ad03b80802752c9931afeedfcbd6806a8c2) Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
The clk_disable_unprepare() should be called in the error handling
of devm_clk_bulk_get_optional, fix it by replacing devm_clk_get_optional
and clk_prepare_enable by devm_clk_get_optional_enabled.
Fixes: f5eccd94b63f ("mmc: mediatek: Add subsys clock control for MT8192 msdc") Signed-off-by: Gaosheng Cui <cuigaosheng1@huawei.com> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20221125090141.3626747-1-cuigaosheng1@huawei.com Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
(cherry picked from commit 5a1122e1a896117485bd58634ca39ed2d4950a66) Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Allow to add an exit hook to devm managed clocks. Also use
clk_get_optional() in devm_clk_get_optional instead of open coding it.
The generalisation will be used in the next commit to add some more
devm_clk helpers.
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Alexandru Ardelean <aardelean@deviqon.com> Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Link: https://lore.kernel.org/r/20220520075737.758761-3-u.kleine-koenig@pengutronix.de Signed-off-by: Stephen Boyd <sboyd@kernel.org>
Stable-dep-of: c61bfb1cb63d ("mmc: mtk-sd: Fix missing clk_disable_unprepare in msdc_of_clock_parse()") Signed-off-by: Sasha Levin <sashal@kernel.org>
(cherry picked from commit eb94a7a20f3c8dee1eb3a4f47ac21a3ec8229d97) Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
proc_skip_spaces() seems to think it is working on C strings, and ends
up being just a wrapper around skip_spaces() with a really odd calling
convention.
Instead of basing it on skip_spaces(), it should have looked more like
proc_skip_char(), which really is the exact same function (except it
skips a particular character, rather than whitespace). So use that as
inspiration, odd coding and all.
Now the calling convention actually makes sense and works for the
intended purpose.
Reported-and-tested-by: Kyle Zeng <zengyhkyle@gmail.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 48642f94311b0cf9667aa6833f9f5e3a87d2a0ce) Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
proc_get_long() is passed a size_t, but then assigns it to an 'int'
variable for the length. Let's not do that, even if our IO paths are
limited to MAX_RW_COUNT (exactly because of these kinds of type errors).
When __do_semtimedop() goes to sleep because it has to wait for a
semaphore value becoming zero or becoming bigger than some threshold, it
links the on-stack sem_queue to the sem_array, then goes to sleep
without holding a reference on the sem_array.
When __do_semtimedop() comes back out of sleep, one of two things must
happen:
a) We prove that the on-stack sem_queue has been disconnected from the
(possibly freed) sem_array, making it safe to return from the stack
frame that the sem_queue exists in.
b) We stabilize our reference to the sem_array, lock the sem_array, and
detach the sem_queue from the sem_array ourselves.
sem_array has RCU lifetime, so for case (b), the reference can be
stabilized inside an RCU read-side critical section by locklessly
checking whether the sem_queue is still connected to the sem_array.
However, the current code does the lockless check on sem_queue before
starting an RCU read-side critical section, so the result of the
lockless check immediately becomes useless.
Fix it by doing rcu_read_lock() before the lockless check. Now RCU
ensures that if we observe the object being on our queue, the object
can't be freed until rcu_read_unlock().
This bug is only hittable on kernel builds with full preemption support
(either CONFIG_PREEMPT or PREEMPT_DYNAMIC with preempt=full).
Fixes: 370b262c896e ("ipc/sem: avoid idr tree lookup for interrupted semop") Cc: stable@vger.kernel.org Signed-off-by: Jann Horn <jannh@google.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 4a4073a2e2fe392db08c00c39fdc5e2f8f198547) Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Currently tpm transactions are executed unconditionally in
tpm_pm_suspend() function, which may lead to races with other tpm
accessors in the system.
Specifically, the hw_random tpm driver makes use of tpm_get_random(),
and this function is called in a loop from a kthread, which means it's
not frozen alongside userspace, and so can race with the work done
during system suspend:
On the subject of suspend, the RISC-V SBI spec states:
This does not cover whether any given events actually reach the hart or
not, just what the hart will do if it receives an event. On PolarFire
SoC, and potentially other SiFive based implementations, events from the
RISC-V timer do reach a hart during suspend. This is not the case for the
implementation on the Allwinner D1 - there timer events are not received
during suspend.
To fix this, the CLOCK_EVT_FEAT_C3STOP (mis)feature was enabled for the
timer driver - but this has broken both RCU stall detection and timers
generally on PolarFire SoC and potentially other SiFive based
implementations.
If an AXI read to the PCIe controller on PolarFire SoC times out, the
system will stall, however, with CLOCK_EVT_FEAT_C3STOP active, the system
just locks up without RCU stalling:
Similarly issues were reported with clock_nanosleep() - with a test app
that sleeps each cpu for 6, 5, 4, 3 ms respectively, HZ=250 & the blamed
commit in place, the sleep times are rounded up to the next jiffy:
Fortunately, the D1 has a second timer, which is "currently used in
preference to the RISC-V/SBI timer driver" so a revert here does not
hurt operation of D1 in its current form.
Ultimately, a DeviceTree property (or node) will be added to encode the
behaviour of the timers, but until then revert the addition of
CLOCK_EVT_FEAT_C3STOP.
In a system with a single initiator node, and one or more memory-only
'target' nodes, the memory-only node(s) would fail to register their
initiator node correctly. i.e. in sysfs:
# ls /sys/devices/system/node/node0/access0/targets/
node0
Where as the correct behavior should be:
# ls /sys/devices/system/node/node0/access0/targets/
node0 node1
This happened because hmat_register_target_initiators() uses list_sort()
to sort the initiator list, but the sort comparision function
(initiator_cmp()) is overloaded to also set the node mask's bits.
In a system with a single initiator, the list is singular, and list_sort
elides the comparision helper call. Thus the node mask never gets set,
and the subsequent search for the best initiator comes up empty.
Add a new helper to consume the sorted initiator list, and generate the
nodemask, decoupling it from the overloaded initiator_cmp() comparision
callback. This prevents the singular list corner case naturally, and
makes the code easier to follow as well.
Cc: <stable@vger.kernel.org> Cc: Rafael J. Wysocki <rafael@kernel.org> Cc: Liu Shixin <liushixin2@huawei.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Reported-by: Chris Piper <chris.d.piper@intel.com> Signed-off-by: Vishal Verma <vishal.l.verma@intel.com> Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Link: https://lore.kernel.org/r/20221116-acpi_hmat_fix-v2-2-3712569be691@intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
(cherry picked from commit a759057af7285263093ab10b0bac3e14f35e5fea) Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
In hmat_register_target_initiators(), the variable 'best' gets
initialized in the outer per-locality-type for loop. The initialization
just before setting up 'Access 1' targets was unnecessary. Remove it.
Cc: Rafael J. Wysocki <rafael@kernel.org> Cc: Liu Shixin <liushixin2@huawei.com> Cc: Dan Williams <dan.j.williams@intel.com> Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Vishal Verma <vishal.l.verma@intel.com> Link: https://lore.kernel.org/r/20221116-acpi_hmat_fix-v2-1-3712569be691@intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Stable-dep-of: 48d4180939e1 ("ACPI: HMAT: Fix initiator registration for single-initiator systems") Signed-off-by: Sasha Levin <sashal@kernel.org>
(cherry picked from commit da8a794d71ec99ee92bde4e174d6a6169c00b381) Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Recent changes to the DMA code has resulting in the IMX driver failing
I2C transfers when the buffer has been vmalloc. Only perform DMA
transfers if the message has the I2C_M_DMA_SAFE flag set, indicating
the client is providing a buffer which is DMA safe.
This is a minimal fix for stable. The I2C core provides helpers to
allocate a bounce buffer. For a fuller fix the master should make use
of these helpers.
Fixes: 4544b9f25e70 ("dma-mapping: Add vmap checks to dma_map_single()") Signed-off-by: Andrew Lunn <andrew@lunn.ch> Acked-by: Oleksij Rempel <o.rempel@pengutronix.de> Signed-off-by: Wolfram Sang <wsa@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
(cherry picked from commit 2d16161a2c98730dc56d456e6dd1702a0a9fe735) Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
A problem about i2c-npcm7xx create debugfs failed is triggered with the
following log given:
[ 173.827310] debugfs: Directory 'npcm_i2c' with parent '/' already present!
The reason is that npcm_i2c_init() returns platform_driver_register()
directly without checking its return value, if platform_driver_register()
failed, it returns without destroy the newly created debugfs, resulting
the debugfs of npcm_i2c can never be created later.
The STM32 USART can control RS-485 Transmit Enable in hardware. Since
commit 7df5081cbf5e ("serial: stm32: Add RS485 RTS GPIO control"),
it can alternatively be controlled in software. That was done to allow
RS-485 even if the RTS pin is unavailable because it's pinmuxed to a
different function.
However the commit neglected to deassert Transmit Enable upon invocation
of the ->rs485_config() callback. Fix it.
Avoid forward declarations by moving stm32_usart_tx_empty(),
stm32_usart_rs485_rts_enable() and stm32_usart_rs485_rts_disable()
further up in the driver.
In case the RS485 mode is emulated using GPIO RTS, use the TC interrupt
to deassert the GPIO RTS, otherwise the GPIO RTS stays asserted after a
transmission ended and the RS485 cannot work.
Signed-off-by: Marek Vasut <marex@denx.de> Cc: Alexandre Torgue <alexandre.torgue@foss.st.com> Cc: Erwan Le Ray <erwan.leray@foss.st.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Jean Philippe Romain <jean-philippe.romain@foss.st.com> Cc: Valentin Caron <valentin.caron@foss.st.com> Cc: linux-arm-kernel@lists.infradead.org Cc: linux-stm32@st-md-mailman.stormreply.com
To: linux-serial@vger.kernel.org Link: https://lore.kernel.org/r/20220430162845.244655-2-marex@denx.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
(cherry picked from commit 45f628f4fd816f4843f6860996fcaa2b461165d1) Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>