- dsa: microchip: fix kernel oops on ksz8 switches
- dsa: qca8k: fix NULL pointer dereference for
of_device_get_match_data
Previous releases - regressions:
- netfilter: clean up hook list when offload flags check fails
- wifi: mt76: fix crash in chip reset fail
- rxrpc: fix ICMP/ICMP6 error handling
- ice: fix DMA mappings leak
- i40e: fix kernel crash during module removal
Previous releases - always broken:
- ipv6: sr: fix out-of-bounds read when setting HMAC data.
- tcp: TX zerocopy should not sense pfmemalloc status
- sch_sfb: don't assume the skb is still around after
enqueueing to child
- netfilter: drop dst references before setting
- wifi: wilc1000: fix DMA on stack objects
- rxrpc: fix an insufficiently large sglist in
rxkad_verify_packet_2()
- fec: use a spinlock to guard `fep->ptp_clk_on`
Misc:
- usb: qmi_wwan: add Quectel RM520N"
* tag 'net-6.0-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (50 commits)
sch_sfb: Also store skb len before calling child enqueue
net: phy: lan87xx: change interrupt src of link_up to comm_ready
net/smc: Fix possible access to freed memory in link clear
net: ethernet: mtk_eth_soc: check max allowed hash in mtk_ppe_check_skb
net: skb: export skb drop reaons to user by TRACE_DEFINE_ENUM
net: ethernet: mtk_eth_soc: fix typo in __mtk_foe_entry_clear
net: dsa: felix: access QSYS_TAG_CONFIG under tas_lock in vsc9959_sched_speed_set
net: dsa: felix: disable cut-through forwarding for frames oversized for tc-taprio
net: dsa: felix: tc-taprio intervals smaller than MTU should send at least one packet
net: usb: qmi_wwan: add Quectel RM520N
net: dsa: qca8k: fix NULL pointer dereference for of_device_get_match_data
tcp: fix early ETIMEDOUT after spurious non-SACK RTO
stmmac: intel: Simplify intel_eth_pci_remove()
net: mvpp2: debugfs: fix memory leak when using debugfs_lookup()
ipv6: sr: fix out-of-bounds read when setting HMAC data.
bonding: accept unsolicited NA message
bonding: add all node mcast address when slave up
bonding: use unspecified address if no available link local address
wifi: use struct_group to copy addresses
wifi: mac80211_hwsim: check length for virtio packets
...
Linus Torvalds [Wed, 31 Aug 2022 16:46:12 +0000 (09:46 -0700)]
fs: only do a memory barrier for the first set_buffer_uptodate()
Commit d4252071b97d ("add barriers to buffer_uptodate and
set_buffer_uptodate") added proper memory barriers to the buffer head
BH_Uptodate bit, so that anybody who tests a buffer for being up-to-date
will be guaranteed to actually see initialized state.
However, that commit didn't _just_ add the memory barrier, it also ended
up dropping the "was it already set" logic that the BUFFER_FNS() macro
had.
That's conceptually the right thing for a generic "this is a memory
barrier" operation, but in the case of the buffer contents, we really
only care about the memory barrier for the _first_ time we set the bit,
in that the only memory ordering protection we need is to avoid anybody
seeing uninitialized memory contents.
Any other access ordering wouldn't be about the BH_Uptodate bit anyway,
and would require some other proper lock (typically BH_Lock or the folio
lock). A reader that races with somebody invalidating the buffer head
isn't an issue wrt the memory ordering, it's a serialization issue.
Now, you'd think that the buffer head operations don't matter in this
day and age (and I certainly thought so), but apparently some loads
still end up being heavy users of buffer heads. In particular, the
kernel test robot reported that not having this bit access optimization
in place caused a noticeable direct IO performance regression on ext4:
sch_sfb: Also store skb len before calling child enqueue
Cong Wang noticed that the previous fix for sch_sfb accessing the queued
skb after enqueueing it to a child qdisc was incomplete: the SFB enqueue
function was also calling qdisc_qstats_backlog_inc() after enqueue, which
reads the pkt len from the skb cb field. Fix this by also storing the skb
len, and using the stored value to increment the backlog after enqueueing.
Fixes: 9efd23297cca ("sch_sfb: Don't assume the skb is still around after enqueueing to child") Signed-off-by: Toke Høiland-Jørgensen <toke@toke.dk> Acked-by: Cong Wang <cong.wang@bytedance.com> Link: https://lore.kernel.org/r/20220905192137.965549-1-toke@toke.dk Signed-off-by: Paolo Abeni <pabeni@redhat.com>
net: phy: lan87xx: change interrupt src of link_up to comm_ready
Currently phy link up/down interrupt is enabled using the
LAN87xx_INTERRUPT_MASK register. In the lan87xx_read_status function,
phy link is determined using the T1_MODE_STAT_REG register comm_ready bit.
comm_ready bit is set using the loc_rcvr_status & rem_rcvr_status.
Whenever the phy link is up, LAN87xx_INTERRUPT_SOURCE link_up bit is set
first but comm_ready bit takes some time to set based on local and
remote receiver status.
As per the current implementation, interrupt is triggered using link_up
but the comm_ready bit is still cleared in the read_status function. So,
link is always down. Initially tested with the shared interrupt
mechanism with switch and internal phy which is working, but after
implementing interrupt controller it is not working.
It can fixed either by updating the read_status function to read from
LAN87XX_INTERRUPT_SOURCE register or enable the interrupt mask for
comm_ready bit. But the validation team recommends the use of comm_ready
for link detection.
This patch fixes by enabling the comm_ready bit for link_up in the
LAN87XX_INTERRUPT_MASK_2 register (MISC Bank) and link_down in
LAN87xx_INTERRUPT_MASK register.
Hyunwoo Kim [Wed, 7 Sep 2022 16:07:14 +0000 (09:07 -0700)]
efi: capsule-loader: Fix use-after-free in efi_capsule_write
A race condition may occur if the user calls close() on another thread
during a write() operation on the device node of the efi capsule.
This is a race condition that occurs between the efi_capsule_write() and
efi_capsule_flush() functions of efi_capsule_fops, which ultimately
results in UAF.
So, the page freeing process is modified to be done in
efi_capsule_release() instead of efi_capsule_flush().
Yacan Liu [Tue, 6 Sep 2022 13:01:39 +0000 (21:01 +0800)]
net/smc: Fix possible access to freed memory in link clear
After modifying the QP to the Error state, all RX WR would be completed
with WC in IB_WC_WR_FLUSH_ERR status. Current implementation does not
wait for it is done, but destroy the QP and free the link group directly.
So there is a risk that accessing the freed memory in tasklet context.
Fixes: bd4ad57718cc ("smc: initialize IB transport incl. PD, MR, QP, CQ, event, WR") Signed-off-by: Yacan Liu <liuyacan@corp.netease.com> Reviewed-by: Tony Lu <tonylu@linux.alibaba.com> Signed-off-by: David S. Miller <davem@davemloft.net>
net: ethernet: mtk_eth_soc: check max allowed hash in mtk_ppe_check_skb
Even if max hash configured in hw in mtk_ppe_hash_entry is
MTK_PPE_ENTRIES - 1, check theoretical OOB accesses in
mtk_ppe_check_skb routine
Fixes: c4f033d9e03e9 ("net: ethernet: mtk_eth_soc: rework hardware flow table management") Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
net: skb: export skb drop reaons to user by TRACE_DEFINE_ENUM
As Eric reported, the 'reason' field is not presented when trace the
kfree_skb event by perf:
$ perf record -e skb:kfree_skb -a sleep 10
$ perf script
ip_defrag 14605 [021] 221.614303: skb:kfree_skb:
skbaddr=0xffff9d2851242700 protocol=34525 location=0xffffffffa39346b1
reason:
The cause seems to be passing kernel address directly to TP_printk(),
which is not right. As the enum 'skb_drop_reason' is not exported to
user space through TRACE_DEFINE_ENUM(), perf can't get the drop reason
string from the 'reason' field, which is a number.
Therefore, we introduce the macro DEFINE_DROP_REASON(), which is used
to define the trace enum by TRACE_DEFINE_ENUM(). With the help of
DEFINE_DROP_REASON(), now we can remove the auto-generate that we
introduced in the commit ec43908dd556
("net: skb: use auto-generation to convert skb drop reason to string"),
and define the string array 'drop_reasons'.
Hmmmm...now we come back to the situation that have to maintain drop
reasons in both enum skb_drop_reason and DEFINE_DROP_REASON. But they
are both in dropreason.h, which makes it easier.
After this commit, now the format of kfree_skb is like this:
net: ethernet: mtk_eth_soc: fix typo in __mtk_foe_entry_clear
Set ib1 state to MTK_FOE_STATE_UNBIND in __mtk_foe_entry_clear routine.
Fixes: 33fc42de33278 ("net: ethernet: mtk_eth_soc: support creating mac address based offload entries") Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Wed, 7 Sep 2022 12:44:04 +0000 (13:44 +0100)]
Merge branch 'dsa-felix-fixes'
Vladimir Oltean says:
====================
Fixes for Felix DSA driver calculation of tc-taprio guard bands
This series fixes some bugs which are not quite new, but date from v5.13
when static guard bands were enabled by Michael Walle to prevent
tc-taprio overruns.
The investigation started when Xiaoliang asked privately what is the
expected max SDU for a traffic class when its minimum gate interval is
10 us. The answer, as it turns out, is not an L1 size of 1250 octets,
but 1245 octets, since otherwise, the switch will not consider frames
for egress scheduling, because the static guard band is exactly as large
as the time interval. The switch needs a minimum of 33 ns outside of the
guard band to consider a frame for scheduling, and the reduction of the
max SDU by 5 provides exactly for that.
The fix for that (patch 1/3) is relatively small, but during testing, it
became apparent that cut-through forwarding prevents oversized frame
dropping from working properly. This is solved through the larger patch
2/3. Finally, patch 3/3 fixes one more tc-taprio locking problem found
through code inspection.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Vladimir Oltean [Mon, 5 Sep 2022 17:01:25 +0000 (20:01 +0300)]
net: dsa: felix: access QSYS_TAG_CONFIG under tas_lock in vsc9959_sched_speed_set
The read-modify-write of QSYS_TAG_CONFIG from vsc9959_sched_speed_set()
runs unlocked with respect to the other functions that access it, which
are vsc9959_tas_guard_bands_update(), vsc9959_qos_port_tas_set() and
vsc9959_tas_clock_adjust(). All the others are under ocelot->tas_lock,
so move the vsc9959_sched_speed_set() access under that lock as well, to
resolve the concurrency.
Fixes: 55a515b1f5a9 ("net: dsa: felix: drop oversized frames with tc-taprio instead of hanging the port") Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Vladimir Oltean [Mon, 5 Sep 2022 17:01:24 +0000 (20:01 +0300)]
net: dsa: felix: disable cut-through forwarding for frames oversized for tc-taprio
Experimentally, it looks like when QSYS_QMAXSDU_CFG_7 is set to 605,
frames even way larger than 601 octets are transmitted even though these
should be considered as oversized, according to the documentation, and
dropped.
Since oversized frame dropping depends on frame size, which is only
known at the EOF stage, and therefore not at SOF when cut-through
forwarding begins, it means that the switch cannot take QSYS_QMAXSDU_CFG_*
into consideration for traffic classes that are cut-through.
Since cut-through forwarding has no UAPI to control it, and the driver
enables it based on the mantra "if we can, then why not", the strategy
is to alter vsc9959_cut_through_fwd() to take into consideration which
tc's have oversize frame dropping enabled, and disable cut-through for
them. Then, from vsc9959_tas_guard_bands_update(), we re-trigger the
cut-through determination process.
There are 2 strategies for vsc9959_cut_through_fwd() to determine
whether a tc has oversized dropping enabled or not. One is to keep a bit
mask of traffic classes per port, and the other is to read back from the
hardware registers (a non-zero value of QSYS_QMAXSDU_CFG_* means the
feature is enabled). We choose reading back from registers, because
struct ocelot_port is shared with drivers (ocelot, seville) that don't
support either cut-through nor tc-taprio, and we don't have a felix
specific extension of struct ocelot_port. Furthermore, reading registers
from the Felix hardware is quite cheap, since they are memory-mapped.
Fixes: 55a515b1f5a9 ("net: dsa: felix: drop oversized frames with tc-taprio instead of hanging the port") Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
because the gate entry for TC 7 (S 0x80 10000 ns) now has a static guard
band added earlier than its 'gate close' event, such that packet
overruns won't occur in the worst case of the largest packet possible.
Since guard bands are statically determined based on the per-tc
QSYS_QMAXSDU_CFG_* with a fallback on the port-based QSYS_PORT_MAX_SDU,
we need to discuss what happens with TC 7 depending on kernel version,
since the driver, prior to commit 55a515b1f5a9 ("net: dsa: felix: drop
oversized frames with tc-taprio instead of hanging the port"), did not
touch QSYS_QMAXSDU_CFG_*, and therefore relied on QSYS_PORT_MAX_SDU.
1 (before vsc9959_tas_guard_bands_update): QSYS_PORT_MAX_SDU defaults to
1518, and at gigabit this introduces a static guard band (independent
of packet sizes) of 12144 ns, plus QSYS::HSCH_MISC_CFG.FRM_ADJ (bit
time of 20 octets => 160 ns). But this is larger than the time window
itself, of 10000 ns. So, the queue system never considers a frame with
TC 7 as eligible for transmission, since the gate practically never
opens, and these frames are forever stuck in the TX queues and hang
the port.
2 (after vsc9959_tas_guard_bands_update): Under the sole goal of
enabling oversized frame dropping, we make an effort to set
QSYS_QMAXSDU_CFG_7 to 1230 bytes. But QSYS_QMAXSDU_CFG_7 plays
one more role, which we did not take into account: per-tc static guard
band, expressed in L2 byte time (auto-adjusted for FCS and L1 overhead).
There is a discrepancy between what the driver thinks (that there is
no guard band, and 100% of min_gate_len[tc] is available for egress
scheduling) and what the hardware actually does (crops the equivalent
of QSYS_QMAXSDU_CFG_7 ns out of min_gate_len[tc]). In practice, this
means that the hardware thinks it has exactly 0 ns for scheduling tc 7.
In both cases, even minimum sized Ethernet frames are stuck on egress
rather than being considered for scheduling on TC 7, even if they would
fit given a proper configuration. Considering the current situation,
with vsc9959_tas_guard_bands_update(), frames between 60 octets and 1230
octets in size are not eligible for oversized dropping (because they are
smaller than QSYS_QMAXSDU_CFG_7), but won't be considered as eligible
for scheduling either, because the min_gate_len[7] (10000 ns) minus the
guard band determined by QSYS_QMAXSDU_CFG_7 (1230 octets * 8 ns per
octet == 9840 ns) minus the guard band auto-added for L1 overhead by
QSYS::HSCH_MISC_CFG.FRM_ADJ (20 octets * 8 ns per octet == 160 octets)
leaves 0 ns for scheduling in the queue system proper.
Investigating the hardware behavior, it becomes apparent that the queue
system needs precisely 33 ns of 'gate open' time in order to consider a
frame as eligible for scheduling to a tc. So the solution to this
problem is to amend vsc9959_tas_guard_bands_update(), by giving the
per-tc guard bands less space by exactly 33 ns, just enough for one
frame to be scheduled in that interval. This allows the queue system to
make forward progress for that port-tc, and prevents it from hanging.
Fixes: 297c4de6f780 ("net: dsa: felix: re-enable TAS guard band mode") Reported-by: Xiaoliang Yang <xiaoliang.yang_1@nxp.com> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
David Howells [Tue, 6 Sep 2022 21:09:11 +0000 (22:09 +0100)]
afs: Return -EAGAIN, not -EREMOTEIO, when a file already locked
When trying to get a file lock on an AFS file, the server may return
UAEAGAIN to indicate that the lock is already held. This is currently
translated by the default path to -EREMOTEIO.
Translate it instead to -EAGAIN so that we know we can retry it.
Merge tag 'erofs-for-6.0-rc5-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs
Pull erofs fixes from Gao Xiang:
- Fix return codes in erofs_fscache_{meta_,}read_folio error paths
- Fix potential wrong pcluster sizes for later non-4K lclusters
- Fix in-memory pcluster use-after-free on UP platforms
* tag 'erofs-for-6.0-rc5-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs:
erofs: fix pcluster use-after-free on UP platforms
erofs: avoid the potentially wrong m_plen for big pcluster
erofs: fix error return code in erofs_fscache_{meta_,}read_folio
net: dsa: qca8k: fix NULL pointer dereference for of_device_get_match_data
of_device_get_match_data is called on priv->dev before priv->dev is
actually set. Move of_device_get_match_data after priv->dev is correctly
set to fix this kernel panic.
Fixes: 3bb0844e7bcd ("net: dsa: qca8k: cache match data to speed up access") Signed-off-by: Christian Marangi <ansuelsmth@gmail.com> Reviewed-by: Vladimir Oltean <olteanv@gmail.com> Link: https://lore.kernel.org/r/20220904215319.13070-1-ansuelsmth@gmail.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
tcp: fix early ETIMEDOUT after spurious non-SACK RTO
Fix a bug reported and analyzed by Nagaraj Arankal, where the handling
of a spurious non-SACK RTO could cause a connection to fail to clear
retrans_stamp, causing a later RTO to very prematurely time out the
connection with ETIMEDOUT.
Here is the buggy scenario, expanding upon Nagaraj Arankal's excellent
report:
(*1) Send one data packet on a non-SACK connection
(*2) Because no ACK packet is received, the packet is retransmitted
and we enter CA_Loss; but this retransmission is spurious.
(*3) The ACK for the original data is received. The transmitted packet
is acknowledged. The TCP timestamp is before the retrans_stamp,
so tcp_may_undo() returns true, and tcp_try_undo_loss() returns
true without changing state to Open (because tcp_is_sack() is
false), and tcp_process_loss() returns without calling
tcp_try_undo_recovery(). Normally after undoing a CA_Loss
episode, tcp_fastretrans_alert() would see that the connection
has returned to CA_Open and fall through and call
tcp_try_to_open(), which would set retrans_stamp to 0. However,
for non-SACK connections we hold the connection in CA_Loss, so do
not fall through to call tcp_try_to_open() and do not set
retrans_stamp to 0. So retrans_stamp is (erroneously) still
non-zero.
At this point the first "retransmission event" has passed and
been recovered from. Any future retransmission is a completely
new "event". However, retrans_stamp is erroneously still
set. (And we are still in CA_Loss, which is correct.)
(*4) After 16 minutes (to correspond with tcp_retries2=15), a new data
packet is sent. Note: No data is transmitted between (*3) and
(*4) and we disabled keep alives.
The socket's timeout SHOULD be calculated from this point in
time, but instead it's calculated from the prior "event" 16
minutes ago (step (*2)).
(*5) Because no ACK packet is received, the packet is retransmitted.
(*6) At the time of the 2nd retransmission, the socket returns
ETIMEDOUT, prematurely, because retrans_stamp is (erroneously)
too far in the past (set at the time of (*2)).
This commit fixes this bug by ensuring that we reuse in
tcp_try_undo_loss() the same careful logic for non-SACK connections
that we have in tcp_try_undo_recovery(). To avoid duplicating logic,
we factor out that logic into a new
tcp_is_non_sack_preventing_reopen() helper and call that helper from
both undo functions.
Merge tag 'soc-fixes-6.0-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc
Pull ARM SoC fixes from Arnd Bergmann:
"These are the expected fixes for the SoC tree. I have let the patches
pile up a little too long, so this is bigger than I would have liked.
- Minor build fixes for Broadcom STB and NXP i.MX8M SoCs as well\ as
TEE firmware
- Updates to the MAINTAINERS file for the PolarFire SoC
- Minor DT fixes for Renesas White Hawk and Arm Versatile and Juno
platforms
- A fix for a missing dependnecy in the NXP DPIO driver
- Broadcom BCA fixes to the newly added devicetree files
- Multiple fixes for Microchip AT91 based SoCs, dealing with
self-refresh timings and regulator settings in DT
- Several DT fixes for NXP i.MX platforms, dealing with incorrect
GPIO settings, extraneous nodes, and a wrong clock setting"
* tag 'soc-fixes-6.0-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (45 commits)
soc: fsl: select FSL_GUTS driver for DPIO
ARM: dts: at91: sama5d2_icp: don't keep vdd_other enabled all the time
ARM: dts: at91: sama5d27_wlsom1: don't keep ldo2 enabled all the time
ARM: dts: at91: sama7g5ek: specify proper regulator output ranges
ARM: dts: at91: sama5d2_icp: specify proper regulator output ranges
ARM: dts: at91: sama5d27_wlsom1: specify proper regulator output ranges
ARM: at91: pm: fix DDR recalibration when resuming from backup and self-refresh
ARM: at91: pm: fix self-refresh for sama7g5
soc: brcmstb: pm-arm: Fix refcount leak and __iomem leak bugs
ARM: configs: at91: remove CONFIG_MICROCHIP_PIT64B
ARM: ixp4xx: fix typos in comments
arm64: dts: renesas: r8a779g0: Fix HSCIF0 interrupt number
tee: fix compiler warning in tee_shm_register()
arm64: dts: freescale: verdin-imx8mp: fix atmel_mxt_ts reset polarity
arm64: dts: freescale: verdin-imx8mm: fix atmel_mxt_ts reset polarity
arm64: dts: imx8mp: Fix I2C5 GPIO assignment on i.MX8M Plus DHCOM
arm64: dts: imx8mm-venice-gw7901: fix port/phy validation
arm64: dts: verdin-imx8mm: add otg2 pd to usbphy
soc: imx: gpcv2: Assert reset before ungating clock
arm64: dts: ls1028a-qds-65bb: don't use in-band autoneg for 2500base-x
...
erofs: fix pcluster use-after-free on UP platforms
During stress testing with CONFIG_SMP disabled, KASAN reports as below:
==================================================================
BUG: KASAN: use-after-free in __mutex_lock+0xe5/0xc30
Read of size 8 at addr ffff8881094223f8 by task stress/7789
CPU: 0 PID: 7789 Comm: stress Not tainted 6.0.0-rc1-00002-g0d53d2e882f9 #3
Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
Call Trace:
<TASK>
..
__mutex_lock+0xe5/0xc30
..
z_erofs_do_read_page+0x8ce/0x1560
..
z_erofs_readahead+0x31c/0x580
..
Freed by task 7787
kasan_save_stack+0x1e/0x40
kasan_set_track+0x20/0x30
kasan_set_free_info+0x20/0x40
__kasan_slab_free+0x10c/0x190
kmem_cache_free+0xed/0x380
rcu_core+0x3d5/0xc90
__do_softirq+0x12d/0x389
Last potentially related work creation:
kasan_save_stack+0x1e/0x40
__kasan_record_aux_stack+0x97/0xb0
call_rcu+0x3d/0x3f0
erofs_shrink_workstation+0x11f/0x210
erofs_shrink_scan+0xdc/0x170
shrink_slab.constprop.0+0x296/0x530
drop_slab+0x1c/0x70
drop_caches_sysctl_handler+0x70/0x80
proc_sys_call_handler+0x20a/0x2f0
vfs_write+0x555/0x6c0
ksys_write+0xbe/0x160
do_syscall_64+0x3b/0x90
The root cause is that erofs_workgroup_unfreeze() doesn't reset to
orig_val thus it causes a race that the pcluster reuses unexpectedly
before freeing.
Since UP platforms are quite rare now, such path becomes unnecessary.
Let's drop such specific-designed path directly instead.
Yue Hu [Fri, 12 Aug 2022 06:01:50 +0000 (14:01 +0800)]
erofs: avoid the potentially wrong m_plen for big pcluster
Actually, 'compressedlcs' stores compressed block count rather than
lcluster count. Therefore, the number of bits for shifting the count
should be 'LOG_BLOCK_SIZE' rather than 'lclusterbits' although current
lcluster size is 4K.
The value of 'm_plen' will be wrong once we enable the non 4K-sized
lcluster.
There is no point to call pcim_iounmap_regions() in the remove function,
this frees a managed resource that would be release by the framework
anyway.
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
net: mvpp2: debugfs: fix memory leak when using debugfs_lookup()
When calling debugfs_lookup() the result must have dput() called on it,
otherwise the memory will leak over time. Fix this up to be much
simpler logic and only create the root debugfs directory once when the
driver is first accessed. That resolves the memory leak and makes
things more obvious as to what the intent is.
Cc: Marcin Wojtas <mw@semihalf.com> Cc: Russell King <linux@armlinux.org.uk> Cc: "David S. Miller" <davem@davemloft.net> Cc: Eric Dumazet <edumazet@google.com> Cc: Jakub Kicinski <kuba@kernel.org> Cc: Paolo Abeni <pabeni@redhat.com> Cc: netdev@vger.kernel.org Cc: stable <stable@kernel.org> Fixes: 21da57a23125 ("net: mvpp2: add a debugfs interface for the Header Parser") Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: David S. Miller <davem@davemloft.net>
David Lebrun [Fri, 2 Sep 2022 09:45:06 +0000 (10:45 +0100)]
ipv6: sr: fix out-of-bounds read when setting HMAC data.
The SRv6 layer allows defining HMAC data that can later be used to sign IPv6
Segment Routing Headers. This configuration is realised via netlink through
four attributes: SEG6_ATTR_HMACKEYID, SEG6_ATTR_SECRET, SEG6_ATTR_SECRETLEN and
SEG6_ATTR_ALGID. Because the SECRETLEN attribute is decoupled from the actual
length of the SECRET attribute, it is possible to provide invalid combinations
(e.g., secret = "", secretlen = 64). This case is not checked in the code and
with an appropriately crafted netlink message, an out-of-bounds read of up
to 64 bytes (max secret length) can occur past the skb end pointer and into
skb_shared_info:
Breakpoint 1, seg6_genl_sethmac (skb=<optimized out>, info=<optimized out>) at net/ipv6/seg6.c:208
208 memcpy(hinfo->secret, secret, slen);
(gdb) bt
#0 seg6_genl_sethmac (skb=<optimized out>, info=<optimized out>) at net/ipv6/seg6.c:208
#1 0xffffffff81e012e9 in genl_family_rcv_msg_doit (skb=skb@entry=0xffff88800b1f9f00, nlh=nlh@entry=0xffff88800b1b7600,
extack=extack@entry=0xffffc90000ba7af0, ops=ops@entry=0xffffc90000ba7a80, hdrlen=4, net=0xffffffff84237580 <init_net>, family=<optimized out>,
family=<optimized out>) at net/netlink/genetlink.c:731
#2 0xffffffff81e01435 in genl_family_rcv_msg (extack=0xffffc90000ba7af0, nlh=0xffff88800b1b7600, skb=0xffff88800b1f9f00,
family=0xffffffff82fef6c0 <seg6_genl_family>) at net/netlink/genetlink.c:775
#3 genl_rcv_msg (skb=0xffff88800b1f9f00, nlh=0xffff88800b1b7600, extack=0xffffc90000ba7af0) at net/netlink/genetlink.c:792
#4 0xffffffff81dfffc3 in netlink_rcv_skb (skb=skb@entry=0xffff88800b1f9f00, cb=cb@entry=0xffffffff81e01350 <genl_rcv_msg>)
at net/netlink/af_netlink.c:2501
#5 0xffffffff81e00919 in genl_rcv (skb=0xffff88800b1f9f00) at net/netlink/genetlink.c:803
#6 0xffffffff81dff6ae in netlink_unicast_kernel (ssk=0xffff888010eec800, skb=0xffff88800b1f9f00, sk=0xffff888004aed000)
at net/netlink/af_netlink.c:1319
#7 netlink_unicast (ssk=ssk@entry=0xffff888010eec800, skb=skb@entry=0xffff88800b1f9f00, portid=portid@entry=0, nonblock=<optimized out>)
at net/netlink/af_netlink.c:1345
#8 0xffffffff81dff9a4 in netlink_sendmsg (sock=<optimized out>, msg=0xffffc90000ba7e48, len=<optimized out>) at net/netlink/af_netlink.c:1921
...
(gdb) p/x ((struct sk_buff *)0xffff88800b1f9f00)->head + ((struct sk_buff *)0xffff88800b1f9f00)->end
$1 = 0xffff88800b1b76c0
(gdb) p/x secret
$2 = 0xffff88800b1b76c0
(gdb) p slen
$3 = 64 '@'
The OOB data can then be read back from userspace by dumping HMAC state. This
commit fixes this by ensuring SECRETLEN cannot exceed the actual length of
SECRET.
Reported-by: Lucas Leong <wmliang.tw@gmail.com>
Tested: verified that EINVAL is correctly returned when secretlen > len(secret) Fixes: 4f4853dc1c9c1 ("ipv6: sr: implement API to control SR HMAC structure") Signed-off-by: David Lebrun <dlebrun@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Hangbin Liu [Tue, 30 Aug 2022 09:37:22 +0000 (17:37 +0800)]
bonding: accept unsolicited NA message
The unsolicited NA message with all-nodes multicast dest address should
be valid, as this also means the link could reach the target.
Also rename bond_validate_ns() to bond_validate_na().
Reported-by: LiLiang <liali@redhat.com> Fixes: 5e1eeef69c0f ("bonding: NS target should accept link local address") Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Hangbin Liu [Tue, 30 Aug 2022 09:37:21 +0000 (17:37 +0800)]
bonding: add all node mcast address when slave up
When a link is enslave to bond, it need to set the interface down first.
This makes the slave remove mac multicast address 33:33:00:00:00:01(The
IPv6 multicast address ff02::1 is kept even when the interface down). When
bond set the slave up, ipv6_mc_up() was not called due to commit c2edacf80e15
("bonding / ipv6: no addrconf for slaves separately from master").
This is not an issue before we adding the lladdr target feature for bonding,
as the mac multicast address will be added back when bond interface up and
join group ff02::1.
But after adding lladdr target feature for bonding. When user set a lladdr
target, the unsolicited NA message with all-nodes multicast dest will be
dropped as the slave interface never add 33:33:00:00:00:01 back.
Fix this by calling ipv6_mc_up() to add 33:33:00:00:00:01 back when
the slave interface up.
Reported-by: LiLiang <liali@redhat.com> Fixes: 5e1eeef69c0f ("bonding: NS target should accept link local address") Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Hangbin Liu [Tue, 30 Aug 2022 09:37:20 +0000 (17:37 +0800)]
bonding: use unspecified address if no available link local address
When ns_ip6_target was set, the ipv6_dev_get_saddr() will be called to get
available source address and send IPv6 neighbor solicit message.
If the target is global address, ipv6_dev_get_saddr() will get any
available src address. But if the target is link local address,
ipv6_dev_get_saddr() will only get available address from our interface,
i.e. the corresponding bond interface.
But before bond interface up, all the address is tentative, while
ipv6_dev_get_saddr() will ignore tentative address. This makes we can't
find available link local src address, then bond_ns_send() will not be
called and no NS message was sent. Finally bond interface will keep in
down state.
Fix this by sending NS with unspecified address if there is no available
source address.
Reported-by: LiLiang <liali@redhat.com> Fixes: 5e1eeef69c0f ("bonding: NS target should accept link local address") Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Merge tag 'powerpc-6.0-4' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux
Pull powerpc fixes from Michael Ellerman:
- Fix handling of PCI domains in /proc on 32-bit systems using the
recently added support for numbering buses from zero for each domain.
- A fix and a revert for some changes to use READ/WRITE_ONCE() which
caused problems with KASAN enabled due to sanitisation calls being
introduced in low-level paths that can't cope with it.
- Fix build errors on 32-bit caused by the syscall table being
misaligned sometimes.
- Two fixes to get IBM Cell native machines booting again, which had
bit-rotted while my QS22 was temporarily out of action.
- Fix the papr_scm driver to not assume the order of events returned by
the hypervisor is stable, and a related compile fix.
Thanks to Aneesh Kumar K.V, Christophe Leroy, Jordan Niethe, Kajol Jain,
Masahiro Yamada, Nathan Chancellor, Pali Rohár, Vaibhav Jain, and Zhouyi
Zhou.
* tag 'powerpc-6.0-4' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
powerpc/papr_scm: Ensure rc is always initialized in papr_scm_pmu_register()
Revert "powerpc/irq: Don't open code irq_soft_mask helpers"
powerpc: Fix hard_irq_disable() with sanitizer
powerpc/rtas: Fix RTAS MSR[HV] handling for Cell
Revert "powerpc: Remove unused FW_FEATURE_NATIVE references"
powerpc: align syscall table for ppc32
powerpc/pci: Enable PCI domains in /proc when PCI bus numbers are not unique
powerpc/papr_scm: Fix nvdimm event mappings
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull kvm fixes from Paolo Bonzini:
"s390:
- PCI interpretation compile fixes
RISC-V:
- fix unused variable warnings in vcpu_timer.c
- move extern sbi_ext declarations to a header
x86:
- check validity of argument to KVM_SET_MP_STATE
- use guest's global_ctrl to completely disable guest PEBS
- fix a memory leak on memory allocation failure
- mask off unsupported and unknown bits of IA32_ARCH_CAPABILITIES
- fix build failure with Clang integrated assembler
- fix MSR interception
- always flush TLBs when enabling dirty logging"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
KVM: x86: check validity of argument to KVM_SET_MP_STATE
perf/x86/core: Completely disable guest PEBS via guest's global_ctrl
KVM: x86: fix memoryleak in kvm_arch_vcpu_create()
KVM: x86: Mask off unsupported and unknown bits of IA32_ARCH_CAPABILITIES
KVM: s390: pci: Hook to access KVM lowlevel from VFIO
riscv: kvm: move extern sbi_ext declarations to a header
riscv: kvm: vcpu_timer: fix unused variable warnings
KVM: selftests: Fix ambiguous mov in KVM_ASM_SAFE()
KVM: selftests: Fix KVM_EXCEPTION_MAGIC build with Clang
KVM: VMX: Heed the 'msr' argument in msr_write_intercepted()
kvm: x86: mmu: Always flush TLBs when enabling dirty logging
kvm: x86: mmu: Drop the need_remote_flush() function
Makefile.extrawarn: re-enable -Wformat for clang; take 2
-Wformat was recently re-enabled for builds with clang, then quickly
re-disabled, due to concerns stemming from the frequency of default
argument promotion related warning instances.
commit 258fafcd0683 ("Makefile.extrawarn: re-enable -Wformat for clang")
commit 21f9c8a13bb2 ("Revert "Makefile.extrawarn: re-enable -Wformat for clang"")
ISO WG14 has ratified N2562 to address default argument promotion
explicitly for printf, as part of the upcoming ISO C2X standard.
The behavior of clang was changed in clang-16 to not warn for the cited
cases in all language modes.
Add a version check, so that users of clang-16 now get the full effect
of -Wformat. For older clang versions, re-enable flags under the
-Wformat group that way users still get some useful checks related to
format strings, without noisy default argument promotion warnings. I
intentionally omitted -Wformat-y2k and -Wformat-security from being
re-enabled, which are also part of -Wformat in clang-16.
David S. Miller [Sun, 4 Sep 2022 10:23:11 +0000 (11:23 +0100)]
Merge tag 'wireless-2022-09-03' of git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless
Johannes berg says:
====================
We have a handful of fixes:
- fix DMA from stack in wilc1000 driver
- fix crash on chip reset failure in mt7921e
- fix for the reported warning on aggregation timer expiry
- check packet lengths in hwsim virtio paths
- fix compiler warnings/errors with AAD construction by
using struct_group
- fix Intel 4965 driver rate scale operation
- release channel contexts correctly in mac80211 mlme code
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Merge tag 'gpio-fixes-for-v6.0-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux
Pull gpio fixes from Bartosz Golaszewski:
"A a set of fixes from the GPIO subsystem.
Most are small driver fixes except the realtek-otto driver patch which
is pretty big but addresses a significant flaw that can cause the CPU
to stay infinitely busy on uncleared ISR on some platforms.
Summary:
- MAINTAINERS update
- fix resource leaks in gpio-mockup and gpio-pxa
- add missing locking in gpio-pca953x
- use 32-bit I/O in gpio-realtek-otto
- make irq_chip structures immutable in four more drivers"
* tag 'gpio-fixes-for-v6.0-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux:
gpio: ws16c48: Make irq_chip immutable
gpio: 104-idio-16: Make irq_chip immutable
gpio: 104-idi-48: Make irq_chip immutable
gpio: 104-dio-48e: Make irq_chip immutable
gpio: realtek-otto: switch to 32-bit I/O
gpio: pca953x: Add mutex_lock for regcache sync in PM
gpio: mockup: remove gpio debugfs when remove device
gpio: pxa: use devres for the clock struct
MAINTAINERS: rectify entry for XILINX GPIO DRIVER
Merge tag 'for-linus-6.0-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip
Pull xen fixes from Juergen Gross:
- a minor fix for the Xen grant driver
- a small series fixing a recently introduced problem in the Xen
blkfront/blkback drivers with negotiation of feature usage
* tag 'for-linus-6.0-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
xen/grants: prevent integer overflow in gnttab_dma_alloc_pages()
xen-blkfront: Cache feature_persistent value before advertisement
xen-blkfront: Advertise feature-persistent as user requested
xen-blkback: Advertise feature-persistent as user requested
Merge tag 'loongarch-fixes-6.0-2' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson
Pull LoongArch fixes from Huacai Chen:
"Fix several build errors or warnings, cleanup some code, and adjust
arch_do_signal_or_restart() to adapt generic entry"
* tag 'loongarch-fixes-6.0-2' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson:
LoongArch: mm: Remove the unneeded result variable
LoongArch: Fix arch_remove_memory() undefined build error
LoongArch: Fix section mismatch due to acpi_os_ioremap()
LoongArch: Improve dump_tlb() output messages
LoongArch: Adjust arch_do_signal_or_restart() to adapt generic entry
LoongArch: Avoid orphan input sections
Merge tag 'input-for-v6.0-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input
Pull input fixes from Dmitry Torokhov:
- GT1158 ID added to Goodix touchscreen driver
- Boeder Force Feedback Wheel USB added to iforce joystick driver
- fixup for iforce driver to avoid hangups
- fix autoloading of rk805-pwrkey driver.
* tag 'input-for-v6.0-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
Input: iforce - add support for Boeder Force Feedback Wheel
Input: iforce - wake up after clearing IFORCE_XMIT_RUNNING flag
Input: goodix - add compatible string for GT1158
MAINTAINERS: add include/dt-bindings/input to INPUT DRIVERS
Input: rk805-pwrkey - fix module autoloading
Input: goodix - add support for GT1158
dt-bindings: input: touchscreen: add compatible string for Goodix GT1158
Merge tag 'tty-6.0-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty
Pull tty/serial driver fixes from Greg KH:
"Here are some small tty/serial/vt driver fixes for 6.0-rc4 that
resolve a number of reported issues:
- n_gsm fixups for previous changes that caused problems
- much-reported serdev crash fix that showed up in 6.0-rc1
- vt font selection bugfix
- kerneldoc build warning fixes
- other tiny serial core fixes
All of these have been in linux-next for a while with no reported
problems"
* tag 'tty-6.0-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty:
tty: n_gsm: avoid call of sleeping functions from atomic context
tty: n_gsm: replace kicktimer with delayed_work
tty: n_gsm: initialize more members at gsm_alloc_mux()
tty: n_gsm: add sanity check for gsm->receive in gsm_receive_buf()
tty: serial: atmel: Preserve previous USART mode if RS485 disabled
tty: serial: lpuart: disable flow control while waiting for the transmit engine to complete
tty: Fix lookahead_buf crash with serdev
serial: fsl_lpuart: RS485 RTS polariy is inverse
vt: Clear selection before changing the font
serial: document start_rx member at struct uart_ops
Merge tag 'staging-6.0-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging
Pull staging driver fixes from Greg KH:
"Here are three small staging driver fixes for 6.0-rc4 that resolve
some reported problems and add some a device id:
- new device id for r8188eu driver
- use-after-free bugfixes for the rtl8712 driver
- fix up firmware dependency problem for the r8188eu driver
All of these have been in linux-next for a while with no reported
problems"
* tag 'staging-6.0-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging:
staging: rtl8712: fix use after free bugs
staging: r8188eu: Add Rosewill USB-N150 Nano to device tables
staging: r8188eu: add firmware dependency
Merge tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux
Pull clk fixes from Stephen Boyd:
"Here's a collection of primarily clk driver fixes, with a couple fixes
to the core framework.
We had to revert out a commit that affected boot on some devices that
have the CLK_OPS_PARENT_ENABLE flag set. It isn't critical to have
that fix so we'll try again next time.
Driver side fixes include:
- Plug an OF-node refcount bug in the TI clk driver
- Fix the error handling in the raspberry pi firmware get_rate so
that errors don't look like valid frequencies
- Avoid going out of bounds in the raspberry pi driver too if the
video firmware returns something we're not expecting"
* tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux:
Revert "clk: core: Honor CLK_OPS_PARENT_ENABLE for clk gate ops"
clk: bcm: rpi: Show clock id limit in error case
clk: bcm: rpi: Add missing newline
clk: bcm: rpi: Prevent out-of-bounds access
clk: bcm: rpi: Fix error handling of raspberrypi_fw_get_rate
clk: core: Fix runtime PM sequence in clk_core_unprepare()
clk: core: Honor CLK_OPS_PARENT_ENABLE for clk gate ops
clk: ti: Fix missing of_node_get() ti_find_clock_provider()
Merge tag 'hwmon-for-v6.0-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging
Pull hwmon fixes from Guenter Roeck:
- Fix out of bounds access in gpio-fan driver
- Fix VOUT margin caching in PMBus core
- Avoid error message after -EPROBE_DEFER from devm_regulator_register()
* tag 'hwmon-for-v6.0-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging:
hwmon: (gpio-fan) Fix array out of bounds access
hwmon: (pmbus) Fix vout margin caching
hwmon: (pmbus) Use dev_err_probe() to filter -EPROBE_DEFER error messages
Steven Price [Fri, 2 Sep 2022 11:26:12 +0000 (12:26 +0100)]
mm: pagewalk: Fix race between unmap and page walker
The mmap lock protects the page walker from changes to the page tables
during the walk. However a read lock is insufficient to protect those
areas which don't have a VMA as munmap() detaches the VMAs before
downgrading to a read lock and actually tearing down PTEs/page tables.
For users of walk_page_range() the solution is to simply call pte_hole()
immediately without checking the actual page tables when a VMA is not
present. We now never call __walk_page_range() without a valid vma.
For walk_page_range_novma() the locking requirements are tightened to
require the mmap write lock to be taken, and then walking the pgd
directly with 'no_vma' set.
This in turn means that all page walkers either have a valid vma, or
it's that special 'novma' case for page table debugging. As a result,
all the odd '(!walk->vma && !walk->no_vma)' tests can be removed.
Fixes: dd2283f2605e ("mm: mmap: zap pages with read mmap_sem in munmap") Reported-by: Jann Horn <jannh@google.com> Signed-off-by: Steven Price <steven.price@arm.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Cc: Konstantin Khlebnikov <koct9i@gmail.com> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Johannes Berg [Mon, 29 Aug 2022 09:46:38 +0000 (11:46 +0200)]
wifi: use struct_group to copy addresses
We sometimes copy all the addresses from the 802.11 header
for the AAD, which may cause complaints from fortify checks.
Use struct_group() to avoid the compiler warnings/errors.
Change-Id: Ic3ea389105e7813b22095b295079eecdabde5045 Signed-off-by: Johannes Berg <johannes.berg@intel.com>
wifi: mac80211_hwsim: check length for virtio packets
An invalid packet with a length shorter than the specified length in the
netlink header can lead to use-after-frees and slab-out-of-bounds in the
processing of the netlink attributes, such as the following:
BUG: KASAN: slab-out-of-bounds in __nla_validate_parse+0x1258/0x2010
Read of size 2 at addr ffff88800ac7952c by task kworker/0:1/12
Johannes Berg [Fri, 2 Sep 2022 14:11:14 +0000 (16:11 +0200)]
wifi: mac80211: fix locking in auth/assoc timeout
If we hit an authentication or association timeout, we only
release the chanctx for the deflink, and the other link(s)
are released later by ieee80211_vif_set_links(), but we're
not locking this correctly.
Fix the locking here while releasing the channels and links.
Change-Id: I9e08c1a5434592bdc75253c1abfa6c788f9f39b1 Fixes: 81151ce462e5 ("wifi: mac80211: support MLO authentication/association with one link") Signed-off-by: Johannes Berg <johannes.berg@intel.com>
wifi: mac80211: fix link warning in RX agg timer expiry
The rx data link pointer isn't set from the RX aggregation timer,
resulting in a later warning. Fix that by setting it to the first
valid link for now, with a FIXME to worry about statistics later,
it's not very important since it's just the timeout case.
Reported-by: Hans de Goede <hdegoede@redhat.com> Link: https://lore.kernel.org/r/498d714c-76be-9d04-26db-a1206878de5e@redhat.com Fixes: 56057da4569b ("wifi: mac80211: rx: track link in RX data") Signed-off-by: Mukesh Sisodiya <mukesh.sisodiya@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
LoongArch: Fix section mismatch due to acpi_os_ioremap()
Now acpi_os_ioremap() is marked with __init because it calls memblock_
is_memory() which is also marked with __init in the !ARCH_KEEP_MEMBLOCK
case. However, acpi_os_ioremap() is called by ordinary functions such
as acpi_os_{read, write}_memory() and causes section mismatch warnings:
Fix these warnings by selecting ARCH_KEEP_MEMBLOCK unconditionally and
removing the __init modifier of acpi_os_ioremap(). This can also give a
chance to track "memory" and "reserved" memblocks after early boot.
Huacai Chen [Wed, 31 Aug 2022 03:19:27 +0000 (11:19 +0800)]
LoongArch: Adjust arch_do_signal_or_restart() to adapt generic entry
Commit 8ba62d37949e248c69 ("task_work: Call tracehook_notify_signal from
get_signal on all architectures") adjust arch_do_signal_or_restart() for
all architectures. LoongArch hasn't been upstream yet at that time and
can be still built successfully without adjustment because this function
has a weak version with the correct prototype. It is obviously that we
should convert LoongArch to use new API, otherwise some signal handlings
will be lost.
Ard Biesheuvel [Wed, 24 Aug 2022 15:31:10 +0000 (17:31 +0200)]
LoongArch: Avoid orphan input sections
Ensure that all input sections are listed explicitly in the linker
script, and issue a warning otherwise. This ensures that the binary
image matches the PE/COFF and other image metadata exactly, which is
important for things like code signing.
David S. Miller [Sat, 3 Sep 2022 09:46:24 +0000 (10:46 +0100)]
Merge branch '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue
Tony Nguyen says:
====================
Intel Wired LAN Driver Updates 2022-09-02 (i40e, iavf)
This series contains updates to i40e and iavf drivers.
Przemyslaw adds reset to ADQ configuration to allow for setting of rate
limit beyond TC0 for i40e.
Ivan Vecera does not free client on failure to open which could cause
NULL pointer dereference to occur on i40e. He also detaches device
during reset to prevent NDO calls with could cause races for iavf.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
net: dsa: microchip: fix kernel oops on ksz8 switches
After driver refactoring we was running ksz9477 specific CPU port
configuration on ksz8 family which ended with kernel oops. So, make sure
we run this code only on ksz9477 compatible devices.
Tested on KSZ8873 and KSZ9477.
Fixes: da8cd08520f3 ("net: dsa: microchip: add support for common phylink mac link up") Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de> Signed-off-by: David S. Miller <davem@davemloft.net>
Paul Durrant [Thu, 1 Sep 2022 11:55:54 +0000 (12:55 +0100)]
xen-netback: only remove 'hotplug-status' when the vif is actually destroyed
Removing 'hotplug-status' in backend_disconnected() means that it will be
removed even in the case that the frontend unilaterally disconnects (which
it is free to do at any time). The consequence of this is that, when the
frontend attempts to re-connect, the backend gets stuck in 'InitWait'
rather than moving straight to 'Connected' (which it can do because the
hotplug script has already run).
Instead, the 'hotplug-status' mode should be removed in netback_remove()
i.e. when the vif really is going away.
Fixes: 0f4558ae9187 ("Revert "xen-netback: remove 'hotplug-status' once it has served its purpose"") Signed-off-by: Paul Durrant <pdurrant@amazon.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Wei Fang [Tue, 30 Aug 2022 07:01:48 +0000 (15:01 +0800)]
net: fec: add pm_qos support on imx6q platform
There is a very low probability that tx timeout will occur during
suspend and resume stress test on imx6q platform. So we add pm_qos
support to prevent system from entering low level idles which may
affect the transmission of tx.
1. Fix IP address check in irc DCC conntrack helper, this should check
the opposite direction rather than the destination address of the
packets' direction, from David Leadbeater.
2. bridge netfilter needs to drop dst references, from Harsh Modi.
This was fine back in the day the code was originally written,
but nowadays various tunnels can pre-set metadata dsts on packets.
3. Remove nf_conntrack_helper sysctl and the modparam toggle, users
need to explicitily assign the helpers to use via nftables or
iptables. Conntrack helpers, by design, may be used to add dynamic
port redirections to internal machines, so its necessary to restrict
which hosts/peers are allowed to use them.
It was discovered that improper checking in the irc DCC helper makes
it possible to trigger the 'please do dynamic port forward'
from outside by embedding a 'DCC' in a PING request; if the client
echos that back a expectation/port forward gets added.
The auto-assign-for-everything mechanism has been in "please don't do this"
territory since 2012. From Pablo.
4. Fix a memory leak in the netdev hook error unwind path, also from Pablo.
* git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf:
netfilter: nf_conntrack_irc: Fix forged IP logic
netfilter: nf_tables: clean up hook list when offload flags check fails
netfilter: br_netfilter: Drop dst references before setting.
netfilter: remove nf_conntrack_helper sysctl and modparam toggles
====================
Merge tag 'block-6.0-2022-09-02' of git://git.kernel.dk/linux-block
Pull block fixes from Jens Axboe:
- NVMe pull request via Christoph:
- error handling fix for the new auth code (Hannes Reinecke)
- fix unhandled tcp states in nvmet_tcp_state_change (Maurizio
Lombardi)
- add NVME_QUIRK_BOGUS_NID for Lexar NM610 (Shyamin Ayesh)
- Add documentation for the ublk driver merged in this merge window
(Ming)
* tag 'block-6.0-2022-09-02' of git://git.kernel.dk/linux-block:
Documentation: document ublk
nvmet-tcp: fix unhandled tcp states in nvmet_tcp_state_change()
nvmet-auth: add missing goto in nvmet_setup_auth()
nvme-pci: add NVME_QUIRK_BOGUS_NID for Lexar NM610
Merge tag 'io_uring-6.0-2022-09-02' of git://git.kernel.dk/linux-block
Pull io_uring fixes from Jens Axboe:
- A single fix for over-eager retries for networking (Pavel)
- Revert the notification slot support for zerocopy sends.
It turns out that even after more than a year or development and
testing, there's not full agreement on whether just using plain
ordered notifications is Good Enough to avoid the complexity of using
the notifications slots. Because of that, we decided that it's best
left to a future final decision.
We can always bring back this feature, but we can't really change it
or remove it once we've released 6.0 with it enabled. The reverts
leave the usual CQE notifications as the primary interface for
knowing when data was sent, and when it was acked. (Pavel)
* tag 'io_uring-6.0-2022-09-02' of git://git.kernel.dk/linux-block:
selftests/net: return back io_uring zc send tests
io_uring/net: simplify zerocopy send user API
io_uring/notif: remove notif registration
Revert "io_uring: rename IORING_OP_FILES_UPDATE"
Revert "io_uring: add zc notification flush requests"
selftests/net: temporarily disable io_uring zc test
io_uring/net: fix overexcessive retries
Merge tag '6.0-rc3-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6
Pull cifs fixes from Steve French:
"Five fixes, all also marked for stable:
- fixes for collapse range and insert range (also fixes xfstest
generic/031)
- memory leak fix"
* tag '6.0-rc3-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6:
cifs: fix small mempool leak in SMB2_negotiate()
smb3: use filemap_write_and_wait_range instead of filemap_write_and_wait
smb3: fix temporary data corruption in insert range
smb3: fix temporary data corruption in collapse range
smb3: Move the flush out of smb2_copychunk_range() into its callers
Merge tag 'landlock-6.0-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/mic/linux
Pull landlock fix from Mickaël Salaün:
"This fixes a mis-handling of the LANDLOCK_ACCESS_FS_REFER right when
multiple rulesets/domains are stacked.
The expected behaviour was that an additional ruleset can only
restrict the set of permitted operations, but in this particular case,
it was potentially possible to re-gain the LANDLOCK_ACCESS_FS_REFER
right"
* tag 'landlock-6.0-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/mic/linux:
landlock: Fix file reparenting without explicit LANDLOCK_ACCESS_FS_REFER
Merge tag 'drm-fixes-2022-09-02' of git://anongit.freedesktop.org/drm/drm
Pull drm fixes from Dave Airlie:
"Regular fixes pull. One core dma-buf fix, then two weeks of i915
fixes, a lot of amdgpu fixes mostly for new IP, and a bunch of msm
fixes, mostly modesetting ones.
Nothing seems too bad at this point.
dma-buf/dma-resv:
- Fence-handling fix
i915:
- GVT fixes including fix for a CommetLake regression in mmio table
and misc doc and typo fixes
- Fix CCS handling
- Fix for guc requests after reset
- Display DSI related fixes
- Display backlight related fixes
- Fix for a null pointer dereference
- HDMI related quirk for ECS Liva Q2 with GLK graphics
- Skip wm/ddb readout for disabled pipes
amdgpu:
- FRU error message fix
- MES 11 updates
- DCN 3.2.x fixes
- DCN 3.1.4 fixes
- Fix possible use after free in CS IOCTL
- SMU 13.0.x fixes
- Fix iolink reporting on devices with direct connections to CPU
- GFX10 tap delay firmware fixes
msm:
- Fix for inconsistent indenting in msm_dsi_dphy_timing_calc_v3().
- Fix to make eDP the first connector in the connected list.
- Fix to populate intf_cfg correctly before calling reset_intf_cfg().
- Specify the correct number of DSI regulators for SDM660.
- Specify the correct number of DSI regulators for MSM8996.
- Fix for removing DP_RECOVERED_CLOCK_OUT_EN bit for tps4 link training
- Fix probe-deferral crash in gpu devfreq
- Fix gpu debugfs deadlock"
* tag 'drm-fixes-2022-09-02' of git://anongit.freedesktop.org/drm/drm: (51 commits)
drm/amd/amdgpu: skip ucode loading if ucode_size == 0
drm/amdgpu: only init tap_delay ucode when it's included in ucode binary
drm/amd/display: Fix black flash when switching from ODM2to1 to ODMBypass
drm/amd/display: Fix check for stream and plane
drm/amd/display: Re-initialize viewport after pipe merge
drm/amd/display: Use correct plane for CAB cursor size allocation
drm/amdgpu: ensure no PCIe peer access for CPU XGMI iolinks
drm/amd/pm: bump SMU 13.0.0 driver_if header version
drm/amd/pm: use vbios carried pptable for all SMU13.0.7 SKUs
drm/amd/pm: use vbios carried pptable for those supported SKUs
drm/amd/display: fix wrong register access
drm/amd/display: use actual cursor size instead of max for CAB allocation
drm/amd/display: disable display fresh from MALL on an edge case for DCN321
drm/amd/display: Fix CAB cursor size allocation for DCN32/321
drm/amd/display: Missing HPO instance added
drm/amd/display: set dig fifo read start level to 7 before dig fifo reset
drm/amdgpu: Fix use-after-free in amdgpu_cs_ioctl
drm/amd/display: Fix OTG H timing reset for dcn314
drm/amd/display: Fix DCN32 DPSTREAMCLK_CNTL programming
drm/amdgpu: Update mes_v11_api_def.h
...
hci_read_buffer_size_sync shall not use HCI_OP_LE_READ_BUFFER_SIZE_V2
sinze that is LE specific, instead it is hci_le_read_buffer_size_sync
version that shall use it.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=216382 Fixes: 26afbd826ee3 ("Bluetooth: Add initial implementation of CIS connections") Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Ivan Vecera [Tue, 30 Aug 2022 08:16:27 +0000 (10:16 +0200)]
iavf: Detach device during reset task
iavf_reset_task() takes crit_lock at the beginning and holds
it during whole call. The function subsequently calls
iavf_init_interrupt_scheme() that grabs RTNL. Problem occurs
when userspace initiates during the reset task any ndo callback
that runs under RTNL like iavf_open() because some of that
functions tries to take crit_lock. This leads to classic A-B B-A
deadlock scenario.
To resolve this situation the device should be detached in
iavf_reset_task() prior taking crit_lock to avoid subsequent
ndos running under RTNL and reattach the device at the end.
Fixes: 62fe2a865e6d ("i40evf: add missing rtnl_lock() around i40evf_set_interrupt_capability") Cc: Jacob Keller <jacob.e.keller@intel.com> Cc: Patryk Piotrowski <patryk.piotrowski@intel.com> Cc: SlawomirX Laba <slawomirx.laba@intel.com> Tested-by: Vitaly Grinberg <vgrinber@redhat.com> Signed-off-by: Ivan Vecera <ivecera@redhat.com> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Ivan Vecera [Tue, 16 Aug 2022 16:22:30 +0000 (18:22 +0200)]
i40e: Fix kernel crash during module removal
The driver incorrectly frees client instance and subsequent
i40e module removal leads to kernel crash.
Reproducer:
1. Do ethtool offline test followed immediately by another one
host# ethtool -t eth0 offline; ethtool -t eth0 offline
2. Remove recursively irdma module that also removes i40e module
host# modprobe -r irdma
Two offline tests cause IRDMA driver failure (ETIMEDOUT) and this
failure is indicated back to i40e_client_subtask() that calls
i40e_client_del_instance() to free client instance referenced
by pf->cinst and sets this pointer to NULL. During the module
removal i40e_remove() calls i40e_lan_del_device() that dereferences
pf->cinst that is NULL -> crash.
Do not remove client instance when client open callbacks fails and
just clear __I40E_CLIENT_INSTANCE_OPENED bit. The driver also needs
to take care about this situation (when netdev is up and client
is NOT opened) in i40e_notify_client_of_netdev_close() and
calls client close callback only when __I40E_CLIENT_INSTANCE_OPENED
is set.
Fixes: 0ef2d5afb12d ("i40e: KISS the client interface") Signed-off-by: Ivan Vecera <ivecera@redhat.com> Tested-by: Helena Anna Dubel <helena.anna.dubel@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Fix HW rate limiting for ADQ.
Fallback to kernel queue selection for ADQ, as it is network stack
that decides which queue to use for transmit with ADQ configured.
Reset PF after creation of VMDq2 VSIs required for ADQ, as to
reprogram TX queue contexts in i40e_configure_tx_ring.
Without this patch PF would limit TX rate only according to TC0.
Fixes: a9ce82f744dc ("i40e: Enable 'channel' mode in mqprio for TC configs") Signed-off-by: Przemyslaw Patynowski <przemyslawx.patynowski@intel.com> Signed-off-by: Jan Sokolowski <jan.sokolowski@intel.com> Tested-by: Bharathi Sreenivas <bharathi.sreenivas@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Merge tag 'char-misc-6.0-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc
Pull char/misc driver fixes from Greg KH:
"Here are some small char/misc and other driver fixes for 6.0-rc4.
Included in here are:
- binder fixes for previous fixes, and a few more fixes uncovered by
them.
- iio driver fixes
- soundwire driver fixes
- fastrpc driver fixes for memory corruption on some hardware
- peci driver fix
- mhi driver fix
All of these have been in linux-next with no reported problems"
* tag 'char-misc-6.0-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc:
binder: fix alloc->vma_vm_mm null-ptr dereference
misc: fastrpc: increase maximum session count
misc: fastrpc: fix memory corruption on open
misc: fastrpc: fix memory corruption on probe
soundwire: qcom: fix device status array range
bus: mhi: host: Fix up null pointer access in mhi_irq_handler
soundwire: qcom: remove duplicate reset control get
iio: light: cm32181: make cm32181_pm_ops static
iio: ad7292: Prevent regulator double disable
dt-bindings: iio: gyroscope: bosch,bmg160: correct number of pins
iio: adc: mcp3911: use correct formula for AD conversion
iio: adc: mcp3911: correct "microchip,device-addr" property
Revert "binder_alloc: Add missing mmap_lock calls when using the VMA"
binder_alloc: Add missing mmap_lock calls when using the VMA
binder: fix UAF of ref->proc caused by race condition
iio: light: cm3605: Fix an error handling path in cm3605_probe()
iio: adc: mcp3911: make use of the sign bit
peci: cpu: Fix use-after-free in adev_release()
peci: aspeed: fix error check return value of platform_get_irq()
Merge tag 'usb-6.0-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb
Pull USB/Thunderbolt driver fixes from Greg KH:
"Here are a lot of small USB and Thunderbolt driver fixes for 6.0-rc4
for reported problems. Included in here are:
- new usb-serial driver ids
- dwc3 driver bugfixes for reported problems with 6.0-rc1
- new device quirks, and reverts of some quirks that were incorrect
- gadget driver bugfixes for reported problems
- USB host controller bugfixes (xhci and others)
- other small USB fixes, details in the shortlog
- small thunderbolt driver fixes
All of these have been in linux-next with no reported issues"
* tag 'usb-6.0-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: (51 commits)
Revert "usb: gadget: udc-xilinx: replace memcpy with memcpy_toio"
usb: storage: Add ASUS <0x0b05:0x1932> to IGNORE_UAS
USB: serial: ch341: fix disabled rx timer on older devices
USB: serial: ch341: fix lost character on LCR updates
USB: serial: cp210x: add Decagon UCA device id
Revert "usb: add quirks for Lenovo OneLink+ Dock"
usb: cdns3: fix issue with rearming ISO OUT endpoint
usb: cdns3: fix incorrect handling TRB_SMM flag for ISOC transfer
usb: gadget: mass_storage: Fix cdrom data transfers on MAC-OS
media: mceusb: Use new usb_control_msg_*() routines
USB: core: Prevent nested device-reset calls
USB: gadget: Fix obscure lockdep violation for udc_mutex
usb: dwc2: fix wrong order of phy_power_on and phy_init
usb: gadget: udc-xilinx: replace memcpy with memcpy_toio
usb: typec: Remove retimers properly
usb: dwc3: disable USB core PHY management
usb: add quirks for Lenovo OneLink+ Dock
USB: serial: option: add support for Cinterion MV32-WA/WB RmNet mode
USB: serial: ftdi_sio: add Omron CS1W-CIF31 device id
USB: serial: option: add Quectel EM060K modem
...
Merge tag 'platform-drivers-x86-v6.0-2' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86
Pull x86 platform driver fixes from Hans de Goede:
"Various small fixes and hardware-id additions"
* tag 'platform-drivers-x86-v6.0-2' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86:
platform/x86: p2sb: Fix UAF when caller uses resource name
platform/x86: asus-wmi: Increase FAN_CURVE_BUF_LEN to 32
platform/mellanox: Remove redundant 'NULL' check
platform/mellanox: Remove unnecessary code
platform/mellanox: mlxreg-lc: Fix locking issue
platform/mellanox: mlxreg-lc: Fix coverity warning
platform/x86: acer-wmi: Acer Aspire One AOD270/Packard Bell Dot keymap fixes
platform/x86: thinkpad_acpi: Explicitly set to balanced mode on startup
platform/x86: asus-wmi: Fix the name of the mic-mute LED classdev
platform/surface: aggregator_registry: Add HID devices for sensors and UCSI client to SP8
platform/surface: aggregator_registry: Rename HID device nodes based on new findings
platform/surface: aggregator_registry: Rename HID device nodes based on their function
platform/surface: aggregator_registry: Add support for Surface Laptop Go 2
platform/x86: x86-android-tablets: Fix broken touchscreen on Chuwi Hi8 with Windows BIOS
platform/x86: pmc_atom: Fix SLP_TYPx bitfield mask
Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 fixes from Will Deacon:
"It's a lot smaller than last week, with the star of the show being a
couple of fixes to head.S addressing a boot regression introduced by
the recent overhaul of that code in non-default configurations (i.e.
KASLR disabled).
The first of those two resolves the issue reported (and bisected) by
Mikulus in the wait_on_bit() thread.
Summary:
- Fix two boot issues caused by the recent head.S rework when !KASLR
- Fix calculation of crashkernel memory reservation
- Fix bogus error check in PMU IRQ probing code"
* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
arm64: mm: Reserve enough pages for the initial ID map
perf/arm_pmu_platform: fix tests for platform_get_irq() failure
arm64: head: Ignore bogus KASLR displacement on non-relocatable kernels
arm64/kexec: Fix missing extra range for crashkres_low.
Fix leak, when user changes ring parameters.
During reallocation of RX buffers, new DMA mappings are created for
those buffers. New buffers with different RX ring count should
substitute older ones, but those buffers were freed in ice_vsi_cfg_rxq
and reallocated again with ice_alloc_rx_buf. kfree on rx_buf caused
leak of already mapped DMA.
Reallocate ZC with xdp_buf struct, when BPF program loads. Reallocate
back to rx_buf, when BPF program unloads.
If BPF program is loaded/unloaded and XSK pools are created, reallocate
RX queues accordingly in XDP_SETUP_XSK_POOL handler.
Steps for reproduction:
while :
do
for ((i=0; i<=8160; i=i+32))
do
ethtool -G enp130s0f0 rx $i tx $i
sleep 0.5
ethtool -g enp130s0f0
done
done
Fixes: 617f3e1b588c ("ice: xsk: allocate separate memory for XDP SW ring") Signed-off-by: Przemyslaw Patynowski <przemyslawx.patynowski@intel.com> Signed-off-by: Mateusz Palczewski <mateusz.palczewski@intel.com> Tested-by: Chandan <chandanx.rout@intel.com> (A Contingent Worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Mickaël Salaün [Wed, 31 Aug 2022 20:38:40 +0000 (22:38 +0200)]
landlock: Fix file reparenting without explicit LANDLOCK_ACCESS_FS_REFER
This change fixes a mis-handling of the LANDLOCK_ACCESS_FS_REFER right
when multiple rulesets/domains are stacked. The expected behaviour was
that an additional ruleset can only restrict the set of permitted
operations, but in this particular case, it was potentially possible to
re-gain the LANDLOCK_ACCESS_FS_REFER right.
With the introduction of LANDLOCK_ACCESS_FS_REFER, we added the first
globally denied-by-default access right. Indeed, this lifted an initial
Landlock limitation to rename and link files, which was initially always
denied when the source or the destination were different directories.
This led to an inconsistent backward compatibility behavior which was
only taken into account if no domain layer were using the new
LANDLOCK_ACCESS_FS_REFER right. However, when restricting a thread with
a new ruleset handling LANDLOCK_ACCESS_FS_REFER, all inherited parent
rulesets/layers not explicitly handling LANDLOCK_ACCESS_FS_REFER would
behave as if they were handling this access right and with all their
rules allowing it. This means that renaming and linking files could
became allowed by these parent layers, but all the other required
accesses must also be granted: all layers must allow file removal or
creation, and renaming and linking operations cannot lead to privilege
escalation according to the Landlock policy. See detailed explanation
in commit b91c3e4ea756 ("landlock: Add support for file reparenting with
LANDLOCK_ACCESS_FS_REFER").
To say it another way, this bug may lift the renaming and linking
limitations of the initial Landlock version, and a same ruleset can
enforce different restrictions depending on previous or next enforced
ruleset (i.e. inconsistent behavior). The LANDLOCK_ACCESS_FS_REFER right
cannot give access to data not already allowed, but this doesn't follow
the contract of the first Landlock ABI. This fix puts back the
limitation for sandboxes that didn't opt-in for this additional right.
For instance, if a first ruleset allows LANDLOCK_ACCESS_FS_MAKE_REG on
/dst and LANDLOCK_ACCESS_FS_REMOVE_FILE on /src, renaming /src/file to
/dst/file is denied. However, without this fix, stacking a new ruleset
which allows LANDLOCK_ACCESS_FS_REFER on / would now permit the
sandboxed thread to rename /src/file to /dst/file .
This change fixes the (absolute) rule access rights, which now always
forbid LANDLOCK_ACCESS_FS_REFER except when it is explicitly allowed
when creating a rule.
Making all domain handle LANDLOCK_ACCESS_FS_REFER was an initial
approach but there is two downsides:
* it makes the code more complex because we still want to check that a
rule allowing LANDLOCK_ACCESS_FS_REFER is legitimate according to the
ruleset's handled access rights (i.e. ABI v1 != ABI v2);
* it would not allow to identify if the user created a ruleset
explicitly handling LANDLOCK_ACCESS_FS_REFER or not, which will be an
issue to audit Landlock.
Instead, this change adds an ACCESS_INITIALLY_DENIED list of
denied-by-default rights, which (only) contains
LANDLOCK_ACCESS_FS_REFER. All domains are treated as if they are also
handling this list, but without modifying their fs_access_masks field.
A side effect is that the errno code returned by rename(2) or link(2)
*may* be changed from EXDEV to EACCES according to the enforced
restrictions. Indeed, we now have the mechanic to identify if an access
is denied because of a required right (e.g. LANDLOCK_ACCESS_FS_MAKE_REG,
LANDLOCK_ACCESS_FS_REMOVE_FILE) or if it is denied because of missing
LANDLOCK_ACCESS_FS_REFER rights. This may result in different errno
codes than for the initial Landlock version, but this approach is more
consistent and better for rename/link compatibility reasons, and it
wasn't possible before (hence no backport to ABI v1). The
layout1.rename_file test reflects this change.
Add 4 layout1.refer_denied_by_default* test suites to check that the
behavior of a ruleset not handling LANDLOCK_ACCESS_FS_REFER (ABI v1) is
unchanged even if another layer handles LANDLOCK_ACCESS_FS_REFER (i.e.
ABI v1 precedence). Make sure rule's absolute access rights are correct
by testing with and without a matching path. Add test_rename() and
test_exchange() helpers.
Extend layout1.inval tests to check that a denied-by-default access
right is not necessarily part of a domain's handled access rights.
Test coverage for security/landlock is 95.3% of 599 lines according to
gcc/gcov-11.
Fixes: b91c3e4ea756 ("landlock: Add support for file reparenting with LANDLOCK_ACCESS_FS_REFER") Reviewed-by: Paul Moore <paul@paul-moore.com> Reviewed-by: Günther Noack <gnoack3000@gmail.com> Link: https://lore.kernel.org/r/20220831203840.1370732-1-mic@digikod.net Cc: stable@vger.kernel.org
[mic: Constify and slightly simplify test helpers] Signed-off-by: Mickaël Salaün <mic@digikod.net>
David S. Miller [Fri, 2 Sep 2022 11:45:32 +0000 (12:45 +0100)]
Merge tag 'rxrpc-fixes-20220901' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs
David Howells says:
====================
rxrpc fixes
Here are some fixes for AF_RXRPC:
(1) Fix the handling of ICMP/ICMP6 packets. This is a problem due to
rxrpc being switched to acting as a UDP tunnel, thereby allowing it to
steal the packets before they go through the UDP Rx queue. UDP
tunnels can't get ICMP/ICMP6 packets, however. This patch adds an
additional encap hook so that they can.
(2) Fix the encryption routines in rxkad to handle packets that have more
than three parts correctly. The problem is that ->nr_frags doesn't
count the initial fragment, so the sglist ends up too short.
(3) Fix a problem with destruction of the local endpoint potentially
getting repeated.
(4) Fix the calculation of the time at which to resend.
jiffies_to_usecs() gives microseconds, not nanoseconds.
(5) Fix AFS to work out when callback promises and locks expire based on
the time an op was issued rather than the time the first reply packet
arrives. We don't know how long the server took between calculating
the expiry interval and transmitting the reply.
(6) Given (5), rxrpc_get_reply_time() is no longer used, so remove it.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Wed, 31 Aug 2022 23:38:09 +0000 (23:38 +0000)]
tcp: TX zerocopy should not sense pfmemalloc status
We got a recent syzbot report [1] showing a possible misuse
of pfmemalloc page status in TCP zerocopy paths.
Indeed, for pages coming from user space or other layers,
using page_is_pfmemalloc() is moot, and possibly could give
false positives.
There has been attempts to make page_is_pfmemalloc() more robust,
but not using it in the first place in this context is probably better,
removing cpu cycles.
Note to stable teams :
You need to backport 84ce071e38a6 ("net: introduce
__skb_fill_page_desc_noacc") as a prereq.
Race is more probable after commit c07aea3ef4d4
("mm: add a signature in struct page") because page_is_pfmemalloc()
is now using low order bit from page->lru.next, which can change
more often than page->index.
Low order bit should never be set for lru.next (when used as an anchor
in LRU list), so KCSAN report is mostly a false positive.
Backporting to older kernel versions seems not necessary.
[1]
BUG: KCSAN: data-race in lru_add_fn / tcp_build_frag
value changed: 0x0000000000000000 -> 0xffffea0004a1d288
Reported by Kernel Concurrency Sanitizer on:
CPU: 1 PID: 18611 Comm: syz-executor.4 Not tainted 6.0.0-rc2-syzkaller-00248-ge022620b5d05-dirty #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 07/22/2022
Fixes: c07aea3ef4d4 ("mm: add a signature in struct page") Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Shakeel Butt <shakeelb@google.com> Reviewed-by: Shakeel Butt <shakeelb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Dan Carpenter [Wed, 31 Aug 2022 14:47:56 +0000 (17:47 +0300)]
tipc: fix shift wrapping bug in map_get()
There is a shift wrapping bug in this code so anything thing above
31 will return false.
Fixes: 35c55c9877f8 ("tipc: add neighbor monitoring framework") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
sch_sfb: Don't assume the skb is still around after enqueueing to child
The sch_sfb enqueue() routine assumes the skb is still alive after it has
been enqueued into a child qdisc, using the data in the skb cb field in the
increment_qlen() routine after enqueue. However, the skb may in fact have
been freed, causing a use-after-free in this case. In particular, this
happens if sch_cake is used as a child of sfb, and the GSO splitting mode
of CAKE is enabled (in which case the skb will be split into segments and
the original skb freed).
Fix this by copying the sfb cb data to the stack before enqueueing the skb,
and using this stack copy in increment_qlen() instead of the skb pointer
itself.
Reported-by: zdi-disclosures@trendmicro.com # ZDI-CAN-18231 Fixes: e13e02a3c68d ("net_sched: SFB flow scheduler") Signed-off-by: Toke Høiland-Jørgensen <toke@toke.dk> Signed-off-by: David S. Miller <davem@davemloft.net>
Merge tag 'renesas-fixes-for-v6.0-tag1' of git://git.kernel.org/pub/scm/linux/kernel/git/geert/renesas-devel into arm/fixes
Renesas fixes for v6.0
- Fix the serial console on the Renesas White Hawk development board.
* tag 'renesas-fixes-for-v6.0-tag1' of git://git.kernel.org/pub/scm/linux/kernel/git/geert/renesas-devel:
arm64: dts: renesas: r8a779g0: Fix HSCIF0 interrupt number
Merge tag 'at91-fixes-6.0' of git://git.kernel.org/pub/scm/linux/kernel/git/at91/linux into arm/fixes
AT91 fixes for 6.0
It contains:
- fixes for self-refresh on SAMA7G5 while in AT91 power management modes:
one disabling a DDR PHY controller DLL which has been proved to be buggy
and can introduce glitches that can cause unexpected behavior; one
fixing the DDR PHY recalibration which cannot work for all possible
cases (due to hardware bug) while using backup and self-refresh AT91
power management mode;
- one defconfig fix to remove CONFIG_MICROCHIP_PIT64B from all AT91
defconfigs;
- multiple device tree fixes for regulators to avoid having some of them
enabled all the time and to describe min and max output ranges
according to board capabilities.
* tag 'at91-fixes-6.0' of git://git.kernel.org/pub/scm/linux/kernel/git/at91/linux:
ARM: dts: at91: sama5d2_icp: don't keep vdd_other enabled all the time
ARM: dts: at91: sama5d27_wlsom1: don't keep ldo2 enabled all the time
ARM: dts: at91: sama7g5ek: specify proper regulator output ranges
ARM: dts: at91: sama5d2_icp: specify proper regulator output ranges
ARM: dts: at91: sama5d27_wlsom1: specify proper regulator output ranges
ARM: at91: pm: fix DDR recalibration when resuming from backup and self-refresh
ARM: at91: pm: fix self-refresh for sama7g5
ARM: configs: at91: remove CONFIG_MICROCHIP_PIT64B
The soc/fsl/dpio driver will perform a soc_device_match()
to determine the optimal cache settings for a given CPU core.
If FSL_GUTS is not enabled, this search will fail and
the driver will not configure cache stashing for the given
DPIO, and a string of "unknown SoC" messages will appear:
fsl_mc_dpio dpio.7: unknown SoC version
fsl_mc_dpio dpio.6: unknown SoC version
fsl_mc_dpio dpio.5: unknown SoC version
Merge tag 'arm-soc/for-6.0/devicetree' of https://github.com/Broadcom/stblinux into arm/fixes
This pull request contains Broadcom ARM-based SoCs Device Tree fixes for
6.0, please pull the following:
- William fixes a number of the recently submitted DTS files for 63178,
6846, 6878 to have correct PSCI node propertie as well as correct timer
CPU masks
* tag 'arm-soc/for-6.0/devicetree' of https://github.com/Broadcom/stblinux:
ARM: dts: bcmbca: bcm6878: cosmetic change
ARM: dts: bcmbca: bcm6878: fix timer node cpu mask flag
ARM: dts: bcmbca: bcm6846: fix interrupt controller node
ARM: dts: bcmbca: bcm6846: clean up psci node
ARM: dts: bcmbca: bcm6846: fix timer node cpu mask flag
ARM: dts: bcmbca: bcm63178: cosmetic change
ARM: dts: bcmbca: bcm63178: fix interrupt controller node
ARM: dts: bcmbca: bcm63178: clean up psci node
ARM: dts: bcmbca: bcm63178: fix timer node cpu mask flag
Dan Carpenter [Thu, 1 Sep 2022 15:35:20 +0000 (18:35 +0300)]
xen/grants: prevent integer overflow in gnttab_dma_alloc_pages()
The change from kcalloc() to kvmalloc() means that arg->nr_pages
might now be large enough that the "args->nr_pages << PAGE_SHIFT" can
result in an integer overflow.
Fixes: b3f7931f5c61 ("xen/gntdev: switch from kcalloc() to kvcalloc()") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by: Juergen Gross <jgross@suse.com> Link: https://lore.kernel.org/r/YxDROJqu/RPvR0bi@kili Signed-off-by: Juergen Gross <jgross@suse.com>