The max inline mtt count supported is ERDMA_MAX_INLINE_MTT_ENTRIES.
When mr->mem.mtt_nents == ERDMA_MAX_INLINE_MTT_ENTRIES, inline mtt
is also supported, fix it.
Fixes: 155055771704 ("RDMA/erdma: Add verbs implementation") Signed-off-by: Cheng Xu <chengyou@linux.alibaba.com> Link: https://lore.kernel.org/r/20230320084652.16807-4-chengyou@linux.alibaba.com Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
Max EQ depth of hardware is 32K, the current default EQ depth is too small
for some applications, so change the default depth to 4096.
Max send WRs the hardware can support is 8K, but the driver limits the
value to 4K. Remove this limitation.
Fixes: be3cff0f242d ("RDMA/erdma: Add the hardware related definitions") Fixes: db23ae64caac ("RDMA/erdma: Add verbs header file") Signed-off-by: Cheng Xu <chengyou@linux.alibaba.com> Link: https://lore.kernel.org/r/20230320084652.16807-3-chengyou@linux.alibaba.com Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
Currently, when driver queries PTYS to report which link speed is being
used on its RoCE ports, it does not check the case of having 400Gbps
transmitted over 8 lanes. Thus it fails to report the said speed and
instead it defaults to report 10G over 4 lanes.
Add a check for the said speed when querying PTYS and report it back
correctly when needed.
Fixes: 08e8676f1607 ("IB/mlx5: Add support for 50Gbps per lane link modes") Signed-off-by: Maher Sanalla <msanalla@nvidia.com> Reviewed-by: Aya Levin <ayal@nvidia.com> Reviewed-by: Saeed Mahameed <saeedm@nvidia.com> Link: https://lore.kernel.org/r/ec9040548d119d22557d6a4b4070d6f421701fd4.1678973994.git.leon@kernel.org Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
Add ipv4 check to irdma_find_listener(). Otherwise the function
incorrectly finds and returns a listener with a different addr family for
the zero IP addr, if a listener with a zero IP addr and the same port as
the one searched for has already been created.
Fixes: 146b9756f14c ("RDMA/irdma: Add connection manager") Signed-off-by: Tatyana Nikolova <tatyana.e.nikolova@intel.com> Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com> Link: https://lore.kernel.org/r/20230315145231.931-5-shiraz.saleem@intel.com Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
When running perftest with large number of connections in iWARP mode, the
passive side could be slow to respond. Increase the rexmit counter default
to allow scaling connections.
Fixes: 146b9756f14c ("RDMA/irdma: Add connection manager") Signed-off-by: Mustafa Ismail <mustafa.ismail@intel.com> Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com> Link: https://lore.kernel.org/r/20230315145231.931-4-shiraz.saleem@intel.com Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
On rmmod of irdma, the PBLE object memory is not being freed. PBLE object
memory are not statically pre-allocated at function initialization time
unlike other HMC objects. PBLEs objects and the Segment Descriptors (SD)
for it can be dynamically allocated during scale up and SD's remain
allocated till function deinitialization.
Fix this leak by adding IRDMA_HMC_IW_PBLE to the iw_hmc_obj_types[] table
and skip pbles in irdma_create_hmc_obj but not in irdma_del_hmc_objects().
Fixes: 44d9e52977a1 ("RDMA/irdma: Implement device initialization definitions") Signed-off-by: Mustafa Ismail <mustafa.ismail@intel.com> Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com> Link: https://lore.kernel.org/r/20230315145231.931-3-shiraz.saleem@intel.com Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
Currently, artificial SW completions are generated for NOP wqes which can
generate unexpected completions with wr_id = 0. Skip the generation of
artificial completions for NOPs.
Fixes: 81091d7696ae ("RDMA/irdma: Add SW mechanism to generate completions on error") Signed-off-by: Mustafa Ismail <mustafa.ismail@intel.com> Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com> Link: https://lore.kernel.org/r/20230315145231.931-2-shiraz.saleem@intel.com Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
In sprd clock driver, regmap_config.max_register was set to a fixed value
which is likely larger than the address range configured in device tree,
when reading registers through debugfs it would cause access violation.
Fixes: d41f59fd92f2 ("clk: sprd: Add common infrastructure") Signed-off-by: Chunyan Zhang <chunyan.zhang@unisoc.com> Link: https://lore.kernel.org/r/20230316023624.758204-1-chunyan.zhang@unisoc.com Signed-off-by: Stephen Boyd <sboyd@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
So far io_req_complete_post() only covers DEFER_TASKRUN by completing
request via task work when the request is completed from IOWQ.
However, uring command could be completed from any context, and if io
uring is setup with DEFER_TASKRUN, the command is required to be
completed from current context, otherwise wait on IORING_ENTER_GETEVENTS
can't be wakeup, and may hang forever.
The issue can be observed on removing ublk device, but turns out it is
one generic issue for uring command & DEFER_TASKRUN, so solve it in
io_uring core code.
Fixes: e6aeb2721d3b ("io_uring: complete all requests in task context") Cc: stable@vger.kernel.org Link: https://lore.kernel.org/linux-block/b3fc9991-4c53-9218-a8cc-5b4dd3952108@kernel.dk/ Reported-by: Jens Axboe <axboe@kernel.dk> Cc: Kanchan Joshi <joshi.k@samsung.com> Signed-off-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
On TGL+ the DSS control registers are at different offsets, and there's
one per pipe. Fix the offsets to fix dual link DSI for TGL+.
There would be helpers for this in the DSC code, but just do the quick
fix now for DSI. Long term, we should probably move all the DSS handling
into intel_vdsc.c, so exporting the helpers seems counter-productive.
Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/8232 Cc: Ville Syrjala <ville.syrjala@linux.intel.com> Cc: stable@vger.kernel.org Signed-off-by: Jani Nikula <jani.nikula@intel.com> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230301151409.1581574-1-jani.nikula@intel.com
(cherry picked from commit 1a62dd9895dca78bee28bba3a36f08836fdd143d) Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
I got really badly confused in d443d9386472 ("fbcon: move more common
code into fb_open()") because we set the con2fb_map before the failure
points, which didn't look good.
But in trying to fix that I moved the assignment into the wrong path -
we need to do it for _all_ vc we take over, not just the first one
(which additionally requires the call to con2fb_acquire_newinfo).
I've figured this out because of a KASAN bug report, where the
fbcon_registered_fb and fbcon_display arrays went out of sync in
fbcon_mode_deleted() because the con2fb_map pointed at the old
fb_info, but the modes and everything was updated for the new one.
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> Reviewed-by: Javier Martinez Canillas <javierm@redhat.com> Acked-by: Helge Deller <deller@gmx.de> Tested-by: Xingyuan Mo <hdthky0@gmail.com> Fixes: d443d9386472 ("fbcon: move more common code into fb_open()") Reported-by: Xingyuan Mo <hdthky0@gmail.com> Cc: Thomas Zimmermann <tzimmermann@suse.de> Cc: Sam Ravnborg <sam@ravnborg.org> Cc: Xingyuan Mo <hdthky0@gmail.com> Cc: Thomas Zimmermann <tzimmermann@suse.de> Cc: Helge Deller <deller@gmx.de> Cc: <stable@vger.kernel.org> # v5.19+ Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
This is a regressoin introduced in b07db3958485 ("fbcon: Ditch error
handling for con2fb_release_oldinfo"). I failed to realize what the if
(!err) checks. The mentioned commit was dropping the
con2fb_release_oldinfo() return value but the if (!err) was also
checking whether the con2fb_acquire_newinfo() function call above
failed or not.
Fix this with an early return statement.
Note that there's still a difference compared to the orginal state of
the code, the below lines are now also skipped on error:
if (!search_fb_in_map(info_idx))
info_idx = newidx;
These are only needed when we've actually thrown out an old fb_info
from the console mappings, which only happens later on.
Also move the fbcon_add_cursor_work() call into the same if block,
it's all protected by console_lock so doesn't matter when we set up
the blinking cursor delayed work anyway. This further simplifies the
control flow and allows us to ditch the found local variable.
v2: Clarify commit message (Javier)
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> Reviewed-by: Javier Martinez Canillas <javierm@redhat.com> Acked-by: Helge Deller <deller@gmx.de> Tested-by: Xingyuan Mo <hdthky0@gmail.com> Fixes: b07db3958485 ("fbcon: Ditch error handling for con2fb_release_oldinfo") Cc: Thomas Zimmermann <tzimmermann@suse.de> Cc: Sam Ravnborg <sam@ravnborg.org> Cc: Xingyuan Mo <hdthky0@gmail.com> Cc: Thomas Zimmermann <tzimmermann@suse.de> Cc: Helge Deller <deller@gmx.de> Cc: <stable@vger.kernel.org> # v5.19+ Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
Currently, with VHE, KVM enables the EL0 event counting for the
guest on vcpu_load() or KVM enables it as a part of the PMU
register emulation process, when needed. However, in the migration
case (with VHE), the same handling is lacking, as vPMU register
values that were restored by userspace haven't been propagated yet
(the PMU events haven't been created) at the vcpu load-time on the
first KVM_RUN (kvm_vcpu_pmu_restore_guest() called from vcpu_load()
on the first KVM_RUN won't do anything as events_{guest,host} of
kvm_pmu_events are still zero).
So, with VHE, enable the guest's EL0 event counting on the first
KVM_RUN (after the migration) when needed. More specifically,
have kvm_pmu_handle_pmcr() call kvm_vcpu_pmu_restore_guest()
so that kvm_pmu_handle_pmcr() on the first KVM_RUN can take
care of it.
Fixes: d0c94c49792c ("KVM: arm64: Restore PMU configuration on first run") Cc: stable@vger.kernel.org Reviewed-by: Marc Zyngier <maz@kernel.org> Signed-off-by: Reiji Watanabe <reijiw@google.com> Link: https://lore.kernel.org/r/20230329023944.2488484-1-reijiw@google.com Signed-off-by: Oliver Upton <oliver.upton@linux.dev> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
mtd_read() may return -EUCLEAN in case of corrected bit-flips.This
particular condition should not be treated like an error.
Signed-off-by: Bang Li <libang.linuxer@gmail.com> Fixes: e47f68587b82 ("mtd: check for max_bitflips in mtd_read_oob()") Cc: <stable@vger.kernel.org> # v3.7 Acked-by: Richard Weinberger <richard@nod.at> Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com> Link: https://lore.kernel.org/linux-mtd/20230328163012.4264-1-libang.linuxer@gmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
This is an oversight from dc5bdb68b5b3 ("drm/fb-helper: Fix vt
restore") - I failed to realize that nasty userspace could set this.
It's not pretty to mix up kernel-internal and userspace uapi flags
like this, but since the entire fb_var_screeninfo structure is uapi
we'd need to either add a new parameter to the ->fb_set_par callback
and fb_set_par() function, which has a _lot_ of users. Or some other
fairly ugly side-channel int fb_info. Neither is a pretty prospect.
Instead just correct the issue at hand by filtering out this
kernel-internal flag in the ioctl handling code.
Reviewed-by: Javier Martinez Canillas <javierm@redhat.com> Acked-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> Fixes: dc5bdb68b5b3 ("drm/fb-helper: Fix vt restore") Cc: Alex Deucher <alexander.deucher@amd.com> Cc: shlomo@fastmail.com Cc: Michel Dänzer <michel@daenzer.net> Cc: Noralf Trønnes <noralf@tronnes.org> Cc: Thomas Zimmermann <tzimmermann@suse.de> Cc: Daniel Vetter <daniel.vetter@intel.com> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Cc: Maxime Ripard <mripard@kernel.org> Cc: David Airlie <airlied@linux.ie> Cc: Daniel Vetter <daniel@ffwll.ch> Cc: dri-devel@lists.freedesktop.org Cc: <stable@vger.kernel.org> # v5.7+ Cc: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Nathan Chancellor <natechancellor@gmail.com> Cc: Qiujun Huang <hqjagain@gmail.com> Cc: Peter Rosin <peda@axentia.se> Cc: linux-fbdev@vger.kernel.org Cc: Helge Deller <deller@gmx.de> Cc: Sam Ravnborg <sam@ravnborg.org> Cc: Geert Uytterhoeven <geert+renesas@glider.be> Cc: Samuel Thibault <samuel.thibault@ens-lyon.org> Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Cc: Shigeru Yoshida <syoshida@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230404193934.472457-1-daniel.vetter@ffwll.ch Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
The BTRFS_FS_CSUM_IMPL_FAST flag is currently set whenever a non-generic
crc32c is detected, which is the incorrect check if the file system uses
a different checksumming algorithm. Refactor the code to only check
this if crc32c is actually used. Note that in an ideal world the
information if an algorithm is hardware accelerated or not should be
provided by the crypto API instead, but that's left for another day.
CC: stable@vger.kernel.org # 5.4.x: c8a5f8ca9a9c: btrfs: print checksum type and implementation at mount time CC: stable@vger.kernel.org # 5.4.x Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
Commit d7b9416fe5c5 ("btrfs: remove btrfs_end_io_wq") converted the read
and I/O handling from btrfs_workqueues to Linux workqueues, and as part
of that lost the code to apply the thread_pool= based max_active limit
on remount. Restore it.
Fixes: d7b9416fe5c5 ("btrfs: remove btrfs_end_io_wq") CC: stable@vger.kernel.org # 6.0+ Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
==================================================================
BUG: KASAN: slab-use-after-free in hci_conn_del+0xba/0x3a0
Write of size 8 at addr ffff88800208e9c8 by task iso-tester/31
When it happens, hci_cs_setup_sync_conn won't be able to obtain the
reference to the SCO connection, so it will be stuck and potentially
hinder subsequent connections to the same device.
This patch prevents that by also deleting the SCO connection if it is
still not established when the corresponding ACL connection is deleted.
Signed-off-by: Archie Pusaka <apusaka@chromium.org> Reviewed-by: Ying Hsu <yinghsu@chromium.org> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
This patch fixes an incorrect loop exit condition in code that replaces
'/' symbols in the board name. There might also be a memory corruption
issue here, but it is unlikely to be a real problem.
Cc: <stable@vger.kernel.org> Signed-off-by: Sasha Finkelstein <fnkl.kernel@gmail.com> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
There is a potential race condition in hidp_session_thread that may
lead to use-after-free. For instance, the timer is active while
hidp_del_timer is called in hidp_session_thread(). After hidp_session_put,
then 'session' will be freed, causing kernel panic when hidp_idle_timeout
is running.
The solution is to use del_timer_sync instead of del_timer.
Cc: stable@vger.kernel.org Signed-off-by: Min Li <lm0963hack@gmail.com> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
Similar to commit d0be8347c623 ("Bluetooth: L2CAP: Fix use-after-free
caused by l2cap_chan_put"), just use l2cap_chan_hold_unless_zero to
prevent referencing a channel that is about to be destroyed.
Cc: stable@kernel.org Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com> Signed-off-by: Min Li <lm0963hack@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
Use of keep-alive (KAE) has resulted in loss of audio on some A750/770
cards as the transition from keep-alive to stream playback is not
working as expected. As there is limited benefit of the new KAE mode
on discrete cards, revert back to older silent-stream implementation
on these systems.
The BIOS botches this one completely - it says the 2nd S/PDIF output is
used, while in fact it's the 1st one. This is tested on DP45SG, but I'm
assuming it's valid for the other boards in the series as well.
Also add some comments regarding the pins.
FWIW, the codec is apparently still sold by Tempo Semiconductor, Inc.,
where one can download the documentation.
Signed-off-by: Oswald Buddenhagen <oswald.buddenhagen@gmx.de> Cc: <stable@vger.kernel.org> Link: https://lore.kernel.org/r/20230405201220.2197826-2-oswald.buddenhagen@gmx.de Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
It could have never worked, as snd_emu10k1_fx8010_playback_prepare() and
snd_emu10k1_fx8010_playback_hw_free() assume the emu10k1 offset for the
ETRAM, and the default DSP code includes no handler for it. It also
wouldn't make a lot of sense to make it work, as Audigy has an own, much
simpler, pass-through mechanism. So just skip creation of the device.
Signed-off-by: Oswald Buddenhagen <oswald.buddenhagen@gmx.de> Cc: <stable@vger.kernel.org> Link: https://lore.kernel.org/r/20230405201220.2197938-1-oswald.buddenhagen@gmx.de Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
The direct return will cause the stream list of "&tscm->domain" unemptied
and the session in "tscm" unfinished if amdtp_domain_start() returns with
an error.
Fix this by changing the direct return to a goto which will empty the
stream list of "&tscm->domain" and finish the session in "tscm".
The snd_tscm_stream_start_duplex() function is called in the prepare
callback of PCM. According to "ALSA Kernel API Documentation", the prepare
callback of PCM will be called many times at each setup. So, if the
"&d->streams" list is not emptied, when the prepare callback is called
next time, snd_tscm_stream_start_duplex() will receive -EBUSY from
amdtp_domain_add_stream() that tries to add an existing stream to the
domain. The error handling code after the "error" label will be executed
in this case, and the "&d->streams" list will be emptied. So not emptying
the "&d->streams" list will not cause an issue. But it is more efficient
and readable to empty it on the first error by changing the direct return
to a goto statement.
The session in "tscm" has been begun before amdtp_domain_start(), so it
needs to be finished when amdtp_domain_start() fails.
Fixes: c281d46a51e3 ("ALSA: firewire-tascam: support AMDTP domain") Signed-off-by: Xu Biang <xubiang@hust.edu.cn> Reviewed-by: Dan Carpenter <error27@gmail.com> Acked-by: Takashi Sakamoto <o-takashi@sakamocchi.jp> Cc: <stable@vger.kernel.org> Link: https://lore.kernel.org/r/20230406132801.105108-1-xubiang@hust.edu.cn Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
Add pins and verbs needed to enable speakers and jack.
The pins and verbs configurations were identified by snooping the
Windows driver commands, with a nice write-up here:
https://brakkee.org/site/2023/02/07/fixing-sound-on-the-asus-n7601zm/
Reported-by: Erik Brakkee <erik@brakkee.org> Link: https://github.com/thesofproject/linux/issues/4176 Tested-by: Erik Brakkee <erik@brakkee.org> Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com> Reviewed-by: Kai Vehmanen <kai.vehmanen@linux.intel.com> Reviewed-by: Bard Liao <yung-chuan.liao@linux.intel.com> Cc: <stable@vger.kernel.org> Link: https://lore.kernel.org/r/20230406152725.15191-1-pierre-louis.bossart@linux.intel.com Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
Due to two copy/pastos, closing the MIC or EFX capture device would
make a running ADC capture hang due to unsetting its interrupt handler.
In principle, this would have also allowed dereferencing dangling
pointers, but we're actually rather thorough at disabling and flushing
the ints.
While it may sound like one, this actually wasn't a hypothetical bug:
PortAudio will open a capture stream at startup (and close it right
away) even if not asked to. If the first device is busy, it will just
proceed with the next one ... thus killing a concurrent capture.
Signed-off-by: Oswald Buddenhagen <oswald.buddenhagen@gmx.de> Cc: <stable@vger.kernel.org> Link: https://lore.kernel.org/r/20230405201220.2197923-1-oswald.buddenhagen@gmx.de Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
Unlike SKL/GLK the ICL CSC unit suffers from a new issue where
CSC_MODE arming is sticky. That is, once armed it remains armed
causing the CSC coeff/offset registers to become effectively
self-arming.
CSC coeff/offset registers writes no longer disarm the CSC,
but fortunately register read still do. So we can use that
to disarm the CSC unit once the registers for the current
frame have been latched. This avoid s the self-arming behaviour
from persisting into the next frame's .color_commit_noarm()
call.
Cc: <stable@vger.kernel.org> #v5.19+ Cc: Manasi Navare <navaremanasi@google.com> Cc: Drew Davenport <ddavenport@chromium.org> Cc: Imre Deak <imre.deak@intel.com> Cc: Jouni Högander <jouni.hogander@intel.com> Fixes: d13dde449580 ("drm/i915: Split pipe+output CSC programming to noarm+arm pair") Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230320095438.17328-5-ville.syrjala@linux.intel.com Reviewed-by: Imre Deak <imre.deak@intel.com>
(cherry picked from commit 92736f1b452bbb8a66bdb5b1d263ad00e04dd3b8) Signed-off-by: Jani Nikula <jani.nikula@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
[Why & How]
drm_dp_remove_payload() interface was changed. Correct amdgpu dm code
to pass the right parameter to the drm helper function.
Reviewed-by: Jerry Zuo <Jerry.Zuo@amd.com> Acked-by: Qingqing Zhuo <qingqing.zhuo@amd.com> Signed-off-by: Wayne Lin <Wayne.Lin@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry-picked from b8ca445f550a9a079134f836466ddda3bfad6108)
[Hand modified due to missing f0127cb11299df80df45583b216e13f27c408545 which
failed to apply due to missing 94dfeaa46925bb6b4d43645bbb6234e846dec257] Reported-and-tested-by: Veronika Schwan <veronika@pisquaredover6.de> Fixes: d7b5638bd337 ("drm/amd/display: Take FEC Overhead into Timeslot Calculation") Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
This patch introduces a regression on Lenovo Z13, which can't wake
from the lid with it applied; and some unspecified AMD based Dell
platforms are unable to wake from hitting the power button
Signed-off-by: Kornel Dulęba <korneld@chromium.org> Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Link: https://lore.kernel.org/r/20230411134932.292287-1-korneld@chromium.org Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
BugLink: https://bugs.launchpad.net/bugs/1685291
This enables initrd-less boot on generic kernel, enabling any Ubuntu
installation to be lifted from any hardware and converted into a VM
which will, even with corrupted initrd and missing modules, be able to
boot by default. This is an extremly useful property to have on all
ubuntu kernels. Separately, this also removes the need for linux-kvm
kernel flavour, as all the changes it has are now incorporated into
generic flavour & cloud-images.
Other related bugs in related projects, that this change alone
improves, but doesn't resolve in full alone:
Reference: https://bugs.launchpad.net/bugs/1931841
Reference: https://bugs.launchpad.net/bugs/2006488
Signed-off-by: Dimitri John Ledkov <dimitri.ledkov@canonical.com> Acked-by: Tim Gardner <tim.gardner@canonical.com> Acked-by: Andrea Righi <andrea.righi@canonical.com> Acked-by: Andrei Gherzan <andrei.gherzan@canonical.com> Acked-by: Roxana Nicolescu <roxana.nicolescu@canonical.com> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
If the platform support the control of speaker-mute led, register it
Change-Id: I70ac3007e715a0410dd692f362e51b347ef4488e Signed-off-by: Koba Ko <koba.ko@canonical.com> Link: https://lore.kernel.org/r/20230308062414.1048913-1-koba.ko@canonical.com Reviewed-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Hans de Goede <hdegoede@redhat.com>
(cherry picked from commit 3ed0b880b71489d59f89b0bc1323ad34a136cca0
linux-next) Signed-off-by: Koba Ko <koba.ko@canonical.com> Acked-by: Andrei Gherzan <andrei.gherzan@canonical.com> Acked-by: Tim Gardner <tim.gardner@canonical.com> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
UBUNTU: [Debian] autoreconstruct - fix restoration of execute permissions
BugLink: http://bugs.launchpad.net/bugs/2015498
Debian source package diffs cannot represent that a file should be
executable.
gen-auto-reconstruct detects the situations where a file has changed
its permissions and restores the execute permissions in the
reconstruct script, assuming the file has always had execute permission.
This does not work in case a file removed the execute permission because
the script will change it back.
The script now adds either `chmod +x` or `chmod -x` based on the actual
permission change.
Signed-off-by: Andy Whitcroft <apw@canonical.com> Signed-off-by: Roxana Nicolescu <roxana.nicolescu@canonical.com> Acked-by: Andrei Gherzan <andrei.gherzan@canonical.com> Acked-by: Tim Gardner <tim.gardner@canonical.com> Acked-by: Cory Todd <cory.todd@canonical.com> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
Ville Syrjälä [Tue, 28 Mar 2023 08:32:46 +0000 (16:32 +0800)]
drm/i915: Improve PPS debugs
BugLink: https://launchpad.net/bugs/2001599
Always include both the encoder and PPS instance information
in the debug prints so that we know what piece of hardware
we're actually dealing with.
v2: Make sure pps is selected before debug prints/etc. in
intel_pps_vdd_on_unlocked() on vlv/chv
There is no pps on pipe C on chv
v3: Allow PPS=INVALID_PIPE for vlv/chv
Ville Syrjälä [Tue, 28 Mar 2023 08:32:43 +0000 (16:32 +0800)]
drm/i915: Reject unusable power sequencers
BugLink: https://launchpad.net/bugs/2001599
On ICP-ADP the pins used by the second PPS can be alternatively
muxed to some other function. In that case the second power
sequencer is unusable.
Unfortunately (on my ADL Thinkpad T14 gen3 at least) the
BIOS still likes to enable the VDD on the second PPS (due
to the VBT declaring the second bogus eDP panel) even when
not correctly muxed, so we need to deal with it somehow.
For now let's just initialize the PPS as normal, and then
use the normal eDP probe failure VDD off path to turn it off
(and release the wakeref the PPS init grabbed). The
alternative of just declaring that the platform has a single
PPS doesn't really work since it would cause the second eDP
probe to also try to use the first PPS and thus clobber the
state for the first (real) eDP panel.
Ville Syrjälä [Tue, 28 Mar 2023 08:32:42 +0000 (16:32 +0800)]
drm/i915: Extend dual PPS handlind for ICP+
BugLink: https://launchpad.net/bugs/2001599
On the PCH side the second PPS was introduced in ICP. Let's
make sure we examine both power sequencer on ICP+ as well.
Note that DG1/2 south block only has the single PPS, so need
to exclude the fake DG1/2 PCHs.
Ville Syrjälä [Tue, 28 Mar 2023 08:32:41 +0000 (16:32 +0800)]
drm/i915: Try to use the correct power sequencer intiially on bxt/glk
BugLink: https://launchpad.net/bugs/2001599
Currently on bxt/glk we just grab the power sequencer index from
the VBT data even though it may not have been parsed yet. That
could lead us to using the incorrect power sequencer during the
initial panel probe.
To avoid that let's try to read out the current state of the
power sequencer from the hardware. Unfortunately the power
sequencer no longer has anything in its registers to associate
it with the port, so the best we can do is just iterate through
the power sequencers and pick the first one. This should be
sufficient for single panel cases.
For the dual panel cases we probably need to go back to
parsing the VBT before the panel probe (and hope that
panel_type=0xff is never a thing in those cases). To that
end the code always prefers the VBT panel sequencer, if
available.
v2: Restructure a bit for upcoming icp+ dual PPS support
Ville Syrjälä [Tue, 28 Mar 2023 08:32:40 +0000 (16:32 +0800)]
drm/i915: Generalize the PPS vlv_pipe_check() stuff
BugLink: https://launchpad.net/bugs/2001599
Restate the vlv_pipe_check() stuff in terms of PPS index
(rather than pipe, which it is on VLV/CHV) so that we can
reuse this same mechanim on other platforms as well.
Isaac True [Mon, 27 Mar 2023 12:23:25 +0000 (14:23 +0200)]
UBUNTU: [Config] arm64: Enable PCI_MESON module
BugLink: https://bugs.launchpad.net/bugs/2007745
This driver is important for some Amlogic systems like the ODROID HC4, which
uses a PCIE-SATA bridge to provide a storage interface.
Signed-off-by: Isaac True <isaac.true@canonical.com> Acked-by: Tim Gardner <tim.gardner@canonical.com> Acked-by: Paolo Pisati <paolo.pisati@canonical.com> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
Yury Norov [Fri, 17 Feb 2023 01:39:08 +0000 (17:39 -0800)]
sched/topology: fix KASAN warning in hop_cmp()
BugLink: https://bugs.launchpad.net/bugs/2008824
Despite that prev_hop is used conditionally on cur_hop
is not the first hop, it's initialized unconditionally.
Because initialization implies dereferencing, it might happen
that the code dereferences uninitialized memory, which has been
spotted by KASAN. Fix it by reorganizing hop_cmp() logic.
Reported-by: Bruno Goncalves <bgoncalv@redhat.com> Fixes: cd7f55359c90 ("sched: add sched_numa_find_nth_cpu()") Signed-off-by: Yury Norov <yury.norov@gmail.com> Link: https://lore.kernel.org/r/Y+7avK6V9SyAWsXi@yury-laptop/ Signed-off-by: Jakub Kicinski <kuba@kernel.org>
(cherry picked from commit 01bb11ad828b320749764fa93ad078db20d08a9e) Signed-off-by: dann frazier <dann.frazier@canonical.com> Acked-by: Tim Gardner <tim.gardner@canonical.com> Acked-by: Ian May <ian.may@canonical.com> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
Yury Norov [Sat, 21 Jan 2023 04:24:36 +0000 (20:24 -0800)]
lib/cpumask: update comment for cpumask_local_spread()
BugLink: https://bugs.launchpad.net/bugs/2008824
Now that we have an iterator-based alternative for a very common case
of using cpumask_local_spread for all cpus in a row, it's worth to
mention that in comment to cpumask_local_spread().
Signed-off-by: Yury Norov <yury.norov@gmail.com> Reviewed-by: Valentin Schneider <vschneid@redhat.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
(cherry picked from commit 2ac4980c57f54db7c5b416f7946d2921fc16d9d2) Signed-off-by: dann frazier <dann.frazier@canonical.com> Acked-by: Tim Gardner <tim.gardner@canonical.com> Acked-by: Ian May <ian.may@canonical.com> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
Tariq Toukan [Sat, 21 Jan 2023 04:24:35 +0000 (20:24 -0800)]
net/mlx5e: Improve remote NUMA preferences used for the IRQ affinity hints
BugLink: https://bugs.launchpad.net/bugs/2008824
In the IRQ affinity hints, replace the binary NUMA preference (local /
remote) with the improved for_each_numa_hop_cpu() API that minds the
actual distances, so that remote NUMAs with short distance are preferred
over farther ones.
This has significant performance implications when using NUMA-aware
allocated memory (follow [1] and derivatives for example).
[1]
drivers/net/ethernet/mellanox/mlx5/core/en_main.c :: mlx5e_open_channel()
int cpu = cpumask_first(mlx5_comp_irq_get_affinity_mask(priv->mdev, ix));
Performance tests:
TCP multi-stream, using 16 iperf3 instances pinned to 16 cores (with aRFS on).
Active cores: 64,65,72,73,80,81,88,89,96,97,104,105,112,113,120,121
+-------------------------+-----------+------------------+------------------+
| | BW (Gbps) | TX side CPU util | RX side CPU util |
+-------------------------+-----------+------------------+------------------+
| Baseline | 52.3 | 6.4 % | 17.9 % |
+-------------------------+-----------+------------------+------------------+
| Applied on TX side only | 52.6 | 5.2 % | 18.5 % |
+-------------------------+-----------+------------------+------------------+
| Applied on RX side only | 94.9 | 11.9 % | 27.2 % |
+-------------------------+-----------+------------------+------------------+
| Applied on both sides | 95.1 | 8.4 % | 27.3 % |
+-------------------------+-----------+------------------+------------------+
Bottleneck in RX side is released, reached linerate (~1.8x speedup).
~30% less cpu util on TX.
* CPU util on active cores only.
Setups details (similar for both sides):
NIC: ConnectX6-DX dual port, 100 Gbps each.
Single port used in the tests.
$ lscpu
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
CPU(s): 256
On-line CPU(s) list: 0-255
Thread(s) per core: 2
Core(s) per socket: 64
Socket(s): 2
NUMA node(s): 16
Vendor ID: AuthenticAMD
CPU family: 25
Model: 1
Model name: AMD EPYC 7763 64-Core Processor
Stepping: 1
CPU MHz: 2594.804
BogoMIPS: 4890.73
Virtualization: AMD-V
L1d cache: 32K
L1i cache: 32K
L2 cache: 512K
L3 cache: 32768K
NUMA node0 CPU(s): 0-7,128-135
NUMA node1 CPU(s): 8-15,136-143
NUMA node2 CPU(s): 16-23,144-151
NUMA node3 CPU(s): 24-31,152-159
NUMA node4 CPU(s): 32-39,160-167
NUMA node5 CPU(s): 40-47,168-175
NUMA node6 CPU(s): 48-55,176-183
NUMA node7 CPU(s): 56-63,184-191
NUMA node8 CPU(s): 64-71,192-199
NUMA node9 CPU(s): 72-79,200-207
NUMA node10 CPU(s): 80-87,208-215
NUMA node11 CPU(s): 88-95,216-223
NUMA node12 CPU(s): 96-103,224-231
NUMA node13 CPU(s): 104-111,232-239
NUMA node14 CPU(s): 112-119,240-247
NUMA node15 CPU(s): 120-127,248-255
..
BugLink: https://bugs.launchpad.net/bugs/2008824
The recently introduced sched_numa_hop_mask() exposes cpumasks of CPUs
reachable within a given distance budget, wrap the logic for iterating over
all (distance, mask) values inside an iterator macro.
Signed-off-by: Valentin Schneider <vschneid@redhat.com> Reviewed-by: Yury Norov <yury.norov@gmail.com> Signed-off-by: Yury Norov <yury.norov@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
(cherry picked from commit 06ac01721f7d07da722abe0ec6f147b90bfc8c77) Signed-off-by: dann frazier <dann.frazier@canonical.com> Acked-by: Tim Gardner <tim.gardner@canonical.com> Acked-by: Ian May <ian.may@canonical.com> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
BugLink: https://bugs.launchpad.net/bugs/2008824
Tariq has pointed out that drivers allocating IRQ vectors would benefit
from having smarter NUMA-awareness - cpumask_local_spread() only knows
about the local node and everything outside is in the same bucket.
sched_domains_numa_masks is pretty much what we want to hand out (a cpumask
of CPUs reachable within a given distance budget), introduce
sched_numa_hop_mask() to export those cpumasks.
Link: http://lore.kernel.org/r/20220728191203.4055-1-tariqt@nvidia.com Signed-off-by: Valentin Schneider <vschneid@redhat.com> Reviewed-by: Yury Norov <yury.norov@gmail.com> Signed-off-by: Yury Norov <yury.norov@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
(cherry picked from commit 9feae65845f7b16376716fe70b7d4b9bf8721848) Signed-off-by: dann frazier <dann.frazier@canonical.com> Acked-by: Tim Gardner <tim.gardner@canonical.com> Acked-by: Ian May <ian.may@canonical.com> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
BugLink: https://bugs.launchpad.net/bugs/2008824
Now after moving all NUMA logic into sched_numa_find_nth_cpu(),
else-branch of cpumask_local_spread() is just a function call, and
we can simplify logic by using ternary operator.
While here, replace BUG() with WARN_ON().
Signed-off-by: Yury Norov <yury.norov@gmail.com> Acked-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Peter Lafreniere <peter@n8pjl.ca> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
(cherry picked from commit b1beed72b8b75d365fdbc925da856c212195051b) Signed-off-by: dann frazier <dann.frazier@canonical.com> Acked-by: Tim Gardner <tim.gardner@canonical.com> Acked-by: Ian May <ian.may@canonical.com> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
Yury Norov [Sat, 21 Jan 2023 04:24:31 +0000 (20:24 -0800)]
cpumask: improve on cpumask_local_spread() locality
BugLink: https://bugs.launchpad.net/bugs/2008824
Switch cpumask_local_spread() to use newly added sched_numa_find_nth_cpu(),
which takes into account distances to each node in the system.
Signed-off-by: Yury Norov <yury.norov@gmail.com> Acked-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Peter Lafreniere <peter@n8pjl.ca> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
(cherry picked from commit 406d394abfcd8f16dc1dbcc8fc1b828252befb6d) Signed-off-by: dann frazier <dann.frazier@canonical.com> Acked-by: Tim Gardner <tim.gardner@canonical.com> Acked-by: Ian May <ian.may@canonical.com> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
Leveraging the fact that each hop in sched_domains_numa_masks includes the
same or greater number of CPUs than the previous one, we can use binary
search on hops instead of linear walk, which makes the overall complexity
of O(log n) in terms of number of cpumask_weight() calls.
Signed-off-by: Yury Norov <yury.norov@gmail.com> Acked-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Peter Lafreniere <peter@n8pjl.ca> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
(cherry picked from commit cd7f55359c90a4108e6528e326b8623fce1ad72a) Signed-off-by: dann frazier <dann.frazier@canonical.com> Acked-by: Tim Gardner <tim.gardner@canonical.com> Acked-by: Ian May <ian.may@canonical.com> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
Yury Norov [Sat, 21 Jan 2023 04:24:29 +0000 (20:24 -0800)]
cpumask: introduce cpumask_nth_and_andnot
BugLink: https://bugs.launchpad.net/bugs/2008824
Introduce cpumask_nth_and_andnot() based on find_nth_and_andnot_bit().
It's used in the following patch to traverse cpumasks without storing
intermediate result in temporary cpumask.
Signed-off-by: Yury Norov <yury.norov@gmail.com> Acked-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Peter Lafreniere <peter@n8pjl.ca> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
(cherry picked from commit 62f4386e564d31c7d0ed7d835843e2685f99ae71) Signed-off-by: dann frazier <dann.frazier@canonical.com> Acked-by: Tim Gardner <tim.gardner@canonical.com> Acked-by: Ian May <ian.may@canonical.com> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
Yury Norov [Sat, 21 Jan 2023 04:24:28 +0000 (20:24 -0800)]
lib/find: introduce find_nth_and_andnot_bit
BugLink: https://bugs.launchpad.net/bugs/2008824
In the following patches the function is used to implement in-place bitmaps
traversing without storing intermediate result in temporary bitmaps.
Signed-off-by: Yury Norov <yury.norov@gmail.com> Acked-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Peter Lafreniere <peter@n8pjl.ca> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
(cherry picked from commit 43245117806ff8914e37327b610fc08b5ddedc91) Signed-off-by: dann frazier <dann.frazier@canonical.com> Acked-by: Tim Gardner <tim.gardner@canonical.com> Acked-by: Ian May <ian.may@canonical.com> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
Ping-Ke Shih [Fri, 17 Mar 2023 07:09:51 +0000 (15:09 +0800)]
wifi: rtw89: release RX standby timer of beamformee CSI to save power
BugLink: https://bugs.launchpad.net/bugs/2012019
Originally, we keep RX standby timer to handle beamformee CSI, but this
spends power and causes system not entering power save mode. To improve
power consumption, release the timer if throughput becomes low.
Signed-off-by: Ping-Ke Shih <pkshih@realtek.com> Signed-off-by: Kalle Valo <kvalo@kernel.org> Link: https://lore.kernel.org/r/20230307141848.26403-1-pkshih@realtek.com
(cherry picked from commit 8a66293e73a520a42a7653d2ca32074ba323ff56
wireless-next) Signed-off-by: Koba Ko <koba.ko@canonical.com> Acked-by: Tim Gardner <tim.gardner@canonical.com> Acked-by: Paolo Pisati <paolo.pisati@canonical.com> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
Signed-off-by: Chin-Yen Lee <timlee@realtek.com> Signed-off-by: Ping-Ke Shih <pkshih@realtek.com> Signed-off-by: Kalle Valo <kvalo@kernel.org> Link: https://lore.kernel.org/r/20230126112715.5811-1-pkshih@realtek.com
(cherry picked from commit 3712888e3dba5df2b4f3fb3ba87e20bac6afc7c0
wireless-next) Signed-off-by: Koba Ko <koba.ko@canonical.com> Acked-by: Tim Gardner <tim.gardner@canonical.com> Acked-by: Paolo Pisati <paolo.pisati@canonical.com> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
The maple tree tracks the stack and is able to update the pivot
(lower/upper boundary) in-place to allow the page fault handler to write
to the tree while holding just the mmap read lock. This is safe as the
writes to the stack have a guard VMA which ensures there will always be
a NULL in the direction of the growth and thus will only update a pivot.
It is possible, but not recommended, to have VMAs that grow up/down
without guard VMAs. syzbot has constructed a testcase which sets up a
VMA to grow and consume the empty space. Overwriting the entire NULL
entry causes the tree to be altered in a way that is not safe for
concurrent readers; the readers may see a node being rewritten or one
that does not match the maple state they are using.
Enabling RCU mode allows the concurrent readers to see a stable node and
will return the expected result.
Link: https://lkml.kernel.org/r/20230227173632.3292573-9-surenb@google.com Cc: stable@vger.kernel.org Fixes: d4af56c5c7c6 ("mm: start tracking VMAs with maple tree") Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com> Reported-by: syzbot+8d95422d3537159ca390@syzkaller.appspotmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
Dereferencing RCU objects within the RCU callback without the RCU check
has caused lockdep to complain. Fix the RCU dereferencing by using the
RCU callback lock to ensure the operation is safe.
Also stop creating a new lock to use for dereferencing during destruction
of the tree or subtree. Instead, pass through a pointer to the tree that
has the lock that is held for RCU dereferencing checking. It also does
not make sense to use the maple state in the freeing scenario as the tree
walk is a special case where the tree no longer has the normal encodings
and parent pointers.
Link: https://lkml.kernel.org/r/20230227173632.3292573-8-surenb@google.com Fixes: 54a611b60590 ("Maple Tree: add new data structure") Cc: stable@vger.kernel.org Reported-by: Suren Baghdasaryan <surenb@google.com> Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
Add an smp_rmb() before reading the parent pointer to ensure that anything
read from the node prior to the parent pointer hasn't been reordered ahead
of this check.
The is necessary for RCU mode.
Link: https://lkml.kernel.org/r/20230227173632.3292573-7-surenb@google.com Fixes: 54a611b60590 ("Maple Tree: add new data structure") Cc: stable@vger.kernel.org Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
The call to mte_set_dead_node() before the smp_wmb() already calls
smp_wmb() so this is not needed. This is an optimization for the RCU mode
of the maple tree.
Link: https://lkml.kernel.org/r/20230227173632.3292573-5-surenb@google.com Fixes: 54a611b60590 ("Maple Tree: add new data structure") Cc: stable@vger.kernel.org Signed-off-by: Liam Howlett <Liam.Howlett@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
The walk to destroy the nodes was not always setting the node type and
would result in a destroy method potentially using the values as nodes.
Avoid this by setting the correct node types. This is necessary for the
RCU mode of the maple tree.
Link: https://lkml.kernel.org/r/20230227173632.3292573-4-surenb@google.com Cc: <Stable@vger.kernel.org> Fixes: 54a611b60590 ("Maple Tree: add new data structure") Signed-off-by: Liam Howlett <Liam.Howlett@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
When initially starting a search, the root node may already be in the
process of being replaced in RCU mode. Detect and restart the walk if
this is the case. This is necessary for RCU mode of the maple tree.
Link: https://lkml.kernel.org/r/20230227173632.3292573-3-surenb@google.com Cc: <Stable@vger.kernel.org> Fixes: 54a611b60590 ("Maple Tree: add new data structure") Signed-off-by: Liam Howlett <Liam.Howlett@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
If mas->node is an MAS_START, there are three cases, and they all assign
different values to mas->node and mas->offset. So there is no need to set
them to a default value before updating.
Update them directly to make them easier to understand and for better
readability.
Link: https://lkml.kernel.org/r/20221221060058.609003-7-vernon2gm@gmail.com Cc: <Stable@vger.kernel.org> Fixes: 54a611b60590 ("Maple Tree: add new data structure") Signed-off-by: Vernon Yang <vernon2gm@gmail.com> Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
If an invalidated maple state is encountered during write, reset the maple
state to MAS_START. This will result in a re-walk of the tree to the
correct location for the write.
When iterating, a user may operate on the tree and cause the maple state
to be altered and left in an unintuitive state. Detect this scenario and
correct it by setting to the limit and invalidating the state.
Link: https://lkml.kernel.org/r/20230120162650.984577-4-Liam.Howlett@oracle.com Cc: <Stable@vger.kernel.org> Fixes: 54a611b60590 ("Maple Tree: add new data structure") Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
Ensure the node isn't dead after reading the node end.
Link: https://lkml.kernel.org/r/20230120162650.984577-3-Liam.Howlett@oracle.com Cc: <Stable@vger.kernel.org> Fixes: 54a611b60590 ("Maple Tree: add new data structure") Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
Preallocations are common in the VMA code to avoid allocating under
certain locking conditions. The preallocations must also cover the
worst-case scenario. Removing the GFP_ZERO flag from the
kmem_cache_alloc() (and bulk variant) calls will reduce the amount of time
spent zeroing memory that may not be used. Only zero out the necessary
area to keep track of the allocations in the maple state. Zero the entire
node prior to using it in the tree.
This required internal changes to node counting on allocation, so the test
code is also updated.
This restores some micro-benchmark performance: up to +9% in mmtests mmap1
by my testing +10% to +20% in mmap, mmapaddr, mmapmany tests reported by
Red Hat
Link: https://bugzilla.redhat.com/show_bug.cgi?id=2149636 Link: https://lkml.kernel.org/r/20230105160427.2988454-1-Liam.Howlett@oracle.com Cc: stable@vger.kernel.org Fixes: 54a611b60590 ("Maple Tree: add new data structure") Signed-off-by: Liam Howlett <Liam.Howlett@oracle.com> Reported-by: Jirka Hladky <jhladky@redhat.com> Suggested-by: Matthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
Device exclusive page table entries are used to prevent CPU access to a
page whilst it is being accessed from a device. Typically this is used to
implement atomic operations when the underlying bus does not support
atomic access. When a CPU thread encounters a device exclusive entry it
locks the page and restores the original entry after calling mmu notifiers
to signal drivers that exclusive access is no longer available.
The device exclusive entry holds a reference to the page making it safe to
access the struct page whilst the entry is present. However the fault
handling code does not hold the PTL when taking the page lock. This means
if there are multiple threads faulting concurrently on the device
exclusive entry one will remove the entry whilst others will wait on the
page lock without holding a reference.
This can lead to threads locking or waiting on a folio with a zero
refcount. Whilst mmap_lock prevents the pages getting freed via munmap()
they may still be freed by a migration. This leads to warnings such as
PAGE_FLAGS_CHECK_AT_FREE due to the page being locked when the refcount
drops to zero.
Fix this by trying to take a reference on the folio before locking it.
The code already checks the PTE under the PTL and aborts if the entry is
no longer there. It is also possible the folio has been unmapped, freed
and re-allocated allowing a reference to be taken on an unrelated folio.
This case is also detected by the PTE check and the folio is unlocked
without further changes.
Link: https://lkml.kernel.org/r/20230330012519.804116-1-apopple@nvidia.com Fixes: b756a3b5e7ea ("mm: device exclusive memory access") Signed-off-by: Alistair Popple <apopple@nvidia.com> Reviewed-by: Ralph Campbell <rcampbell@nvidia.com> Reviewed-by: John Hubbard <jhubbard@nvidia.com> Acked-by: David Hildenbrand <david@redhat.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Christoph Hellwig <hch@infradead.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
This fixes PLL being unable to lock, and is derived from an equivalent
downstream commit.
Available LT9611 documentation does not list this register, neither does
LT9611UXC (which is a different chip).
This commit has been confirmed to fix HDMI output on DragonBoard 845c.
Suggested-by: Amit Pundir <amit.pundir@linaro.org> Reviewed-by: Amit Pundir <amit.pundir@linaro.org> Signed-off-by: Robert Foss <robert.foss@linaro.org> Link: https://patchwork.freedesktop.org/patch/msgid/20221213150304.4189760-1-robert.foss@linaro.org Signed-off-by: Amit Pundir <amit.pundir@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
The psp suspend & resume should be skipped to avoid destroy
the TMR and reload FWs again for IMU enabled APU ASICs.
Signed-off-by: Tim Huang <tim.huang@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
[Why]
In case of failure to resume MST topology after suspend, an emtpty
mst tree prevents further mst hub detection on the same connector.
That causes the issue with MST hub hotplug after it's been unplug in
suspend.
[How]
Stop topology manager on the connector after detecting DM_MST failure.
Reviewed-by: Wayne Lin <Wayne.Lin@amd.com> Acked-by: Jasdeep Dhillon <jdhillon@amd.com> Signed-off-by: Roman Li <roman.li@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: "Limonciello, Mario" <Mario.Limonciello@amd.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
When CPU1 is about to execute the instruction pointed to by IP, the
ma_data_end() executed by CPU2 may return the wrong end position, which
will cause the value loaded by mtree_load() to be wrong.
An example of triggering the bug:
Add mdelay(100) between rcu_assign_pointer() and ma_set_meta() in
mas_root_expand().
static DEFINE_MTREE(tree);
int work(void *p) {
unsigned long val;
for (int i = 0 ; i< 30; ++i) {
val = (unsigned long)mtree_load(&tree, 8);
mdelay(5);
pr_info("%lu",val);
}
return 0;
}
In RCU mode, mtree_load() should always return the value before or after
the data structure is modified, and in this example mtree_load(&tree, 8)
may return 56789 which is not expected, it should always return NULL. Fix
it by put ma_set_meta() before rcu_assign_pointer().
Link: https://lkml.kernel.org/r/20230314124203.91572-4-zhangpeng.00@bytedance.com Fixes: 54a611b60590 ("Maple Tree: add new data structure") Signed-off-by: Peng Zhang <zhangpeng.00@bytedance.com> Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
The above code should be changed to if (likely(offset < end)), which is
correct. This affects the correctness of ma_data_end(). Now it seems
that the final result will not be wrong, but it is best to change it.
This patch does not change the code as above, because it simplifies the
code by the way.
This patch fixes an issue that a hugetlb uffd-wr-protected mapping can be
writable even with uffd-wp bit set. It only happens with hugetlb private
mappings, when someone firstly wr-protects a missing pte (which will
install a pte marker), then a write to the same page without any prior
access to the page.
Userfaultfd-wp trap for hugetlb was implemented in hugetlb_fault() before
reaching hugetlb_wp() to avoid taking more locks that userfault won't
need. However there's one CoW optimization path that can trigger
hugetlb_wp() inside hugetlb_no_page(), which will bypass the trap.
This patch skips hugetlb_wp() for CoW and retries the fault if uffd-wp bit
is detected. The new path will only trigger in the CoW optimization path
because generic hugetlb_fault() (e.g. when a present pte was
wr-protected) will resolve the uffd-wp bit already. Also make sure
anonymous UNSHARE won't be affected and can still be resolved, IOW only
skip CoW not CoR.
This patch will be needed for v5.19+ hence copy stable.
[peterx@redhat.com: v2] Link: https://lkml.kernel.org/r/ZBzOqwF2wrHgBVZb@x1n
[peterx@redhat.com: v3] Link: https://lkml.kernel.org/r/20230324142620.2344140-1-peterx@redhat.com Link: https://lkml.kernel.org/r/20230321191840.1897940-1-peterx@redhat.com Fixes: 166f3ecc0daf ("mm/hugetlb: hook page faults for uffd write protection") Signed-off-by: Peter Xu <peterx@redhat.com> Reported-by: Muhammad Usama Anjum <usama.anjum@collabora.com> Tested-by: Muhammad Usama Anjum <usama.anjum@collabora.com> Acked-by: David Hildenbrand <david@redhat.com> Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Axel Rasmussen <axelrasmussen@google.com> Cc: Mike Rapoport <rppt@linux.vnet.ibm.com> Cc: Nadav Amit <nadav.amit@gmail.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
The si->lock must be held when deleting the si from the available list.
Otherwise, another thread can re-add the si to the available list, which
can lead to memory corruption. The only place we have found where this
happens is in the swapoff path. This case can be described as below:
core 0 core 1
swapoff
del_from_avail_list(si) waiting
try lock si->lock acquire swap_avail_lock
and re-add si into
swap_avail_head
acquire si->lock but missing si already being added again, and continuing
to clear SWP_WRITEOK, etc.
It can be easily found that a massive warning messages can be triggered
inside get_swap_pages() by some special cases, for example, we call
madvise(MADV_PAGEOUT) on blocks of touched memory concurrently, meanwhile,
run much swapon-swapoff operations (e.g. stress-ng-swap).
However, in the worst case, panic can be caused by the above scene. In
swapoff(), the memory used by si could be kept in swap_info[] after
turning off a swap. This means memory corruption will not be caused
immediately until allocated and reset for a new swap in the swapon path.
A panic message caused: (with CONFIG_PLIST_DEBUG enabled)
Now, si->lock locked before calling 'del_from_avail_list()' to make sure
other thread see the si had been deleted and SWP_WRITEOK cleared together,
will not reinsert again.
This problem exists in versions after stable 5.10.y.
Link: https://lkml.kernel.org/r/20230404154716.23058-1-rongwei.wang@linux.alibaba.com Fixes: a2468cc9bfdff ("swap: choose swap device according to numa node") Tested-by: Yongchen Yin <wb-yyc939293@alibaba-inc.com> Signed-off-by: Rongwei Wang <rongwei.wang@linux.alibaba.com> Cc: Bagas Sanjaya <bagasdotme@gmail.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Aaron Lu <aaron.lu@intel.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
When user reads file 'trace_pipe', kernel keeps printing following logs
that warn at "cpu_buffer->reader_page->read > rb_page_size(reader)" in
rb_get_reader_page(). It just looks like there's an infinite loop in
tracing_read_pipe(). This problem occurs several times on arm64 platform
when testing v5.10 and below.
Then I dump the vmcore and look into the problematic per_cpu ring_buffer,
I found that tail_page/commit_page/reader_page are on the same page while
reader_page->read is obviously abnormal:
tail_page == commit_page == reader_page == {
.write = 0x100d20,
.read = 0x8f9f4805, // Far greater than 0xd20, obviously abnormal!!!
.entries = 0x10004c,
.real_end = 0x0,
.page = {
.time_stamp = 0x857257416af0,
.commit = 0xd20, // This page hasn't been full filled.
// .data[0...0xd20] seems normal.
}
}
The root cause is most likely the race that reader and writer are on the
same page while reader saw an event that not fully committed by writer.
To fix this, add memory barriers to make sure the reader can see the
content of what is committed. Since commit a0fcaaed0c46 ("ring-buffer: Fix
race between reset page and reading page") has added the read barrier in
rb_get_reader_page(), here we just need to add the write barrier.
Link: https://lore.kernel.org/linux-trace-kernel/20230325021247.2923907-1-zhengyejian1@huawei.com Cc: stable@vger.kernel.org Fixes: 77ae365eca89 ("ring-buffer: make lockless") Suggested-by: Steven Rostedt (Google) <rostedt@goodmis.org> Signed-off-by: Zheng Yejian <zhengyejian1@huawei.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
Userspace can guess the id value and try to race oa_config object creation
with config remove, resulting in a use-after-free if we dereference the
object after unlocking the metrics_lock. For that reason, unlocking the
metrics_lock must be done after we are done dereferencing the object.
Signed-off-by: Min Li <lm0963hack@gmail.com> Fixes: f89823c21224 ("drm/i915/perf: Implement I915_PERF_ADD/REMOVE_CONFIG interface") Cc: <stable@vger.kernel.org> # v4.14+ Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com> Reviewed-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230328093627.5067-1-lm0963hack@gmail.com
[tursulin: Manually added stable tag.]
(cherry picked from commit 49f6f6483b652108bcb73accd0204a464b922395) Signed-off-by: Jani Nikula <jani.nikula@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
When considering whether to mark one context as stopped and another as
started we need to look at whether the previous and new _contexts_ are
different and not just requests. Otherwise the software tracked context
start time was incorrectly updated to the most recent lite-restore time-
stamp, which was in some cases resulting in active time going backward,
until the context switch (typically the heartbeat pulse) would synchronise
with the hardware tracked context runtime. Easiest use case to observe
this behaviour was with a full screen clients with close to 100% engine
load.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Fixes: bb6287cb1886 ("drm/i915: Track context current active time") Cc: <stable@vger.kernel.org> # v5.19+ Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230320151423.1708436-1-tvrtko.ursulin@linux.intel.com
[tursulin: Fix spelling in commit msg.]
(cherry picked from commit b3e70051879c665acdd3a1ab50d0ed58d6a8001f) Signed-off-by: Jani Nikula <jani.nikula@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
Since SQE memory is shared with userspace, we should only be reading it
once. We cannot read it multiple times, particularly when it's read once
for validation and then read again for the actual use.
ublk_ch_uring_cmd() is safe when called as a retry operation, as the
memory backing is stable at that point. But for normal issue, we want
to ensure that we only read ublksrv_io_cmd once. Wrap the function in
a helper that reads the value into an on-stack copy of the struct.
Cc: stable@vger.kernel.org # 6.0+ Reviewed-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
This helps avoid cleartext leakage of already queued or powersave buffered
packets, when a reassoc triggers the key deletion.
Cc: stable@vger.kernel.org Signed-off-by: Felix Fietkau <nbd@nbd.name> Signed-off-by: Kalle Valo <kvalo@kernel.org> Link: https://lore.kernel.org/r/20230330091259.61378-1-nbd@nbd.name Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
This failure really confuse me as there're still lots of available pages.
Finally I figured out it was caused by a fatal signal. When a process is
allocating memory via vm_area_alloc_pages(), it will break directly even
if it hasn't allocated the requested pages when it receives a fatal
signal. In that case, we shouldn't show this warn_alloc, as it is
useless. We only need to show this warning when there're really no enough
pages.
Link: https://lkml.kernel.org/r/20230330162625.13604-1-laoar.shao@gmail.com Signed-off-by: Yafang Shao <laoar.shao@gmail.com> Reviewed-by: Lorenzo Stoakes <lstoakes@gmail.com> Cc: Christoph Hellwig <hch@infradead.org> Cc: Uladzislau Rezki (Sony) <urezki@gmail.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
The same task check in perf_event_set_output has some potential issues
for some usages.
For the current perf code, there is a problem if using of
perf_event_open() to have multiple samples getting into the same mmap’d
memory when they are both attached to the same process.
https://lore.kernel.org/all/92645262-D319-4068-9C44-2409EF44888E@gmail.com/
Because the event->ctx is not ready when the perf_event_set_output() is
invoked in the perf_event_open().
Besides the above issue, before the commit bd2756811766 ("perf: Rewrite
core context handling"), perf record can errors out when sampling with
a hardware event and a software event as below.
$ perf record -e cycles,dummy --per-thread ls
failed to mmap with 22 (Invalid argument)
That's because that prior to the commit a hardware event and a software
event are from different task context.
The problem should be a long time issue since commit c3f00c70276d
("perk: Separate find_get_context() from event initialization").
The task struct is stored in the event->hw.target for each per-thread
event. It is a more reliable way to determine whether two events are
attached to the same task.
The event->hw.target was also introduced several years ago by the
commit 50f16a8bf9d7 ("perf: Remove type specific target pointers"). It
can not only be used to fix the issue with the current code, but also
back port to fix the issues with an older kernel.
Note: The event->hw.target was introduced later than commit c3f00c70276d. The patch may cannot be applied between the commit c3f00c70276d and commit 50f16a8bf9d7. Anybody that wants to back-port
this at that period may have to find other solutions.
Fixes: c3f00c70276d ("perf: Separate find_get_context() from event initialization") Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Zhengjun Xing <zhengjun.xing@linux.intel.com> Link: https://lkml.kernel.org/r/20230322202449.512091-1-kan.liang@linux.intel.com Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
Thomas reported that offlining CPUs spends a lot of time in
synchronize_rcu() as called from perf_pmu_migrate_context() even though
he's not actually using uncore events.
Turns out, the thing is unconditionally waiting for RCU, even if there's
no actual events to migrate.
Fixes: 0cda4c023132 ("perf: Introduce perf_pmu_migrate_context()") Reported-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Tested-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Paul E. McKenney <paulmck@kernel.org> Link: https://lkml.kernel.org/r/20230403090858.GT4253@hirez.programming.kicks-ass.net Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>