Ville Syrjälä [Tue, 5 Mar 2024 08:36:59 +0000 (10:36 +0200)]
drm/i915/dsi: Go back to the previous INIT_OTP/DISPLAY_ON order, mostly
Reinstate commit 88b065943cb5 ("drm/i915/dsi: Do display on
sequence later on icl+"), for the most part. Turns out some
machines (eg. Chuwi Minibook X) really do need that updated order.
It is also the order the Windows driver uses.
However we can't just undo the revert since that would again
break Lenovo 82TQ. After staring at the VBT sequences for both
machines I've concluded that the Lenovo 82TQ sequences look
somewhat broken:
- INIT_OTP is not present at all
- what should be in INIT_OTP is found in DISPLAY_ON
- what should be in DISPLAY_ON is found in BACKLIGHT_ON
(along with the actual backlight stuff)
The Chuwi Minibook X on the other hand has a full complement
of sequences in its VBT.
So let's try to deal with the broken sequences in the
Lenovo 82TQ VBT by simply swapping the (non-existent)
INIT_OTP sequence with the DISPLAY_ON sequence. Thus we
execute DISPLAY_ON when intending to execute INIT_OTP,
and execute nothing at all when intending to execute
DISPLAY_ON. That should be 100% equivalent to the
revert, for such broken VBTs.
Cc: stable@vger.kernel.org Fixes: dc524d05974f ("Revert "drm/i915/dsi: Do display on sequence later on icl+"")
References: https://gitlab.freedesktop.org/drm/intel/-/issues/10071 Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/10334 Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240305083659.8396-1-ville.syrjala@linux.intel.com Acked-by: Jani Nikula <jani.nikula@intel.com>
drm/i915/display: Disable AuxCCS framebuffers if built for Xe
AuxCCS framebuffers don't work on Xe driver hence disable them
from plane capabilities until they are fixed. FlatCCS framebuffers
work and they are left enabled. CCS is left untouched for i915
driver.
Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/933 Signed-off-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com> Reviewed-by: José Roberto de Souza <jose.souza@intel.com> Tested-by: José Roberto de Souza <jose.souza@intel.com> Acked-by: Jani Nikula <jani.nikula@intel.com> Fixes: 44e694958b95 ("drm/xe/display: Implement display support") Signed-off-by: José Roberto de Souza <jose.souza@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240228140225.858145-1-juhapekka.heikkila@gmail.com
Ville Syrjälä [Mon, 26 Feb 2024 19:32:50 +0000 (21:32 +0200)]
drm/i915: Stop doing double audio enable/disable on SDVO and g4x+ DP
Looks like I misplaced a few hunks when I moved the audio
enable/disable out from the encoder enable/disable hooks.
So we are now doing a double audio enable/disable on SDVO
and g4x+ DP. Probably harmless as doing it twice shouldn't
really change anything, but let's do it just once, as intended.
Imre Deak [Mon, 5 Feb 2024 13:26:31 +0000 (15:26 +0200)]
drm/i915/dp: Fix connector DSC HW state readout
The DSC HW state of DP connectors is read out during driver loading and
system resume in intel_modeset_update_connector_atomic_state(). This
function is called for all connectors though and so the state of DSI
connectors will also get updated incorrectly, triggering a WARN there
wrt. the DSC decompression AUX device.
Fix the above by moving the DSC state readout to a new DP connector
specific sync_state() hook. This is anyway the logical place to update
the connector object's state vs. the connector's atomic state.
Fixes: b2608c6b3212 ("drm/i915/dp_mst: Enable MST DSC decompression for all streams") Reported-and-tested-by: Drew Davenport <ddavenport@chromium.org> Closes: https://lore.kernel.org/all/Zb0q8IDVXS0HxJyj@chromium.org Reviewed-by: Ankit Nautiyal <ankit.k.nautiyal@intel.com> Signed-off-by: Imre Deak <imre.deak@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240205132631.1588577-1-imre.deak@intel.com
Imre Deak [Wed, 28 Feb 2024 16:46:36 +0000 (18:46 +0200)]
drm/dp: Fix documentation of DP tunnel functions
Fix the documentation issues below, also reported by 'make htmldocs':
drivers/gpu/drm/display/drm_dp_tunnel.c:447: warning: Function parameter or struct member 'tunnel' not described in 'drm_dp_tunnel_put'
drivers/gpu/drm/display/drm_dp_tunnel.c:447: warning: Function parameter or struct member 'tracker' not described in 'drm_dp_tunnel_put'
drivers/gpu/drm/display/drm_dp_tunnel.c:1185: warning: expecting prototype for drm_dp_tunnel_atomic_get_allocated_bw(). Prototype was for drm_dp_tunnel_get_allocated_bw() instead
drivers/gpu/drm/display/drm_dp_tunnel.c:1903: warning: Function parameter or struct member 'max_group_count' not described in 'drm_dp_tunnel_mgr_create'
Animesh Manna [Thu, 29 Feb 2024 04:37:16 +0000 (10:07 +0530)]
drm/i915/panelreplay: Move out psr_init_dpcd() from init_connector()
Move psr_init_dpcd() from init-connector to connector-detect
function. The dpcd probe for checking panel replay capability
for external dp connector is causing delay during boot which can
be optimized by moving dpcd probe to connector specific detect().
v1: Initial version.
v2: Add details in commit description. [Jani]
Suggested-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/10284 Signed-off-by: Animesh Manna <animesh.manna@intel.com> Fixes: cceeaa312d39 ("drm/i915/panelreplay: Enable panel replay dpcd initialization for DP") Reviewed-by: Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240229043716.4065760-1-animesh.manna@intel.com
Ville Syrjälä [Fri, 23 Feb 2024 20:32:16 +0000 (22:32 +0200)]
drm/i915: Simplify aux_ch_to_digital_port()
Just return the correct thing from within the loop to make
the code more readable. We have no ref counts/etc. to deal
with here so no point in breaking from the loop just to return
something.
Ville Syrjälä [Fri, 23 Feb 2024 20:32:15 +0000 (22:32 +0200)]
drm/i915: Don't explode when the dig port we don't have an AUX CH
The icl+ power well code currently assumes that every AUX power
well maps to an encoder which is using said power well. That is
by no menas guaranteed as we:
- only register encoders for ports declared in the VBT
- combo PHY HDMI-only encoder no longer get an AUX CH since
commit 9856308c94ca ("drm/i915: Only populate aux_ch if really needed")
However we have places such as intel_power_domains_sanitize_state()
that blindly traverse all the possible power wells. So these bits
of code may very well encounbter an aux power well with no associated
encoder.
In this particular case the BIOS seems to have left one AUX power
well enabled even though we're dealing with a HDMI only encoder
on a combo PHY. We then proceed to turn off said power well and
explode when we can't find a matching encoder. As a short term fix
we should be able to just skip the PHY related parts of the power
well programming since we know this situation can only happen with
combo PHYs.
Another option might be to go back to always picking an AUX CH for
all encoders. However I'm a bit wary about that since we might in
theory end up conflicting with the VBT AUX CH assignment. Also
that wouldn't help with encoders not declared in the VBT, should
we ever need to poke the corresponding power wells.
Longer term we need to figure out what the actual relationship
is between the PHY vs. AUX CH vs. AUX power well. Currently this
is entirely unclear.
drm/i915/display: Save a few bytes of memory in intel_backlight_device_register()
'name' may still be "intel_backlight" when backlight_device_register()
is called. In such a case, using kstrdup_const() saves a memory
duplication when dev_set_name() is called in
backlight_device_register().
Add a function to return the expected child device size. Flip the if
ladder around and use the same versions as in documentation to make it
easier to verify. Return an error for unknown versions. No functional
changes.
v2: Move BUILD_BUG_ON() next to the expected sizes
Gustavo Sousa [Wed, 14 Feb 2024 20:27:20 +0000 (17:27 -0300)]
drm/i915/cdclk: Rename intel_cdclk_needs_modeset to intel_cdclk_clock_changed
Looks like the name and description of intel_cdclk_needs_modeset()
became inaccurate as of commit 59f9e9cab3a1 ("drm/i915: Skip modeset for
cdclk changes if possible"), when it became possible to update the cdclk
without requiring disabling the pipes when only changing the cd2x
divider was enough.
Later on we also added the same type of support with squash and crawling
with commit 25e0e5ae5610 ("drm/i915/display: Do both crawl and squash
when changing cdclk"), commit d4a23930490d ("drm/i915: Allow cdclk
squasher to be reconfigured live") and commit d62686ba3b54
("drm/i915/adl_p: CDCLK crawl support for ADL").
As such, update that function's name and documentation to something more
appropriate, since the real checks for requiring modeset are done
elsewhere.
v2:
- Rename to intel_cdclk_clock_changed instead of
intel_cdclk_params_changed. (Ville)
Dave Airlie [Wed, 28 Feb 2024 01:02:54 +0000 (11:02 +1000)]
Merge tag 'drm-intel-next-2024-02-27-1' of git://anongit.freedesktop.org/drm/drm-intel into drm-next
drm/i915 feature pull #2 for v6.9:
Features and functionality:
- DP tunneling and bandwidth allocation support (Imre)
- Add more ADL-N PCI IDs (Gustavo)
- Enable fastboot also on older platforms (Ville)
- Bigjoiner force enable debugfs option for testing (Stan)
Refactoring and cleanups:
- Remove unused structs and struct members (Jiri Slaby)
- Use per-device debug logging (Ville)
- State check improvements (Ville)
- Hardcoded cd2x divider cleanups (Ville)
- CDCLK documentation updates (Ville, Rodrigo)
Fixes:
- HDCP MST Type1 fixes (Suraj)
- Fix MTL C20 PHY PLL values (Ravi)
- More hardware access prevention during init (Imre)
- Always enable decompression with tile4 on Xe2 (Juha-Pekka)
- Improve LNL package C residency (Suraj)
Imre Deak [Tue, 20 Feb 2024 21:18:40 +0000 (23:18 +0200)]
drm/i915/dp: Read DPRX for all long HPD pulses
The TBT DP tunnel BW request logic in the Thunderbolt Connection Manager
depends on the GFX driver reading out the sink's DPRX capabilities in
response to a long HPD pulse. Since in i915 this read-out can be blocked
by another connector's/encoder's hotplug event handling (which is
serialized by drm_mode_config::connection_mutex), do a dummy DPRX read-out
in the encoder's HPD pulse handler (which is not blocked by other
encoders).
Imre Deak [Tue, 20 Feb 2024 21:18:39 +0000 (23:18 +0200)]
drm/i915/dp: Suspend/resume DP tunnels
Suspend and resume DP tunnels during system suspend/resume, disabling
the BW allocation mode during suspend, re-enabling it after resume. This
reflects the link's BW management component (Thunderbolt CM) disabling
BWA during suspend. Before any BW requests the driver must read the
sink's DPRX capabilities (since the BW manager requires this
information, so snoops for it on AUX), so ensure this read takes place.
Imre Deak [Tue, 20 Feb 2024 21:18:38 +0000 (23:18 +0200)]
drm/i915/dp: Call intel_dp_sync_state() always for DDI DP encoders
A follow-up change will need to resume DP tunnels during system resume,
so call intel_dp_sync_state() always for DDI encoders, so this function
can resume the tunnels for all DP connectors.
Imre Deak [Tue, 20 Feb 2024 21:18:37 +0000 (23:18 +0200)]
drm/i915/dp: Handle DP tunnel IRQs
Handle DP tunnel IRQs a sink (or rather a BW management component like
the Thunderbolt Connection Manager) raises to signal the completion of a
BW request by the driver, or to signal any state change related to the
link BW.
Imre Deak [Tue, 20 Feb 2024 21:18:35 +0000 (23:18 +0200)]
drm/i915/dp: Compute DP tunnel BW during encoder state computation
Compute the BW required through a DP tunnel on links with such tunnels
detected and add the corresponding atomic state during a modeset.
v2:
- Fix error check of intel_dp_tunnel_compute_stream_bw(). (Ville)
- Move intel_dp_tunnel_atomic_cleanup_inherited_state() to this patch.
(Ville)
- Move intel_dp_tunnel_atomic_clear_stream_bw() to this patch.
Imre Deak [Tue, 20 Feb 2024 21:18:34 +0000 (23:18 +0200)]
drm/i915/dp: Account for tunnel BW limit in intel_dp_max_link_data_rate()
Take any link BW limitation into account in
intel_dp_max_link_data_rate(). Such a limitation can be due to multiple
displays on (Thunderbolt) links with DP tunnels sharing the link BW.
Imre Deak [Tue, 20 Feb 2024 21:18:33 +0000 (23:18 +0200)]
drm/i915/dp: Add DP tunnel atomic state and check BW limit
Add the atomic state during a modeset required to enable the DP tunnel
BW allocation mode on links where such a tunnel was detected. This state
applies to an already enabled output, the state added for a newly
enabled output will be computed and added/cleared to/from the atomic
state in a follow-up patch.
v2:
- s/old_crtc_state/crtc_state in intel_crtc_duplicate_state().
- Move intel_dp_tunnel_atomic_cleanup_inherited_state() to a follow-up
patch adding the corresponding state. (Ville)
- Move intel_dp_tunnel_atomic_clear_stream_bw() to a follow-up
patch adding the corresponding state.
Imre Deak [Mon, 26 Feb 2024 18:52:46 +0000 (20:52 +0200)]
drm/i915/dp: Add support for DP tunnel BW allocation
Add support to enable the DP tunnel BW allocation mode. Follow-up
patches will call the required helpers added here to prepare for a
modeset on a link with DP tunnels, the last change in the patchset
actually enabling BWA.
With BWA enabled, the driver will expose the full mode list a display
supports, regardless of any BW limitation on a shared (Thunderbolt)
link. Such BW limits will be checked against only during a modeset, when
the driver has the full knowledge of each display's BW requirement.
If the link BW changes in a way that a connector's modelist may also
change, userspace will get a hotplug notification for all the connectors
sharing the same link (so it can adjust the mode used for a display).
The BW limitation can change at any point, asynchronously to modesets
on a given connector, so a modeset can fail even though the atomic check
for it passed. In such scenarios userspace will get a bad link
notification and in response is supposed to retry the modeset.
v2:
- Fix old vs. new connector state in intel_dp_tunnel_atomic_check_state().
(Ville)
- Fix propagating the error from
intel_dp_tunnel_atomic_compute_stream_bw(). (Ville)
- Move tunnel==NULL checks from driver to DRM core helpers. (Ville)
- Simplify return flow from intel_dp_tunnel_detect(). (Ville)
- s/dp_tunnel_state/inherited_dp_tunnels (Ville)
- Simplify struct intel_dp_tunnel_inherited_state. (Ville)
- Unconstify object pointers (vs. states) where possible. (Ville)
- Init crtc_state while declaring it in check_group_state(). (Ville)
- Join obj->base.id, obj->name arg lines in debug prints to reduce LOC.
(Ville)
- Add/rework intel_dp_tunnel_atomic_alloc_bw() to prepare for moving the
BW allocation from encoder hooks up to intel_atomic_commit_tail()
later in the patchset.
- Disable BW alloc mode during system suspend.
- Allocate the required BW for all tunnels during system resume.
- Add intel_dp_tunnel_atomic_clear_stream_bw() instead of the open-coded
sequence in a follow-up patch.
- Add function documentation to all exported functions.
- Add CONFIG_USB4 dependency to CONFIG_DRM_I915_DP_TUNNEL.
v3:
- Rebase on intel_dp_get_active_pipes() change in previous patch.
Imre Deak [Mon, 26 Feb 2024 18:52:45 +0000 (20:52 +0200)]
drm/i915/dp: Sync instead of try-sync commits when getting active pipes
Sync instead of only try-sync non-blocking commits when getting the
active pipes through a given DP port. Atm intel_dp_get_active_pipes()
will only try to sync a given pipe and if that would block ignore the
pipe. This was supposed to avoid link retraining in case a pending
modeset would do that anyway, however that could incorrectly ignore
fastset pipes as well for instance (which don't retraing the link).
The TC port reset path needs to handle all pipes, even if a waiting for
a pending commit would block. To account for the above cases sync all
the pipes unconditionally.
This also prepares for a follow-up change enabling the DP tunnel BW
allocation mode which needs to ensure that all active pipes are synced
and returned from intel_dp_get_active_pipes().
v2:
- Add a separate function to try-sync the pipes. (Ville)
v3:
- Just sync the pipes unconditionally in intel_dp_get_active_pipes().
(Ville)
Imre Deak [Tue, 20 Feb 2024 21:18:30 +0000 (23:18 +0200)]
drm/i915/dp: Add intel_dp_max_link_data_rate()
Add intel_dp_max_link_data_rate() to get the link BW vs. the sink DPRX
BW used by a follow-up patch enabling the DP tunnel BW allocation mode.
The link BW can be below the DPRX BW due to a BW limitation on a link
shared by multiple sinks.
Imre Deak [Tue, 20 Feb 2024 21:18:24 +0000 (23:18 +0200)]
drm/i915/dp: Add support to notify MST connectors to retry modesets
On shared (Thunderbolt) links with DP tunnels, the modeset may need to
be retried on all connectors on the link due to a link BW limitation
arising only after the atomic check phase. To support this add a helper
function queuing a work to retry the modeset on a given port's connector
and at the same time any MST connector with streams through the same
port. A follow-up change enabling the DP tunnel Bandwidth Allocation
Mode will take this into use.
v2:
- Send the uevent only to enabled MST connectors. (Jouni)
Imre Deak [Tue, 20 Feb 2024 21:18:23 +0000 (23:18 +0200)]
drm/i915: Fix display bpp limit computation during system resume
The system resume display mode restoration should happen with an output
configuration matching that of the suspend time saved mode. Since the
restored mode configuration is subject to the bpp fallback logic,
starting out with an unlimited bpp and reducing the bpp as required by
any (MST) link BW limit, the resulting bpp will match the one during
suspend only if the BW limit checks during suspend and resume are
applied in an identical way. The latter is not guaranteed at the moment,
since the pre-suspend MST topology may not be in place during resume
(for instance if the MST sink was disconnected while being suspended),
which makes the MST link BW check accept the unlimited bpp mode
configuration unconditionally without ensuring that the required BW fits
into the available MST link BW.
To fix the above, initialize the bpp fallback logic with the max link
bpp / force-FEC limits left behind by the suspend time mode save.
Imre Deak [Mon, 26 Feb 2024 18:52:44 +0000 (20:52 +0200)]
drm/dp: Add support for DP tunneling
Add support for Display Port tunneling. For now this includes the
support for Bandwidth Allocation Mode (BWA), leaving adding Panel Replay
support for later.
BWA allows using displays that share the same (Thunderbolt) link with
their maximum resolution. Atm, this may not be possible due to the
coarse granularity of partitioning the link BW among the displays on the
link: the BW allocation policy is in a SW/FW/HW component on the link
(on Thunderbolt it's the SW or FW Connection Manager), independent of
the driver. This policy will set the DPRX maximum rate and lane count
DPCD registers the GFX driver will see (0x00000, 0x00001, 0x02200,
0x02201) based on the available link BW.
The granularity of the current BW allocation policy is coarse, based on
the required link rate in the 1.62Gbs..8.1Gbps range and it may prevent
using higher resolutions all together: the display connected first will
get a share of the link BW which corresponds to its full DPRX capability
(regardless of the actual mode it uses). A subsequent display connected
will only get the remaining BW, which could be well below its full
capability.
BWA solves the above coarse granularity (reducing it to a 250Mbs..1Gps
range) and first-come/first-served issues by letting the driver request
the BW for each display on a link which reflects the actual modes the
displays use.
This patch adds the DRM core helper functions, while a follow-up change
in the patchset takes them into use in the i915 driver.
v2:
- Fix prepare_to_wait vs. wake-up cond check order in
allocate_tunnel_bw(). (Ville)
- Move tunnel==NULL checks from callers in drivers to here. (Ville)
- Avoid var inits in declaration blocks that can fail or have
side-effects. (Ville)
- Use u8 for driver and group IDs. (Ville)
- Simplify API removing drm_dp_tunnel_get/put_untracked(). (Ville)
- Reuse str_yes_no() instead of a local yes_no_chr(). (Ville)
- s/drm_dp_tunnel_atomic_clear_state()/free_tunnel_state() and unexport
the function. (Ville)
- s/clear_tunnel_group_state()/free_group_state() and move kfree() to
this function. (Ville)
- Add separate group_free_bw() helper and describe what the tunnel
estimated BW includes. (Ville)
- Improve help text for CONFIG_DRM_DISPLAY_DP_TUNNEL. (Ville)
- Add code comment explaining the purpose of DPCD reg read helpers.
(Ville)
- Add code comment describing the tunnel group name prefix format.
(Ville)
- Report the allocated BW as undetermined until the first allocation
request.
- Skip allocation requests matching the previous request.
- Clear any stale BW request status flags before a new request.
- Add missing error return check of drm_dp_tunnel_atomic_get_group_state()
in drm_dp_tunnel_atomic_set_stream_bw().
- Add drm_dp_tunnel_get_allocated_bw().
- s/drm_dp_tunnel_atomic_get_tunnel_bw/drm_dp_tunnel_atomic_get_required_bw
- Fix return value description in function doc of drm_dp_tunnel_detect().
- Add function documentation to all exported functions.
v3:
- Improve grouping of fields in drm_dp_tunnel_group struct. (Uma)
- Fix validating the BW granularity DPCD reg value. (Uma)
- Document return value of check_and_clear_status_change(). (Uma)
- Fix resetting drm_dp_tunnel_ref::tunnel in drm_dp_tunnel_ref_put().
(Ville)
- Allow for ALLOCATED_BW to change after a BWA enable/disable sequence.
Imre Deak [Mon, 26 Feb 2024 18:52:43 +0000 (20:52 +0200)]
drm/dp: Add drm_dp_max_dprx_data_rate()
Copy intel_dp_max_data_rate() to DRM core. It will be needed by a
follow-up DP tunnel patch, checking the maximum rate the DPRX (sink)
supports. Accordingly use the drm_dp_max_dprx_data_rate() name for
clarity. This patchset will also switch calling the new DRM function
in i915 instead of intel_dp_max_data_rate().
While at it simplify the function documentation/comments, removing
parts described already by drm_dp_bw_channel_coding_efficiency().
v2: (Ville)
- Remove max_link_rate_kbps.
- Simplify the function documentation.
v3:
- Rebased on latest drm-tip.
Suraj Kandpal [Mon, 26 Feb 2024 06:30:52 +0000 (12:00 +0530)]
drm/i915/hdcp: Read Rxcaps for robustibility
We see some monitors and docks report incorrect hdcp version
and capability in first few reads so we read rx_caps three times
before we conclude the monitor's or docks HDCP capability
--v2
-Add comment to justify the 3 time read loop for hdcp capability[Ankit]
Suraj Kandpal [Mon, 26 Feb 2024 06:30:51 +0000 (12:00 +0530)]
drm/i915/hdcp: Allocate stream id after HDCP AKE stage
Allocate stream id after HDCP AKE stage and not before so that it
can also be done during link integrity check.
Right now for MST scenarios LIC fails after hdcp enablement for this
reason.
--v2
-no need for else block in prepare_streams function [Ankit]
--v3
-remove intel_hdcp argument from required_content_stream function
[Ankit]
Suraj Kandpal [Mon, 26 Feb 2024 06:30:50 +0000 (12:00 +0530)]
drm/i915/hdcp: Don't enable HDCP1.4 directly from check_link
Whenever LIC fails instead of moving from ENABLED to DESIRED
CP property we directly enable HDCP1.4 without informing the userspace
of this failure in link integrity check.
Now we will just update the value to DESIRED send the event to
userspace and then continue with the normal flow of HDCP enablement.
Suraj Kandpal [Mon, 26 Feb 2024 06:30:49 +0000 (12:00 +0530)]
drm/i915/hdcp: Don't enable HDCP2.2 directly from check_link
Whenever LIC fails instead of moving from ENABLED to DESIRED
CP property we directly enable HDCP2.2 without informing the userspace
of this failure in link integrity check.
Now we will just update the value to DESIRED send the event to
userspace and then continue with the normal flow of HDCP enablement.
--v2
-Don't change the function prototype in this function [Ankit]
Daniel Vetter [Mon, 26 Feb 2024 10:41:07 +0000 (11:41 +0100)]
Merge v6.8-rc6 into drm-next
Thomas Zimmermann asked to backmerge -rc6 for drm-misc branches,
there's a few same-area-changed conflicts (xe and amdgpu mostly) that
are getting a bit too annoying.
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Daniel Vetter [Mon, 26 Feb 2024 10:06:19 +0000 (11:06 +0100)]
Merge tag 'drm-habanalabs-next-2024-02-26' of https://git.kernel.org/pub/scm/linux/kernel/git/ogabbay/linux into drm-next
This tag contains habanalabs driver and accel changes for v6.9.
The notable changes are:
- New features and improvements:
- Configure interrupt affinity according to NUMA nodes for the MSI-X interrupts that are
assigned to the userspace application which acquires the device.
- Move the HBM MMU page tables to reside inside the HBM to minimize latency when doing
page-walks.
- Improve the device reset mechanism when consecutive heartbeat failures occur (firmware
fails to ack on heartbeat message).
- Check also extended errors in the PCIe addr_dec interrupt information.
- Rate limit the error messages that can be printed to dmesg log by userspace actions.
- Firmware related fixes:
- Handle requests from firmware to reserve device memory
- Bug fixes and code cleanups:
- constify the struct device_type usage in accel (accel_sysfs_device_minor).
- Fix the PCI health check by reading uncached register.
- Fix reporting of drain events.
- Fix debugfs files permissions.
- Fix calculation of DRAM BAR base address.
Daniel Vetter [Mon, 26 Feb 2024 09:49:09 +0000 (10:49 +0100)]
Merge tag 'drm-xe-next-2024-02-25' of ssh://gitlab.freedesktop.org/drm/xe/kernel into drm-next
drm/xe feature pull for v6.9:
UAPI Changes:
- New query to the GuC firmware submission version. (José Roberto de Souza)
- Remove unused persistent exec_queues (Thomas Hellström)
- Add vram frequency sysfs attributes (Sujaritha Sundaresan, Rodrigo Vivi)
- Add the flag XE_VM_BIND_FLAG_DUMPABLE to notify devcoredump that mapping
should be dumped (Maarten Lankhorst)
Cross-drivers Changes:
- Make sure intel_wakeref_t is treated as opaque type on i915-display
and fix its type on xe
Driver Changes:
- Drop pre-production workarounds (Matt Roper)
- Drop kunit tests for unsuported platforms: PVC and pre-production DG2 (Lucas De Marchi)
- Start pumbling SR-IOV support with memory based interrupts
for VF (Michal Wajdeczko)
- Allow to map BO in GGTT with PAT index corresponding to
XE_CACHE_UC to work with memory based interrupts (Michal Wajdeczko)
- Improve logging with GT-oriented drm_printers (Michal Wajdeczko)
- Add GuC Doorbells Manager as prep work SR-IOV during
VF provisioning ((Michal Wajdeczko)
- Refactor fake device handling in kunit integration ((Michal Wajdeczko)
- Implement additional workarounds for xe2 and MTL (Tejas Upadhyay,
Lucas De Marchi, Shekhar Chauhan, Karthik Poosa)
- Program a few registers according to perfomance guide spec for Xe2 (Shekhar Chauhan)
- Add error handling for non-blocking communication with GuC (Daniele Ceraolo Spurio)
- Fix remaining 32b build issues and enable it back (Lucas De Marchi)
- Fix build with CONFIG_DEBUG_FS=n (Jani Nikula)
- Fix warnings from GuC ABI headers (Matthew Brost)
- Introduce Relay Communication for SR-IOV for VF <-> GuC <-> PF (Michal Wajdeczko)
- Add mocs reset kunit (Ruthuvikas Ravikumar)
- Fix spellings (Colin Ian King)
- Disable mid-thread preemption when not properly supported by hardware (Nirmoy Das)
- Release mmap mappings on rpm suspend (Badal Nilawar)
- Fix BUG_ON on xe_exec by moving fence reservation to the validate stage (Matthew Auld)
- Fix xe_exec by reserving extra fence slot for CPU bind (Matthew Brost)
- Fix xe_exec with full long running exec queue, now returning
-EWOULDBLOCK to userspace (Matthew Brost)
- Fix CT irq handler when CT is disabled (Matthew Brost)
- Fix VM_BIND_OP_UNMAP_ALL without any bound vmas (Thomas Hellström)
- Fix missing __iomem annotations (Thomas Hellström)
- Fix exec queue priority handling with GuC (Brian Welty)
- Fix setting SLPC flag to GuC when it's not supported (Vinay Belgaumkar)
- Fix C6 disabling without SLPC (Matt Roper)
- Drop -Wstringop-overflow to fix build with GCC11 (Paul E. McKenney)
- Circumvent bogus -Wstringop-overflow in one case (Arnd Bergmann)
- Refactor exec_queue user extensions handling and fix USM attributes
being applied too late (Brian Welty)
- Use circ_buf head/tail convention (Matthew Brost)
- Fail build if circ_buf-related defines are modified with incompatible values
(Matthew Brost)
- Fix several error paths (Dan Carpenter)
- Fix CCS copy for small VRAM copy chunks (Thomas Hellström)
- Rework driver initialization order and paths to account for driver running
in VF mode (Michal Wajdeczko)
- Initialize GuC earlier during probe to handle driver in VF mode (Michał Winiarski)
- Fix migration use of MI_STORE_DATA_IMM to write PTEs (Matt Roper)
- Fix bounds checking in __xe_bo_placement_for_flags (Brian Welty)
- Drop display dependency on CONFIG_EXPERT (Jani Nikula)
- Do not hand-roll kstrdup when creating snapshot (Michal Wajdeczko)
- Stop creating one kunit module per kunit suite (Lucas De Marchi)
- Reduce scope and constify variables (Thomas Hellström, Jani Nikula, Michal Wajdeczko)
- Improve and document xe_guc_ct_send_recv() (Michal Wajdeczko)
- Add proxy communication between CSME and GSC uC (Daniele Ceraolo Spurio)
- Fix size calculation when writing pgtable (Fei Yang)
- Make sure cfb is page size aligned in stolen memory (Vinod Govindapillai)
- Stop printing guc log to dmesg when waiting for GuC fails (Rodrigo Vivi)
- Use XE_CACHE_WB instead of XE_CACHE_NONE for cpu coherency on migration
(Himal Prasad Ghimiray)
- Fix error path in xe_vm_create (Moti Haimovski)
- Fix warnings in doc generation (Thomas Hellström, Badal Nilawar)
- Improve devcoredump content for mesa debugging (José Roberto de Souza)
- Fix crash in trace_dma_fence_init() (José Roberto de Souza)
- Improve CT state change handling (Matthew Brost)
- Toggle USM support for Xe2 (Lucas De Marchi)
- Reduces code duplication to emit PIPE_CONTROL (José Roberto de Souza)
- Canonicalize addresses where needed for Xe2 and add to devcoredump
(José Roberto de Souza)
- Only allow 1 ufence per exec / bind IOCTL (Matthew Brost)
- Move all display code to display/ (Jani Nikula)
- Fix sparse warnings by correctly using annotations (Thomas Hellström)
- Warn on job timeouts instead of using asserts (Matt Roper)
- Prefix macros to avoid clashes with sparc (Matthew Brost)
- Fix -Walloc-size by subclassing instead of allocating size smaller than struct (Thomas Hellström)
- Add status check during gsc header readout (Suraj Kandpal)
- Fix infinite loop in vm_bind_ioctl_ops_unwind() (Matthew Brost)
- Fix fence refcounting (Matthew Brost)
- Fix picking incorrect userptr VMA (Matthew Brost)
- Fix USM on integrated by mapping both mem.kernel_bb_pool and usm.bb_pool (Matthew Brost)
- Fix double initialization of display power domains (Xiaoming Wang)
- Check expected uC versions by major.minor.patch instead of just major.minor (John Harrison)
- Bump minimum GuC version to 70.19.2 for all platforms under force-probe
(John Harrison)
- Add GuC firmware loading for Lunar Lake (John Harrison)
- Use kzalloc() instead of hand-rolled alloc + memset (Nirmoy Das)
- Fix max page size of VMA during a REMAP (Matthew Brost)
- Don't ignore error when pinning pages in kthread (Matthew Auld)
- Refactor xe hwmon (Karthik Poosa)
- Add debug logs for D3cold (Riana Tauro)
- Remove broken TEST_VM_ASYNC_OPS_ERROR (Matthew Brost)
- Always allow to override firmware blob with module param and improve
log when no firmware is found (Lucas De Marchi)
- Fix shift-out-of-bounds due to xe_vm_prepare_vma() accepting zero fences (Thomas Hellström)
- Fix shift-out-of-bounds by distinguishing xe_pt/xe_pt_dir subclass (Thomas Hellström)
- Fail driver bind if platform supports MSIX, but fails to allocate all of them (Dani Liberman)
- Fix intel_fbdev thinking memory is backed by shmem (Matthew Auld)
- Prefer drm_dbg() over dev_dbg() (Jani Nikula)
- Avoid function cast warnings with clang-16 (Arnd Bergmann)
- Enhance xe_bo_move trace (Priyanka Dandamudi)
- Fix xe_vma_set_pte_size() not setting the right gpuva.flags for 4K size (Matthew Brost)
- Add XE_VMA_PTE_64K VMA flag (Matthew Brost)
- Return 2MB page size for compact 64k PTEs (Matthew Brost)
- Remove usage of the deprecated ida_simple_xx() API (Christophe JAILLET)
- Fix modpost warning on xe_mocs live kunit module (Ashutosh Dixit)
- Drop extra newline in from sysfs files (Ashutosh Dixit)
- Implement VM snapshot support for BO's and userptr (Maarten Lankhorst)
- Add debug logs when skipping rebinds (Matthew Brost)
- Fix code generation when mixing build directories (Dafna Hirschfeld)
- Prefer struct_size over open coded arithmetic (Erick Archer)
Since commit aed65af1cc2f ("drivers: make device_type const"), the driver
core can properly handle constant struct device_type. Move the
accel_sysfs_device_minor variable to be a constant structure as well,
placing it into read-only memory which can not be modified at runtime.
Ofir Bitton [Mon, 12 Feb 2024 12:35:24 +0000 (14:35 +0200)]
accel/habanalabs: modify pci health check
Today we read PCI VENDOR-ID in order to make sure PCI link is
healthy. Apparently the VENDOR-ID might be stored on host and
hence, when we read it we might not access the PCI bus.
In order to make sure PCI health check is reliable, we will start
checking the DEVICE-ID instead.
Tomer Tayar [Tue, 30 Jan 2024 07:57:32 +0000 (09:57 +0200)]
accel/habanalabs: keep explicit size of reserved memory for FW
The reserved memory for FW is currently saved in an ASIC property in
units of MB, just like the value that comes from FW.
Except the fact that it is not clear from the property's name, it means
also that a calculation to actual size is required everywhere that it is
used.
Modify the property to hold the size in bytes.
Tomer Tayar [Mon, 29 Jan 2024 15:26:17 +0000 (17:26 +0200)]
accel/habanalabs: handle reserved memory request when working with full FW
Currently the reserved memory request from FW is handled when running
with preboot only, but this request is relevant also when running with
full FW.
Modify to always handle this reservation request.
Tomer Tayar [Thu, 18 Jan 2024 17:18:43 +0000 (19:18 +0200)]
accel/habanalabs: modify print for skip loading linux FW to debug log
Skip loading a linux FW image into the device with the current supported
ASICs is done for test purposes only.
Moreover, for future supported ASICs it is possible that there won't be
a need to load such an image.
The print in such a case is therefore not needed in most cases, so
replace the used dev_info() with dev_dbg().
Erick Archer [Sat, 20 Jan 2024 15:10:28 +0000 (16:10 +0100)]
accel/habanalabs: use kcalloc() instead of kzalloc()
As noted in the "Deprecated Interfaces, Language Features, Attributes,
and Conventions" documentation [1], size calculations (especially
multiplication) should not be performed in memory allocator (or similar)
function arguments due to the risk of them overflowing. This could lead
to values wrapping around and a smaller allocation being made than the
caller was expecting. Using those allocations could lead to linear
overflows of heap memory and other misbehaviors.
So, use the purpose specific kcalloc() function instead of the argument
size * count in the kzalloc() function.
Colin Ian King [Sat, 6 Jan 2024 12:42:13 +0000 (12:42 +0000)]
accel/habanalabs/goya: remove redundant assignment to pointer 'input'
The pointer input is assigned a value that is not read, it is
being re-assigned again later with the same value. Resolve this
by moving the declaration to input into the if block.
Cleans up clang scan build warning:
warning: Value stored to 'input' during its initialization is never
read [deadcode.DeadStores]
Signed-off-by: Colin Ian King <colin.i.king@intel.com> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
Malkoot Khan [Thu, 28 Dec 2023 21:08:58 +0000 (21:08 +0000)]
accel/habanalabs: Remove unnecessary braces from if statement
The coding style in the Linux kernel prefers not to use
braces for single-statement if conditions.
This patch removes the unnecessary braces from an if statement
in the file drivers/accel/habanalabs/common/command_submission.c,
which also resolves a coding style warning.
Farah Kassabri [Thu, 2 Nov 2023 09:53:29 +0000 (11:53 +0200)]
accel/habanalabs/gaudi2: move HMMU page tables to device memory
Currently the HMMU page tables reside in the host memory,
which will cause host access from the device for every page walk.
This can affect PCIe bandwidth in certain scenarios.
To prevent that problem, HMMU page tables will be moved to the device
memory so the miss transaction will read the hops from there instead of
going to the host.
Tomer Tayar [Sun, 24 Dec 2023 22:28:36 +0000 (00:28 +0200)]
accel/habanalabs: abort device reset for consecutive heartbeat failures
The mechanism of aborting device reset for consecutive fatal errors is
currently only for fatal errors that are reported by FW.
A non-responsive FW and consecutive heartbeat failures is also
considered fatal, so add them as well to this mechanism to avoid
recurring device reset in such a case.
Tomer Tayar [Thu, 14 Dec 2023 08:38:06 +0000 (10:38 +0200)]
accel/habanalabs: fix DRAM BAR base address calculation
When the DRAM region size in the BAR is not a power of 2, calculating
the corresponding BAR base address should be done using the offset from
the DRAM start address, and not using directly the DRAM address.
accel/habanalabs/gaudi2: add interrupt affinity for user interrupts
User interrupts are MSIx interrupts coming from Gaudi2, that have
specific range of IDs and are assigned to the sole use of the user
process that opened the Gaudi2 device (reminder: there can be only
a single user process running on Gaudi2 at any given time).
The interrupts are allocated and managed by the driver and therefore,
the user expects the driver to initialize them properly, which also
includes setting the affinity to the related CPU cores of the
device's NUMA node to get maximum performance.
Suraj Kandpal [Fri, 23 Feb 2024 08:14:48 +0000 (13:44 +0530)]
drm/i915/hdcp: HDCP Capability for the downstream device
Currently we are only checking capability of remote device and not
immediate downstream device but during capability check we need are
concerned with only the HDCP capability of downstream device.
During i915_display_info reporting we need HDCP Capability for both
the monitors and downstream branch device if any this patch adds that.
--v2
-Use MST Hub HDCP version [Ankit]
--v3
-Redefined how we seprate remote and direct read to make sure HDMI
shim functions are not touched [Ankit]
--v4
- Fix the conditions so that hdcp_info with remote_req true is sent
only when encoder is mst [Ankit]
--v5
-No need to have the MST Hub version in i915_hdcp_sink_capability[Ankit]
Suraj Kandpal [Fri, 23 Feb 2024 08:14:42 +0000 (13:44 +0530)]
drm/i915/hdcp: Move to direct reads for HDCP
Even for MST scenarios we need to do direct reads only on the
immediate downstream device the rest of the authentication is taken
care by that device. Remote reads will only be used to check
capability of the monitors in MST topology.
--v2
-Add fixes tag [Ankit]
-Derive aux where needed rather than through a function [Ankit]
drm/i915/display/debugfs: New entry "DRRS capable" to i915_drrs_status
If the connected panel supports both DRRS & PSR, driver gives preference
to PSR ("DRRS enabled: no"). Even though the hardware supports DRRS,
IGT treats ("DRRS enabled: yes") as not capable.
Introduce a new entry "DRRS capable" to debugfs i915_drrs_status, so
that IGT will read the DRRS capability as "DRRS capable: yes".
Linus Torvalds [Sun, 25 Feb 2024 23:31:57 +0000 (15:31 -0800)]
Merge tag 'bcachefs-2024-02-25' of https://evilpiepirate.org/git/bcachefs
Pull bcachefs fixes from Kent Overstreet:
"Some more mostly boring fixes, but some not
User reported ones:
- the BTREE_ITER_FILTER_SNAPSHOTS one fixes a really nasty
performance bug; user reported an untar initially taking two
seconds and then ~2 minutes
- kill a __GFP_NOFAIL in the buffered read path; this was a leftover
from the trickier fix to kill __GFP_NOFAIL in readahead, where we
can't return errors (and have to silently truncate the read
ourselves).
bcachefs can't use GFP_NOFAIL for folio state unlike iomap based
filesystems because our folio state is just barely too big, 2MB
hugepages cause us to exceed the 2 page threshhold for GFP_NOFAIL.
additionally, the flags argument was just buggy, we weren't
supplying GFP_KERNEL previously (!)"
* tag 'bcachefs-2024-02-25' of https://evilpiepirate.org/git/bcachefs:
bcachefs: fix bch2_save_backtrace()
bcachefs: Fix check_snapshot() memcpy
bcachefs: Fix bch2_journal_flush_device_pins()
bcachefs: fix iov_iter count underflow on sub-block dio read
bcachefs: Fix BTREE_ITER_FILTER_SNAPSHOTS on inodes btree
bcachefs: Kill __GFP_NOFAIL in buffered read path
bcachefs: fix backpointer_to_text() when dev does not exist
Linus Torvalds [Sun, 25 Feb 2024 18:58:12 +0000 (10:58 -0800)]
Merge tag 'docs-6.8-fixes3' of git://git.lwn.net/linux
Pull two documentation build fixes from Jonathan Corbet:
- The XFS online fsck documentation uses incredibly deeply nested
subsection and list nesting; that broke the PDF docs build. Tweak a
parameter to tell LaTeX to allow the deeper nesting.
- Fix a 6.8 PDF-build regression
* tag 'docs-6.8-fixes3' of git://git.lwn.net/linux:
docs: translations: use attribute to store current language
docs: Instruct LaTeX to cope with deeper nesting
Linus Torvalds [Sun, 25 Feb 2024 18:41:57 +0000 (10:41 -0800)]
Merge tag 'usb-6.8-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb
Pull USB fixes from Greg KH:
"Here are some small USB fixes for 6.8-rc6 to resolve some reported
problems. These include:
- regression fixes with typec tpcm code as reported by many
- cdnsp and cdns3 driver fixes
- usb role setting code bugfixes
- build fix for uhci driver
- ncm gadget driver bugfix
- MAINTAINERS entry update
All of these have been in linux-next all week with no reported issues
and there is at least one fix in here that is in Thorsten's regression
list that is being tracked"
* tag 'usb-6.8-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
usb: typec: tpcm: Fix issues with power being removed during reset
MAINTAINERS: Drop myself as maintainer of TYPEC port controller drivers
usb: gadget: ncm: Avoid dropping datagrams of properly parsed NTBs
Revert "usb: typec: tcpm: reset counter when enter into unattached state after try role"
usb: gadget: omap_udc: fix USB gadget regression on Palm TE
usb: dwc3: gadget: Don't disconnect if not started
usb: cdns3: fix memory double free when handle zero packet
usb: cdns3: fixed memory use after free at cdns3_gadget_ep_disable()
usb: roles: don't get/set_role() when usb_role_switch is unregistered
usb: roles: fix NULL pointer issue when put module's reference
usb: cdnsp: fixed issue with incorrect detecting CDNSP family controllers
usb: cdnsp: blocked some cdns3 specific code
usb: uhci-grlib: Explicitly include linux/platform_device.h
Linus Torvalds [Sun, 25 Feb 2024 18:35:41 +0000 (10:35 -0800)]
Merge tag 'tty-6.8-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty
Pull tty/serial driver fixes from Greg KH:
"Here are three small serial/tty driver fixes for 6.8-rc6 that resolve
the following reported errors:
- riscv hvc console driver fix that was reported by many
- amba-pl011 serial driver fix for RS485 mode
- stm32 serial driver fix for RS485 mode
All of these have been in linux-next all week with no reported
problems"
* tag 'tty-6.8-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty:
serial: amba-pl011: Fix DMA transmission in RS485 mode
serial: stm32: do not always set SER_RS485_RX_DURING_TX if RS485 is enabled
tty: hvc: Don't enable the RISC-V SBI console by default
Linus Torvalds [Sun, 25 Feb 2024 18:22:21 +0000 (10:22 -0800)]
Merge tag 'x86_urgent_for_v6.8_rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Borislav Petkov:
- Make sure clearing CPU buffers using VERW happens at the latest
possible point in the return-to-userspace path, otherwise memory
accesses after the VERW execution could cause data to land in CPU
buffers again
* tag 'x86_urgent_for_v6.8_rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
KVM/VMX: Move VERW closer to VMentry for MDS mitigation
KVM/VMX: Use BT+JNC, i.e. EFLAGS.CF to select VMRESUME vs. VMLAUNCH
x86/bugs: Use ALTERNATIVE() instead of mds_user_clear static key
x86/entry_32: Add VERW just before userspace transition
x86/entry_64: Add VERW just before userspace transition
x86/bugs: Add asm helpers for executing VERW
Linus Torvalds [Sun, 25 Feb 2024 18:14:12 +0000 (10:14 -0800)]
Merge tag 'irq_urgent_for_v6.8_rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull irq fixes from Borislav Petkov:
- Make sure GICv4 always gets initialized to prevent a kexec-ed kernel
from silently failing to set it up
- Do not call bus_get_dev_root() for the mbigen irqchip as it always
returns NULL - use NULL directly
- Fix hardware interrupt number truncation when assigning MSI
interrupts
- Correct sending end-of-interrupt messages to disabled interrupts
lines on RISC-V PLIC
* tag 'irq_urgent_for_v6.8_rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
irqchip/gic-v3-its: Do not assume vPE tables are preallocated
irqchip/mbigen: Don't use bus_get_dev_root() to find the parent
PCI/MSI: Prevent MSI hardware interrupt number truncation
irqchip/sifive-plic: Enable interrupt if needed before EOI
Linus Torvalds [Sun, 25 Feb 2024 17:29:05 +0000 (09:29 -0800)]
Merge tag 'pull-fixes.pathwalk-rcu-2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull RCU pathwalk fixes from Al Viro:
"We still have some races in filesystem methods when exposed to RCU
pathwalk. This series is a result of code audit (the second round of
it) and it should deal with most of that stuff.
Still pending: ntfs3 ->d_hash()/->d_compare() and ceph_d_revalidate().
Up to maintainers (a note for NTFS folks - when documentation says
that a method may not block, it *does* imply that blocking allocations
are to be avoided. Really)"
[ More explanations for people who aren't familiar with the vagaries of
RCU path walking: most of it is hidden from filesystems, but if a
filesystem actively participates in the low-level path walking it
needs to make sure the fields involved in that walk are RCU-safe.
That "actively participate in low-level path walking" includes things
like having its own ->d_hash()/->d_compare() routines, or by having
its own directory permission function that doesn't just use the common
helpers. Having a ->d_revalidate() function will also have this issue.
Note that instead of making everything RCU safe you can also choose to
abort the RCU pathwalk if your operation cannot be done safely under
RCU, but that obviously comes with a performance penalty. One common
pattern is to allow the simple cases under RCU, and abort only if you
need to do something more complicated.
So not everything needs to be RCU-safe, and things like the inode etc
that the VFS itself maintains obviously already are. But these fixes
tend to be about properly RCU-delaying things like ->s_fs_info that
are maintained by the filesystem and that got potentially released too
early. - Linus ]
* tag 'pull-fixes.pathwalk-rcu-2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
ext4_get_link(): fix breakage in RCU mode
cifs_get_link(): bail out in unsafe case
fuse: fix UAF in rcu pathwalks
procfs: make freeing proc_fs_info rcu-delayed
procfs: move dropping pde and pid from ->evict_inode() to ->free_inode()
nfs: fix UAF on pathwalk running into umount
nfs: make nfs_set_verifier() safe for use in RCU pathwalk
afs: fix __afs_break_callback() / afs_drop_open_mmap() race
hfsplus: switch to rcu-delayed unloading of nls and freeing ->s_fs_info
exfat: move freeing sbi, upcase table and dropping nls into rcu-delayed helper
affs: free affs_sb_info with kfree_rcu()
rcu pathwalk: prevent bogus hard errors from may_lookup()
fs/super.c: don't drop ->s_user_ns until we free struct super_block itself
Linus Torvalds [Sun, 25 Feb 2024 17:17:15 +0000 (09:17 -0800)]
Merge tag 'pull-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull vfs fixes from Al Viro:
"A couple of fixes - revert of regression from this cycle and a fix for
erofs failure exit breakage (had been there since way back)"
* tag 'pull-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
erofs: fix handling kern_mount() failure
Revert "get rid of DCACHE_GENOCIDE"
Al Viro [Sat, 3 Feb 2024 06:17:34 +0000 (01:17 -0500)]
ext4_get_link(): fix breakage in RCU mode
1) errors from ext4_getblk() should not be propagated to caller
unless we are really sure that we would've gotten the same error
in non-RCU pathwalk.
2) we leak buffer_heads if ext4_getblk() is successful, but bh is
not uptodate.
Al Viro [Wed, 20 Sep 2023 02:28:16 +0000 (22:28 -0400)]
cifs_get_link(): bail out in unsafe case
->d_revalidate() bails out there, anyway. It's not enough
to prevent getting into ->get_link() in RCU mode, but that
could happen only in a very contrieved setup. Not worth
trying to do anything fancy here unless ->d_revalidate()
stops kicking out of RCU mode at least in some cases.
Reviewed-by: Christian Brauner <brauner@kernel.org> Acked-by: Miklos Szeredi <mszeredi@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Al Viro [Thu, 28 Sep 2023 04:19:39 +0000 (00:19 -0400)]
fuse: fix UAF in rcu pathwalks
->permission(), ->get_link() and ->inode_get_acl() might dereference
->s_fs_info (and, in case of ->permission(), ->s_fs_info->fc->user_ns
as well) when called from rcu pathwalk.
Freeing ->s_fs_info->fc is rcu-delayed; we need to make freeing ->s_fs_info
and dropping ->user_ns rcu-delayed too.
Al Viro [Wed, 20 Sep 2023 04:12:00 +0000 (00:12 -0400)]
procfs: make freeing proc_fs_info rcu-delayed
makes proc_pid_ns() safe from rcu pathwalk (put_pid_ns()
is still synchronous, but that's not a problem - it does
rcu-delay everything that needs to be)
Reviewed-by: Christian Brauner <brauner@kernel.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Al Viro [Thu, 28 Sep 2023 02:11:26 +0000 (22:11 -0400)]
nfs: fix UAF on pathwalk running into umount
NFS ->d_revalidate(), ->permission() and ->get_link() need to access
some parts of nfs_server when called in RCU mode:
server->flags
server->caps
*(server->io_stats)
and, worst of all, call
server->nfs_client->rpc_ops->have_delegation
(the last one - as NFS_PROTO(inode)->have_delegation()). We really
don't want to RCU-delay the entire nfs_free_server() (it would have
to be done with schedule_work() from RCU callback, since it can't
be made to run from interrupt context), but actual freeing of
nfs_server and ->io_stats can be done via call_rcu() just fine.
nfs_client part is handled simply by making nfs_free_client() use
kfree_rcu().
Acked-by: Christian Brauner <brauner@kernel.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Al Viro [Thu, 28 Sep 2023 01:50:25 +0000 (21:50 -0400)]
nfs: make nfs_set_verifier() safe for use in RCU pathwalk
nfs_set_verifier() relies upon dentry being pinned; if that's
the case, grabbing ->d_lock stabilizes ->d_parent and guarantees
that ->d_parent points to a positive dentry. For something
we'd run into in RCU mode that is *not* true - dentry might've
been through dentry_kill() just as we grabbed ->d_lock, with
its parent going through the same just as we get to into
nfs_set_verifier_locked(). It might get to detaching inode
(and zeroing ->d_inode) before nfs_set_verifier_locked() gets
to fetching that; we get an oops as the result.
That can happen in nfs{,4} ->d_revalidate(); the call chain in
question is nfs_set_verifier_locked() <- nfs_set_verifier() <-
nfs_lookup_revalidate_delegated() <- nfs{,4}_do_lookup_revalidate().
We have checked that the parent had been positive, but that's
done before we get to nfs_set_verifier() and it's possible for
memory pressure to pick our dentry as eviction candidate by that
time. If that happens, back-to-back attempts to kill dentry and
its parent are quite normal. Sure, in case of eviction we'll
fail the ->d_seq check in the caller, but we need to survive
until we return there...
Acked-by: Christian Brauner <brauner@kernel.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>