git.proxmox.com Git - mirror_ubuntu-kernels.git/log

drm/i915: Remove unwanted ghost obj check

vm_fault_ttm() should not expect ttm ghost obj so remove that check.

Suggested-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Nirmoy Das <nirmoy.das@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221024144558.27747-1-nirmoy.das@intel.com

drm/i915/selftests: add igt_vma_move_to_active_unlocked

All calls to i915_vma_move_to_active are surrounded by vma lock
and/or there are multiple local helpers for it in particular tests.
Let's replace it by common helper.
The patch should not introduce functional changes.

Signed-off-by: Andrzej Hajda <andrzej.hajda@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221019215906.295296-3-andrzej.hajda@intel.com

drm/i915: call i915_request_await_object from _i915_vma_move_to_active

Since almost all calls to i915_vma_move_to_active are prepended with
i915_request_await_object, let's call the latter from
_i915_vma_move_to_active by default and add flag allowing bypassing it.
Adjust all callers accordingly.
The patch should not introduce functional changes.

Signed-off-by: Andrzej Hajda <andrzej.hajda@intel.com>
Acked-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221019215906.295296-2-andrzej.hajda@intel.com

drm/i915: Update workaround documentation

There were several updates in the driver on how the workarounds are
handled since its documentation was written. Update the documentation to
reflect the current reality.

v2:
  - Remove footnote that was wrongly referenced, adding back the
    reference in the correct paragraph.
  - Remove "Display workarounds" and just mention "display IP" under
    "Other" category since all of them are peppered around the driver.

Cc: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Acked-by: Balasubramani Vivekanandan <balasubramani.vivekanandan@intel.com> # v1
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221115192611.179981-1-lucas.demarchi@intel.com

drm/i915: Fix vma allocator debug

Add a missing colon which I accidentally removed in the recent logging
changes.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Fixes: a10234fda466 ("drm/i915: Partial abandonment of legacy DRM logging macros")
Cc: Andrzej Hajda <andrzej.hajda@intel.com>
Reviewed-by: Andrzej Hajda <andrzej.hajda@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221115101730.394880-1-tvrtko.ursulin@linux.intel.com

Documentation/gpu: Fix section in the wrong scope

That section should still be inside "DRM client usage stats" rather than
as a sibling.

Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221107173209.2219571-2-lucas.demarchi@intel.com

drm/i915/guc: handle interrupts from media GuC

The render and media GuCs share the same interrupt enable register, so
we can no longer disable interrupts when we disable communication for
one of the GuCs as this would impact the other GuC. Instead, we keep the
interrupts always enabled in HW and use a variable in the GuC structure
to determine if we want to service the received interrupts or not.

v2: use MTL_ prefix for reg definition (Matt)

Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Cc: John Harrison <John.C.Harrison@Intel.com>
Cc: Alan Previn <alan.previn.teres.alexis@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221108020600.3575467-7-daniele.ceraolospurio@intel.com

drm/i915/guc: define media GT GuC send regs

The media GT shares the G-unit with the root GT, so a second set of
communication registers is required for the media GuC.

Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: John Harrison <John.C.Harrison@Intel.com>
Cc: Alan Previn <alan.previn.teres.alexis@intel.com>
Reviewed-by: John Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221108020600.3575467-6-daniele.ceraolospurio@intel.com

drm/i915/mtl: Handle wopcm per-GT and limit calculations.

With MTL standalone media architecture the wopcm layout has changed,
with separate partitioning in WOPCM for the root GT GuC and the media
GT GuC. The size of WOPCM is 4MB with the lower 2MB reserved for the
media GT and the upper 2MB for the root GT.

Given that MTL has GuC deprivilege, the WOPCM registers are pre-locked
by the bios. Therefore, we can skip all the math for the partitioning
and just limit ourselves to sanity-checking the values.

v2: fix makefile file ordering (Jani)
v3: drop XELPM_SAMEDIA_WOPCM_SIZE, check huc instead of VDBOX (John)
v4: further clarify commit message, remove blank line (John)

Signed-off-by: Aravind Iddamsetty <aravind.iddamsetty@intel.com>
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Cc: John Harrison <john.c.harrison@intel.com>
Cc: Alan Previn <alan.previn.teres.alexis@intel.com>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Reviewed-by: John Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221108020600.3575467-5-daniele.ceraolospurio@intel.com

drm/i915/guc: Add GuC deprivilege feature to MTL

MTL supports GuC deprivilege. Add the feature flag to this platform.

Signed-off-by: Stuart Summers <stuart.summers@intel.com>
Cc: Radhakrishna Sripada <radhakrishna.sripada@intel.com>
Cc: John Harrison <john.c.harrison@intel.com>
Cc: Alan Previn <alan.previn.teres.alexis@intel.com>
Reviewed-by: John Harrison <John.C.Harrison@Intel.com>
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221108020600.3575467-4-daniele.ceraolospurio@intel.com

drm/i915/uc: use different ggtt pin offsets for uc loads

Our current FW loading process is the same for all FWs:

- Pin FW to GGTT at the start of the ggtt->uc_fw node
- Load the FW
- Unpin

This worked because we didn't have a case where 2 FWs would be loaded on
the same GGTT at the same time. On MTL, however, this can happen if both
GTs are reset at the same time, so we can't pin everything in the same
spot and we need to use separate offset. For simplicity, instead of
calculating the exact required size, we reserve a 2MB slot for each fw.

v2: fail fetch if FW is > 2MBs, improve comments (John)
v3: more comment improvements (John)

Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: John Harrison <john.c.harrison@intel.com>
Cc: Alan Previn <alan.previn.teres.alexis@intel.com>
Reviewed-by: John Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221108020600.3575467-3-daniele.ceraolospurio@intel.com

drm/i915/uc: fetch uc firmwares for each GT

The FW binaries are independently loaded on each GT. On MTL, the memory
is shared so we could potentially re-use a single allocation, but on
discrete multi-gt platforms we are going to need independent copies,
so it is easier to do the same on MTL as well, given that the amount
of duplicated memory is relatively small (~500K).

Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: John Harrison <john.c.harrison@intel.com>
Cc: Alan Previn <alan.previn.teres.alexis@intel.com>
Reviewed-by: John Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221108020600.3575467-2-daniele.ceraolospurio@intel.com

drm/i915/huc: only load HuC on GTs that have VCS engines

On MTL the primary GT doesn't have any media capabilities, so no video
engines and no HuC. We must therefore skip the HuC fetch and load on
that specific case. Given that other multi-GT platforms might have HuC
on the primary GT, we can't just check for that and it is easier to
instead check for the lack of VCS engines.

Based on code from Aravind Iddamsetty

v2: clarify which engine_mask is used for each GT and why (Tvrtko)

Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Aravind Iddamsetty <aravind.iddamsetty@intel.com>
Cc: John Harrison <john.c.harrison@intel.com>
Cc: Alan Previn <alan.previn.teres.alexis@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Reviewed-by: Aravind Iddamsetty <aravind.iddamsetty@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221108020600.3575467-1-daniele.ceraolospurio@intel.com

drm/i915: Simplify internal helper function signature

Since we are now storing the GT backpointer in the wa list we can drop the
explicit struct intel_gt * argument to wa_list_apply.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Andrzej Hajda <andrzej.hajda@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221110124633.3135026-1-tvrtko.ursulin@linux.intel.com

drm/i915/mtl: Add Wa_14017073508 for SAMedia

This workaround is added for Media tile of MTL A step. It is to help
pcode workaround which handles the hardware issue seen during package C2/C3
transitions due to RC6 entry/exit transitions on Media tile. As a part of
workaround pcode expect kmd to send mailbox message "media busy" when
components of Media tile are in use and "media idle" otherwise.
As per workaround description gucrc need to be disabled so enabled
host based RC for Media tile.

v2:
- Correct workaround id (Matt)
- Fix review comments (Rodrigo)

Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Radhakrishna Sripada <radhakrishna.sripada@intel.com>
Cc: Vinay Belgaumkar <vinay.belgaumkar@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Badal Nilawar <badal.nilawar@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Anshuman Gupta <anshuman.gupta@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221103184559.2306481-1-badal.nilawar@intel.com

drm/i915/guc/slpc: Add selftest for slpc tile-tile interaction

Run a workload on tiles simultaneously by requesting for RP0 frequency.
Pcode can however limit the frequency being granted due to throttling
reasons. This test checks if there is any throttling but does not fail
if RP0 is not granted due to throttle reasons

v2: Fix build error
v3: Use IS_ERR_OR_NULL to check worker
Addressed cosmetic review comments (Tvrtko)
v4: do not skip test on media engines if gt type is GT_MEDIA.
Use correct PERF_LIMIT_REASONS register for MTL (Vinay)

Signed-off-by: Riana Tauro <riana.tauro@intel.com>
Reviewed-by: Vinay Belgaumkar <vinay.belgaumkar@intel.com>
Signed-off-by: Anshuman Gupta <anshuman.gupta@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221109112541.275021-2-riana.tauro@intel.com

drm/i915/perf: Fix kernel-doc warning

Fix kernel-doc issue from a previous commit.

Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Fixes: 2db609c01495 ("drm/i915/perf: Replace gt->perf.lock with stream->lock for file ops")
Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221107202410.1976895-1-umesh.nerlige.ramappa@intel.com

drm/i915: Partial abandonment of legacy DRM logging macros

Convert some usages of legacy DRM logging macros into versions which tell
us on which device have the events occurred.

v2:
* Don't have struct drm_device as local. (Jani, Ville)

v3:
* Store gt, not i915, in workaround list. (John)

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Acked-by: Jani Nikula <jani.nikula@intel.com>
Cc: Jani Nikula <jani.nikula@intel.com>
Cc: John Harrison <John.C.Harrison@Intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Andrzej Hajda <andrzej.hajda@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221109104633.2579245-1-tvrtko.ursulin@linux.intel.com

drm/i915: use i915_sg_dma_sizes() for all backends

We rely on page_sizes.sg in setup_scratch_page() reporting the correct
value if the underlying sgl is not contiguous, however in
get_pages_internal() we are only looking at the layout of the created
pages when calculating the sg_page_sizes, and not the final sgl, which
could in theory be completely different. In such a situation we might
incorrectly think we have a 64K scratch page, when it is actually only
4K or similar split over multiple non-contiguous entries, which could
lead to broken behaviour when touching the scratch space within the
padding of a 64K GTT page-table. For most of the other backends we
already just call i915_sg_dma_sizes() on the final mapping, so rather
just move that into __i915_gem_object_set_pages() to avoid such issues
coming back to bite us later.

v2: Update missing conversion in gvt

Suggested-by: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Stuart Summers <stuart.summers@intel.com>
Cc: Andrzej Hajda <andrzej.hajda@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221108103238.165447-1-matthew.auld@intel.com

drm/i915/ttm: add some sanity checks for lmem_userfault_list

Rather than getting some hard to debug uaf, add some warns to hopefully
catch issues with userfault_count being non-zero when destroying the
object. Also if we somehow add an object to lmem_userfault_list that
somehow doesn't map lmem.

References: https://gitlab.freedesktop.org/drm/intel/-/issues/7469
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Anshuman Gupta <anshuman.gupta@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Andrzej Hajda <andrzej.hajda@intel.com>
Cc: Nirmoy Das <nirmoy.das@intel.com>
Reviewed-by: Andrzej Hajda <andrzej.hajda@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221107165414.56970-2-matthew.auld@intel.com

drm/i915/ttm: fix uaf with lmem_userfault_list handling

In the fault handler, make sure we check if the BO maps lmem after
we schedule the migration, since the current resource might change from
lmem to smem, if the pages are in the non-cpu visible portion of lmem.
This then leads to adding the object to the lmem_userfault_list even
though the current resource is no longer lmem. If we then destroy the
object, the list might still contain a link to the now free object, since
we only remove it if the object is still in lmem.

Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/7469
Fixes: ad74457a6b5a ("drm/i915/dgfx: Release mmap on rpm suspend")
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Anshuman Gupta <anshuman.gupta@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Andrzej Hajda <andrzej.hajda@intel.com>
Cc: Nirmoy Das <nirmoy.das@intel.com>
Reviewed-by: Andrzej Hajda <andrzej.hajda@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221107165414.56970-1-matthew.auld@intel.com

drm/i915/pxp: use <> instead of "" for headers in include/

Headers in include/ should be included using the system header #include
syntax.

Fixes: 887a193b4fb1 ("drm/i915/pxp: add huc authentication and loading command")
Cc: Tomas Winkler <tomas.winkler@intel.com>
Cc: Vitaly Lubart <vitaly.lubart@intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Alan Previn <alan.previn.teres.alexis@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221107140454.2680954-1-jani.nikula@intel.com

drm/i915/guc: don't hardcode BCS0 in guc_hang selftest

On MTL there are no BCS engines on the media GT, so we can't always use
BCS0 in the test. There is no actual reason to use a BCS engine over an
engine of a different class, so switch to using any available engine.

Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: John Harrison <John.C.Harrison@Intel.com>
Acked-by: Nirmoy Das <nirmoy.das@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221102214310.2829310-1-daniele.ceraolospurio@intel.com

drm/i915/mtl: don't expose GSC command streamer to the user

There is no userspace user for this CS yet, we only need it for internal
kernel ops (e.g. HuC, PXP), so don't expose it.

v2: even if it's not exposed, rename the engine so it is easier to
identify in the debug logs (Matt)

Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221102171047.2787951-6-daniele.ceraolospurio@intel.com

drm/i915/mtl: add GSC CS reset support

The GSC CS has its own dedicated bit in the GDRST register.

Bspec: 52549
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221102171047.2787951-5-daniele.ceraolospurio@intel.com

drm/i915/mtl: add GSC CS interrupt support

The GSC CS re-uses the same interrupt bits that the GSC used in older
platforms. This means that we can now have an engine interrupt coming
out of OTHER_CLASS, so we need to handle that appropriately.

v2: clean up the if statement for the engine irq (Tvrtko)

Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221102171047.2787951-4-daniele.ceraolospurio@intel.com

drm/i915/mtl: pass the GSC CS info to the GuC

We need to tell the GuC that the GSC CS is there.

Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221102171047.2787951-3-daniele.ceraolospurio@intel.com

drm/i915/mtl: add initial definitions for GSC CS

Starting on MTL, the GSC is no longer managed with direct MMIO access,
but we instead have a dedicated command streamer for it. As a first step
for adding support for this CS, add the required definitions.
Note that, although it is now a CS, the GSC retains its old
class:instance value (OTHER_CLASS instance 6)

Bspec: 65308, 45605
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221102171047.2787951-2-daniele.ceraolospurio@intel.com

drm/i915/guc: Don't deadlock busyness stats vs reset

The engine busyness stats has a worker function to do things like
64bit extend the 32bit hardware counters. The GuC's reset prepare
function flushes out this worker function to ensure no corruption
happens during the reset. Unforunately, the worker function has an
infinite wait for active resets to finish before doing its work. Thus
a deadlock would occur if the worker function had actually started
just as the reset starts.

The function being used to lock the reset-in-progress mutex is called
intel_gt_reset_trylock(). However, as noted it does not follow
standard 'trylock' conventions and exit if already locked. So rename
the current _trylock function to intel_gt_reset_lock_interruptible(),
which is the behaviour it actually provides. In addition, add a new
implementation of _trylock and call that from the busyness stats
worker instead.

v2: Rename existing trylock to interruptible rather than trying to
preserve the existing (confusing) naming scheme (review comments from
Tvrtko).

Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221102192109.2492625-3-John.C.Harrison@Intel.com

drm/i915/guc: Properly initialise kernel contexts

If a context has already been registered prior to first submission
then context init code was not being called. The noticeable effect of
that was the scheduling priority was left at zero (meaning super high
priority) instead of being set to normal. This would occur with
kernel contexts at start of day as they are manually pinned up front
rather than on first submission. So add a call to initialise those
when they are pinned.

Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221102192109.2492625-2-John.C.Harrison@Intel.com

drm/i915/guc: Remove excessive line feeds in state dumps

Some of the GuC state dump messages were adding extra line feeds. When
printing via a DRM printer to dmesg, for example, that messes up the
log formatting as it loses any prefixing from the printer. Given that
the extra line feeds are just in the middle of random bits of GuC
state, there isn't any real need for them. So just remove them
completely.

Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221031220007.4176835-1-John.C.Harrison@Intel.com

drm/i915/userptr: restore probe_range behaviour

The conversion looks harmless, however the addr value is updated inside
the loop with the previous vm_end, which then incorrectly leads to
for_each_vma_range() iterating over stuff outside the range we care
about. Fix this by storing the end value separately. Also fix the case
where the range doesn't intersect with any vma, or if the vma itself
doesn't extend the entire range, which must mean we have hole at the
end. Both should result in an error, as per the previous behaviour.

v2: Fix the cases where the range is empty, or if there's a hole at
the end of the range

Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/7247
Testcase: igt@gem_userptr_blits@probe
Fixes: f683b9d61319 ("i915: use the VMA iterator")
Reported-by: kernel test robot <oliver.sang@intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Liam R. Howlett <Liam.Howlett@Oracle.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Yu Zhao <yuzhao@google.com>
Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Reviewed-by: Andrzej Hajda <andrzej.hajda@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221028130635.465839-1-matthew.auld@intel.com

Merge drm/drm-next into drm-intel-gt-next

Needed to bring in v6.1-rc1 which contains commit f683b9d61319 ("i915: use the VMA iterator")
which is needed for series https://patchwork.freedesktop.org/series/110083/ .

Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>

Merge tag 'drm-intel-gt-next-2022-11-03' of git://anongit.freedesktop.org/drm/drm-intel into drm-next

Driver Changes:

- Fix for #7306: [Arc A380] white flickering when using arc as a
secondary gpu (Matt A)
- Add Wa_18017747507 for DG2 (Wayne)
- Avoid spurious WARN on DG1 due to incorrect cache_dirty flag
(Niranjana, Matt A)
- Corrections to CS timestamp support for Gen5 and earlier (Ville)

- Fix a build error used with clang compiler on hwmon (GG)
- Improvements to LMEM handling with RPM (Anshuman, Matt A)
- Cleanups in dmabuf code (Mike)

- Selftest improvements (Matt A)

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/Y2N11wu175p6qeEN@jlahtine-mobl.ger.corp.intel.com

Merge tag 'drm-misc-next-2022-11-03' of git://anongit.freedesktop.org/drm/drm-misc into drm-next

drm-misc-next for 6.2:

UAPI Changes:

Cross-subsystem Changes:
- dma-buf: locking improvements
- firmware: New API in the RaspberryPi firmware driver used by vc4

Core Changes:
- client: Null pointer dereference fix in drm_client_buffer_delete()
- mm/buddy: Add back random seed log
- ttm: Convert ttm_resource to use size_t for its size, fix for an
  undefined behaviour

Driver Changes:
- bridge:
  - adv7511: use dev_err_probe
  - it6505: Fix return value check of pm_runtime_get_sync
- panel:
  - sitronix: Fixes and clean-ups
- lcdif: Increase DMA burst size
- rockchip: runtime_pm improvements
- vc4: Fix for a regression preventing the use of 4k @ 60Hz, and
  further HDMI rate constraints check.
- vmwgfx: Cursor improvements

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Maxime Ripard <maxime@cerno.tech>
Link: https://patchwork.freedesktop.org/patch/msgid/20221103083437.ksrh3hcdvxaof62l@houat

drm/i915/selftests: Reduce oversaturation of request smoketesting

The goal in launching the request smoketest is to have sufficient tasks
running across the system such that we are likely to detect concurrency
issues. We aim to have 2 tasks using the same engine, gt, device (each
level of locking around submission and signaling) running at the same
time. While tasks may not be running all the time as they synchronise
with the gpu, they will be running most of the time, in which case
having many more tasks than cores available is wasteful (and
dramatically increases the workload causing excess runtime). Aim to
limit the number of tasks such that there is at least 2 running per
engine, spreading surplus cores around the engines (rather than running
a task per core per engine.)

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Nirmoy Das <nirmoy.das@intel.com>
Tested-by: Nirmoy Das <nirmoy.das@intel.com>
Signed-off-by: Nirmoy Das <nirmoy.das@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221102155709.31717-1-nirmoy.das@intel.com

drm/i915/hwmon: Fix a build error used with clang compiler

Use REG_FIELD_PREP() and a constant value for hwm_field_scale_and_write()

If the first argument of FIELD_PREP() is not a compile-time constant value
or unsigned long long type, this routine of the __BF_FIELD_CHECK() macro
used internally by the FIELD_PREP() macro always returns false.

BUILD_BUG_ON_MSG(__bf_cast_unsigned(_mask, _mask) >      \
                  __bf_cast_unsigned(_reg, ~0ull),        \
                  _pfx "type of reg too small for mask"); \

And it returns a build error by the option among the clang
compilation options. [-Werror,-Wtautological-constant-out-of-range-compare]

Reported build error while using clang compiler:

drivers/gpu/drm/i915/i915_hwmon.c:115:16: error: result of comparison of
constant 18446744073709551615 with expression of type 'typeof (_Generic((field_msk),
char: (unsigned char)0, unsigned char: (unsigned char)0, signed char: (unsigned char)0,
unsigned short: (unsigned short)0, short: (unsigned short)0, unsigned int:
(unsigned int)0, int: (unsigned int)0, unsigned long: (unsigned long)0, long:
(unsigned long)0, unsigned long long: (unsigned long long)0, long long:
(unsigned long long)0, default: (field_msk)))' (aka 'unsigned int') is always false
[-Werror,-Wtautological-constant-out-of-range-compare]
        bits_to_set = FIELD_PREP(field_msk, nval);
                      ^~~~~~~~~~~~~~~~~~~~~~~~~~~
./include/linux/bitfield.h:114:3: note: expanded from macro 'FIELD_PREP'
                __BF_FIELD_CHECK(_mask, 0ULL, _val, "FIELD_PREP: ");    \
                ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
./include/linux/bitfield.h:71:53: note: expanded from macro '__BF_FIELD_CHECK'
                BUILD_BUG_ON_MSG(__bf_cast_unsigned(_mask, _mask) >     \
                ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~
./include/linux/build_bug.h:39:58: note: expanded from macro 'BUILD_BUG_ON_MSG'
                                    ~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~
./include/linux/compiler_types.h:357:22: note: expanded from macro 'compiletime_assert'
        _compiletime_assert(condition, msg, __compiletime_assert_, __COUNTER__)
        ~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
./include/linux/compiler_types.h:345:23: note: expanded from macro '_compiletime_assert'
        __compiletime_assert(condition, msg, prefix, suffix)
        ~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
./include/linux/compiler_types.h:337:9: note: expanded from macro '__compiletime_assert'
                if (!(condition))                                       \

v2: Use REG_FIELD_PREP() macro instead of FIELD_PREP() (Jani)

Fixes: 99f55efb7911 ("drm/i915/hwmon: Power PL1 limit and TDP setting")
Cc: Ashutosh Dixit <ashutosh.dixit@intel.com>
Cc: Anshuman Gupta <anshuman.gupta@intel.com>
Cc: Andi Shyti <andi.shyti@linux.intel.com>
Cc: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Gwan-gyeong Mun <gwan-gyeong.mun@intel.com>
Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Acked-by: Jani Nikula <jani.nikula@intel.com>
[Joonas: Wrapped commit message error line length to be more reasonable]
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221029044230.32128-1-gwan-gyeong.mun@intel.com

drm/i915: Do not set cache_dirty for DGFX

Currently on DG1, which does not have LLC, we hit the below
warning while rebinding an userptr invalidated object.

WARNING: CPU: 4 PID: 13008 at drivers/gpu/drm/i915/gem/i915_gem_pages.c:34 __i915_gem_object_set_pages+0x296/0x2d0 [i915]
...
RIP: 0010:__i915_gem_object_set_pages+0x296/0x2d0 [i915]
...
Call Trace:
<TASK>
i915_gem_userptr_get_pages+0x175/0x1a0 [i915]
____i915_gem_object_get_pages+0x32/0xb0 [i915]
i915_gem_object_userptr_submit_init+0x286/0x470 [i915]
eb_lookup_vmas+0x2ff/0xcf0 [i915]
? __intel_wakeref_get_first+0x55/0xb0 [i915]
i915_gem_do_execbuffer+0x785/0x21d0 [i915]
i915_gem_execbuffer2_ioctl+0xe7/0x3d0 [i915]

We shouldn't be setting the obj->cache_dirty for DGFX,
fix it.

Fixes: d70af57944a1 ("drm/i915/shmem: ensure flush during swap-in on non-LLC")
Suggested-by: Matthew Auld <matthew.auld@intel.com>
Reported-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Acked-by: Nirmoy Das <nirmoy.das@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221102051416.27327-1-niranjana.vishwanathapura@intel.com

drm/tests: Add back seed value information

As reported by Michał, the drm_mm and drm_buddy unit tests lost the
printk with seed value after they were refactored into KUnit.

Add kunit_info with seed value information to assure reproducibility.

Reported-by: Michał Winiarski <michal.winiarski@intel.com>
Signed-off-by: Arthur Grillo <arthurgrillo@riseup.net>
Signed-off-by: Maíra Canal <mairacanal@riseup.net>
Link: https://patchwork.freedesktop.org/patch/msgid/20221028221755.340487-1-arthurgrillo@riseup.net

drm/client: Prevent NULL dereference in drm_client_buffer_delete()

The drm_gem_vunmap() will crash with a NULL dereference if the passed
object pointer is NULL. It wasn't a problem before we added the locking
support to drm_gem_vunmap function because the mapping argument was always
NULL together with the object. Make drm_client_buffer_delete() to check
whether GEM is NULL before trying to unmap the GEM, it will happen on
framebuffer creation error.

Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Link: https://lore.kernel.org/dri-devel/Y1kFEGxT8MVlf32V@kili/
Fixes: 79e2cf2e7a19 ("drm/gem: Take reservation lock for vmap/vunmap operations")
Signed-off-by: Dmitry Osipenko <dmitry.osipenko@collabora.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221030154412.8320-3-dmitry.osipenko@collabora.com

dma-buf: Make locking consistent in dma_buf_detach()

The dma_buf_detach() locks attach->dmabuf->resv and then unlocks
dmabuf->resv, which could be a two different locks from a static
code checker perspective. In particular this triggers Smatch to
report the "double unlock" error. Make the locking pointers consistent.

Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Link: https://lore.kernel.org/dri-devel/Y1fLfsccW3AS%2Fo+%2F@kili/
Fixes: 809d9c72c2f8 ("dma-buf: Move dma_buf_attach() to dynamic locking specification")
Signed-off-by: Dmitry Osipenko <dmitry.osipenko@collabora.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221030154412.8320-2-dmitry.osipenko@collabora.com

drm/ttm: fix undefined behavior in bit shift for TTM_TT_FLAG_PRIV_POPULATED

Shifting signed 32-bit value by 31 bits is undefined, so changing
significant bit to unsigned. The UBSAN warning calltrace like below:

UBSAN: shift-out-of-bounds in ./include/drm/ttm/ttm_tt.h:122:26
left shift of 1 by 31 places cannot be represented in type 'int'
Call Trace:
<TASK>
dump_stack_lvl+0x7d/0xa5
dump_stack+0x15/0x1b
ubsan_epilogue+0xe/0x4e
__ubsan_handle_shift_out_of_bounds+0x1e7/0x20c
ttm_bo_move_memcpy+0x3b4/0x460 [ttm]
bo_driver_move+0x32/0x40 [drm_vram_helper]
ttm_bo_handle_move_mem+0x118/0x200 [ttm]
ttm_bo_validate+0xfa/0x220 [ttm]
drm_gem_vram_pin_locked+0x70/0x1b0 [drm_vram_helper]
drm_gem_vram_pin+0x48/0xb0 [drm_vram_helper]
drm_gem_vram_plane_helper_prepare_fb+0x53/0xe0 [drm_vram_helper]
drm_gem_vram_simple_display_pipe_prepare_fb+0x26/0x30 [drm_vram_helper]
drm_simple_kms_plane_prepare_fb+0x4d/0xe0 [drm_kms_helper]
drm_atomic_helper_prepare_planes+0xda/0x210 [drm_kms_helper]
drm_atomic_helper_commit+0xc3/0x1e0 [drm_kms_helper]
drm_atomic_commit+0x9c/0x160 [drm]
drm_client_modeset_commit_atomic+0x33a/0x380 [drm]
drm_client_modeset_commit_locked+0x77/0x220 [drm]
drm_client_modeset_commit+0x31/0x60 [drm]
__drm_fb_helper_restore_fbdev_mode_unlocked+0xa7/0x170 [drm_kms_helper]
drm_fb_helper_set_par+0x51/0x90 [drm_kms_helper]
fbcon_init+0x316/0x790
visual_init+0x113/0x1d0
do_bind_con_driver+0x2a3/0x5c0
do_take_over_console+0xa9/0x270
do_fbcon_takeover+0xa1/0x170
do_fb_registered+0x2a8/0x340
fbcon_fb_registered+0x47/0xe0
register_framebuffer+0x294/0x4a0
__drm_fb_helper_initial_config_and_unlock+0x43c/0x880 [drm_kms_helper]
drm_fb_helper_initial_config+0x52/0x80 [drm_kms_helper]
drm_fbdev_client_hotplug+0x156/0x1b0 [drm_kms_helper]
drm_fbdev_generic_setup+0xfc/0x290 [drm_kms_helper]
bochs_pci_probe+0x6ca/0x772 [bochs]
local_pci_probe+0x4d/0xb0
pci_device_probe+0x119/0x320
really_probe+0x181/0x550
__driver_probe_device+0xc6/0x220
driver_probe_device+0x32/0x100
__driver_attach+0x195/0x200
bus_for_each_dev+0xbb/0x120
driver_attach+0x27/0x30
bus_add_driver+0x22e/0x2f0
driver_register+0xa9/0x190
__pci_register_driver+0x90/0xa0
bochs_pci_driver_init+0x52/0x1000 [bochs]
do_one_initcall+0x76/0x430
do_init_module+0x61/0x28a
load_module+0x1f82/0x2e50
__do_sys_finit_module+0xf8/0x190
__x64_sys_finit_module+0x23/0x30
do_syscall_64+0x58/0x80
entry_SYSCALL_64_after_hwframe+0x63/0xcd
</TASK>

Fixes: 3312be8f6fc8 ("drm/ttm: move populated state into page flags")
Signed-off-by: Gaosheng Cui <cuigaosheng1@huawei.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221031113350.4180975-1-cuigaosheng1@huawei.com
Signed-off-by: Christian König <christian.koenig@amd.com>

drm/i915/selftests: Run the perf MI_BB tests on gen4/5

Now that we know the ring timestamp frequency on gen4/5 we
can run the perf tests that depend on sampling the timestamp.

On g4x/ilk we must read the udw of the 64bit timestamp
register. Details in {g4x,gen5)_read_clock_frequency().

When executing the read via the CS i965 doesn't seem to need
the double read trick that CPU mmio reads need.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221031135703.14670-7-ville.syrjala@linux.intel.com
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>

drm/i915/selftests: Test RING_TIMESTAMP on gen4/5

Now that we actually know the cs timestamp frequency on gen4/5
let's run the corresponding test.

On g4x/ilk we must read the udw of the 64bit timestamp
register. Details in {g4x,gen5)_read_clock_frequency().

The one extra caveat is that on i965 (or at least CL, don't
recall if I ever tested on BW) we must read the register
twice to get an up to date value. For some unknown reason
the first read tends to return a stale value.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221031135703.14670-6-ville.syrjala@linux.intel.com
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>

drm/i915/selftests: Run MI_BB perf selftests on SNB

SNB does have the RING_TIMESTAMP register on the RCS engine.
Run the MI_BB perf tests on it.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221031135703.14670-5-ville.syrjala@linux.intel.com
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>

drm/i915: Fix cs timestamp frequency for cl/bw

Despite what the spec says the TIMESTAMP register seems to
tick once every hrawclk (confirmed on i965gm and g35).

v2: Rebase

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221031135703.14670-4-ville.syrjala@linux.intel.com
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>

drm/i915: Stop claiming cs timestamp frquency on gen2/3

Gen2/3 have no TIMESTAMP registers to sample so no point in thinking
we have any frequency for it either.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221031135703.14670-3-ville.syrjala@linux.intel.com
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>

drm/i915: Fix cs timestamp frequency for ctg/elk/ilk

On ilk the UDW of TIMESTAMP increments every 1000 ns,
LDW is mbz. In order to represent that we'd need 52 bits,
but we only have 32 bits. Even worse most things want to
only deal with 32 bits of timestamp. So let's just set
up the timestamp frequency as if we only had the UDW.

On ctg/elk 63:20 of TIMESTAMP increments every 1/4 ns, 19:0
are mbz. To make life simpler let's ignore the LDW and set up
timestamp frequency based on the UDW only (increments every
1024 ns).

v2: Rebase

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221031135703.14670-2-ville.syrjala@linux.intel.com
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>

drm/panel/panel-sitronix-st7701: Remove panel on DSI attach failure

In case mipi_dsi_attach() fails, call drm_panel_remove() to
avoid memory leak.

Fixes: 849b2e3ff969 ("drm/panel: Add Sitronix ST7701 panel driver")
Signed-off-by: Marek Vasut <marex@denx.de>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20221014231106.468063-1-marex@denx.de

drm/panel/panel-sitronix-st7701: Clean up CMDnBKx selection

There are two command register files, CMD1 and CMD2, where only the CMD2
contains additional register sub-files BK0..3 . Pull the register file
selection call into separate function instead of duplicating it all over
the driver. The CMD2BK2 file is undocumented in datasheet, and is used
for BIST. No functional change.

Signed-off-by: Marek Vasut <marex@denx.de>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20221014231042.468033-1-marex@denx.de

drm/i915/dg2: Introduce Wa_18017747507

WA 18017747507 applies to all DG2 skus.

BSpec: 56035, 46121, 68173

Signed-off-by: Wayne Boyer <wayne.boyer@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221031131509.3411195-1-wayne.boyer@intel.com

drm/panel/panel-sitronix-st7701: Fix RTNI calculation

The RTNI field is multiplied by 16 and incremented by 512 before being
used as the minimum number of pixel clock per horizontal line, hence
it is necessary to subtract those 512 bytes from htotal and then divide
the result by 16 before writing the value into the RTNI field. Fix the
calculation.

Fixes: de2b4917843c ("drm/panel/panel-sitronix-st7701: Infer horizontal pixel count from TFT mode")
Signed-off-by: Marek Vasut <marex@denx.de>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20221012221159.88397-1-marex@denx.de

drm: lcdif: change burst size to 256B

If a axi bus master with a higher priority do a lot of memory access
FIFO underruns can be inspected. Increase the burst size to 256B to
avoid such underruns and to improve the memory access efficiency.

Fixes: 9db35bb349a0 ("drm: lcdif: Add support for i.MX8MP LCDIF variant")
Signed-off-by: Marco Felsch <m.felsch@pengutronix.de>
Reviewed-by: Marek Vasut <marex@denx.de>
Signed-off-by: Marek Vasut <marex@denx.de>
Link: https://patchwork.freedesktop.org/patch/msgid/20221101164615.778299-1-m.felsch@pengutronix.de

Merge tag 'drm-intel-next-2022-10-28' of git://anongit.freedesktop.org/drm/drm-intel into drm-next

- Hotplug code clean-up and organization (Jani, Gustavo)
- More VBT specific code clean-up, doc, organization,
and improvements (Ville)
- More MTL enabling work (Matt, RK, Anusha, Jose)
- FBC related clean-ups and improvements (Ville)
- Removing unused sw_fence_await_reservation (Niranjana)
- Big chunch of display house clean-up (Ville)
- Many Watermark fixes and clean-ups (Ville)
- Fix device info for devices without display (Jani)
- Fix TC port PLLs after readout (Ville)
- DPLL ID clean-ups (Ville)
- Prep work for finishing (de)gamma readout (Ville)
- PSR fixes and improvements (Jouni, Jose)
- Reject excessive dotclocks early (Ville)
- DRRS related improvements (Ville)
- Simplify uncore register updates (Andrzej)
- Fix simulated GPU reset wrt. encoder HW readout (Imre)
- Add a ADL-P workaround (Jose)
- Fix clear mask in GEN7_MISCCPCTL update (Andrzej)
- Temporarily disable runtime_pm for discrete (Anshuman)
- Improve fbdev debugs (Nirmoy)
- Fix DP FRL link training status (Ankit)
- Other small display fixes (Ankit, Suraj)
- Allow panel fixed modes to have differing sync
polarities (Ville)
- Clean up crtc state flag checks (Ville)
- Fix race conditions during DKL PHY accesses (Imre)
- Prep-work for cdclock squash and crawl modes (Anusha)
- ELD precompute and readout (Ville)

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/Y1wd6ZJ8LdJpCfZL@intel.com

drm/vmwgfx: Cleanup the cursor snooping code

Cursor snooping depended on implicit size and format which made debugging
quite difficult. Make the code easier to following by making everything
explicit and instead of using magic numbers predefine all the
parameters the code depends on.

Also fixes incorrectly computed pitches for non-aligned cursor snoops.
Fix which has no practical effect because non-aligned cursor snoops
are not used by the X11 driver and Wayland cursors will go through
mob cursors, instead of surface dma's.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Michael Banack <banackm@vmware.com>
Reviewed-by: Martin Krastev <krastevm@vmware.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221026031936.1004280-2-zack@kde.org

drm/vmwgfx: Validate the box size for the snooped cursor

Invalid userspace dma surface copies could potentially overflow
the memcpy from the surface to the snooped image leading to crashes.
To fix it the dimensions of the copybox have to be validated
against the expected size of the snooped cursor.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Fixes: 2ac863719e51 ("vmwgfx: Snoop DMA transfers with non-covering sizes")
Cc: <stable@vger.kernel.org> # v3.2+
Reviewed-by: Michael Banack <banackm@vmware.com>
Reviewed-by: Martin Krastev <krastevm@vmware.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221026031936.1004280-1-zack@kde.org

drm/i915/dmabuf: Use scatterlist for_each_sg API

Update open coded for loop to use the standard scatterlist
for_each_sg API.

Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Michael J. Ruhl <michael.j.ruhl@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221028155029.494736-4-matthew.auld@intel.com

drm/i915/dmabuf: dmabuf cleanup

Some minor cleanup of some variables for consistency.

Normalize struct sg_table to sgt.
Normalize struct dma_buf_attachment to attach.
checkpatch issues sizeof(), !NULL updates.

Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Michael J. Ruhl <michael.j.ruhl@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221028155029.494736-3-matthew.auld@intel.com

drm/i915/selftests: exercise GPU access from the importer

Using PAGE_SIZE here potentially hides issues so bump that to something
larger. This should also make it possible for iommu to coalesce entries
for us. With that in place verify we can write from the GPU using the
importers sg_table, followed by checking that our writes match when read
from the CPU side.

v2: Switch over to igt_gpu_fill_dw(), which looks to be more widely
supported than the migrate stuff (at least OOTB).

References: https://gitlab.freedesktop.org/drm/intel/-/issues/7306
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Michael J. Ruhl <michael.j.ruhl@intel.com>
Reviewed-by: Michael J. Ruhl <michael.j.ruhl@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221028155029.494736-2-matthew.auld@intel.com

drm/i915/dmabuf: fix sg_table handling in map_dma_buf

We need to iterate over the original entries here for the sg_table,
pulling out the struct page for each one, to be remapped. However
currently this incorrectly iterates over the final dma mapped entries,
which is likely just one gigantic sg entry if the iommu is enabled,
leading to us only mapping the first struct page (and any physically
contiguous pages following it), even if there is potentially lots more
data to follow.

Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/7306
Fixes: 1286ff739773 ("i915: add dmabuf/prime buffer sharing support.")
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Michael J. Ruhl <michael.j.ruhl@intel.com>
Cc: <stable@vger.kernel.org> # v3.5+
Reviewed-by: Michael J. Ruhl <michael.j.ruhl@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221028155029.494736-1-matthew.auld@intel.com

drm/i915/dgfx: Grab wakeref at i915_ttm_unmap_virtual

We had already grabbed the rpm wakeref at obj destruction path,
but it also required to grab the wakeref when object moves.
When i915_gem_object_release_mmap_offset() gets called by
i915_ttm_move_notify(), it will release the mmap offset without
grabbing the wakeref. We want to avoid that therefore,
grab the wakeref at i915_ttm_unmap_virtual() accordingly.

While doing that also changed the lmem_userfault_lock from
mutex to spinlock, as spinlock widely used for list.

Also changed if (obj->userfault_count) to
GEM_BUG_ON(!obj->userfault_count).

v2:
- Removed lmem_userfault_{list,lock} from intel_gt. [Matt Auld]

Fixes: ad74457a6b5a ("drm/i915/dgfx: Release mmap on rpm suspend")
Suggested-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Anshuman Gupta <anshuman.gupta@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221027092242.1476080-3-anshuman.gupta@intel.com

drm/i915: Encapsulate lmem rpm stuff in intel_runtime_pm

Runtime pm is not really per GT, therefore it make sense to
move lmem_userfault_list, lmem_userfault_lock and
userfault_wakeref from intel_gt to intel_runtime_pm structure,
which is embedded to i915.

No functional change.

v2:
- Fixes the code comment nit. [Matt Auld]

Signed-off-by: Anshuman Gupta <anshuman.gupta@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Acked-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221027092242.1476080-2-anshuman.gupta@intel.com

drm/rockchip: lvds: fix PM usage counter unbalance in poweron

pm_runtime_get_sync will increment pm usage counter even it failed.
Forgetting to putting operation will result in reference leak here.
We fix it by replacing it with the newest pm_runtime_resume_and_get
to keep usage counter balanced.

Fixes: 34cc0aa25456 ("drm/rockchip: Add support for Rockchip Soc LVDS")
Fixes: cca1705c3d89 ("drm/rockchip: lvds: Add PX30 support")
Signed-off-by: Zhang Qilong <zhangqilong3@huawei.com>
Signed-off-by: Heiko Stuebner <heiko@sntech.de>
Link: https://patchwork.freedesktop.org/patch/msgid/20220922132107.105419-3-zhangqilong3@huawei.com

drm/rockchip: use pm_runtime_resume_and_get() instead of pm_runtime_get_sync()

Replace pm_runtime_get_sync() with pm_runtime_resume_and_get() to avoid
device usage counter leak.

Signed-off-by: Yuan Can <yuancan@huawei.com>
Signed-off-by: Heiko Stuebner <heiko@sntech.de>
Link: https://patchwork.freedesktop.org/patch/msgid/20220615062644.96837-1-yuancan@huawei.com

drm/rockchip: dsi: Remove the unused function dsi_update_bits()

The function dsi_update_bits() is defined in the dw-mipi-dsi-rockchip.c
file, but not called elsewhere, so delete this unused function.

drivers/gpu/drm/rockchip/dw-mipi-dsi-rockchip.c:367:20: warning: unused function 'dsi_update_bits'.

https://bugzilla.openanolis.cn/show_bug.cgi?id=2414

Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com>
Signed-off-by: Heiko Stuebner <heiko@sntech.de>
Link: https://patchwork.freedesktop.org/patch/msgid/20221017084330.94117-1-jiapeng.chong@linux.alibaba.com

drm/rockchip: vop2: Register Esmart0-win0 as primary plane

Esmart0-win0 could serve as primary plane, so mark it as such. On
RK3568 this window will never be used as primary plane, because the
three windows at the beginning of the rk3568_vop_win_data[] array
will be used. On RK3566 however, two of the windows at the beginning
of the rk3568_vop_win_data[] array cannot not be used due to hardware
limitations, so without this patch we end up with CRTCs without primary
planes when multiple VPs are active.

Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
Tested-by: Chris Morgan <macromorgan@hotmail.com>
Signed-off-by: Heiko Stuebner <heiko@sntech.de>
Link: https://patchwork.freedesktop.org/patch/msgid/20220926081643.304759-1-s.hauer@pengutronix.de

drm: rockchip: remove rockchip_drm_framebuffer_init() function

The function rockchip_drm_framebuffer_init() was in use
in the rockchip_drm_fbdev.c file, but that is now replaced
by a generic fbdev setup. Reduce the image size by
removing the rockchip_drm_framebuffer_init() and sub function
rockchip_fb_alloc() and cleanup the rockchip_drm_fb.h header file.

Signed-off-by: Johan Jonker <jbx6244@gmail.com>
Reviewed-by: John Keeping <john@metanate.com>
Signed-off-by: Heiko Stuebner <heiko@sntech.de>
Link: https://patchwork.freedesktop.org/patch/msgid/ebe91504-c5df-99e4-635f-832218584051@gmail.com

drm/i915/mtl: Add missing steering table terminators

The termination entries were missing for a couple of the recently-added
MTL steering tables.

Fixes: f32898c94a10 ("drm/i915/xelpg: Add multicast steering")
Fixes: a7ec65fc7e83 ("drm/i915/xelpmp: Add multicast steering for media GT")
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221028224022.964997-1-matthew.d.roper@intel.com

drm: bridge: adv7511: use dev_err_probe in probe function

adv7511 probe may need to be attempted multiple times before no
-EPROBE_DEFER is returned. Currently, every such probe results in
an error message:

[ 4.534229] adv7511 1-003d: failed to find dsi host
[ 4.580288] adv7511 1-003d: failed to find dsi host

This is misleading, as there is no error and probe deferral is normal
behavior. Fix this by using dev_err_probe that will suppress
-EPROBE_DEFER errors. While at it, we touch all dev_err in the probe
path. This makes the code more concise and included the error code
everywhere to aid user in debugging.

Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Signed-off-by: Ahmad Fatoum <a.fatoum@pengutronix.de>
Signed-off-by: Neil Armstrong <neil.armstrong@linaro.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20221026125246.3188260-1-a.fatoum@pengutronix.de

drm/i915/sdvo: Fix debug print

Correctly indicate which outputs we support in the debug print.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221026101134.20865-9-ville.syrjala@linux.intel.com
Reviewed-by: Jani Nikula <jani.nikula@intel.com>

drm/i915/sdvo: Reduce copy-pasta in output setup

Avoid having to call the output init function for each
output type separately. We can just call the right one
based on the "class" of the output.

Technically we could just walk the bits of the bitmask
but that could change the order in which we initialize
the outputs. To avoid any behavioural changes keep to
the same explicit probe order as before.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221026101134.20865-8-ville.syrjala@linux.intel.com
Reviewed-by: Jani Nikula <jani.nikula@intel.com>

drm/i915/sdvo: Get rid of the output type<->device index stuff

Get rid of this silly output type<->device index back and
forth and just pass the output type directly to the corresponding
output init function. This was already being done for TV outputs
anyway.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221026101134.20865-7-ville.syrjala@linux.intel.com
Reviewed-by: Jani Nikula <jani.nikula@intel.com>

drm/i915/sdvo: Don't add DDC modes for LVDS

Stop enumerating the DDC modes for SDVO LVDS outputs (outside
the initial fixed mode setup). intel_panel_mode_valid() will
just reject most of them anyway, and any left over are entirely
pointless as they'll match the fixed mode hdisp+vdisp+vrefresh
so no user visible effect from using them instead of the fixed
mode.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221026101134.20865-6-ville.syrjala@linux.intel.com
Reviewed-by: Jani Nikula <jani.nikula@intel.com>

drm/i915/sdvo: Simplify output setup debugs

Get rid of this funny byte based dumping of invalid output
flags and just dump it as a single hex numbers. Also do that
early since all the rest is going to get skipped anyway of
the thing is zero.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221026101134.20865-5-ville.syrjala@linux.intel.com
Reviewed-by: Jani Nikula <jani.nikula@intel.com>

drm/i915/sdvo: Grab mode_config.mutex during LVDS init to avoid WARNs

drm_mode_probed_add() is unhappy about being called w/o
mode_config.mutex. Grab it during LVDS fixed mode setup
to silence the WARNs.

Cc: stable@vger.kernel.org
Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/7301
Fixes: aa2b88074a56 ("drm/i915/sdvo: Fix multi function encoder stuff")
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221026101134.20865-4-ville.syrjala@linux.intel.com
Reviewed-by: Jani Nikula <jani.nikula@intel.com>

drm/i915/sdvo: Setup DDC fully before output init

Call intel_sdvo_select_ddc_bus() before initializing any
of the outputs. And before that is functional (assuming no VBT)
we have to set up the controlled_outputs thing. Otherwise DDC
won't be functional during the output init but LVDS really
needs it for the fixed mode setup.

Note that the whole multi output support still looks very
bogus, and more work will be needed to make it correct.
But for now this should at least fix the LVDS EDID fixed mode
setup.

Cc: stable@vger.kernel.org
Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/7301
Fixes: aa2b88074a56 ("drm/i915/sdvo: Fix multi function encoder stuff")
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221026101134.20865-3-ville.syrjala@linux.intel.com
Reviewed-by: Jani Nikula <jani.nikula@intel.com>

drm/i915/sdvo: Filter out invalid outputs more sensibly

We try to filter out the corresponding xxx1 output
if the xxx0 output is not present. But the way that is
being done is pretty awkward. Make it less so.

Cc: stable@vger.kernel.org
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221026101134.20865-2-ville.syrjala@linux.intel.com
Reviewed-by: Jani Nikula <jani.nikula@intel.com>

drm/vc4: Make sure we don't end up with a core clock too high

Following the clock rate range improvements to the clock framework,
trying to set a disjoint range on a clock will now result in an error.

Thus, we can't set a minimum rate higher than the maximum reported by
the firmware, or clk_set_min_rate() will fail.

Thus we need to clamp the rate we are about to ask for to the maximum
rate possible on that clock.

Reviewed-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
Link: https://lore.kernel.org/r/20220815-rpi-fix-4k-60-v5-7-fe9e7ac8b111@cerno.tech
Signed-off-by: Maxime Ripard <maxime@cerno.tech>

drm/vc4: hdmi: Add more checks for 4k resolutions

At least the 4096x2160@60Hz mode requires some overclocking that isn't
available by default, even if hdmi_enable_4kp60 is enabled.

Let's add some logic to detect whether we can satisfy the core clock
requirements for that mode, and prevent it from being used otherwise.

Signed-off-by: Dom Cobley <popcornmix@gmail.com>
Reviewed-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
Link: https://lore.kernel.org/r/20220815-rpi-fix-4k-60-v5-6-fe9e7ac8b111@cerno.tech
Signed-off-by: Maxime Ripard <maxime@cerno.tech>

drm/vc4: hdmi: Rework hdmi_enable_4kp60 detection code

In order to support higher HDMI frequencies, users have to set the
hdmi_enable_4kp60 parameter in their config.txt file.

This will have the side-effect of raising the maximum of the core clock,
tied to the HVS, and managed by the HVS driver.

However, we are querying this in the HDMI driver by poking into the HVS
structure to get our struct clk handle.

Let's make this part of the HVS bind implementation to have all the core
clock related setup in the same place.

Reviewed-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
Link: https://lore.kernel.org/r/20220815-rpi-fix-4k-60-v5-5-fe9e7ac8b111@cerno.tech
Signed-off-by: Maxime Ripard <maxime@cerno.tech>

drm/vc4: hdmi: Fix hdmi_enable_4kp60 detection

In order to support higher HDMI frequencies, users have to set the
hdmi_enable_4kp60 parameter in their config.txt file.

We were detecting this so far by calling clk_round_rate() on the core
clock with the frequency we're supposed to run at when one of those
modes is enabled. Whether or not the parameter was enabled could then be
inferred by the returned rate since the maximum clock rate reported by
the firmware was one of the side effect of setting that parameter.

However, the recent clock rework we did changed what clk_round_rate()
was returning to always return the minimum allowed, and thus this test
wasn't reliable anymore.

Let's use the new clk_get_max_rate() function to reliably determine the
maximum rate allowed on that clock and fix the 4k@60Hz output.

Fixes: e9d6cea2af1c ("clk: bcm: rpi: Run some clocks at the minimum rate allowed")
Reviewed-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
Link: https://lore.kernel.org/r/20220815-rpi-fix-4k-60-v5-4-fe9e7ac8b111@cerno.tech
Signed-off-by: Maxime Ripard <maxime@cerno.tech>

firmware: raspberrypi: Provide a helper to query a clock max rate

The firmware allows to query for its clocks the operating range of a
given clock. We'll need this for some drivers (KMS, in particular) to
infer the state of some configuration options, so let's create a
function to do so.

Acked-by: Stephen Boyd <sboyd@kernel.org>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
Link: https://lore.kernel.org/r/20220815-rpi-fix-4k-60-v5-3-fe9e7ac8b111@cerno.tech
Signed-off-by: Maxime Ripard <maxime@cerno.tech>

firmware: raspberrypi: Move the clock IDs to the firmware header

We'll need the clock IDs in more drivers than just the clock driver from
now on, so let's move them in the firmware header.

Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Acked-by: Stephen Boyd <sboyd@kernel.org>
Link: https://lore.kernel.org/r/20220815-rpi-fix-4k-60-v5-2-fe9e7ac8b111@cerno.tech
Signed-off-by: Maxime Ripard <maxime@cerno.tech>

firmware: raspberrypi: Introduce rpi_firmware_find_node()

A significant number of RaspberryPi drivers using the firmware don't
have a phandle to it, so end up scanning the device tree to find a node
with the firmware compatible.

That code is duplicated everywhere, so let's introduce a helper instead.

Acked-by: Florian Fainelli <f.fainelli@gmail.com>
Link: https://lore.kernel.org/r/20220815-rpi-fix-4k-60-v5-1-fe9e7ac8b111@cerno.tech
Signed-off-by: Maxime Ripard <maxime@cerno.tech>

Merge tag 'drm-misc-next-2022-10-27' of git://anongit.freedesktop.org/drm/drm-misc into drm-next

drm-misc-next for 6.2:

UAPI Changes:

Cross-subsystem Changes:

Core Changes:
- connector: Send hotplug event on cleanup
- edid: logging/debug improvements
- plane_helper: Improve tests

Driver Changes:
- bridge:
- it6505: Synchronization improvements
- panel:
- panel-edp: Add INX N116BGE-EA2 C2 and C4 support.
- nouveau: Fix page-fault handling
- vmwgfx: fb and cursor refactoring, convert to generic hashtable

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Maxime Ripard <maxime@cerno.tech>
Link: https://patchwork.freedesktop.org/patch/msgid/20221027073407.c2tlaczvzjrnzazi@houat

drm/i915/perf: Enable OA for DG2

OA was disabled for DG2 as support was missing. Enable it back now.

Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221026222102.5526-17-umesh.nerlige.ramappa@intel.com

drm/i915/perf: complete programming whitelisting for XEHPSDV

We have an additional register to select which slices contribute to
OAG/OAG counter increments.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221026222102.5526-16-umesh.nerlige.ramappa@intel.com

drm/i915/guc: Support OA when Wa_16011777198 is enabled

On DG2, a w/a resets RCS/CCS before it goes into RC6. This breaks OA
since OA does not expect engine resets during its use. Fix it by
disabling RC6.

v2: (Ashutosh)
- Bring back slpc_unset_param helper
- Update commit msg
- Use with_intel_runtime_pm helper for set/unset

v3: (Ashutosh)
- Just use intel_uc_uses_guc_rc

Signed-off-by: Vinay Belgaumkar <vinay.belgaumkar@intel.com>
Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221026222102.5526-15-umesh.nerlige.ramappa@intel.com

drm/i915/perf: Save/restore EU flex counters across reset

If a drm client is killed, then hw contexts used by the client are reset
immediately. This reset clears the EU flex counter configuration. If an
OA use case is running in parallel, it would start seeing zeroed eu
counter values following the reset even if the drm client is restarted.
Save/restore the EU flex counter config so that the EU counters can be
monitored continuously across resets.

v2:
- Save/restore eu flex config only for gen12, as for pre-gen12, these
are saved and restored in the context image.

Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221026222102.5526-14-umesh.nerlige.ramappa@intel.com

drm/i915/perf: Apply Wa_18013179988

OA reports in the OA buffer contain an OA timestamp field that helps
user calculate delta between 2 OA reports. The calculation relies on the
CS timestamp frequency to convert the timestamp value to nanoseconds.
The CS timestamp frequency is a function of the CTC_SHIFT value in
RPM_CONFIG0.

In DG2, OA unit assumes that the CTC_SHIFT is 3, instead of using the
actual value from RPM_CONFIG0. At the user level, this results in an
error in calculating delta between 2 OA reports since the OA timestamp
is not shifted in the same manner as CS timestamp. Also the periodicity
of the reports is different from what the user configured because of
mismatch in the CS and OA frequencies.

The issue also affects MI_REPORT_PERF_COUNT command.

To resolve this, return actual OA timestamp frequency to the user in
i915_getparam_ioctl, so that user can calculate the right OA exponent as
well as interpret the reports correctly.

MR: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18893

v2:
- Use REG_FIELD_GET (Ashutosh)
- Update commit msg

Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221026222102.5526-13-umesh.nerlige.ramappa@intel.com

drm/i915/perf: Add Wa_1508761755:dg2

Disable Clock gating in EU when gathering the events so that EU events
are not lost.

v2: Fix checkpatch issues
v3: User MCR helpers to write to MC reg
v4: Indent correctly (checkpatch)

Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221026222102.5526-12-umesh.nerlige.ramappa@intel.com

drm/i915/perf: Store a pointer to oa_format in oa_buffer

DG2 introduces OA reports with 64 bit report header fields. Perf OA
would need more information about the OA format in order to process such
reports. Store all OA format info in oa_buffer instead of just the size
and format-id.

v2: Drop format_size variable (Ashutosh)

Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221026222102.5526-11-umesh.nerlige.ramappa@intel.com

drm/i915/perf: Use gt-specific ggtt for OA and noa-wait buffers

User passes uabi engine class and instance to the perf OA interface. Use
gt corresponding to the engine to pin the buffers to the right ggtt.

Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221026222102.5526-10-umesh.nerlige.ramappa@intel.com

drm/i915/perf: Replace gt->perf.lock with stream->lock for file ops

With multi-gt, user can access multiple OA buffers concurrently. Use
stream->lock instead of gt->perf.lock to serialize file operations.

Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221026222102.5526-9-umesh.nerlige.ramappa@intel.com

drm/i915/perf: Move gt-specific data from i915->perf to gt->perf

Make perf part of gt as the OAG buffer is specific to a gt. The refactor
eventually simplifies programming the right OA buffer and the right HW
registers when supporting multiple gts.

Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221026222102.5526-8-umesh.nerlige.ramappa@intel.com

drm/i915/perf: Simply use stream->ctx

Earlier code used exclusive_stream to check for user passed context.
Simplify this by accessing stream->ctx.

Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221026222102.5526-7-umesh.nerlige.ramappa@intel.com

drm/i915/perf: Enable bytes per clock reporting in OA

XEHPSDV and DG2 provide a way to configure bytes per clock vs commands
per clock reporting. Enable bytes per clock setting on enabling OA.

Bspec: 51762
Bspec: 52201

v2:
- Fix commit msg (Ashutosh)
- Fix checkpatch issues

v3:
- s/commands/bytes/ in code comment and commmit msg

Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221026222102.5526-6-umesh.nerlige.ramappa@intel.com

drm/i915/perf: Determine gen12 oa ctx offset at runtime

Some SKUs of same gen12 platform may have different oactxctrl
offsets. For gen12, determine oactxctrl offsets at runtime.

v2: (Lionel)
- Move MI definitions to intel_gpu_commands.h
- Ensure __find_reg_in_lri does read past context image size

v3: (Ashutosh)
- Drop unnecessary use of double underscores
- fix find_reg_in_lri
- Return error if oa context offset is U32_MAX
- Error out if oa_ctx_ctrl_offset does not find offset

v4: (Ashutosh)
- Warn on odd MI LRI_LEN
- Remove unnecessary check for valid_oactxctrl_offset
- Drop valid_oactxctrl_offset macro

v5: Drop unrelated comment

Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221026222102.5526-5-umesh.nerlige.ramappa@intel.com

drm/i915/perf: Fix noa wait predication for DG2

Predication for batch buffer commands changed in XEHPSDV.
MI_BATCH_BUFFER_START predicates based on MI_SET_PREDICATE_RESULT
register. The MI_SET_PREDICATE_RESULT register can only be modified
with MI_SET_PREDICATE command. When configured, the MI_SET_PREDICATE
command sets MI_SET_PREDICATE_RESULT based on bit 0 of
MI_PREDICATE_RESULT_2. Use this to configure predication in noa_wait.

Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221026222102.5526-4-umesh.nerlige.ramappa@intel.com

drm/i915/perf: Add 32-bit OAG and OAR formats for DG2

Add new OA formats for DG2.

MR: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18893

v2:
- Update commit title (Ashutosh)
- Coding style fixes (Lionel)
- 64 bit OA formats need UMD changes in GPUvis, drop for now and send in a
separate series with UMD changes

v3:
- Update commit message to drop 64 bit related description

Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> #1
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221026222102.5526-3-umesh.nerlige.ramappa@intel.com