Lucas De Marchi [Tue, 24 Sep 2019 21:00:38 +0000 (14:00 -0700)]
drm/i915/tgl: re-indent code to prepare for DKL changes
The final save operation into pll_state of the calculations done will
be different for DKL PHY. Prepare for that by reindenting code so it's
easier to check for correctness. This one has no change in behavior.
Lucas De Marchi [Tue, 24 Sep 2019 21:00:35 +0000 (14:00 -0700)]
drm/i915/tgl: Add initial dkl pll support
The disable function can be the same as for MG phy since the same
registers are used. The others are different as registers changed,
also adding a empty dkl_pll_write() to be implemented later.
v2:
Setting the right HIP_INDEX_REG bits (José)
v3:
Masking non-computed registers of mg_pll_tdc_coldst_bias
when getting hardware state
Sharing mg_pll_enable() with TGL
Chris Wilson [Wed, 25 Sep 2019 13:08:45 +0000 (14:08 +0100)]
drm/i915/execlists: Simplify gen12_csb_parse
Having decided that we only care about the promotion predicate, we can
simplify gen12_csb_parse to simply check whether we need to jump to a
new queue.
Rename linked_plane to planar_linked_plane and slave to planar_slave,
this will make it easier to keep apart bigjoiner linking and planar plane
linking.
We had this as an optimization to not do a plane update, but we killed
it off because there are so many reasons we may have to do a plane
update or fastset that it's best to just assume everything changed.
Readout the FEC state in encoder->get_config(), this will allow
us to ensure that we can correctly inherit the state from boot,
and that we set FEC during modeset.
There was a integer wraparound when mode_clock became too high,
and we didn't correct for the FEC overhead factor when dividing,
with the calculations breaking at HBR3.
As a result our calculated bpp was way too high, and the link width
limitation never came into effect.
Print out the resulting bpp calcululations as a sanity check, just
in case we ever have to debug it later on again.
We also used the wrong factor for FEC. While bspec mentions 2.4%,
all the calculations use 1/0.972261, and the same ratio should be
applied to data M/N as well, so use it there when FEC is enabled.
This fixes the FIFO underrun we are seeing with FEC enabled.
Changes since v2:
- Handle fec_enable in intel_link_compute_m_n, so only data M/N is adjusted. (Ville)
- Fix initial hardware readout for FEC. (Ville)
Changes since v3:
- Remove bogus fec_to_mode_clock. (Ville)
Changes since v4:
- Use the correct register for icl. (Ville)
- Split hw readout to a separate patch.
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Fixes: d9218c8f6cf4 ("drm/i915/dp: Add helpers for Compressed BPP and Slice Count for DSC") Cc: <stable@vger.kernel.org> # v5.0+ Cc: Manasi Navare <manasi.d.navare@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190925082110.17439-1-maarten.lankhorst@linux.intel.com Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Chris Wilson [Tue, 24 Sep 2019 17:35:01 +0000 (18:35 +0100)]
drm/i915/tgl: Swap engines for no rps (gpu reclocking)
If we disable rps, it appears the Tigerlake is stable enough to run
multiple engines simultaneously in CI. As disabling rps should only
cause the execution to be slow, whereas many features depend on the
different engines, we would prefer to have the engines enabled while the
machine hangs are being debugged.
drm/i915: Add Pipe D cursor ctrl register for Gen12
Currently the offset for PIPE D cursor control register is missing in
i915_reg.h due to which the cursor plane cannot be enabled for Pipe D.
This also causes kernel Warning, when a user requests to enable cursor
plane for PIPE D for Gen 12 platforms.
This patch adds the CURSOR_CTL_D register in the i915_reg.h.
Chris Wilson [Tue, 24 Sep 2019 14:59:50 +0000 (15:59 +0100)]
drm/i915/selftests: Verify the LRC register layout between init and HW
Before we submit the first context to HW, we need to construct a valid
image of the register state. This layout is defined by the HW and should
match the layout generated by HW when it saves the context image.
Asserting that this should be equivalent should help avoid any undefined
behaviour and verify that we haven't missed anything important!
Of course, having insisted that the initial register state within the
LRC should match that returned by HW, we need to ensure that it does.
v2: Drop the RELATIVE_MMIO flag from gen11, we ignore it for
constructing the lrc image.
Added bandwidth calculation algorithm and checks,
similar way as it was done for ICL, some constants
were corrected according to BSpec 53998.
v2: Start using same icl_get_bw_info function to avoid
code duplication. Moved mpagesize to memory info
related structure as it is now dependent on memory type.
Fixed qi.t_bl field assignment.
v3: Removed mpagesize as unused. Duplicate code and redundant blankline
fixed.
v4: Changed ordering of IS_GEN checks as agreed. Minor commit
message fixes.
Chris Wilson [Fri, 20 Sep 2019 12:18:21 +0000 (13:18 +0100)]
drm/i915: Mark contents as dirty on a write fault
Since dropping the set-to-gtt-domain in commit a679f58d0510 ("drm/i915:
Flush pages on acquisition"), we no longer mark the contents as dirty on
a write fault. This has the issue of us then not marking the pages as
dirty on releasing the buffer, which means the contents are not written
out to the swap device (should we ever pick that buffer as a victim).
Notably, this is visible in the dumb buffer interface used for cursors.
Having updated the cursor contents via mmap, and swapped away, if the
shrinker should evict the old cursor, upon next reuse, the cursor would
be invisible.
E.g. echo 80 > /proc/sys/kernel/sysrq ; echo f > /proc/sysrq-trigger
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111541 Fixes: a679f58d0510 ("drm/i915: Flush pages on acquisition") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Matthew Auld <matthew.william.auld@gmail.com> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Cc: <stable@vger.kernel.org> # v5.2+ Reviewed-by: Matthew Auld <matthew.william.auld@gmail.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190920121821.7223-1-chris@chris-wilson.co.uk
Chris Wilson [Mon, 23 Sep 2019 15:28:44 +0000 (16:28 +0100)]
drm/i915: Prevent bonded requests from overtaking each other on preemption
Force bonded requests to run on distinct engines so that they cannot be
shuffled onto the same engine where timeslicing will reverse the order.
A bonded request will often wait on a semaphore signaled by its master,
creating an implicit dependency -- if we ignore that implicit dependency
and allow the bonded request to run on the same engine and before its
master, we will cause a GPU hang. [Whether it will hang the GPU is
debatable, we should keep on timeslicing and each timeslice should be
"accidentally" counted as forward progress, in which case it should run
but at one-half to one-third speed.]
We can prevent this inversion by restricting which engines we allow
ourselves to jump to upon preemption, i.e. baking in the arrangement
established at first execution. (We should also consider capturing the
implicit dependency using i915_sched_add_dependency(), but first we need
to think about the constraints that requires on the execution/retirement
ordering.)
Chris Wilson [Mon, 23 Sep 2019 15:28:43 +0000 (16:28 +0100)]
drm/i915: Fixup preempt-to-busy vs reset of a virtual request
Due to the nature of preempt-to-busy the execlists active tracking and
the schedule queue may become temporarily desync'ed (between resubmission
to HW and its ack from HW). This means that we may have unwound a
request and passed it back to the virtual engine, but it is still
inflight on the HW and may even result in a GPU hang. If we detect that
GPU hang and try to reset, the hanging request->engine will no longer
match the current engine, which means that the request is not on the
execlists active list and we should not try to find an older incomplete
request. Given that we have deduced this must be a request on a virtual
engine, it is the single active request in the context and so must be
guilty (as the context is still inflight, it is prevented from being
executed on another engine as we process the reset).
Chris Wilson [Mon, 23 Sep 2019 15:28:42 +0000 (16:28 +0100)]
drm/i915: Fixup preempt-to-busy vs resubmission of a virtual request
As preempt-to-busy leaves the request on the HW as the resubmission is
processed, that request may complete in the background and even cause a
second virtual request to enter queue. This second virtual request
breaks our "single request in the virtual pipeline" assumptions.
Furthermore, as the virtual request may be completed and retired, we
lose the reference the virtual engine assumes is held. Normally, just
removing the request from the scheduler queue removes it from the
engine, but the virtual engine keeps track of its singleton request via
its ve->request. This pointer needs protecting with a reference.
Clinton A Taylor [Fri, 20 Sep 2019 20:58:07 +0000 (13:58 -0700)]
drm/i915/tgl/pll: Set update_active_dpll
Commit 24a7bfe0c2d7 ("drm/i915: Keep the TypeC port mode fixed when the
port is active") added this new hook while in parallel TGL upstream was
happening and this was missed.
Without this driver will crash when TC DDI is added and driver is
preparing to do a full modeset.
drm/i915/tgl: Finish modular FIA support on registers
If platform supports and has modular FIA is enabled, the registers
bits also change, example: reading TC3 registers with modular FIA
enabled, driver should read from FIA2 but with TC1 bits offsets.
It is described in BSpec 50231 for DFLEXDPSP, other registers don't
have the BSpec description but testing in real hardware have proven
that it had moved for all other registers too.
v2:
- Caching index in tc_phy_fia_idx, instead of calculate it each time
v3:
- Setting tc_phy_fia and tc_phy_fia_idx in the same function
Chris Wilson [Mon, 23 Sep 2019 11:00:56 +0000 (12:00 +0100)]
drm/i915/execlists: Refactor -EIO markup of hung requests
Pull setting -EIO on the hung requests into its own utility function.
Having allowed ourselves to short-circuit submission of completed
requests, we can now do the mark_eio() prior to submission and avoid
some redundant operations.
Chris Wilson [Mon, 23 Sep 2019 11:00:55 +0000 (12:00 +0100)]
drm/i915: Only enqueue already completed requests
If we are asked to submit a completed request, just move it onto the
active-list without modifying it's payload. If we try to emit the
modified payload of a completed request, we risk racing with the
ring->head update during retirement which may advance the head past our
breadcrumb and so we generate a warning for the emission being behind
the RING_HEAD.
v2: Commentary for the sneaky, shared responsibility between functions.
v3: Spelling mistakes and bonus assertion
Chris Wilson [Mon, 23 Sep 2019 11:00:54 +0000 (12:00 +0100)]
drm/i915/execlists: Drop redundant list_del_init(&rq->sched.link)
Since amalgamating the queued and active lists in commit 422d7df4f090
("drm/i915: Replace engine->timeline with a plain list"), performing a
i915_request_submit() will remove the request from the execlists
priority queue.
Chris Wilson [Mon, 23 Sep 2019 11:00:53 +0000 (12:00 +0100)]
drm/i915/execlists: Relax assertion for a pinned context image on reset
A gpu hang can occur at any time, given a sufficiently angry gpu. An
example is when it forgets to perform a context-switch at the end of a
request, leaving us with a hanging GPU on a completed request. Here, we
may retire the request, only leaving its context alive via the active
barrier. When we reset the GPU on a completed request, we do not modify
its context image (just updating the ring state) and can safely defer
the assertion that we have the image pinned and ready to modify.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111639 Fixes: dffa8feb3084 ("drm/i915/perf: Assert locking for i915_init_oa_perf_state()") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190923110056.15176-1-chris@chris-wilson.co.uk
Jani Nikula [Fri, 20 Sep 2019 18:54:19 +0000 (21:54 +0300)]
drm/i915: abstract intel_panel_sanitize_ssc() from intel_modeset_init()
The code is too specific and detailed to have open in a high level
function. Abstract away. As a drive-by improvement switch to using
enableddisabled() in logging and git rid of a redundant !!. No
functional changes.
Jani Nikula [Fri, 20 Sep 2019 18:54:18 +0000 (21:54 +0300)]
drm/i915: pass i915 to intel_modeset_driver_remove()
In general, prefer struct drm_i915_private * over struct drm_device *
when either will do. Rename the local variable to i915. Also propagate
to intel_hpd_poll_fini(). No functional changes.
Kai Vehmanen [Fri, 20 Sep 2019 08:39:18 +0000 (11:39 +0300)]
drm/i915: save AUD_FREQ_CNTRL state at audio domain suspend
When audio power domain is suspended, the display driver must
save state of AUD_FREQ_CNTRL on Tiger Lake and Ice Lake
systems. The initial value of the register is set by BIOS and
is read by driver during the audio component init sequence.
Animesh Manna [Fri, 20 Sep 2019 11:59:28 +0000 (17:29 +0530)]
drm/i915/dsb: Enable gamma lut programming using DSB.
Gamma lut programming can be programmed using DSB
where bulk register programming can be done using indexed
register write which takes number of data and the mmio offset
to be written.
Currently enabled for 12-bit gamma LUT which is enabled by
default and later 8-bit/10-bit will be enabled in future
based on need.
v1: Initial version.
v2: Directly call dsb-api at callsites. (Jani)
v3:
- modified the code as per single dsb instance per crtc. (Shashank)
- Added dsb get/put call in platform specific load_lut hook. (Jani)
- removed dsb pointer from dev_priv. (Jani)
v4: simplified code by dropping ref-count implementation. (Shashank)
Animesh Manna [Fri, 20 Sep 2019 11:59:27 +0000 (17:29 +0530)]
drm/i915/dsb: function to trigger workload execution of DSB.
Batch buffer will be created through dsb-reg-write function which can have
single/multiple request based on usecase and once the buffer is ready
commit function will trigger the execution of the batch buffer. All
the registers will be updated simultaneously.
v1: Initial version.
v2: Optimized code few places. (Chris)
v3: USed DRM_ERROR for dsb head/tail programming failure. (Shashank)
v4: reset ins_start_offset after commit. (Jani)
Cc: Imre Deak <imre.deak@intel.com> Cc: Jani Nikula <jani.nikula@intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Shashank Sharma <shashank.sharma@intel.com> Reviewed-by: Shashank Sharma <shashank.sharma@intel.com> Signed-off-by: Animesh Manna <animesh.manna@intel.com> Signed-off-by: Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190920115930.27829-8-animesh.manna@intel.com
Animesh Manna [Fri, 20 Sep 2019 11:59:26 +0000 (17:29 +0530)]
drm/i915/dsb: functions to enable/disable DSB engine.
DSB will be used for performance improvement for some special scenario.
DSB engine will be enabled based on need and after completion of its work
will be disabled. Api added for enable/disable operation by using DSB_CTRL
register.
v1: Initial version.
v2: POSTING_READ added after writing control register. (Shashank)
v3: cosmetic changes done. (Shashank)
Cc: Michel Thierry <michel.thierry@intel.com> Cc: Jani Nikula <jani.nikula@intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Shashank Sharma <shashank.sharma@intel.com> Reviewed-by: Shashank Sharma <shashank.sharma@intel.com> Signed-off-by: Animesh Manna <animesh.manna@intel.com> Signed-off-by: Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190920115930.27829-7-animesh.manna@intel.com
Animesh Manna [Fri, 20 Sep 2019 11:59:24 +0000 (17:29 +0530)]
drm/i915/dsb: Indexed register write function for DSB.
DSB can program large set of data through indexed register write
(opcode 0x9) in one shot. DSB feature can be used for bulk register
programming e.g. gamma lut programming, HDR meta data programming.
v1: initial version.
v2: simplified code by using ALIGN(). (Chris)
v3: ascii table added as code comment. (Shashank)
v4: cosmetic changes done. (Shashank)
v5: reset ins_start_offset. (Jani)
v6: update ins_start_offset in inel_dsb_reg_write.
Cc: Shashank Sharma <shashank.sharma@intel.com> Cc: Imre Deak <imre.deak@intel.com> Cc: Jani Nikula <jani.nikula@intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: Shashank Sharma <shashank.sharma@intel.com> Signed-off-by: Animesh Manna <animesh.manna@intel.com> Signed-off-by: Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190920115930.27829-5-animesh.manna@intel.com
Jani Nikula [Mon, 23 Sep 2019 07:09:23 +0000 (10:09 +0300)]
drm/i915/dsb: single register write function for DSB.
DSB support single register write through opcode 0x1. Generic
api created which accumulate all single register write in a batch
buffer and once DSB is triggered, it will program all the registers
at the same time.
v1: Initial version.
v2: Unused macro removed and cosmetic changes done. (Shashank)
v3: set free_pos to zero in dsb-put() instead dsb-get() and
a cosmetic change. (Shashank)
v4: macro of indexed-write is moved. (Shashank)
Cc: Jani Nikula <jani.nikula@intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Shashank Sharma <shashank.sharma@intel.com> Reviewed-by: Shashank Sharma <shashank.sharma@intel.com> Signed-off-by: Animesh Manna <animesh.manna@intel.com> Signed-off-by: Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190920115930.27829-4-animesh.manna@intel.com
Animesh Manna [Fri, 20 Sep 2019 11:59:22 +0000 (17:29 +0530)]
drm/i915/dsb: DSB context creation.
This patch adds a function, which will internally get the gem buffer
for DSB engine. The GEM buffer is from global GTT, and is mapped into
CPU domain, contains the data + opcode to be feed to DSB engine.
v1: Initial version.
v2:
- removed some unwanted code. (Chris)
- Used i915_gem_object_create_internal instead of _shmem. (Chris)
- cmd_buf_tail removed and can be derived through vma object. (Chris)
v3: vma realeased if i915_gem_object_pin_map() failed. (Shashank)
v4: for simplification and based on current usage added single dsb
object in intel_crtc. (Shashank)
v5: seting NULL to cmd_buf moved outside of mutex in dsb-put(). (Shashank)
v6:
- refcount machanism added.
- Used atomic_add_return and atomic_dec_and_test instead of
atomic_inc and atomic_dec. (Jani)
Cc: Imre Deak <imre.deak@intel.com> Cc: Michel Thierry <michel.thierry@intel.com> Cc: Jani Nikula <jani.nikula@intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Shashank Sharma <shashank.sharma@intel.com> Signed-off-by: Animesh Manna <animesh.manna@intel.com>
[Jani: added #include <linux/types.h> while pushing] Signed-off-by: Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190920115930.27829-3-animesh.manna@intel.com
For icl+, have hw read out to create hw blob of gamma
lut values. icl+ platforms supports multi segmented gamma
mode by default, add hw lut creation for this mode.
This will be used to validate gamma programming using dsb
(display state buffer) which is a tgl specific feature.
Major change done-removal of readouts of coarse and fine segments
because PAL_PREC_DATA register isn't giving propoer values.
State checker limited only to "fine segment"
v2: -readout code for multisegmented gamma has to come
up with some intermediate entries that aren't preserved
in hardware (Jani N)
-linear interpolation (Ville)
-moved common code to check gamma_enable to specific funcs,
since icl doesn't support that
v3: -use u16 instead of __u16 [Jani N]
-used single lut [Jani N]
-improved and more readable for loops [Jani N]
-read values directly to actual locations and then fill gaps [Jani N]
-moved cleaning to patch 1 [Jani N]
-renamed icl_read_lut_multi_seg() to icl_read_lut_multi_segment to
make it similar to icl_load_luts()
-renamed icl_compute_interpolated_gamma_blob() to
icl_compute_interpolated_gamma_lut_values() more sensible, I guess
v4: -removed interpolated func for creating gamma lut values
-removed readouts of fine and coarse segments, failure to read PAL_PREC_DATA
correctly
Gen12 has dual-subslices (DSS), which compared to gen11 subslices have
some duplicated resources/paths. Although DSS behave similarly to 2
subslices, instead of splitting this and presenting userspace with bits
not directly representative of hardware resources, present userspace
with a subslice_mask made up of DSS bits instead.
v2: GEM_BUG_ON on mask size (Lionel)
Bspec: 29547
Bspec: 12247 Cc: Kelvin Gardiner <kelvin.gardiner@intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com> CC: Radhakrishna Sripada <radhakrishna.sripada@intel.com> Cc: Michel Thierry <michel.thierry@intel.com> #v1 Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: José Roberto de Souza <jose.souza@intel.com> Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Signed-off-by: James Ausmus <james.ausmus@intel.com> Signed-off-by: Oscar Mateo <oscar.mateo@intel.com> Signed-off-by: Sudeep Dutt <sudeep.dutt@intel.com> Signed-off-by: Stuart Summers <stuart.summers@intel.com> Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190913075137.18476-2-chris@chris-wilson.co.uk Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Ville Syrjälä [Thu, 18 Jul 2019 14:50:53 +0000 (17:50 +0300)]
drm/i915: Add PIPECONF YCbCr 4:4:4 programming for ILK-IVB
On ILK-IVB the pipe colorspace is configured via PIPECONF
(as opposed to PIPEMISC in BDW+). Let's configure+readout
that stuff correctly.
Enabling YCbCr 4:4:4 output will now be a simple matter of
setting crtc_state->output_format appropriately in the encoder
.compute_config(). However, when we do that we must be
aware of the fact that YCbCr DP output doesn't seem to work
on ILK (resulting image is totally garbled), but on SNB+
it works fine. However HDMI YCbCr output does work correctly
even on ILK.
Ville Syrjälä [Thu, 18 Jul 2019 14:50:52 +0000 (17:50 +0300)]
drm/i915: Set up ILK/SNB csc unit properly for YCbCr output
Prepare the pipe csc for YCbCr output on ilk/snb. The main difference
to IVB+ is the lack of explicit post offsets, and instead we must
configure the CSC info RGB->YUV mode (which takes care of offsetting
Cb/Cr properly) and enable the "black screen offset" bit to add the
required offset to Y.
And while at it throw some comments around the bit defines to
document which platforms have which bits.
Ville Syrjälä [Thu, 18 Jul 2019 14:50:49 +0000 (17:50 +0300)]
drm/i915: Simplify intel_get_crtc_ycbcr_config()
Make intel_get_crtc_ycbcr_config() simpler and rename it
to bdw_get_pipemisc_output_format() to better reflect what
it does.
Also toss in some comments to document that the 4:2:0 PIPECONF
bits are glk+ only. They are mbz on earlier platforms so reading
them unconditionally is safe however.
Ville Syrjälä [Thu, 18 Jul 2019 14:50:48 +0000 (17:50 +0300)]
drm/i915: Don't look at unrelated PIPECONF bits for interlaced readout
Since HSW the PIPECONF progressive vs. interlaced selection is done
with just two bits instead of the earlier three. Let's not look at the
extra bit on HSW+. Also gen2 doesn't support interlaced displays at all.
This is actually fine as is currently because the extra bit is mbz (as
are all three bits on gen2). But just to avoid mishaps in the future
if the bits get reused let's only look at what's properly defined.
Ville Syrjälä [Thu, 18 Jul 2019 16:45:23 +0000 (19:45 +0300)]
drm/i915: Never set limited_color_range=true for YCbCr output
crtc_state->limited_color_range only applies to RGB output but
we're currently setting it even for YCbCr output. That will
lead to conflicting MSA and PIPECONF settings which can mess
up the image. Let's make sure limited_color_range stays unset
with YCbCr output.
Also WARN if we end up with such a bogus combination when
programming the MSA MISC bits as it's impossible to even
indicate quantization rangle for YCbCr via MSA MISC. YCbCr
output is simply assumed to be limited range always. Note
that VSC SDP does provide a mechanism for full range YCbCr,
so in the future we may want to rethink how we compute/store
this state.
And for good measure we add the same WARN to the HDMI path.
Ville Syrjälä [Thu, 18 Jul 2019 14:50:44 +0000 (17:50 +0300)]
drm/i915: Fix AVI infoframe quantization range for YCbCr output
We're configuring the AVI infoframe quantization range bits as if
we're always transmitting RGB pixels. Let's fix this so that we
correctly indicate limited range YCC quantization range when
transmitting YCbCr instead.
Looks like we're currently setting the MSA to xvYCC BT.709 instead
of the YCbCr BT.601 claimed by the comment. But even that comment
is wrong since we configure the CSC matrix to BT.709.
Let's remove the bogus statement from the comment and fix the
MSA to indicate YCbCr BT.709 so that it matches the actual
pixel data we're transmitting.
v2: Remove the separator parameter altogether from
__MAKE_UC_FW_PATH.(Daniele)
- Squash all firmware update patches (Daniele)
v3: s/huc/HuC
- Correct the order of platforms
- Change REVID of cml to 5(Michal)
- Code space changes in huc_def (Daniele)
Chris Wilson [Fri, 20 Sep 2019 08:12:54 +0000 (09:12 +0100)]
Revert "drm/i915/tgl: Implement Wa_1406941453"
Our sanitychecks indicate that while this register is context
saved/restore, the HW does not preserve this bit within the register --
it likely doesn't exist, or one of those mythical bits that the
architects insist does something despite all appearances to the
contrary.
For reference, SAMPLER_MODE is already in i915_reg.h as
GEN10_SAMPLER_MODE and is being setup in icl_ctx_workarounds_init() as
opposed to the chosen location here of rcs_engine_wa_init).
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111754 Fixes: 7f0cc34b5349 ("drm/i915/tgl: Implement Wa_1406941453")
Testcase: igt/i915_selftest/live_workarounds Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Lucas De Marchi <lucas.demarchi@intel.com> Cc: Stuart Summers <stuart.summers@intel.com> Cc: Radhakrishna Sripada <radhakrishna.sripada@intel.com> Cc: Jani Nikula <jani.nikula@linux.intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Acked-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190920081254.18389-1-chris@chris-wilson.co.uk
Chris Wilson [Thu, 19 Sep 2019 11:19:12 +0000 (12:19 +0100)]
drm/i915: Protect timeline->hwsp dereferencing
As not only is the signal->timeline volatile, so will be acquiring the
timeline's HWSP. We must first carefully acquire the timeline from the
signaling request and then lock the timeline. With the removal of the
struct_mutex serialisation of request construction, we can have multiple
timelines active at once, and so we must avoid using the nested mutex
lock as it is quite possible for both timelines to be establishing
semaphores on the other and so deadlock.
Chris Wilson [Thu, 19 Sep 2019 11:19:11 +0000 (12:19 +0100)]
drm/i915: Lock signaler timeline while navigating
As we need to take a walk back along the signaler timeline to find the
fence before upon which we want to wait, we need to lock that timeline
to prevent it being modified as we walk. Similarly, we also need to
acquire a reference to the earlier fence while it still exists!
Though we lack the correct locking today, we are saved by the
overarching struct_mutex -- but that protection is being removed.
v2: Tvrtko made me realise I was being lax and using annotations to
ignore the AB-BA deadlock from the timeline overlap. As it would be
possible to construct a second request that was using a semaphore from the
same timeline as ourselves, we could quite easily end up in a situation
where we deadlocked in our mutex waits. Avoid that by using a trylock
and falling back to a normal dma-fence await if contended.
v3: Eek, the signal->timeline is volatile and must be carefully
dereferenced to ensure it is valid.
Chris Wilson [Thu, 19 Sep 2019 11:19:10 +0000 (12:19 +0100)]
drm/i915: Mark i915_request.timeline as a volatile, rcu pointer
The request->timeline is only valid until the request is retired (i.e.
before it is completed). Upon retiring the request, the context may be
unpinned and freed, and along with it the timeline may be freed. We
therefore need to be very careful when chasing rq->timeline that the
pointer does not disappear beneath us. The vast majority of users are in
a protected context, either during request construction or retirement,
where the timeline->mutex is held and the timeline cannot disappear. It
is those few off the beaten path (where we access a second timeline) that
need extra scrutiny -- to be added in the next patch after first adding
the warnings about dangerous access.
One complication, where we cannot use the timeline->mutex itself, is
during request submission onto hardware (under spinlocks). Here, we want
to check on the timeline to finalize the breadcrumb, and so we need to
impose a second rule to ensure that the request->timeline is indeed
valid. As we are submitting the request, it's context and timeline must
be pinned, as it will be used by the hardware. Since it is pinned, we
know the request->timeline must still be valid, and we cannot submit the
idle barrier until after we release the engine->active.lock, ergo while
submitting and holding that spinlock, a second thread cannot release the
timeline.
v2: Don't be lazy inside selftests; hold the timeline->mutex for as long
as we need it, and tidy up acquiring the timeline with a bit of
refactoring (i915_active_add_request)
Chris Wilson [Thu, 19 Sep 2019 15:18:11 +0000 (16:18 +0100)]
drm/i915/tgl: Suspend pre-parser across GTT invalidations
Before we execute a batch, we must first issue any and all TLB
invalidations so that batch picks up the new page table entries.
Tigerlake's preparser is weakening our post-sync CS_STALL inside the
invalidate pipe-control and allowing the loading of the batch buffer
before we have setup its page table (and so it loads the wrong page and
executes indefinitely).
The igt_cs_tlb indicates that this issue can only be observed on rcs,
even though the preparser is common to all engines. Alternatively, we
could do TLB shootdown via mmio on updating the GTT.
By inserting the pre-parser disable inside EMIT_INVALIDATE, we will also
accidentally fixup execution that writes into subsequent batches, such
as gem_exec_whisper and even relocations performed on the GPU. We should
be careful not to allow this disable to become baked into the uABI! The
issue is that if userspace relies on our disabling of the HW
optimisation, when we are ready to enable that optimisation, userspace
will then be broken...
Testcase: igt/i915_selftests/live_gtt/igt_cs_tlb
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111753 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Acked-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190919151811.9526-1-chris@chris-wilson.co.uk
Ville Syrjälä [Wed, 18 Sep 2019 15:07:07 +0000 (18:07 +0300)]
drm/i915: Don't advertise modes that exceed the max plane size
Modern platforms allow the transcoders hdisplay/vdisplay to exceed the
planes' max resolution. This has the nasty implication that modes on the
connectors' mode list may not be usable when the user asks for a
fullscreen plane. Seeing as that is the most common use case it seems
prudent to filter out modes that don't allow for fullscreen planes to
be enabled.
Let's do that in the connetor .mode_valid() hook so that normally
such modes are kept hidden but the user is still able to forcibly
specify such a mode if they know they don't need fullscreen planes.
This is in line with ealier policies regarding certain clock limits.
The idea is to prevent the casual user from encountering a mode that
would fail under typical conditions, but allow the expert user to
force things if they so wish.
Maybe in the future we should consider automagically using two
planes when one can't cover the entire screen? Wouldn't be a
great match for the current uapi with explicit planes though,
but I guess no worse than using two pipes (which we apparently
have to in the future anyway). Either that or we'd have to
teach userspace to do it for us.
v2: Fix icl+ max plane heigth (Manasi)
Cc: Manasi Navare <manasi.d.navare@intel.com> Cc: Leho Kraav <leho@kraav.com> Cc: Sean Paul <sean@poorly.run> Cc: José Roberto de Souza <jose.souza@intel.com> Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Reviewed-by: Manasi Navare <manasi.d.navare@intel.com> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190918150707.32420-1-ville.syrjala@linux.intel.com
Ville Syrjälä [Thu, 5 Sep 2019 13:50:43 +0000 (16:50 +0300)]
drm/i915: Bump skl+ max plane width to 5k for linear/x-tiled
The officially validated plane width limit is 4k on skl+, however
we already had people using 5k displays before we started to enforce
the limit. Also it seems Windows allows 5k resolutions as well
(though not sure if they do it with one plane or two).
According to hw folks 5k should work with the possible
exception of the following features:
- Ytile (already limited to 4k)
- FP16 (already limited to 4k)
- render compression (already limited to 4k)
- KVMR sprite and cursor (don't care)
- horizontal panning (need to verify this)
- pipe and plane scaling (need to verify this)
So apart from last two items on that list we are already
fine. We should really verify what happens with those last
two items but I don't have a 5k display on hand atm so it'll
have to wait.
In the meantime let's just bump the limit back up to 5k since
several users have already been using it without apparent issues.
At least we'll be no worse off than we were prior to lowering
the limits.
Cc: stable@vger.kernel.org Cc: Sean Paul <sean@poorly.run> Cc: José Roberto de Souza <jose.souza@intel.com> Tested-by: Leho Kraav <leho@kraav.com> Fixes: 372b9ffb5799 ("drm/i915: Fix skl+ max plane width")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111501 Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190905135044.2001-1-ville.syrjala@linux.intel.com Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Reviewed-by: Sean Paul <sean@poorly.run>
Matt Roper [Wed, 18 Sep 2019 23:56:26 +0000 (16:56 -0700)]
drm/i915: Unify ICP and MCC hotplug pin tables
The MCC hpd table is just a subset of the ICP table; we can eliminate it
and use the ICP table everywhere. The extra pins in the table won't be
a problem for MCC since we still supply an appropriate hotplug trigger
mask anywhere the pin table is used.
Matt Roper [Wed, 18 Sep 2019 23:56:25 +0000 (16:56 -0700)]
drm/i915: Future-proof DDC pin mapping
We generally assume future platforms will inherit the behavior of the
most recent platforms, so update our DDC pin mapping defaults to match
how ICP/TGP behave (i.e., pins starting from GMBUS_PIN_1_BXT for combo
PHY's and pins starting from GMBUS_PIN_9_TC1_ICP for TC PHY's). MCC's
non-standard handling of combo PHY C seems like a platform-specific
quirk that is unlikely to be duplicated on future platforms, so continue
handling it as a special case.
Without this change, future platforms would default to gen4-style pin
mapping which is almost certainly not what we'll want.
Chris Wilson [Wed, 18 Sep 2019 14:54:50 +0000 (15:54 +0100)]
drm/i915: Verify the engine after acquiring the active.lock
When using virtual engines, the rq->engine is not stable until we hold
the engine->active.lock (as the virtual engine may be exchanged with the
sibling). Since commit 22b7a426bbe1 ("drm/i915/execlists: Preempt-to-busy")
we may retire a request concurrently with resubmitting it to HW, we need
to be extra careful to verify we are holding the correct lock for the
request's active list. This is similar to the issue we saw with
rescheduling the virtual requests, see sched_lock_engine().
Our assumption that the we can ask the HW to lock the SFC even if not
currently in use does not match the HW commitment. The expectation from
the HW is that SW will not try to lock the SFC if the engine is not
using it and if we do that the behavior is undefined; on ICL the HW
ends up to returning the ack and ignoring our lock request, but this is
not guaranteed and we shouldn't expect it going forward.
Also, failing to get the ack while the SFC is in use means that we can't
cleanly reset it, so fail the engine reset in that scenario.
v2: drop rmw change, keep the log as debug and handle failure (Chris),
improve comments (Tvrtko).
Reported-by: Owen Zhang <owen.zhang@intel.com> Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Acked-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20190919015330.15435-1-daniele.ceraolospurio@intel.com
Chris Wilson [Tue, 17 Sep 2019 19:47:46 +0000 (20:47 +0100)]
drm/i915: Extend Haswell GT1 PSMI workaround to all
A few times in CI, we have detected a GPU hang on our Haswell GT2
systems with the characteristic IPEHR of 0x780c0000. When the PSMI w/a
was first introducted, it was applied to all Haswell, but later on we
found an erratum that supposedly restricted the issue to GT1 and so
constrained it only be applied on GT1. That may have been a mistake...
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111692 Fixes: 167bc759e823 ("drm/i915: Restrict PSMI context load w/a to Haswell GT1")
References: 2c550183476d ("drm/i915: Disable PSMI sleep messages on all rings around context switches") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Acked-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190917194746.26710-1-chris@chris-wilson.co.uk
Matt Roper [Mon, 16 Sep 2019 23:32:51 +0000 (16:32 -0700)]
drm/i915/cml: Add second PCH ID for CMP
The CMP PCH ID we have in the driver is correct for the CML-U machines we have
in our CI system, but the CML-S and CML-H CI machines appear to use a
different PCH ID, leading our driver to detect no PCH for them.
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Anusha Srivatsa <anusha.srivatsa@intel.com>
References: 729ae330a0f2e2 ("drm/i915/cml: Introduce Comet Lake PCH")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111461 Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Acked-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20190916233251.387-1-matthew.d.roper@intel.com
Chris Wilson [Tue, 17 Sep 2019 12:30:55 +0000 (13:30 +0100)]
drm/i915/tgl: Extend MI_SEMAPHORE_WAIT
On Tigerlake, MI_SEMAPHORE_WAIT grew an extra dword, so be sure to
update the length field and emit that extra parameter and any padding
noop as required.
v2: Define the token shift while we are adding the updated MI_SEMAPHORE_WAIT
v3: Use int instead of bool in the addition so that readers are not left
wondering about the intricacies of the C spec. Now they just have to
worry what the integer value of a boolean operation is...
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Michal Winiarski <michal.winiarski@intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190917123055.28965-1-chris@chris-wilson.co.uk
Chris Wilson [Tue, 17 Sep 2019 08:00:29 +0000 (09:00 +0100)]
drm/i915: Only apply a rmw mmio update if the value changes
If we try to clear, or even set, a bit in the register that doesn't
change the register state; skip the write. There's a slight danger in
that the register acts as a latch-on-write, but I do not think we use a
rmw cycle with any such latch registers.
Suggested-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190917080029.27632-1-chris@chris-wilson.co.uk
Jani Nikula [Mon, 16 Sep 2019 09:29:01 +0000 (12:29 +0300)]
drm/i915: stop conflating HAS_DISPLAY() and disabled display
Stop setting ->pipe_mask to zero when display is disabled, allowing us
to have different code paths for not actually having display hardware,
and having display hardware disabled. This lets us develop those two
avenues independently.
There are no functional changes for when there is no display. However,
all uses of for_each_pipe() and for_each_pipe_masked() will start
running for the disabled display case. Put one of the more significant
ones behind checks for INTEL_DISPLAY_ENABLED(), otherwise the cases
should not be hit with disabled display, or they seem benign. Fingers
crossed.
All in all, this might not be the ideal solution. In fact we may have
had something along the lines of this in the past, but we ended up
conflating the two cases. Possibly even by recommendation by yours
truly; I did not dare dig up that part of the history. But the perfect
is the enemy of the good, this is a straightforward change, and lets us
get actual work done in both fronts without interfering with each other.
Ville Syrjälä [Fri, 13 Sep 2019 19:31:55 +0000 (22:31 +0300)]
drm/i915: Allow downscale factor of <3.0 on glk+ for all formats
Bspec says that glk+ max downscale factor is <3.0 for all pixel formats.
Older platforms had a max of <2.0 for NV12. Update the code to deal with
this.
Jani Nikula [Fri, 13 Sep 2019 10:04:07 +0000 (13:04 +0300)]
drm/i915: introduce INTEL_DISPLAY_ENABLED()
Prepare for making a distinction between not having display and having
disabled display. Add INTEL_DISPLAY_ENABLED() and use it where
HAS_DISPLAY() is used after intel_device_info_runtime_init(). This is
initially duplication, as disabling display still leads to ->pipe_mask =
0 and HAS_DISPLAY() being false.
Note that ever since i915.display_disable was introduced, it has not
affected PCH detection even if it uses HAS_DISPLAY(), as display disable
happens after that.
Since INTEL_DISPLAY_ENABLED() will not make sense unless HAS_DISPLAY()
is true, include a warning for catching misuses making decisions on
INTEL_DISPLAY_ENABLED() when HAS_DISPLAY() is false.
v2: Remove INTEL_DISPLAY_ENABLED() check from intel_detect_pch() (Chris)
Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: José Roberto de Souza <jose.souza@intel.com> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: José Roberto de Souza <jose.souza@intel.com> Acked-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190913100407.30991-1-jani.nikula@intel.com
Chris Wilson [Thu, 12 Sep 2019 16:08:34 +0000 (17:08 +0100)]
drm/i915: Don't mix srcu tag and negative error codes
While srcu may use an integer tag, it does not exclude potential error
codes and so may overlap with our own use of -EINTR. Use a separate
outparam to store the tag, and report the error code separately.
Fixes: 2caffbf11762 ("drm/i915: Revoke mmaps and prevent access to fence registers across reset") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190912160834.30601-1-chris@chris-wilson.co.uk
On ICL+, the max supported plane height is 4320, so bump it up
To support 4320, we need to increase the number of bits used to
read plane_height to 13 as opposed to older 12 bits.
v4:
* Adjust the width mask also since extra bits are mbz (Ville)
v3:
* Use 0xffff for mask as extra bits are mbz (Ville)
v2:
* ICL plane height supported is 4320 (Ville)
* Add a new line between max width and max height (Jose)
Chris Wilson [Fri, 13 Sep 2019 06:42:00 +0000 (07:42 +0100)]
drm/i915/gtt: Make sure the gen6 ppgtt is bound before first use
As we remove the struct_mutex protection from around the vma pinning,
counters need to be atomic and aware that there may be multiple threads
simultaneously active.
Chris Wilson [Thu, 12 Sep 2019 13:23:13 +0000 (14:23 +0100)]
drm/i915/tgl: Disable preemption while being debugged
We see failures where the context continues executing past a
preemption event, eventually leading to situations where a request has
executed before we have event submitted it to HW! It seems like tgl is
ignoring our RING_TAIL updates, but more likely is that there is a
missing update required for our semaphore waits around preemption.
Chris Wilson [Thu, 12 Sep 2019 12:48:13 +0000 (13:48 +0100)]
drm/i915/pmu: Use GT parked for estimating RC6 while asleep
As we track when we put the GT device to sleep upon idling, we can use
that callback to sample the current rc6 counters and record the
timestamp for estimating samples after that point while asleep.
v2: Stick to using ktime_t
v3: Track user_wakerefs that interfere with the new
intel_gt_pm_wait_for_idle
v4: No need for parked/unparked estimation if !CONFIG_PM
v5: Keep timer park/unpark logic as was
v6: Refactor duplicated estimate/update rc6 logic
v7: Pull intel_get_pm_get_if_awake() out from the pmu->lock.
Jani Nikula [Wed, 11 Sep 2019 20:29:08 +0000 (23:29 +0300)]
drm/i915: convert device info num_pipes to pipe_mask
Replace device info number of pipes with a bit mask of available
pipes. This will prove handy in the future. There's still a bunch of
future work to do to actually allow a non-consecutive mask of pipes, but
it's a start. No functional changes.
Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: José Roberto de Souza <jose.souza@intel.com> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: José Roberto de Souza <jose.souza@intel.com> Acked-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190911202908.19631-1-jani.nikula@intel.com
drm/i915/pmu: Skip busyness sampling when and where not needed
Since d0aa694b9239 ("drm/i915/pmu: Always sample an active ringbuffer")
the cost of sampling the engine state on execlists platforms became a
little bit higher when both engine busyness and one of the wait states are
being monitored. (Previously the busyness sampling on legacy platforms was
done via seqno comparison so there was no cost of mmio read.)
We can avoid that by skipping busyness sampling when engine supports
software busy stats and so avoid the cost of potential mmio read and
sample accumulation.
Chris Wilson [Thu, 12 Sep 2019 09:29:33 +0000 (10:29 +0100)]
drm/i915/execlists: Ensure the context is reloaded after a GPU reset
After we manipulate the context to allow replay after a GPU reset, force
that context to be reloaded. This should be a layer of paranoia, for if
the GPU was reset, the context will no longer be resident!