Mika Kuoppala [Fri, 24 Apr 2020 21:48:40 +0000 (00:48 +0300)]
drm/i915: Add live selftests for indirect ctx batchbuffers
Indirect ctx batchbuffers are a hw feature of which
batch can be run, by hardware, during context restoration stage.
Driver can setup a batchbuffer and also an offset into the
context image. When context image is marshalled from
memory to registers, and when the offset from the start of
context register state is equal of what driver pre-determined,
batch will run. So one can manipulate context restoration
process at cacheline granularity, given some limitations,
as you need to have rudimentaries in place before you can
run a batch.
Add selftest which will write the ring start register
to a canary spot. This will test that hardware will run a
batchbuffer for the context in question.
v2: request wait fix, naming (Chris)
v3: test order (Chris)
v4: rebase
Mika Kuoppala [Fri, 24 Apr 2020 23:05:46 +0000 (02:05 +0300)]
drm/i915: Add per ctx batchbuffer wa for timestamp
Restoration of a previous timestamp can collide
with updating the timestamp, causing a value corruption.
Combat this issue by using indirect ctx bb to
modify the context image during restoring process.
We can preload value into scratch register. From which
we then do the actual write with LRR. LRR is faster and
thus less error prone as probability of race drops.
v2: tidying (Chris)
v3: lrr for all engines
v4: grp
v5: reg bit
v6: wa_bb_offset, virtual engines (Chris)
Chris Wilson [Fri, 24 Apr 2020 19:14:10 +0000 (20:14 +0100)]
drm/i915: Drop rq->ring->vma peeking from error capture
We only hold the active spinlock while dumping the error state, and this
does not prevent another thread from retiring the request -- as it is
quite possible that despite us capturing the current state, the GPU has
completed the request. As such, it is dangerous to dereference state
below the request as it may already be freed, and the simplest way to
avoid the danger is not include it in the error state.
Chris Wilson [Fri, 24 Apr 2020 16:28:05 +0000 (17:28 +0100)]
drm/i915/gt: Use the RPM config register to determine clk frequencies
For many configuration details within RC6 and RPS we are programming
intervals for the internal clocks. From gen11, these clocks are
configuration via the RPM_CONFIG and so for convenience, we would like
to convert to/from more natural units (ns).
Chris Wilson [Fri, 24 Apr 2020 16:28:04 +0000 (17:28 +0100)]
drm/i915/gt: Trace RPS events
Add tracek to the RPS events (interrupts, worker, enabling, threshold
selection, frequency setting), so that if we have to debug reticent HW
we have some traces to start from.
Chris Wilson [Wed, 22 Apr 2020 00:17:01 +0000 (01:17 +0100)]
drm/i915/gt: Prefer soft-rc6 over RPS DOWN_TIMEOUT
The RPS DOWN_TIMEOUT interrupt is signaled after a period of rc6, and
upon receipt of that interrupt we reprogram the GPU clocks down to the
next idle notch [to help convserve power during rc6]. However, on
execlists, we benefit from soft-rc6 immediately parking the GPU and
setting idle frequencies upon idling [within a jiffie], and here the
interrupt prevents us from restarting from our last frequency.
In the process, we can simply opt for a static pm_events mask and rely
on the enable/disable interrupts to flush the worker on parking.
This will reduce the amount of oscillation observed during steady
workloads with microsleeps, as each time the rc6 timeout occurs we
immediately follow with a waitboost for a dropped frame.
Ville Syrjälä [Wed, 22 Apr 2020 16:19:17 +0000 (19:19 +0300)]
drm/i915: Have pfit calculations return an error code
Change intel_{gmch,pch}_panel_fitting() to return a normal
error vs. success int. We'll need this later to validate that
the margin properties aren't misconfigured.
Ville Syrjälä [Wed, 22 Apr 2020 16:19:16 +0000 (19:19 +0300)]
drm/i915: Pass connector state to pfit calculations
Pass the entire connector state to intel_{gmch,pch}_panel_fitting().
For now we just need to get at .scaling_mode but in the future we'll
want access to the margin properties as well.
Ville Syrjälä [Wed, 22 Apr 2020 16:19:14 +0000 (19:19 +0300)]
drm/i915: Use drm_rect to store the pfit window pos/size
Make things a bit more abstract by replacing the pch_pfit.pos/size
raw register values with a drm_rect. Makes it slighly more convenient
to eg. compute the scaling factors.
Ville Syrjälä [Wed, 22 Apr 2020 16:19:12 +0000 (19:19 +0300)]
drm/i915: Fix skl+ non-scaled pfit modes
Fix skl_update_scaler_crtc() to deal with different scaling
modes correctly. The current implementation assumes
DRM_MODE_SCALE_FULLSCREEN. Fortunately we don't expose any
border properties currently so the code does actually end
up doing the right thing (assigning a scaler for pfit).
The code does need to be fixed before any borders are
exposed.
Also we have redundant calls to skl_update_scaler_crtc() in
dp/hdmi .compute_config() which can be nuked. They were anyway
called before we had even computed the pfit state so were
basically nonsense. The real call we need to keep is in
intel_crtc_atomic_check().
v2: Deal witrh skl_update_scaler_crtc() in intel_dp_ycbcr420_config()
Chris Wilson [Wed, 22 Apr 2020 19:05:58 +0000 (20:05 +0100)]
drm/i915: Only close vma we open
The history of i915_vma_close() is confusing, as is its use. As the
lifetime of the i915_vma is currently bounded by the object it is
attached to, we needed a means of identify when a vma was no longer in
use by userspace (via the user's fd). This is further complicated by
that only ppgtt vma should be closed at the user's behest, as the ggtt
were always shared.
Now that we attach the vma to a lut on the user's context, the open
count does indicate how many unique and open context/vm are referencing
this vma from the user. As such, we can and should just use the
open_count to track when the vma is still in use by userspace.
It's a poor man's replacement for reference counting.
Chris Wilson [Wed, 22 Apr 2020 07:42:03 +0000 (08:42 +0100)]
drm/i915/selftests: Add request throughput measurement to perf
Under ideal circumstances, the driver should be able to keep the GPU
fully saturated with work. Measure how close to ideal we get under the
harshest of conditions with no user payload.
v2: Also measure throughput using only one thread.
Chris Wilson [Thu, 23 Apr 2020 08:59:39 +0000 (09:59 +0100)]
drm/i915/gt: Check carefully for an idle engine in wait-for-idle
intel_gt_wait_for_idle() tries to wait until all the outstanding requests
are retired and the GPU is idle. As a side effect of retiring requests,
we may submit more work to flush any pm barriers, and so the
wait-for-idle tries to flush the background pm work and catch the new
requests. However, if the work completed in the background before we
were able to flush, it would queue the extra barrier request without us
noticing -- and so we would return from wait-for-idle with one request
remaining. (This breaks e.g. record_default_state where we need to wait
until that barrier is retired, and it may slow suspend down by causing
us to wait on the background retirement worker as opposed to immediately
retiring the barrier.)
However, since we track if there has been a submission since the engine
pm barrier, we can very quickly detect if the idle barrier is still
outstanding.
Chris Wilson [Thu, 23 Apr 2020 11:53:15 +0000 (12:53 +0100)]
drm/i915/gt: Carefully order virtual_submission_tasklet
During the virtual engine's submission tasklet, we take the request and
insert into the submission queue on each of our siblings. This seems
quite simply, and so no problems with ordering. However, the sibling
execlists' submission tasklets may run concurrently with the virtual
engine's tasklet, submitting the request to HW before the virtual
finishes its task of telling all the siblings. If this happens, the
sibling tasklet may *reorder* the ve->sibling[] array that the virtual
engine tasklet is processing. This can *only* reorder within the
elements already processed by the virtual engine, nevertheless the
race is detected by KCSAN:
[ 185.580014] BUG: KCSAN: data-race in execlists_dequeue [i915] / virtual_submission_tasklet [i915]
[ 185.580054]
[ 185.580076] write to 0xffff8881f1919860 of 8 bytes by interrupt on cpu 2:
[ 185.580553] execlists_dequeue+0x6ad/0x1600 [i915]
[ 185.581044] __execlists_submission_tasklet+0x48/0x60 [i915]
[ 185.581517] execlists_submission_tasklet+0xd3/0x170 [i915]
[ 185.581554] tasklet_action_common.isra.0+0x42/0x90
[ 185.581585] __do_softirq+0xc8/0x206
[ 185.581613] run_ksoftirqd+0x15/0x20
[ 185.581641] smpboot_thread_fn+0x15a/0x270
[ 185.581669] kthread+0x19a/0x1e0
[ 185.581695] ret_from_fork+0x1f/0x30
[ 185.581717]
[ 185.581736] read to 0xffff8881f1919860 of 8 bytes by interrupt on cpu 0:
[ 185.582231] virtual_submission_tasklet+0x10e/0x5c0 [i915]
[ 185.582265] tasklet_action_common.isra.0+0x42/0x90
[ 185.582291] __do_softirq+0xc8/0x206
[ 185.582315] run_ksoftirqd+0x15/0x20
[ 185.582340] smpboot_thread_fn+0x15a/0x270
[ 185.582368] kthread+0x19a/0x1e0
[ 185.582395] ret_from_fork+0x1f/0x30
[ 185.582417]
We can prevent this race by checking for the ve->request after looking
up the sibling array.
Imre Deak [Wed, 22 Apr 2020 12:34:40 +0000 (15:34 +0300)]
drm/i915/icl: Fix timeout handling during TypeC AUX power well enabling
Fix the check for when an AUX power well enabling timeout is expected on
a legacy TypeC port.
Fixes: 89e01caac641 ("drm/i915: Use single set of AUX powerwell ops for gen11+") Cc: Matt Roper <matthew.d.roper@intel.com> Cc: José Roberto de Souza <jose.souza@intel.com> Signed-off-by: Imre Deak <imre.deak@intel.com> Reviewed-by: José Roberto de Souza <jose.souza@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200422123440.19522-1-imre.deak@intel.com
Chris Wilson [Wed, 22 Apr 2020 14:17:49 +0000 (15:17 +0100)]
drm/i915/execlists: Drop request-before-CS assertion
When we migrated to execlists, one of the conditions we wanted to test
for was whether the breadcrumb seqno was being written before the
breadcumb interrupt was delivered. This was following on from issues
observed on previous generations which were not so strongly ordered. With
the removal of the missed interrupt detection, we have not reliable
means of detecting the out-of-order seqno/interrupt but instead tried to
assert that the relationship between the CS event interrupt and the
breadwrite should be strongly ordered. However, Icelake proves it is
possible for the HW implementation to forget about minor little details
such as write ordering and so the order between *processing* the CS
event and the breadcrumb is unreliable.
Remove the unreliable assertion, but leave a debug telltale in case we
have reason to suspect.
Chris Wilson [Wed, 22 Apr 2020 07:28:05 +0000 (08:28 +0100)]
drm/i915/gem: Hold obj->vma.lock over for_each_ggtt_vma()
While the ggtt vma are protected by their object lifetime, the list
continues until it hits a non-ggtt vma, and that vma is not protected
and may be freed as we inspect it. Hence, we require the obj->vma.lock
to protect the list as we iterate.
An example of forgetting to hold the obj->vma.lock is
Now to take the spinlock during the list iteration, we need to break it
down into two phases. In the first phase under the lock, we cannot sleep
and so must defer the actual work to a second list, protected by the
ggtt->mutex.
We also need to hold the spinlock during creation of a new vma to
serialise with updates of the tiling on the object.
Reported-by: Dave Airlie <airlied@redhat.com> Fixes: 2850748ef876 ("drm/i915: Pull i915_vma_pin under the vm->mutex") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Dave Airlie <airlied@redhat.com> Cc: <stable@vger.kernel.org> # v5.5+ Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200422072805.17340-1-chris@chris-wilson.co.uk
Chris Wilson [Wed, 22 Apr 2020 10:09:03 +0000 (11:09 +0100)]
drm/i915/selftests: Try to detect rollback during batchbuffer preemption
Since batch buffers dominant execution time, most preemption requests
should naturally occur during execution of a batch buffer. We wish to
verify that should a preemption occur within a batch buffer, when we
come to restart that batch buffer, it occurs at the interrupted
instruction and most importantly does not rollback to an earlier point.
v2: Do not clear the GPR at the start of the batch, but rely on them
being clear for new contexts.
Chris Wilson [Wed, 22 Apr 2020 08:38:55 +0000 (09:38 +0100)]
drm/i915/selftests: Disable heartbeat around RPS interrupt testing
For verifying reciving the EI interrupts, we need to hold the GPU in
very precise conditions (in terms of C0 cycles during the EI). If we
preempt the busy load to handle the heartbeat, this may perturb the busy
load causing us to miss the interrupt.
The other tests, while not as time sensitive, may also be slightly
perturbed, so apply the heartbeat protection across all the
measurements.
Chris Wilson [Tue, 21 Apr 2020 17:13:51 +0000 (18:13 +0100)]
drm/i915/selftests: Unroll the CS frequency loop
Having noticed that MI_BB_START is incurring a memory stall (see the
correlation with uncore frequency), we have to unroll the loop in order
to diminish the impact of the MI_BB_START on the instruction throughput.
Chris Wilson [Tue, 21 Apr 2020 09:25:04 +0000 (10:25 +0100)]
drm/i915/gt: Poison residual state [HWSP] across resume.
Since we may lose the content of any buffer when we relinquish control
of the system (e.g. suspend/resume), we have to be careful not to rely
on regaining control. A good method to detect when we might be using
garbage is by always injecting that garbage prior to first use on
load/resume/etc.
v2: Drop sanitize callback on cleanup
v3: Move seqno reset to timeline enter, so we reset all timelines.
However, this is done on every activation during runtime and not reset.
The similar level of paranoia we apply to correcting context state after
a period of inactivity.
Chris Wilson [Tue, 21 Apr 2020 14:22:36 +0000 (15:22 +0100)]
drm/i915/selftests: Disable C-states when measuring RPS frequency response
Let's isolate the impact of cpu frequency selection on determing the GPU
throughput in response to selection of RPS frequencies.
For real systems, we do have to be concerned with the impact of
integrating c-states, p-states and rp-states, but for the sake of
proving whether or not RPS works, one baby step at a time.
For the record, as one would hope, it does not seem to impact on the
measured performance, but we do it anyway to reduce the number of
variables. Later, we can extend the testing to encourage the the
cpu/pkg to try and sleep while the GPU is busy.
Chris Wilson [Tue, 21 Apr 2020 12:46:36 +0000 (13:46 +0100)]
drm/i915/selftests: Show the full scaling curve on failure
If we detect that the RPS end points do not scale perfectly, take the
time to measure all the in between values as well. We are aborting the
test, so we might as well spend the available time gathering critical
debug information instead.
Jani Nikula [Mon, 20 Apr 2020 13:16:32 +0000 (16:16 +0300)]
drm/i915/audio: fix compressed_bpp check
The early check for compressed_bpp being zero is too early, as it is hit
also when DSC is not enabled. Move the checks down to where the values
are actually needed. This is a paranoid check for a situation that
should not happen, so we don't really care about handling it gracefully
apart from not oopsing.
Fixes: 48b8b04c791d ("drm/i915/display: Enable DP Display Audio WA") Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/1750 Cc: Anshuman Gupta <anshuman.gupta@intel.com> Cc: Uma Shankar <uma.shankar@intel.com> Reviewed-by: Uma Shankar <uma.shankar@intel.com> Signed-off-by: Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200420131632.23283-1-jani.nikula@intel.com
Jani Nikula [Mon, 20 Apr 2020 14:04:38 +0000 (17:04 +0300)]
drm/i915: drop a bunch of superfluous inlines
Remove a number of inlines from .c files, and let the compiler decide
what's best. There's more to do, but need to start somewhere, and need
to start setting the example.
Matt Roper [Wed, 15 Apr 2020 23:34:35 +0000 (16:34 -0700)]
drm/i915: Use single set of AUX powerwell ops for gen11+
AUX power wells sometimes need additional handling besides just
programming the specific power well registers:
* Type-C PHY's also require additional Type-C register programming
* ICL combo PHY's require additional workarounds
* TGL & EHL combo PHY's can be treated like any other power well
Today we have dedicated aux ops for the ICL combo PHY and Type-C cases.
This works fine, but means that when a new platform shows up with
identical general power well handling, but different types of PHYs on
its outputs, we have to define an entire new power well table for that
platform and can't just re-use the table from the earlier platform -- as
an example, see ehl_power_wells[], which is a subset of
icl_power_wells[], *except* that we need to specify different AUX ops
for the third display.
If we instead create a single set of top-level aux ops that will check
the PHY type and then dispatch to the appropriate handlers, we can get
more reuse out of our power well definitions. This allows us to
immediately eliminate ehl_power_wells[] and simply reuse the ICL table;
if future platforms follow the same general power well assignments as
either ICL or TGL, we'll be able to re-use those tables in the same way.
Note that I've only changed ICL+ platforms over to using the new icl_aux
ops; at this point it's unlikely that we'll have any new platforms that
re-use gen9 or earlier power well configurations.
v2:
- ICL_AUX_PW_TO_PHY() won't return the proper PHY for TBT AUX power
wells. But we know those wells will only used on Type-C outputs
anyway, so we can just check is is_tc_tbt flag in the condition.
(Jose).
igt_ppgtt_pin_update() invokes i915_gem_context_get_vm_rcu(), which
returns a reference of the i915_address_space object to "vm" with
increased refcount.
When igt_ppgtt_pin_update() returns, "vm" becomes invalid, so the
refcount should be decreased to keep refcount balanced.
The reference counting issue happens in two exception handling paths of
igt_ppgtt_pin_update(). When i915_gem_object_create_internal() returns
IS_ERR, the refcnt increased by i915_gem_context_get_vm_rcu() is not
decreased, causing a refcnt leak.
Fix this issue by jumping to "out_vm" label when
i915_gem_object_create_internal() returns IS_ERR.
Chris Wilson [Mon, 20 Apr 2020 17:27:39 +0000 (18:27 +0100)]
drm/i915/selftests: Exercise dynamic reclocking with RPS
After having testing all the RPS controls individually, we need to take
a step back and check how our RPS worker integrates them to perform
dynamic GPU reclocking. So do that by submitting a spinner and wait and
see what happens.
Chris Wilson [Mon, 20 Apr 2020 17:27:35 +0000 (18:27 +0100)]
drm/i915/selftests: Skip energy consumption tests if not controlling freq
If we can not manipulate the frequency with RPS, then comparing min/max
power consumption is pointless / misleading. We will leave the warning
about not being able to control the frequency selection via RPS to other
tests so as not to confuse this more specialised check.
Chris Wilson [Mon, 20 Apr 2020 17:27:34 +0000 (18:27 +0100)]
drm/i915/selftests: Verify frequency scaling with RPS
One of the core tenents of reclocking the GPU is that its throughput
scales with the clock frequency. We can observe this by incrementing a
loop counter on the GPU, and compare the different execution rates at
the notional RPS frequencies.
Ville Syrjälä [Fri, 17 Apr 2020 15:27:34 +0000 (18:27 +0300)]
drm/i915: Push MST link retraining to the hotplug work
We shouldn't try to do link retraining from the short hpd handler.
We can't take any modeset locks there so this is racy as hell.
Push the whole thing into the hotplug work like we do with SST.
We'll just have to adjust the SST retraining code to deal with
the MST encoders and multiple pipes.
TODO: I have a feeling we should just rip this all out and
do a full modeset instead. Stuff like port sync and the tgl+
MST master transcoder stuff maybe doesn't work well if we
try to retrain without following the proper modeset sequence.
So far haven't done any actual tests to confirm that though.
Ville Syrjälä [Fri, 17 Apr 2020 15:27:33 +0000 (18:27 +0300)]
drm/i915: Flatten intel_dp_check_mst_status() a bit
Make intel_dp_check_mst_status() somewhat legible by humans.
Note that the return value of drm_dp_mst_hpd_irq() is always
either 0 or -ENOMEM, and we never did anything with the latter
so we can just ignore the whole thing.
We can also get rid of the direct drm_dp_mst_topology_mgr_set_mst(false)
call since returning -EINVAL causes the caller to do the very same call
for us.
Ville Syrjälä [Fri, 17 Apr 2020 13:47:19 +0000 (16:47 +0300)]
drm/i915: Push TRANS_DDI_FUNC_CTL into the encoder->enable() hook
Push the TRANS_DDI_FUNC_CTL into the encoder enable hook. The disable
is already there, and as a followup will enable us to pass the encoder
all the way down.
Ville Syrjälä [Fri, 17 Apr 2020 13:47:18 +0000 (16:47 +0300)]
drm/i915: Move the TRANS_DDI_FUNC_CTL enable to a later point
No reason that I can see why we should enable TRANS_DDI_FUNC_CTL
before we set up the watermarks of configure the mbus stuff.
In fact reordering these seems to match the bspec sequence better,
and crucially will allow us to push the TRANS_DDI_FUNC_CTL enable
into the encoder enable hook as a followup.
Ville Syrjälä [Fri, 17 Apr 2020 13:47:17 +0000 (16:47 +0300)]
drm/i915: Pass encoder to intel_ddi_enable_pipe_clock()
Since intel_ddi_enable_pipe_clock() was pushed down into the
encoder hooks we can pass on the encoder instead of having
to use intel_ddi_get_crtc_encoder().
Peter Jones [Fri, 6 Jul 2018 19:04:24 +0000 (15:04 -0400)]
Make the "Reducing compressed framebufer size" message be DRM_INFO_ONCE()
This was sort of annoying me:
random:~$ dmesg | tail -1
[523884.039227] [drm] Reducing the compressed framebuffer size. This may lead to less power savings than a non-reduced-size. Try to increase stolen memory size if available in BIOS.
random:~$ dmesg | grep -c "Reducing the compressed"
47
This patch makes it DRM_INFO_ONCE() just like the similar message
farther down in that function is pr_info_once().
Cc: stable@vger.kernel.org Signed-off-by: Peter Jones <pjones@redhat.com> Acked-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/1745 Link: https://patchwork.freedesktop.org/patch/msgid/20180706190424.29194-1-pjones@redhat.com
[vsyrjala: Rebase due to per-device logging] Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Chris Wilson [Mon, 20 Apr 2020 12:53:55 +0000 (13:53 +0100)]
drm/i915/gem: Remove object_is_locked assertion from unpin_from_display_plane
Since moving the obj->vma.list to a spin_lock, and the vm->bound_list to
its vm->mutex, along with tracking shrinkable status under its own
spinlock, we no long require the object to be locked by the caller.
This is fortunate as it appears we can be called with the lock along an
error path in flipping:
Jani Nikula [Fri, 17 Apr 2020 06:51:32 +0000 (09:51 +0300)]
drm/i915/audio: error log non-zero audio power refcount after unbind
We have some module unload/reload tests hitting an issue with i915
unbinding the component interface before the audio driver has properly
put the power. Log an error about it for ease of debugging. (Normally
this leads to a wakeref debug splat on the power well.)
Fix the warning caused by enabling the autosectionlabel extension in the
kernel Sphinx build:
Documentation/gpu/i915.rst:610: WARNING: duplicate label
gpu/i915:layout, other instance in Documentation/gpu/i915.rst
The autosectionlabel extension adds labels to each section title for
cross-referencing, but forbids identical section titles in a
document. With kernel-doc, this includes sections titles in the included
kernel-doc comments.
In the warning message, Sphinx is unable to reference the labels in
their true locations in the kernel-doc comments in source. In this case,
there's "Layout" sections in both gt/intel_workarounds.c and
i915_reg.h. Rename the section in the latter to "File Layout".
Michael J. Ruhl [Fri, 17 Apr 2020 19:51:07 +0000 (15:51 -0400)]
drm/i915: Refactor setting dma info to a common helper
DMA_MASK bit values are different for different generations.
This will become more difficult to manage over time with the open
coded usage of different versions of the device.
Fix by:
disallow setting of dma mask in AGP path (< GEN(5) for i915,
add dma_mask_size to the device info configuration,
updating open code call sequence to the latest interface,
refactoring into a common function for setting the dma segment
and mask info
Colin Ian King [Fri, 17 Apr 2020 16:08:29 +0000 (17:08 +0100)]
drm/i915: remove redundant assignment to variable test_result
The variable test_result is being initialized with a value that is
never read and it is being updated later with a new value. The
initialization is redundant and can be removed.
drm/i915/display: Load DP_TP_CTL/STATUS offset before use it
Right now dp.regs.dp_tp_ctl/status are only set during the encoder
pre_enable() hook, what is causing all reads and writes to those
registers to go to offset 0x0 before pre_enable() is executed.
So if i915 takes the BIOS state and don't do a modeset any following
link retraing will fail.
In the case that i915 needs to do a modeset, the DDI disable sequence
will write to a wrong register not disabling DP 'Transport Enable' in
DP_TP_CTL, making a HDMI modeset in the same port/transcoder to
not light up the monitor.
So here for GENs older than 12, that have those registers fixed at
port offset range it is loading at encoder/port init while for GEN12
it will keep setting it at encoder pre_enable() and during HW state
readout.
Fixes: 4444df6e205b ("drm/i915/tgl: move DP_TP_* to transcoder") Cc: Matt Roper <matthew.d.roper@intel.com> Cc: Lucas De Marchi <lucas.demarchi@intel.com> Signed-off-by: José Roberto de Souza <jose.souza@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200414230442.262092-1-jose.souza@intel.com
drm/i915/tc: Catch TC users accessing FIA registers without enable aux
As described in "drm/i915/tc/icl: Implement TC cold sequences" users
of TC functions should held aux power well during access to avoid
read garbage due HW in TC cold state.
v3:
- renamed is_tc_cold_blocked() to assert_tc_cold_blocked()
- restored the removed 0xffffffff checks
TC ports can enter in TCCOLD to save power and is required to request
to PCODE to exit this state before use or read to TC registers.
For TGL there is a new MBOX command to do that with a parameter to ask
PCODE to exit and block TCCOLD entry or unblock TCCOLD entry.
So adding a new power domain to reuse the refcount and only allow
TC cold when all TC ports are not in use.
v2:
- fixed missing case in intel_display_power_domain_str()
- moved tgl_tc_cold_request to intel_display_power.c
- renamed TGL_TC_COLD_OFF to TGL_TC_COLD_OFF_POWER_DOMAINS
- added all TC and TBT aux power domains to
TGL_TC_COLD_OFF_POWER_DOMAINS
v3:
- added one msec sleep when PCODE returns -EAGAIN
- added timeout of 5msec to not loop forever if
sandybridge_pcode_write_timeout() keeps returning -EAGAIN
v4:
- Made failure to block or unblock TC cold a error
- removed 5msec timeout, instead giving PCODE 1msec by up 3 times to
recover from the internal error
drm/i915/tc: Skip ref held check for TC legacy aux power wells
As part of ICL TC cold exit sequences we need to request aux power
well before lock the access to TC ports, so skiping the
intel_tc_port_ref_held() check for TC legacy ports.
This is required for legacy/static TC ports as IOM is not aware of
the connection and will not trigger the TC cold exit.
Just request PCODE to exit TCCOLD is not enough as it could enter
again before driver makes use of the port, to prevent it BSpec states
that aux powerwell should be held.
So here embedding the TC cold exit sequence into ICL aux enable,
it will enable aux and then request TC cold to exit.
The TC cold block(exit and aux hold) and unblock was added to some
exported TC functions for the others and to access PHY registers,
callers should enable and keep aux powerwell enabled during access.
Also adding TC cold check and warnig in tc_port_load_fia_params() as
at this point of the driver initialization we can't request power
wells, if we get this warning we will need to figure out how to handle
it.
v2:
- moved ICL TC cold exit function to intel_display_power
- using dig_port->tc_legacy_port to only execute sequences for legacy
ports, hopefully VBTs will have this right
- fixed check to call _hsw_power_well_continue_enable()
- calling _hsw_power_well_continue_enable() unconditionally in
icl_tc_phy_aux_power_well_enable(), if needed we will surpress timeout
warnings of TC legacy ports
- only blocking TC cold around fia access
v3:
- added timeout of 5msec to not loop forever if
sandybridge_pcode_write_timeout() keeps returning -EAGAIN
returning -EAGAIN in in icl_tc_cold_exit()
- removed leftover tc_cold_wakeref
- added one msec sleep when PCODE returns -EAGAIN
v4:
- removed 5msec timeout, instead giving 1msec to whoever is using
PCODE to finish it up to 3 times
- added a comment about turn TC cold exit failure as a error in future
BSpec: 21750 Closes: https://gitlab.freedesktop.org/drm/intel/issues/1296 Cc: Imre Deak <imre.deak@intel.com> Cc: Cooper Chiou <cooper.chiou@intel.com> Cc: Kai-Heng Feng <kai.heng.feng@canonical.com> Reviewed-by: Imre Deak <imre.deak@intel.com> Signed-off-by: José Roberto de Souza <jose.souza@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200414194956.164323-4-jose.souza@intel.com
This is a similar function to intel_aux_power_domain() but it do not
care about TBT ports, this will be needed by ICL TC sequences.
v2:
- renamed to intel_legacy_aux_to_power_domain()
Cc: Imre Deak <imre.deak@intel.com> Cc: Cooper Chiou <cooper.chiou@intel.com> Cc: Kai-Heng Feng <kai.heng.feng@canonical.com> Reviewed-by: Imre Deak <imre.deak@intel.com> Tested-by: You-Sheng Yang <vicamo.yang@canonical.com> Signed-off-by: José Roberto de Souza <jose.souza@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200414194956.164323-2-jose.souza@intel.com
drm/i915/display: Move out code to return the digital_port of the aux ch
Moving the code to return the digital port of the aux channel also
removing the intel_phy_is_tc() to make it generic.
digital_port will be needed in icl_tc_phy_aux_power_well_enable()
so adding it as a parameter to icl_tc_port_assert_ref_held().
While at at removing the duplicated call to icl_tc_phy_aux_ch() in
icl_tc_port_assert_ref_held().
v2:
- fixed build when DRM_I915_DEBUG_RUNTIME_PM is not set
- moved to before hsw_wait_for_power_well_enable() as it will be
needed by hsw_wait_for_power_well_enable() in a future patch
v4:
- fixed action of if (!dig_port), continue instead of return
drm/i915: Add missing deinitialization cases of load failure
The intel_display_power_put_async() used in TC cold sequences made
easy to hit the missing deinitialization of driver in case of load
failure as seen in the stack trace bellow.
intel_modeset_driver_remove_noirq() had to be removed from
i915_driver_modeset_remove_noirq() as those are different
initialialition steps with IRQ and GEM initialization in between then.
v2:
- fixed handling in case of failure in drm_vblank_init()
- moved i915_gem_driver_remove() call to before
i915_driver_modeset_remove_noirq() this match initialization order too
v3:
- reverting call swap between i915_reset_error_state() and i915_gem_driver_remove()
call order
- improved label naming in i915_driver_modeset_probe_noirq()
Closes: https://gitlab.freedesktop.org/drm/intel/issues/1647 Cc: Imre Deak <imre.deak@intel.com> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Cc: Jani Nikula <jani.nikula@intel.com> Reviewed-by: Imre Deak <imre.deak@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: José Roberto de Souza <jose.souza@intel.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20200416185841.125686-1-jose.souza@intel.com
Some workarounds are not sticking across suspend resume cycles. The
forcewake ranges table has been updated and would reflect the hardware
appropriately.
Closes: https://gitlab.freedesktop.org/drm/intel/issues/1222
v2: Simplify the table and use 0 for some unused ranges(Matt)
Chris Wilson [Fri, 17 Apr 2020 15:20:18 +0000 (16:20 +0100)]
drm/i915/selftests: Check power consumption at min/max frequencies
A basic premise of RPS is that at lower frequencies, not only do we run
slower, but we save power compared to higher frequencies. For example,
when idle, we set the minimum frequency just in case there is some
residual current. Since the power curve should be a physical
relationship, if we find no power saving it's likely that we've broken
our frequency handling, so test!
Lets have a unified way to handle SAGV changes,
espoecially considering the upcoming Gen12 changes.
Current "standard" way of doing this in commit_tail
is pre/post plane updates, when everything which
has to be forbidden and not supported in new config
has to be restricted before update and relaxed after
plane update.
drm/i915: Prepare to extract gen specific functions from intel_can_enable_sagv
Addressing one of the comments, recommending to extract platform
specific code from intel_can_enable_sagv as a preparation, before
we are going to add support for tgl+.
v2: - Removed whitespace
v3: - Removed premature debug and new cycle introduction(Ville)
- Added missing no active pipes check(Ville)
v4: - Fixed stupid mistake with plane_state caused by stupid macro change
Add correspondent helpers to be able to get old/new bandwidth
global state object.
v2: - Fixed typo in function call
v3: - Changed new functions naming to use convention proposed
by Jani Nikula, i.e intel_bw_* in intel_bw.c file.
v4: - Change function naming back to intel_atomic* pattern,
was decided to rename in a separate patch series.
v5: - Fix function naming to match existing practices(Ville)
v6: - Removed spurious whitespace
v7: - Removed bw_state NULL checks(Ville)
Chris Wilson [Fri, 17 Apr 2020 09:39:28 +0000 (10:39 +0100)]
drm/i915/selftests: Take the engine wakeref around __rps_up_interrupt
Since we are touching the device to read the registers, we are required
to ensure the device is awake at the time. Currently, we believe
ourselves to be inside the active request [thus an active engine
wakeref], but since that may be retired in the background, we can
spontaneously lose the wakeref and the ability to probe the HW.
Chris Wilson [Fri, 17 Apr 2020 09:39:27 +0000 (10:39 +0100)]
drm/i915/selftests: Delay spinner before waiting for an interrupt
It seems that although (perhaps because of the memory stall?) the
spinner has signaled that it has started, it still takes some time to
spin up to 100% utilisation of the HW. Since the test depends on the
full utilisation of the HW to trigger the RPS interrupt, wait a little
bit and flush the interrupt status to be sure that the event we see if
from the spinner.
Chris Wilson [Thu, 16 Apr 2020 11:41:17 +0000 (12:41 +0100)]
drm/i915/gt: Scrub execlists state on resume
Before we resume, we reset the HW so we restart from a known good state.
However, as a part of the reset process, we drain our pending CS event
queue -- and if we are resuming that does not correspond to internal
state. On setup, we are scrubbing the CS pointers, but alas only on
setup.
Apply the sanitization not just to setup, but to all resumes.
Uma Shankar [Thu, 16 Apr 2020 10:54:19 +0000 (16:24 +0530)]
drm/i915/display: Enable DP Display Audio WA
For certain DP VDSC bpp settings, hblank asserts before hblank_early,
leading to a bad audio state. Driver need to program "hblank early
enable" and "samples per line" parameters in AUDIO_CONFIG_BE
register.
This is Display Audio WA #1406928334 for 4k+VDSC usecase
applicable on DP encoders. Implemented the same.
v2: Fixed build failures on 32bit machine.
v3: Dropped u64, added helpers for sample room calculation,
other general comments as per Jani Nikula's feedback.
Also fixed connector type check (spotted by Anshuman)
v4: Addressed Jani Nikula and Kai's review comments.
v5: Addressed Anshuman's review comment and used crtc_* variable
to get timings.
It requires a separate debugfs attribute to expose lpsp
status to user space, as there may be display less configuration
without any valid connected output, those configuration will not be
able to test lpsp status, if lpsp status exposed from a connector
based debugfs attribute.
New i915_pm_lpsp igt solution approach relies on connector specific
debugfs attribute i915_lpsp_capability, it exposes whether an output is
capable of driving lpsp.
v2:
- CI fixup.
v3:
- register i915_lpsp_info only for supported connector. [Jani]
- use intel_display_power_well_is_enabled() instead of looking
inside power_well count. [Jani]
- fixes the lpsp capable conditional logic. [Jani]
- combined the lpsp capable and enable info. [Jani]
v4:
- Separate out connector based debugfs i915_lpsp_capability
lpsp enable status would be exposes by different entry. [Animesh]
v5:
- Add Platform Gen condition to add i915_lpsp_capability
and some cosmetic nitpick changes. [Animesh]
Gen11 onwards PG3 is contains functions for pipe B,
external displays, and VGA. It make sense to add
a power well id with name ICL_DISP_PW_3 rather then
TGL_DISP_PW_3, Also PG3 power well id requires to
know if lpsp is enabled.
Merge tag 'topic/phy-compliance-2020-04-08' of git://anongit.freedesktop.org/drm/drm-misc into drm-intel-next-queued
Topic pull request for topic/phy-compliance:
- Standardize DP_PHY_TEST_PATTERN name.
- Add support for setting/getting test pattern from sink.
- Implement DP PHY compliance to i915.
drm/i915: Add YUV444 packed format support for skl+
PLANE_CTL_FORMAT_AYUV is already supported, according to hardware
specification.
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Signed-off-by: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com> Signed-off-by: Bob Paauwe <bob.j.paauwe@intel.com> Signed-off-by: Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200407215546.5445-2-bob.j.paauwe@intel.com
Chris Wilson [Wed, 15 Apr 2020 17:03:18 +0000 (18:03 +0100)]
drm/i915/gt: Update PMINTRMSK holding fw
If we use a non-forcewaked write to PMINTRMSK, it does not take effect
until much later, if at all, causing a loss of RPS interrupts and no GPU
reclocking, leaving the GPU running at the wrong frequency for long
periods of time.
Reported-by: Francisco Jerez <currojerez@riseup.net> Suggested-by: Francisco Jerez <currojerez@riseup.net> Fixes: 35cc7f32c298 ("drm/i915/gt: Use non-forcewake writes for RPS") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Francisco Jerez <currojerez@riseup.net> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Andi Shyti <andi.shyti@intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: Andi Shyti <andi.shyti@intel.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net> Cc: <stable@vger.kernel.org> # v5.6+ Link: https://patchwork.freedesktop.org/patch/msgid/20200415170318.16771-2-chris@chris-wilson.co.uk
Since we depend upon RPS generating interrupts after evaluation
intervals to determine when to up/down clock the GPU, it is imperative
that we successfully enable interrupt generation! Verify that we do see
an interrupt if we keep the GPU busy for an entire EI.