drm/vblank: Data type fixes for 64-bit vblank sequences.
drm_vblank_count() has an u32 type returning what is a 64-bit vblank count.
The effect of this is when drm_wait_vblank_ioctl() tries to widen the user
space requested vblank sequence using this clipped 32-bit count(when the
value is >= 2^32) as reference, the requested sequence remains a 32-bit
value and gets queued like that. However, the code that checks if the
requested sequence has passed compares this against the 64-bit vblank
count.
With drm_vblank_count() returning all bits of the vblank count, update
drm_crtc_accurate_vblank_count() so that drm_crtc_arm_vblank_event() queues
the correct sequence. Otherwise, this leads to prolonged waits for a vblank
sequence when the current count is >=2^32.
drm/i915/selftests: fix inconsistent IS_ERR and PTR_ERR
Fix inconsistent IS_ERR and PTR_ERR in shrink_boom.
The proper pointer to use is _explode_ instead of _purge_.
This issue was detected with the help of Coccinelle.
Fixes: fe215c8bc426 ("drm/i915/selftests: add missing gtt shrinker test") Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180214211234.GA22341@embeddedgus Reviewed-by: Jani Nikula <jani.nikula@intel.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Chris Wilson [Thu, 15 Feb 2018 11:07:59 +0000 (11:07 +0000)]
drm/i915/gtt: Convert WARN_ON to GEM debugging
As we presume that we have sufficient coverage of CI for new machines
and new code paths, we do not need to have user impacting WARN_ON for
programming errors inside i915_gem_gtt.c, so convert those over to
GEM_BUG_ON. This leaves the memory debugging WARN_ON in place as they
are not so easy to exercise with CI.
Chris Wilson [Wed, 14 Feb 2018 16:07:20 +0000 (16:07 +0000)]
drm/i915: Clean up ancient doc comments for i915_ioc32.c
As befitting a file dedicated to the mistakes of the past,
drivers/gpu/drm/i915/i915_ioc32.c:2: warning: Cannot understand * \file i915_ioc32.c
on line 2 - I thought it was a doc line
drivers/gpu/drm/i915/i915_ioc32.c:82: warning: Function parameter or member 'filp' not described in 'i915_compat_ioctl'
drivers/gpu/drm/i915/i915_ioc32.c:82: warning: Function parameter or member 'cmd' not described in 'i915_compat_ioctl'
drivers/gpu/drm/i915/i915_ioc32.c:82: warning: Function parameter or member 'arg' not described in 'i915_compat_ioctl'
Rodrigo Vivi [Wed, 14 Feb 2018 20:42:05 +0000 (12:42 -0800)]
drm/i915/cnl: Remove alpha_support protection
We now have a stable cnl on our CI and it seems mostly
green without big risks of blank screen or anything
blowing up on linux installations in the future.
As a reminder i915.alpha_support was created to protect
future linux installation's iso images that might contain a
kernel from the enabling time of the new platform. Without this
protection most of linux installation was recommending
nomodeset option during installation that was getting stick
there after installation.
Specifically, alpha support says nothing about the development
state of the hardware, and everything about the state of the
driver in a kernel release.
This is semantically no different from the old
preliminary_hw_support flag, but the old one was all too often
interpreted as (preliminary hw) support instead of the intended
(preliminary) hw support, and it was misleading for everyone.
Hence the rename.
v2: Fix the typos and include more history about the parameter
rename on commit message. (Jani)
Reference: https://intel-gfx-ci.01.org/tree/drm-tip/fi-cnl-y3.html Cc: James Ausmus <james.ausmus@intel.com> Cc: Lucas De Marchi <lucas.demarchi@intel.com> Cc: Jani Saarinen <jani.saarinen@intel.com> Cc: Jani Nikula <jani.nikula@intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Acked-by: Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180214204205.4446-1-rodrigo.vivi@intel.com
Rodrigo Vivi [Thu, 8 Feb 2018 07:32:19 +0000 (23:32 -0800)]
drm/i915/cnl: Sync PCI ID with Spec.
Add one missing PCI ID and sort them in a way
that gets easier to review and compare against spec's
table.
When trying to sync libdrm and mesa id list with kernel
and spec I noticed something was wrong and we were missing
a pci id. So to make our lives easier when checking against
spec let's simplify and sort like spec does.
drm/i915: Fix rsvd2 mask when out-fence is returned
GENMASK_ULL wants the high bit of the mask first. The current value
cancels the in-fence when an out-fence is returned.
Fixes: fec0445caa273 ("drm/i915: Support explicit fencing for execbuf")
Testcase: igt/gem_exec_fence/keep-in-fence* Cc: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20180214191827.8465-1-daniele.ceraolospurio@intel.com Cc: <stable@vger.kernel.org> # v4.12+
Chris Wilson [Wed, 14 Feb 2018 14:03:03 +0000 (14:03 +0000)]
drm/i915: Fixup kerneldoc for intel_pm.c
drivers/gpu/drm/i915/intel_pm.c:750: warning: Function parameter or member 'fifo_size' not described in 'intel_calculate_wm'
drivers/gpu/drm/i915/intel_pm.c:5900: warning: Function parameter or member 'crtc' not described in 'intel_update_watermarks'
Chris Wilson [Wed, 14 Feb 2018 13:49:22 +0000 (13:49 +0000)]
drm/i915: Fixup kerneldoc
drivers/gpu/drm/i915/intel_display.c:569: warning: Function parameter or member 'dev_priv' not described in 'intel_PLL_is_valid'
drivers/gpu/drm/i915/intel_display.c:569: warning: Function parameter or member 'limit' not described in 'intel_PLL_is_valid'
drivers/gpu/drm/i915/intel_display.c:569: warning: Function parameter or member 'clock' not described in 'intel_PLL_is_valid'
drivers/gpu/drm/i915/intel_display.c:4769: warning: Function parameter or member 'crtc_state' not described in 'skl_update_scaler_plane'
drivers/gpu/drm/i915/intel_display.c:4769: warning: Excess function parameter 'state' description in 'skl_update_scaler_plane'
drivers/gpu/drm/i915/intel_display.c:4967: warning: Function parameter or member 'new_crtc_state' not described in 'intel_post_enable_primary'
drivers/gpu/drm/i915/intel_display.c:12650: warning: Function parameter or member 'new_state' not described in 'intel_prepare_plane_fb'
drivers/gpu/drm/i915/intel_display.c:12650: warning: Excess function parameter 'fb' description in 'intel_prepare_plane_fb'
drivers/gpu/drm/i915/intel_display.c:12763: warning: Function parameter or member 'old_state' not described in 'intel_cleanup_plane_fb'
drivers/gpu/drm/i915/intel_display.c:12763: warning: Excess function parameter 'fb' description in 'intel_cleanup_plane_fb'
Chris Wilson [Wed, 14 Feb 2018 13:49:21 +0000 (13:49 +0000)]
drm/i915/atomic: Fixup kerneldoc
drivers/gpu/drm/i915/intel_atomic.c:198: warning: Function parameter or member 'state' not described in 'intel_crtc_destroy_state'
drivers/gpu/drm/i915/intel_atomic.c:222: warning: Function parameter or member 'intel_crtc' not described in 'intel_atomic_setup_scalers'
drivers/gpu/drm/i915/intel_atomic.c:222: warning: Excess function parameter 'crtc' description in 'intel_atomic_setup_scalers'
Chris Wilson [Wed, 14 Feb 2018 10:53:32 +0000 (10:53 +0000)]
drm/i915: Fixup kerneldoc for intel_uc_fw_upload()
Just a parameter name change that was lost to kerneldoc.
drivers/gpu/drm/i915/intel_uc_fw.c:209: warning: Function parameter or member 'xfer' not described in 'intel_uc_fw_upload'
drivers/gpu/drm/i915/intel_uc_fw.c:209: warning: Excess function parameter 'loader' description in 'intel_uc_fw_upload'
Chris Wilson [Wed, 14 Feb 2018 10:40:40 +0000 (10:40 +0000)]
drm/i915: Add missing kerneldoc parameters for huc_ucode_xfer
During the recent upheaval to uc, the parameters to huc_ucode_xfer
were changed, but the kerneldoc left behind.
drivers/gpu/drm/i915/intel_huc.c:128: warning: Function parameter or member 'huc_fw' not described in 'huc_ucode_xfer'
drivers/gpu/drm/i915/intel_huc.c:128: warning: Function parameter or member 'vma' not described in 'huc_ucode_xfer'
drivers/gpu/drm/i915/intel_huc.c:128: warning: Excess function parameter 'dev_priv' description in 'huc_ucode_xfer'
Chris Wilson [Wed, 14 Feb 2018 09:29:09 +0000 (09:29 +0000)]
drm/i915/lvds: Fixup commentary
Remove the kerneldoc markup applied to non-kerneldoc comments and
convert the multiline comments to the canonical style.
drivers/gpu/drm/i915/intel_lvds.c:313: warning: Function parameter or member 'encoder' not described in 'intel_enable_lvds'
drivers/gpu/drm/i915/intel_lvds.c:313: warning: Function parameter or member 'pipe_config' not described in 'intel_enable_lvds'
drivers/gpu/drm/i915/intel_lvds.c:313: warning: Function parameter or member 'conn_state' not described in 'intel_enable_lvds'
drivers/gpu/drm/i915/intel_lvds.c:453: warning: Function parameter or member 'connector' not described in 'intel_lvds_detect'
drivers/gpu/drm/i915/intel_lvds.c:453: warning: Function parameter or member 'force' not described in 'intel_lvds_detect'
drivers/gpu/drm/i915/intel_lvds.c:471: warning: Function parameter or member 'connector' not described in 'intel_lvds_get_modes'
drivers/gpu/drm/i915/intel_lvds.c:932: warning: Function parameter or member 'dev_priv' not described in 'intel_lvds_init'
drivers/gpu/drm/i915/intel_lvds.c:932: warning: Excess function parameter 'dev' description in 'intel_lvds_init'
Chris Wilson [Wed, 14 Feb 2018 09:29:08 +0000 (09:29 +0000)]
drm/i915/dvo: Fixup commentary
Remove the kerneldoc markup applied to non-kerneldoc comments and
convert the multiline comments to the canonical style.
drivers/gpu/drm/i915/intel_dvo.c:303: warning: Function parameter or member 'connector' not described in 'intel_dvo_detect'
drivers/gpu/drm/i915/intel_dvo.c:303: warning: Function parameter or member 'force' not described in 'intel_dvo_detect'
drivers/gpu/drm/i915/intel_dvo.c:382: warning: Function parameter or member 'encoder' not described in 'intel_dvo_get_current_mode'
Chris Wilson [Wed, 14 Feb 2018 09:29:07 +0000 (09:29 +0000)]
drm/i915/dvo: Remove incorrect kerneldoc markups
Regular comments where being marked up for kerneldoc, but were not
formatted properly. Remove the markup to remove the warnings.
drivers/gpu/drm/i915/dvo_ivch.c:192: warning: Function parameter or member 'dvo' not described in 'ivch_read'
drivers/gpu/drm/i915/dvo_ivch.c:192: warning: Function parameter or member 'addr' not described in 'ivch_read'
drivers/gpu/drm/i915/dvo_ivch.c:192: warning: Function parameter or member 'data' not described in 'ivch_read'
Chris Wilson [Wed, 14 Feb 2018 09:17:47 +0000 (09:17 +0000)]
drm/i915/panel: Split range scaling calculation for readiblity
Split the 64b multiplication from the division so that it doesn't sprawl
across a couple of lines and use mul_u32_u32() instead of open-coding
the 64b conversion.
Chris Wilson [Wed, 14 Feb 2018 09:17:46 +0000 (09:17 +0000)]
drm/i915/panel: Add missing parameters to kerneldoc
drivers/gpu/drm/i915/intel_panel.c:409: warning: Function parameter or member 'source_min' not described in 'scale'
drivers/gpu/drm/i915/intel_panel.c:409: warning: Function parameter or member 'source_max' not described in 'scale'
drivers/gpu/drm/i915/intel_panel.c:409: warning: Function parameter or member 'target_min' not described in 'scale'
drivers/gpu/drm/i915/intel_panel.c:409: warning: Function parameter or member 'target_max' not described in 'scale'
Chris Wilson [Wed, 14 Feb 2018 09:09:05 +0000 (09:09 +0000)]
drm/i915/sdvo: Tidy up commentary
Drop the kerneldoc markup from the non-kerneldoc comments and convert
the multi-line comments to the canonical format.
drivers/gpu/drm/i915/intel_sdvo.c:223: warning: Function parameter or member 'intel_sdvo' not described in 'intel_sdvo_write_sdvox'
drivers/gpu/drm/i915/intel_sdvo.c:223: warning: Function parameter or member 'val' not described in 'intel_sdvo_write_sdvox'
drivers/gpu/drm/i915/intel_sdvo.c:653: warning: Function parameter or member 'intel_sdvo' not described in 'intel_sdvo_get_trained_inputs'
drivers/gpu/drm/i915/intel_sdvo.c:653: warning: Function parameter or member 'input_1' not described in 'intel_sdvo_get_trained_inputs'
drivers/gpu/drm/i915/intel_sdvo.c:653: warning: Function parameter or member 'input_2' not described in 'intel_sdvo_get_trained_inputs'
drivers/gpu/drm/i915/intel_sdvo.c:2311: warning: Function parameter or member 'dev_priv' not described in 'intel_sdvo_select_ddc_bus'
drivers/gpu/drm/i915/intel_sdvo.c:2311: warning: Function parameter or member 'sdvo' not described in 'intel_sdvo_select_ddc_bus'
Chris Wilson [Wed, 14 Feb 2018 08:58:14 +0000 (08:58 +0000)]
drm/i915/tv: Cleanup up obsolete comments
The ages old kerneldoc-esque comments still refer to the original stubs
and not the more complete functions. As they were only describing the
external entry points (or at least thought themselves to be, they had
drifted!), they don't provide any commentary for the code flow.
drivers/gpu/drm/i915/intel_tv.c:379: warning: cannot understand function prototype: 'const struct tv_mode tv_modes[] = '
drivers/gpu/drm/i915/intel_tv.c:1133: warning: bad line:
drivers/gpu/drm/i915/intel_tv.c:1140: warning: Function parameter or member 'intel_tv' not described in 'intel_tv_detect_type'
drivers/gpu/drm/i915/intel_tv.c:1140: warning: Function parameter or member 'connector' not described in 'intel_tv_detect_type'
drivers/gpu/drm/i915/intel_tv.c:1272: warning: Function parameter or member 'connector' not described in 'intel_tv_detect'
drivers/gpu/drm/i915/intel_tv.c:1272: warning: Function parameter or member 'ctx' not described in 'intel_tv_detect'
drivers/gpu/drm/i915/intel_tv.c:1272: warning: Function parameter or member 'force' not described in 'intel_tv_detect'
drivers/gpu/drm/i915/intel_tv.c:1351: warning: Function parameter or member 'connector' not described in 'intel_tv_get_modes'
Hans de Goede [Wed, 14 Feb 2018 08:21:51 +0000 (09:21 +0100)]
drm/i915: Fix DSI panels with v1 MIPI sequences without a DEASSERT sequence v3
So far models of the Dell Venue 8 Pro, with a panel with MIPI panel
index = 3, one of which has been kindly provided to me by Jan Brummer,
where not working with the i915 driver, giving a black screen on the
first modeset.
The problem with at least these Dells is that their VBT defines a MIPI
ASSERT sequence, but not a DEASSERT sequence. Instead they DEASSERT the
reset in their INIT_OTP sequence, but the deassert must be done before
calling intel_dsi_device_ready(), so that is too late.
Simply doing the INIT_OTP sequence earlier is not enough to fix this,
because the INIT_OTP sequence also sends various MIPI packets to the
panel, which can only happen after calling intel_dsi_device_ready().
This commit fixes this by splitting the INIT_OTP sequence into everything
before the first DSI packet and everything else, including the first DSI
packet. The first part (everything before the first DSI packet) is then
used as deassert sequence.
Changed in v2:
-Split the init OTP sequence into a deassert reset and the actual init
OTP sequence, instead of calling it earlier and then having the first
mipi_exec_send_packet() call call intel_dsi_device_ready().
Changes in v3:
-Move the whole shebang to intel_bios.c
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=82880
References: https://bugs.freedesktop.org/show_bug.cgi?id=101205 Cc: Jan-Michael Brummer <jan.brummer@tabos.org> Reported-by: Jan-Michael Brummer <jan.brummer@tabos.org> Tested-by: Hans de Goede <hdegoede@redhat.com> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Acked-by: Jani Nikula <jani.nikula@intel.com> Signed-off-by: Hans de Goede <hdegoede@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180214082151.25015-3-hdegoede@redhat.com
Hans de Goede [Wed, 14 Feb 2018 08:21:49 +0000 (09:21 +0100)]
drm/i915: Add intel_bios_cleanup() function
Add an intel_bios_cleanup() function to act as counterpart of
intel_bios_init() and move the cleanup of vbt related resources there,
putting it in the same file as the allocation.
Changed in v2:
-While touching the code anyways, remove the unnecessary:
if (dev_priv->vbt.child_dev) done before kfree(dev_priv->vbt.child_dev)
Tvrtko Ursulin [Thu, 8 Feb 2018 16:00:36 +0000 (16:00 +0000)]
drm/i915: Handle RC6 counter wrap
We can implement limited RC6 counter wrap-around protection under the
assumption that clients will be reading this value more frequently than
the wrap period on a given platform.
With the typical wrap-around period being ~90 minutes, even with the
exception of Baytrail which wraps every 13 seconds, this sounds like a
reasonable assumption.
Implementation works by storing a 64-bit software copy of a hardware RC6
counter, along with the previous HW counter snapshot. This enables it to
detect wrap is polled frequently enough and keep the software copy
monotonically incrementing.
v2:
* Missed GEN6_GT_GFX_RC6_LOCKED when considering slot sizing and
indexing.
* Fixed off-by-one in wrap-around handling. (Chris Wilson)
v3:
* Simplify index checking by using unsigned int. (Chris Wilson)
* Expand the comment to explain why indexing works.
v4:
* Use __int128 if supported.
v5:
* Use mul_u64_u32_div. (Chris Wilson)
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94852 Cc: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> # v3 Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20180208160036.29919-1-tvrtko.ursulin@linux.intel.com Acked-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Jani Nikula [Mon, 5 Feb 2018 17:31:38 +0000 (19:31 +0200)]
drm/i915: have virtual PCH detection return a PCH id
Simplify intel_virt_detect_pch() by making it return a PCH id rather
than returning the PCH type and setting PCH id for some PCHs. Map the
PCH id to PCH type using the shared routine. This gives us sanity check
on the supported combinations also in the virtualized setting.
Chris Wilson [Tue, 13 Feb 2018 12:09:40 +0000 (12:09 +0000)]
drm/i915/selftests: Report setup errors for igt_partial_tiling
igt_partial_tiling managed to fail with an -EBUSY. This usually means a
pin leak, but that should be impossible given the test setup. Add a
couple of error messages to help identify the path that failed.
Mahesh Kumar [Mon, 5 Feb 2018 17:21:31 +0000 (15:21 -0200)]
drm/i915/icl: program mbus during pipe enable
This patch program default values of MBus credit during pipe enable.
Changes Since V1:
- Add WARN_ON (Paulo)
- Remove TODO comment
- Program 0 during pipe disable
- Rebase
Changes since V2:
- We don't need to do anything when disabling the pipe
Changes since V3 (from Paulo):
- Remove WARN() that we'll never be able to trigger (Ville).
Mahesh Kumar [Mon, 5 Feb 2018 15:40:44 +0000 (13:40 -0200)]
drm/i915/icl: Enable both DBuf slices during init
ICL has 2 slices of DBuf, enable both the slices during display init.
Ideally we should only enable the second slice when needed in order to
save power, but while we're not there yet, adopt the simpler solution
to keep us bug-free.
v2 (from Paulo):
- Add the TODO comment.
- Reorganize where things are defined.
- Fix indentation.
- Remove unnecessary POSTING_READ() calls.
- Improve the commit message.
Paulo Zanoni [Mon, 5 Feb 2018 15:40:43 +0000 (13:40 -0200)]
drm/i915/icl: implement the display init/uninit sequences
This code is similar enough to the CNL code that I considered just
adding ICL support to the CNL function, but I think it's still
different enough, and having a function specific to ICL allows us to
more easily adapt code in case the spec changes more later.
We're still missing the power wells and the mbus code, so leave those
pieces with a FIXME comment while they're not here yet.
v2: Don't use _PICK, don't WARN_ON(1), don't forget the chicken bits.
v3: Use _MMIO_PORT() (Ville).
Paulo Zanoni [Tue, 6 Feb 2018 19:33:46 +0000 (17:33 -0200)]
drm/i915/icl: add the main CDCLK functions
This commit adds the basic CDCLK functions, but it's still missing
pieces of the display initialization sequence.
v2:
- Implement the voltage levels.
- Rebase.
v3:
- Adjust to the new "bypass" clock (Imre).
- Call intel_dump_cdclk_state() too.
- Rename a variable to avoid confusion.
- Simplify the DVFS part.
v4:
- Remove wrong bit definition (James).
- Also drive-by fix the coding style for the register definition we
touched.
v5:
- Comment style (checkpatch).
Paulo Zanoni [Mon, 5 Feb 2018 15:40:41 +0000 (13:40 -0200)]
drm/i915/icl: add ICL support to cnl_set_procmon_ref_values
On ICL we have two sets of registers: one for port A and another for
port B. The set of port A registers is the same as the CNL registers.
Since the procmon table on ICL is the same we want to reuse the CNL
function. To do that we add a port argument and make CNL always call
the function passing port A. This way, we'll be able to easily reuse
the function on ICL when we add icl_display_core_init().
v2: Don't use _PICK() when you can use a ternary operator.
v3: Don't use a ternary operation when you can use _MMIO_PORT (Ville).
Add an extra comment about why we're passing PORT_A (James).
David Weinehall [Fri, 9 Feb 2018 13:07:55 +0000 (15:07 +0200)]
drm/i915: Fix incorrect comment
While the comment singles out Port A or B, the code says Port A or *D*.
Looking at the history it seems that the comment was added after the code,
so it seems likely that the code is correct, not the comment.
Chris Wilson [Mon, 12 Feb 2018 13:31:18 +0000 (13:31 +0000)]
drm/i915: Replace open-coded memset_p()
When initialising the page directories, we set the GTT entries and the
tree to the scratch page. We have already replaced the DMA fill with
memset64(), but we can similarly use memset_p() to set the pointer array.
References: 4dd504f7d98a ("drm/i915: Use memset64() to prefill the GTT page") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Matthew Auld <matthew.auld@intel.com> Cc: Mika Kuoppala <mika.kuoppala@intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180212133118.16443-1-chris@chris-wilson.co.uk
Imre Deak [Thu, 8 Feb 2018 17:41:02 +0000 (19:41 +0200)]
drm/i915: Don't query PCODE RC6VIDS on platforms not supporting it
On BXT/GLK GEN6_PCODE_READ_RC6VIDS fails with
MAILBOX_P24C_CC_ILLEGAL_CMD, so don't try to do the query on these
platforms. Do it only on SNB, IVB and HSW, where we use this command
anyway for RC6 enabling.
Based on my tests the command also succeeds on all LLC platforms, but
it's not clear if it's really supported on those (it returns 0 aka 245mv
for all RC6 states everywhere except on SNB). BSpec lists the command as
supported on SKL+ (see P24C_PCODE_MAILBOX_INTERFACE) but that's clearly
incorrect, since on SKL/KBL the same command ID is used for
SKL_PCODE_LOAD_HDCP_KEYS. Since the command fails on BXT/GLK, the BSpec
command list is also incorrect for those platforms (see
P_CR_P24C_PCODE_MAILBOX_INTERFACE_0_2_0_GTTMMADR).
I filed a request to update that info in Bspec, but for now let's
assume a minimal set of platforms where the command is supported.
Chris Wilson [Mon, 12 Feb 2018 10:24:15 +0000 (10:24 +0000)]
drm/i915: Hold rpm wakeref for printing the engine's register state
When dumping the engine, we print out the current register values. This
requires the rpm wakeref. If the device is alseep, we can assume the
engine is asleep (and the register state is uninteresting) so skip and
only acquire the rpm wakeref if the device is already awake.
Chris Wilson [Mon, 12 Feb 2018 09:39:28 +0000 (09:39 +0000)]
drm/i915: Don't wake the device up to check if the engine is asleep
If the entire device is powered off, we can safely assume that the
engine is also asleep (and idle).
Reported-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Fixes: a091d4ee931b ("drm/i915: Hold a wakeref for probing the ring registers") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Mika Kuoppala <mika.kuoppala@intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180212093928.6005-1-chris@chris-wilson.co.uk
Chris Wilson [Fri, 9 Feb 2018 11:40:56 +0000 (11:40 +0000)]
drm/i915: Move the final intel_gpu_reset() to after declaring wedged
If we fail to reset the GPU (i915_reset()), we do one final
intel_gpu_reset() attempt as we mark the device wedged. The idea here is
even though the GPU has proven unreliable (and so we want to stop using
it for the time being), we don't want it spinning away in the background
whilst the driver idles so we try to reset it one more time. However, we
want to dump the i915_gem_set_wedged() debugging info before we do, so
that we can see the accurate state of the GPU when it failed.
Ville Syrjälä [Wed, 7 Feb 2018 16:48:41 +0000 (18:48 +0200)]
drm/i915: Give all ioctl functions an _ioctl suffix
Most of our ioctl functions have an _ioctl suffix in the name. I like
that idea since it makes it easy to figure out how the function is
going to get called. Rename the handful of exceptions to follow the
same pattern.
Imre Deak [Thu, 8 Feb 2018 11:23:31 +0000 (13:23 +0200)]
drm/i915/snb+: Remove incorrect forcewake check in debugfs/i915_drpc_info
FORCEWAKE_ACK is depricated by BSpec at least starting from BDW,
referring to the multi-threaded version of it instead. Accessing
FORCEWAKE_ACK triggers an unclaimed register access error - at
least on GLK - see the Reference: below.
The correct registers to use would be FORCEWAKE_MT_ACK on IVB+ and
FORCEWAKE_ACK_RENDER_GEN9 on SKL+ like it's done elsewhere in the
driver.
The forcewake check itself is inconsistent and redundant, since there
could be other forcewake requesters besides the kernel (being the
multithreaded version of the register) and the kernel's per-domain
forcewake counters are shown anyway at the end of the file. So let's
just remove the check.
Suggested-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reference: https://bugs.freedesktop.org/show_bug.cgi?id=103337 Signed-off-by: Imre Deak <imre.deak@intel.com> Tested-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20180208112331.12986-1-imre.deak@intel.com
Chris Wilson [Thu, 8 Feb 2018 15:12:24 +0000 (15:12 +0000)]
drm/i915: Remove redundant check on execlists interrupt
Since commit 4a118ecbe99c ("drm/i915: Filter out spurious execlists
context-switch interrupts") we probe execlists->active, and no longer
have to peek at the execlist interrupt to determine if the tasklet still
needs to be run to drain the ELSP.
References: 4a118ecbe99c ("drm/i915: Filter out spurious execlists context-switch interrupts") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Michal Winiarski <michal.winiarski@intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Mika Kuoppala <mika.kuoppala@intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180208151224.16285-1-chris@chris-wilson.co.uk
Chris Wilson [Thu, 8 Feb 2018 16:39:39 +0000 (16:39 +0000)]
drm/i915/crt: Silence compiler warning for uninitialised status
clang is confused by our if-else-chain that abruptly exits before a
final else:
drivers/gpu/drm/i915/intel_crt.c:821:11: warning: variable 'status' is used uninitialized whenever 'if' condition is false [-Wsometimes-uninitialized]
else if (ret < 0)
^~~~~~~
drivers/gpu/drm/i915/intel_crt.c:826:9: note: uninitialized use occurs here
return status;
^~~~~~
drivers/gpu/drm/i915/intel_crt.c:821:7: note: remove the 'if' if its condition is always true
else if (ret < 0)
^~~~~~~~~~~~
drivers/gpu/drm/i915/intel_crt.c:761:12: note: initialize the variable 'status' to silence this warning
int status, ret;
In this case, we can reduce the final else-if clause to an unconditional
else.
Chris Wilson [Thu, 8 Feb 2018 16:16:39 +0000 (16:16 +0000)]
drm/i915: Disable unused-but-set compiler warning
The unused-but-set warning enabled by W=1 catches out a lot of the
atomic helper iterator macros and drown us in their noise (or trip over
Werror and die). Path of least resistance is to ignore the warning.
Chris Wilson [Thu, 8 Feb 2018 11:42:24 +0000 (11:42 +0000)]
drm/i915: Fix kerneldoc warnings for i915_gem_internal
drivers/gpu/drm/i915/i915_gem_internal.c:183: warning: No description found for parameter 'i915'
drivers/gpu/drm/i915/i915_gem_internal.c:183: warning: No description found for parameter 'size'
Chris Wilson [Thu, 8 Feb 2018 11:39:17 +0000 (11:39 +0000)]
drm/i915: Fix kerneldoc warnings in i915_gem_execbuffer
drivers/gpu/drm/i915/i915_gem_execbuffer.c:1983: warning: No description found for parameter 'dev_priv'
drivers/gpu/drm/i915/i915_gem_execbuffer.c:1983: warning: No description found for parameter 'file'
Chris Wilson [Thu, 8 Feb 2018 10:54:49 +0000 (10:54 +0000)]
drm/i915: Fix kerneldoc parameter markup
drivers/gpu/drm/i915/i915_syncmap.c:92: warning: No description found for parameter 'root'
drivers/gpu/drm/i915/i915_syncmap.c:155: warning: No description found for parameter 'root'
drivers/gpu/drm/i915/i915_syncmap.c:155: warning: No description found for parameter 'id'
drivers/gpu/drm/i915/i915_syncmap.c:155: warning: No description found for parameter 'seqno'
drivers/gpu/drm/i915/i915_syncmap.c:354: warning: No description found for parameter 'root'
drivers/gpu/drm/i915/i915_syncmap.c:354: warning: No description found for parameter 'id'
drivers/gpu/drm/i915/i915_syncmap.c:354: warning: No description found for parameter 'seqno'
drivers/gpu/drm/i915/i915_syncmap.c:396: warning: No description found for parameter 'root'
Chris Wilson [Thu, 8 Feb 2018 11:15:59 +0000 (11:15 +0000)]
drm/i915: Remove lost comment from i915_gem_context
The comment is very old and quite misleading now.
drivers/gpu/drm/i915/i915_gem_context.c:349: warning: No description found for parameter 'dev_priv'
drivers/gpu/drm/i915/i915_gem_context.c:349: warning: No description found for parameter 'file_priv'
Chris Wilson [Thu, 8 Feb 2018 11:13:28 +0000 (11:13 +0000)]
drm/i915: Fix kerneldoc warnings for i915_gem_userptr
drivers/gpu/drm/i915/i915_gem_userptr.c:761: warning: No description found for parameter 'dev'
drivers/gpu/drm/i915/i915_gem_userptr.c:761: warning: No description found for parameter 'data'
drivers/gpu/drm/i915/i915_gem_userptr.c:761: warning: No description found for parameter 'file'
Chris Wilson [Thu, 8 Feb 2018 11:12:20 +0000 (11:12 +0000)]
drm/i915: Fix kerneldoc warnings for intel_ringbuffer
drivers/gpu/drm/i915/intel_ringbuffer.c:179: warning: No description found for parameter 'req'
drivers/gpu/drm/i915/intel_ringbuffer.c:741: warning: No description found for parameter 'req'
drivers/gpu/drm/i915/intel_ringbuffer.c:741: warning: No description found for parameter 'cs'
Chris Wilson [Thu, 8 Feb 2018 11:11:05 +0000 (11:11 +0000)]
drm/i915: Fix kerneldoc warnings for i915_gpu_error
drivers/gpu/drm/i915/i915_gpu_error.c:1815: warning: No description found for parameter 'dev_priv'
drivers/gpu/drm/i915/i915_gpu_error.c:1815: warning: No description found for parameter 'engine_mask'
drivers/gpu/drm/i915/i915_gpu_error.c:1815: warning: No description found for parameter 'error_msg'
drivers/gpu/drm/i915/i915_gpu_error.c:1815: warning: Excess function parameter 'dev' description in 'i915_capture_error_state'
Chris Wilson [Wed, 7 Feb 2018 22:28:24 +0000 (22:28 +0000)]
drm/i915: Wait for gen3 reset status to be asserted
After we assert the reset request (and wait for 20us), when the device
has been fully reset it asserts the reset-status bit. Before we stop
requesting the reset and allow the device to return to normal, we should
wait for the reset to be completed. (Similar to how we wait for the
device to return to normal after deasserting the reset request.)
v2: Rename i915_reset_completed() probe to not cause as much confusion.
Chris Wilson [Thu, 8 Feb 2018 07:28:00 +0000 (07:28 +0000)]
drm/i915: Be paranoid and post the writes to stop the rings
Although the mmio are uncached and so should be flushed on every write,
be paranoid and do a mmio read after setting the ring head/tail to be
sure they have taken effect before moving on.
Chris Wilson [Wed, 7 Feb 2018 15:13:50 +0000 (15:13 +0000)]
drm/i915: Mark the device as wedged from the beginning of set-wedged
Reduce the window of opportunity for set-wedged being called
concurrently with reset (after i915_reset() has performed the
i915_gem_unset_wedged()) by moving the set_bit(I915_WEDGED) to before we
complete the inflight requests. When i915_reset() is being blocked on a
request, such completion may allow it to start and beginning resetting
the GPU before i915_gem_set_wedged() has finished (and so before
set-wedge will have marked the device as wedged). As such,
i915_gem_init_hw() may see a wedged device even from inside
i915_reset().
References: 36703e79a982 ("drm/i915: Break modeset deadlocks on reset") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180207151350.20883-1-chris@chris-wilson.co.uk
Chris Wilson [Thu, 8 Feb 2018 08:51:51 +0000 (08:51 +0000)]
drm/i915: Avoid truncation before clamping userspace's priority value
Userspace provides a 64b value for the priority, we need to be careful
to preserve the full range before validation to prevent truncation (and
letting an illegal value pass).
Reported-by: Antonio Argenziano <antonio.argenziano@intel.com> Fixes: ac14fbd460d0 ("drm/i915/scheduler: Support user-defined priorities") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Antonio Argenziano <antonio.argenziano@intel.com> Cc: Michal Winiarski <michal.winiarski@intel.com> Cc: Mika Kuoppala <mika.kuoppala@intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180208085151.11480-1-chris@chris-wilson.co.uk Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Chris Wilson [Tue, 6 Feb 2018 14:31:37 +0000 (14:31 +0000)]
drm/i915: Remove superfluous worker wakeups when RPS is already boosted
We only need to wake up the RPS worker once when initially enabling the
client boost, it remains in effect then until the last client no longer
requires the boost.
Chris Wilson [Thu, 8 Feb 2018 10:24:03 +0000 (10:24 +0000)]
drm/i915/perf: Fix compiler warning for string truncation
drivers/gpu/drm/i915/i915_oa_cnl.c: In function ‘i915_perf_load_test_config_cnl’:
drivers/gpu/drm/i915/i915_oa_cnl.c:99:2: error: ‘strncpy’ output truncated before terminating nul copying 36 bytes from a string of the same length [-Werror=stringop-truncation]
v2: strlcpy
Fixes: 95690a02fb5d ("drm/i915/perf: enable perf support on CNL") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Cc: Matthew Auld <matthew.auld@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180208102403.5587-2-chris@chris-wilson.co.uk
Chris Wilson [Thu, 8 Feb 2018 10:24:02 +0000 (10:24 +0000)]
drm/i915/perf: Fix compiler warning for string truncation
drivers/gpu/drm/i915/i915_oa_cflgt3.c: In function ‘i915_perf_load_test_config_cflgt3’:
drivers/gpu/drm/i915/i915_oa_cflgt3.c:87:2: error: ‘strncpy’ output truncated before terminating nul copying 36 bytes from a string of the same length [-Werror=stringop-truncation]
v2: strlcpy
Fixes: 4407eaa9b0cc ("drm/i915/perf: add support for Coffeelake GT3") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Cc: Matthew Auld <matthew.auld@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180208102403.5587-1-chris@chris-wilson.co.uk
drm/i915: do not stop engines on sanitize if i915.reset=0
Since commit 5896a5c8c9c0 (drm/i915: Always stop the rings before a
missing GPU reset) we attempt to stop the engines during gem_sanitize
even if reset=0 and nothing bad happened on the gpu.
The specs says that the STOP_RINGS bit needs to be cleared to resume
normal operation, but for some reason the value of the bit seems to be
changing without us writing to it (maybe rc6 entry/exit?), so normal
operation resumes correctly. However, it still feels incorrect to stop
the engines if there hasn't been any issue so skip the whole reset
call in gem_sanitize if i915.reset=0
Chris Wilson [Wed, 7 Feb 2018 21:05:44 +0000 (21:05 +0000)]
drm/i915: Only allocate preempt context when required
If we remove some hardcoded assumptions about the preempt context having
a fixed id, reserved from use by normal user contexts, we may only
allocate the i915_gem_context when required. Then the subsequent
decisions on using preemption reduce to having the preempt context
available.
v2: Include an assert that we don't allocate the preempt context twice.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Michal Winiarski <michal.winiarski@intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Arkadiusz Hiler <arkadiusz.hiler@intel.com> Cc: Mika Kuoppala <mika.kuoppala@intel.com> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Acked-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180207210544.26351-3-chris@chris-wilson.co.uk Reviewed-by: Michel Thierry <michel.thierry@intel.com>
Chris Wilson [Wed, 7 Feb 2018 21:05:43 +0000 (21:05 +0000)]
drm/i915: Move the scheduler feature bits into the purview of the engines
Rather than having the high level ioctl interface guess the underlying
implementation details, having the implementation declare what
capabilities it exports. We define an intel_driver_caps, similar to the
intel_device_info, which instead of trying to describe the HW gives
details on what the driver itself supports. This is then populated by
the engine backend for the new scheduler capability field for use
elsewhere.
v2: Use caps.scheduler for validating CONTEXT_PARAM_SET_PRIORITY (Mika)
One less assumption of engine[RCS] \o/
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tomasz Lis <tomasz.lis@intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: Tomasz Lis <tomasz.lis@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180207210544.26351-2-chris@chris-wilson.co.uk Reviewed-by: Michel Thierry <michel.thierry@intel.com>
Chris Wilson [Wed, 7 Feb 2018 21:05:42 +0000 (21:05 +0000)]
drm/i915/guc: Allow preempt-client to be NULL
In the next patch, we may only conditionally allocate the preempt-client
if there is a global preempt context and so we need to be prepared in
case the preempt-client itself is NULL.
v2: Grep for more preempt_client.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Michal Winiarski <michal.winiarski@intel.com> Cc: Arkadiusz Hiler <arkadiusz.hiler@intel.com> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Michel Thierry <michel.thierry@intel.com> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180207210544.26351-1-chris@chris-wilson.co.uk Reviewed-by: Michel Thierry <michel.thierry@intel.com>
Tvrtko Ursulin [Tue, 6 Feb 2018 18:33:11 +0000 (18:33 +0000)]
drm/i915/pmu: Fix sleep under atomic in RC6 readout
We are not allowed to call intel_runtime_pm_get from the PMU counter read
callback since the former can sleep, and the latter is running under IRQ
context.
To workaround this, we record the last known RC6 and while runtime
suspended estimate its increase by querying the runtime PM core
timestamps.
Downside of this approach is that we can temporarily lose a chunk of RC6
time, from the last PMU read-out to runtime suspend entry, but that will
eventually catch up, once device comes back online and in the presence of
PMU queries.
Also, we have to be careful not to overshoot the RC6 estimate, so once
resumed after a period of approximation, we only update the counter once
it catches up. With the observation that RC6 is increasing while the
device is suspended, this should not pose a problem and can only cause
slight inaccuracies due clock base differences.
v2: Simplify by estimating on top of PM core counters. (Imre)
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104943 Fixes: 6060b6aec03c ("drm/i915/pmu: Add RC6 residency metrics")
Testcase: igt/perf_pmu/rc6-runtime-pm Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Imre Deak <imre.deak@intel.com> Cc: Jani Nikula <jani.nikula@linux.intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: David Airlie <airlied@linux.ie> Cc: intel-gfx@lists.freedesktop.org Cc: dri-devel@lists.freedesktop.org Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20180206183311.17924-1-tvrtko.ursulin@linux.intel.com
Chris Wilson [Wed, 7 Feb 2018 11:15:45 +0000 (11:15 +0000)]
drm/i915: Tidy up some error messages around reset failure
On blb and pnv, we are seeing sporadic
i915 0000:00:02.0: Resetting chip after gpu hang
[drm:intel_gpu_reset [i915]] rcs0: timed out on STOP_RING
[drm:i915_reset [i915]] *ERROR* Failed hw init on reset -5
which notably lack the actual root cause of the error. Ostensibly it
should be the init_ring_common() that failed, but it's error paths are
covered by DRM_ERROR.
Chris Wilson [Wed, 7 Feb 2018 08:43:49 +0000 (08:43 +0000)]
drm/i915: Trim the retired request queue after submitting
If we submit a request and see that the previous request on this
timeline was already signaled, we first do not need to add the
dependency tracker for that completed request and secondly we know that
we there is then a large backlog in retiring requests affecting this
timeline. Given that we just submitted more work to the HW, now would be
a good time to catch up on those retirements.
v2: Try to sum up the compromises involved in flushing the retirement
queue after submission.
drm/i915: Ignore minimum lines for level 0 in skl_compute_plane_wm, v2.
According to bspec, result_lines > 31 is only a maximum for latency
level 1 through 7.
For level 0 the number of lines is ignored, so always write 0 there
to prevent overflowing the 5 bits value.
This is required to make NV12 work.
Changes since v1:
- Rebase on top of GEN11 wm changes. It seems to use res_lines for
level 0 limit calculations, but still doesn't appear to program it.
When a request is preempted, it is unsubmitted from the HW queue and
removed from the active list of breadcrumbs. In the process, this
however triggers the signaler and it may see the clear rbtree with the
old, and still valid, seqno, or it may match the cleared seqno with the
now zero rq->global_seqno. This confuses the signaler into action and
signaling the fence.
Fixes: d6a2289d9d6b ("drm/i915: Remove the preempted request from the execution queue") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: <stable@vger.kernel.org> # v4.12+ Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180206094633.30181-1-chris@chris-wilson.co.uk
Tvrtko Ursulin [Mon, 5 Feb 2018 09:34:48 +0000 (09:34 +0000)]
drm/i915/pmu: Fix PMU enable vs execlists tasklet race
Commit 99e48bf98dd0 ("drm/i915: Lock out execlist tasklet while peeking
inside for busy-stats") added a tasklet_disable call in busy stats
enabling, but we failed to understand that the PMU enable callback runs
as an hard IRQ (IPI).
Consequence of this is that the PMU enable callback can interrupt the
execlists tasklet, and will then deadlock when it calls
intel_engine_stats_enable->tasklet_disable.
To fix this, I realized it is possible to move the engine stats enablement
and disablement to PMU event init and destroy hooks. This allows for much
simpler implementation since those hooks run in normal context (can
sleep).
v2: Extract engine_event_destroy. (Chris Wilson)
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Fixes: 99e48bf98dd0 ("drm/i915: Lock out execlist tasklet while peeking inside for busy-stats")
Testcase: igt/perf_pmu/enable-race-* Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Jani Nikula <jani.nikula@linux.intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: intel-gfx@lists.freedesktop.org Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20180205093448.13877-1-tvrtko.ursulin@linux.intel.com
This workaround should prevent a bug that can be hit on a context
restore. To avoid the issue, we must emit a PIPE_CONTROL with CS stall
(0x7a000004 0x00100000 0x00000000 0x00000000) followed by 12DW's of
NOOP(0x0) in the indirect context batch buffer, to ensure the engine is
idle prior to programming 3DSTATE_SAMPLE_PATTERN.
It's also not clear whether we should add those extra dwords because of
the workaround itself, or if that's just padding for the WA BB (and next
commands could come right after the PIPE_CONTROL). We keep them for now.
Michal Srb [Mon, 5 Feb 2018 16:04:38 +0000 (16:04 +0000)]
drm/i915/cmdparser: Do not check past the cmd length.
The command MEDIA_VFE_STATE checks bits at offset +2 dwords. However, it is
possible to have MEDIA_VFE_STATE command with length = 0 + LENGTH_BIAS = 2.
In that case check_cmd will read bits from the following command, or even past
the end of the buffer.
If the offset ends up outside of the command length, reject the command.
Michal Srb [Mon, 5 Feb 2018 16:04:37 +0000 (16:04 +0000)]
drm/i915/cmdparser: Check reg_table_count before derefencing.
The find_reg function was assuming that there is always at least one table in
reg_tables. It is not always true.
In case of VCS or VECS, the reg_tables is NULL and reg_table_count is 0,
implying that no register-accessing commands are allowed. However, the command
tables include commands such as MI_STORE_REGISTER_MEM. When trying to check
such command, the find_reg would dereference NULL pointer.
Now it will just return NULL meaning that the register was not found and the
command will be rejected.
Ville Syrjälä [Fri, 2 Feb 2018 20:42:31 +0000 (22:42 +0200)]
drm/i915: Deprecate I915_SET_COLORKEY_NONE
Deprecate the silly I915_SET_COLORKEY_NONE flag. The obvious
way to disable colorkey is to just set flags to 0, which is
exactly what the intel ddx has been doing all along.
Currently when userspace sets the flags to 0, we end up in a
funny state where colorkey is disabled, but various colorkey
vs. scaling checks still consider colorkey to be enabled, and
thus we don't allow plane scaling to kick in.
In case there is some other userspace out there that actually
uses this flag (unlikely as this is an i915 specific uapi)
we'll keep on accepting it.
Chris Wilson [Mon, 5 Feb 2018 15:24:31 +0000 (15:24 +0000)]
drm/i915: Skip post-reset request emission if the engine is not idle
Since commit 7b6da818d86f ("drm/i915: Restore the kernel context after a
GPU reset on an idle engine") we submit a request following the engine
reset. The intent is that we don't submit a request if the engine is
busy (as it will restart active by itself) but we only checked to see if
there were remaining requests in flight on the hardware and skipped
checking to see if there were any ready requests that would be
immediately submitted on restart (the same time as our new request would
be). Having convinced the engine to appear idle in the previous patch,
we can use intel_engine_is_idle() as a better test to only submit a new
request if there are no pending requests.
As it happens, this is tripping up igt/drv_selftest/live_hangcheck in CI
as we overfill the kernel_context ringbuffer trigger an infinite
recursion from within the reset.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104786
References: 7b6da818d86f ("drm/i915: Restore the kernel context after a GPU reset on an idle engine") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Michel Thierry <michel.thierry@intel.com> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180205152431.12163-4-chris@chris-wilson.co.uk
Chris Wilson [Mon, 5 Feb 2018 15:24:30 +0000 (15:24 +0000)]
drm/i915/execlists: Move the reset bits to a more natural home
In preparation for the next patch, we want the engine to appear idle
after a reset (if there are no requests in flight). For execlists, this
entails clearing the active status on reset, it will be regenerated on
restarting the engine after the reset. In the process, note that a
couple of other status flags and checks could be moved into the
describing function.
Chris Wilson [Mon, 5 Feb 2018 15:24:29 +0000 (15:24 +0000)]
drm/i915/selftests: Use a sacrificial context for hang testing
Avoid injecting hangs in to the i915->kernel_context in case the GPU
reset leaves corruption in the context image in its wake (leading to
continual failures and system hangs after the selftests are ostensibly
complete). Use a sacrificial kernel_context instead.
v2: Closing a context is tricky; export a function (for selftests) from
i915_gem_context.c to get it right.
Chris Wilson [Mon, 5 Feb 2018 15:24:28 +0000 (15:24 +0000)]
drm/i915/selftests: Flush old resets between engines
When injecting rapid resets, we must be careful to at least wait for the
previous reset to have taken effect and the engine restarted. If we
perform a second reset before that has happened, we will notice that the
engine hasn't recovered and declare it lost, wedging the device and
failing. In practice, since we wait for each hanging batch to start
before injecting the reset, this too-fast-reset condition can only be
triggered when moving onto the next engine in the test, so we need only
wait for the existing reset to complete before switching engines.
v2: Wrap up the wait inside a safety net to bail out in case of angry hw.
Chris Wilson [Sat, 3 Feb 2018 10:19:14 +0000 (10:19 +0000)]
drm/i915/breadcrumbs: Drop request reference for the signaler thread
If we remember to cancel the signaler on a request when retiring it
(after we know that the request has been signaled), we do not need to
carry an additional request in the signaler itself. This prevents an
issue whereby the signaler threads may be delayed and hold on to
thousands of request references, causing severe memory fragmentation and
premature oom (most noticeable on 32b snb due to the limited GFP_KERNEL
and frequent use of inter-engine fences).
v2: Rename first_signal(), document reads outside of locks.
Chris Wilson [Mon, 5 Feb 2018 09:22:01 +0000 (09:22 +0000)]
drm/i915: Remove unbannable context spam from reset
During testing, we trigger a lot of resets on an unbannable context
leading to massive amounts of irrelevant debug spam. Remove the
ban_score accounting and message for the unbannable context so that we
improve the signal:noise in the log messages for when the unexpected
occurs.
Chris Wilson [Mon, 5 Feb 2018 09:22:00 +0000 (09:22 +0000)]
drm/i915/execlists: Remove the startup spam
Execlists is now enabled by default and included in the list of
capabilities printed out to dmesg and beyond. We do not need to mention
it again, every time we restart the engine, so kill the spam.
Chris Wilson [Mon, 5 Feb 2018 09:41:39 +0000 (09:41 +0000)]
drm/i915: Report if an unbannable context is involved in a GPU hang
Since unbannable contexts are special and supposed not to be causing GPU
hangs in the first place, make it clear when they are implicated in said
hang. In practice, most unbannable contexts are those created by igt
for the express purpose of throwing untold thousands of hangs at the GPU
and wish to keep doing so to finish the test. Normally they are cleaned
up, but it's when they or the other unbannable kernel contexts stay
stuck in an erroneous state that we need to worry and so need
highlighting.