Chris Wilson [Thu, 7 Dec 2017 21:14:07 +0000 (21:14 +0000)]
drm/i915: Refactor common list iteration over GGTT vma
In quite a few places, we have a list iteration over the vma on an
object that only want to inspect GGTT vma. By construction, these are
placed at the start of the list, so we have copied that knowledge into
many callsites. Pull that knowledge back to i915_vma.h and provide a
for_each_ggtt_vma() to tidy up the code.
v2: Add a backreference from vma_create() to remind ourselves why we put
ggtt vma at the head of the obj->vma_list (and ppgtt vma at the tail).
v3: Fixup s/vma/V/
Chris Wilson [Wed, 6 Dec 2017 12:49:14 +0000 (12:49 +0000)]
drm/i915: Track GGTT writes on the vma
As writes through the GTT and GGTT PTE updates do not share the same
path, they are not strictly ordered and so we must explicitly flush the
indirect writes prior to modifying the PTE. We do track outstanding GGTT
writes on the object itself, but since the object may have multiple GGTT
vma, that is overly coarse as we can track and flush individual vma as
required.
Whilst here, update the GGTT flushing behaviour for Cannonlake.
v2: Hard-code ring offset to allow use during unload (after RCS may have
been freed, or never existed!)
Chris Wilson [Wed, 6 Dec 2017 12:49:13 +0000 (12:49 +0000)]
drm/i915: Remove vma from object on destroy, not close
Originally we translated from the object to the vma by walking
obj->vma_list to find the matching vm (for user lookups). Now we process
user lookups using the rbtree, and we only use obj->vma_list itself for
maintaining state (e.g. ensuring that all vma are flushed or rebound).
As such maintenance needs to go on beyond the user's awareness of the
vma, defer removal of the vma from the obj->vma_list from i915_vma_close()
to i915_vma_destroy()
Fixes: 5888fc9eac3c ("drm/i915: Flush pending GTT writes before unbinding")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104155 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171206124914.19960-1-chris@chris-wilson.co.uk Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Our new "enable_guc" modparam allows to control whenever HuC
should be loaded. However existing code will try load and
authenticate HuC always when we use the GuC. This patch is
trying to enforce modparam selection.
v2: no need to cast PTR_ERR (Chris)
fetch/fini only if required (Michal)
fix wrong break (Sagar)
v3: add new goto label (Sagar)
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Sagar Arun Kamble <sagar.a.kamble@intel.com> Reviewed-by: Sagar Arun Kamble <sagar.a.kamble@intel.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20171206135316.32556-7-michal.wajdeczko@intel.com
We currently have two module parameters that control GuC:
"enable_guc_loading" and "enable_guc_submission". Whenever
we need submission=1, we also need loading=1. We also need
loading=1 when we want to want to load and verify the HuC.
Lets combine above module parameters into one "enable_guc"
modparam. New supported bit values are:
0=disable GuC (no GuC submission, no HuC)
1=enable GuC submission
2=enable HuC load
Special value "-1" can be used to let driver decide what
option should be enabled for given platform based on
hardware/firmware availability or preference.
Explicit enabling any of the GuC features makes GuC load
a required step, fallback to non-GuC mode will not be
supported.
v2: Don't use -EIO
v3: define modparam bits (Chris)
v4: rely on implicit cast (Chris)
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Sagar Arun Kamble <sagar.a.kamble@intel.com> Cc: Sujaritha Sundaresan <sujaritha.sundaresan@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20171206135316.32556-6-michal.wajdeczko@intel.com
drm/i915/uc: Don't use -EIO to report missing firmware
-EIO has special meaning and is used when we want to allow
engine initialization to fail and mark GPU as wedged.
However here at this function we should return error code
that corresponds to upload status only, as any decision how
to handle missing firmware should be done higher level function
(silent fallback to non-GuC mode, fail into wedged mode, or
abort driver load with fatal error).
v2: commit message update (Michal)
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Sagar Arun Kamble <sagar.a.kamble@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20171206135316.32556-5-michal.wajdeczko@intel.com
drm/i915/uc: Don't fetch GuC firmware if no plan to use GuC
If we don't plan to use GuC then we should not try to fetch GuC and
HuC firmwares. We can save memory and avoid possible dmesg noise.
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Sagar Arun Kamble <sagar.a.kamble@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20171206135316.32556-4-michal.wajdeczko@intel.com
In the upcoming patch we will change the way how to recognize
when GuC is in use. Using helper macros will minimize scope
of that changes. While here, update dev_info message.
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Sagar Arun Kamble <sagar.a.kamble@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20171206135316.32556-3-michal.wajdeczko@intel.com
drm/i915/guc: Move firmware selection to init_early
Doing GuC firmware path selection from sanitize_options function
is not perfect, while there is no problem with doing so during
early init stage as we already have all needed data.
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Sagar Arun Kamble <sagar.a.kamble@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20171206135316.32556-2-michal.wajdeczko@intel.com
drm/i915/huc: Move firmware selection to init_early
Doing HuC firmware path selection from sanitize_options function
is not perfect, while there is no problem with doing so during
early init stage as we already have all needed data.
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Sagar Arun Kamble <sagar.a.kamble@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20171206135316.32556-1-michal.wajdeczko@intel.com
Chris Wilson [Tue, 5 Dec 2017 17:27:57 +0000 (17:27 +0000)]
drm/i915: Taint (TAINT_WARN) the kernel if the GPU reset fails
History tells us that if we cannot reset the GPU now, we never will. This
then impacts everything that is run subsequently. On failing the reset,
we mark the driver as wedged, trying to prevent further execution on the
GPU, forcing userspace to fallback to using the CPU to update its
framebuffers and let the user know what happened.
We also want to go one step further and add a taint to the kernel so that
any subsequent faults can be traced back to this failure. This is
useful for CI, where if the GPU/driver fails we want to reboot and
restart testing rather than continue on into oblivion. For everyone
else, the warning taint is a testament to the system unreliability.
TAINT_WARN is used anytime a WARN() is emitted, which is suitable for
our purposes here as well; the driver/system may behave unexpectedly
after the failure.
v2: Also taint if the recovery fails (again history shows us that is
typically fatal).
v3: Use TAINT_WARN
References: https://bugs.freedesktop.org/show_bug.cgi?id=103514 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Michał Winiarski <michal.winiarski@intel.com> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Acked-by: Tomi Sarvela <tomi.p.sarvela@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171205172757.32609-1-chris@chris-wilson.co.uk
Chris Wilson [Mon, 4 Dec 2017 13:25:13 +0000 (13:25 +0000)]
drm/i915: Flush pending GTT writes before unbinding
From the shrinker paths, we want to relinquish the GPU and GGTT access to
the object, releasing the backing storage back to the system for
swapout. As a part of that process we would unpin the pages, marking
them for access by the CPU (for the swapout/swapin). However, if that
process was interrupted after unbind the vma, we missed a flush of the
inflight GGTT writes before we made that GTT space available again for
reuse, with the prospect that we would redirect them to another page.
The bug dates back to the introduction of multiple GGTT vma, but the
code itself dates to commit 02bef8f98d26 ("drm/i915: Unbind closed vma
for i915_gem_object_unbind()").
Fixes: 02bef8f98d26 ("drm/i915: Unbind closed vma for i915_gem_object_unbind()") Fixes: c5ad54cf7dd8 ("drm/i915: Use partial view in mmap fault handler") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: stable@vger.kernel.org Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171204132513.7303-1-chris@chris-wilson.co.uk
drm/i915/cnl: only divide up base frequency with crystal source
We apply this logic to Gen9 as well. We didn't notice this issue as
most part we've encountered so far only use the crystal as source for
their timestamp registers.
Zhenyu Wang [Mon, 4 Dec 2017 02:42:58 +0000 (10:42 +0800)]
drm/i915/gvt: set max priority for gvt context
This is to workaround guest driver hang regression after
preemption enable that gvt hasn't enabled handling of that
for guest workload. So in effect this disables preemption
for gvt context now.
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Changbin Du [Wed, 29 Nov 2017 07:40:07 +0000 (15:40 +0800)]
drm/i915/gvt: Kick scheduler when new workload queued
The current schedule policy rely on a 1ms timer to execute workload. This
can introduce maximum 1ms unnecessary latency. This is especially bad for
small media workloads.
And I don't think we need this timer for QoS, but the change is not simply
remove the code. So I made a new API intel_gvt_kick_schedule() for future
change.
Signed-off-by: Changbin Du <changbin.du@intel.com> Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Tina Zhang [Wed, 29 Nov 2017 06:57:37 +0000 (14:57 +0800)]
drm/i915/gvt: Free dmabuf_obj list in intel_vgpu_dmabuf_cleanup
The per vGPU dmabuf_obj list should be released in intel_vgpu_dmabuf_
cleanup, which is invoked either in the process of closing a VM or in
the process of removing a vGPU.
Fixes: e3a0d7976c53 ("drm/i915/gvt: Handle orphan dmabuf_objs") Signed-off-by: Tina Zhang <tina.zhang@intel.com> Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Tina Zhang [Thu, 23 Nov 2017 08:26:37 +0000 (16:26 +0800)]
drm/i915/gvt: Handle orphan dmabuf_objs
dmabuf_obj's destruction relys on GEM release operation, which is managed
in i915 driver. And there is a time window between vgpu's destruction and
its dmabuf_objs' destruction. This patch is to free the orphan dmabuf_objs
correctly after the vgpu passes away.
Signed-off-by: Tina Zhang <tina.zhang@intel.com> Cc: Zhenyu Wang <zhenyuw@linux.intel.com> Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Tina Zhang [Thu, 23 Nov 2017 08:26:36 +0000 (16:26 +0800)]
drm/i915/gvt: Dmabuf support for GVT-g
This patch introduces a guest's framebuffer sharing mechanism based on
dma-buf subsystem. With this sharing mechanism, guest's framebuffer can
be shared between guest VM and host.
v16:
- add x_hot and y_hot. (Gerd)
- add flag validation for VFIO_DEVICE_GET_GFX_DMABUF. (Alex)
- rebase 4.14.0-rc6.
v15:
- add VFIO_DEVICE_GET_GFX_DMABUF ABI. (Gerd)
- add intel_vgpu_dmabuf_cleanup() to clean up the vGPU's dmabuf. (Gerd)
v14:
- add PROBE, DMABUF and REGION flags. (Alex)
v12:
- refine the lifecycle of dmabuf.
v9:
- remove dma-buf management. (Alex)
- track the dma-buf create and release in kernel mode. (Gerd) (Daniel)
v8:
- refine the dma-buf ioctl definition.(Alex)
- add a lock to protect the dmabuf list. (Alex)
v7:
- release dma-buf related allocations in dma-buf's associated release
function. (Alex)
- refine ioctl interface for querying plane info or create dma-buf.
(Alex)
v6:
- align the dma-buf life cycle with the vfio device. (Alex)
- add the dma-buf related operations in a separate patch. (Gerd)
- i915 related changes. (Chris)
v5:
- fix bug while checking whether the gem obj is gvt's dma-buf when user
change caching mode or domains. Add a helper function to do it.
(Xiaoguang)
- add definition for the query plane and create dma-buf. (Xiaoguang)
v4:
- fix bug while checking whether the gem obj is gvt's dma-buf when set
caching mode or doamins. (Xiaoguang)
v3:
- declare a new flag I915_GEM_OBJECT_IS_GVT_DMABUF in drm_i915_gem_object
to represent the gem obj for gvt's dma-buf. The tiling mode, caching
mode and domains can not be changed for this kind of gem object. (Alex)
- change dma-buf related information to be more generic. So other vendor
can use the same interface. (Alex)
v2:
- create a management fd for dma-buf operations. (Alex)
- alloc gem object's backing storage in gem obj's get_pages() callback.
(Chris)
Signed-off-by: Tina Zhang <tina.zhang@intel.com> Cc: Alex Williamson <alex.williamson@redhat.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Gerd Hoffmann <kraxel@redhat.com> Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Tina Zhang [Thu, 23 Nov 2017 08:26:35 +0000 (16:26 +0800)]
vfio: ABI for mdev display dma-buf operation
Add VFIO_DEVICE_QUERY_GFX_PLANE ioctl command to let user query and get
a plane and its information. So far, two types of buffers are supported:
buffers based on dma-buf and buffers based on region.
This ioctl can be invoked with:
1) Either DMABUF or REGION flag. Vendor driver returns a plane_info
successfully only when the specific kind of buffer is supported.
2) Flag PROBE. And at the same time either DMABUF or REGION must be set,
so that vendor driver returns success only when the specific kind of
buffer is supported.
Add VFIO_DEVICE_GET_GFX_DMABUF ioctl command to let user get a specific
dma-buf fd of an exposed MDEV buffer provided by dmabuf_id which was
returned in VFIO_DEVICE_QUERY_GFX_PLANE ioctl command.
The life cycle of an exposed MDEV buffer is handled by userspace and
tracked by kernel space. The returned dmabuf_id in struct vfio_device_
query_gfx_plane can be a new id of a new exposed buffer or an old id of
a re-exported buffer. Host user can check the value of dmabuf_id to see
if it needs to create new resources according to the new exposed buffer
or just re-use the existing resource related to the old buffer.
v18:
- update comments for VFIO_DEVICE_GET_GFX_DMABUF. (Alex)
v11:
- rename plane_type to drm_plane_type. (Gerd)
- move fields of vfio_device_query_gfx_plane to vfio_device_gfx_plane_info.
(Gerd)
- remove drm_format_mod, start fields. (Daniel)
- remove plane_id.
v10:
- refine the ABI API VFIO_DEVICE_QUERY_GFX_PLANE. (Alex) (Gerd)
v3:
- add a field gvt_plane_info in the drm_i915_gem_obj structure to save
the decoded plane information to avoid look up while need the plane
info. (Gerd)
Signed-off-by: Tina Zhang <tina.zhang@intel.com> Reviewed-by: Gerd Hoffmann <kraxel@redhat.com> Reviewed-by: Kirti Wankhede <kwankhede@nvidia.com> Acked-by: Alex Williamson <alex.williamson@redhat.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Tina Zhang [Mon, 20 Nov 2017 07:31:16 +0000 (15:31 +0800)]
drm/i915/gvt: Add opregion support
Windows guest driver needs vbt in opregion, to configure the setting
for display. Without opregion support, the display registers won't
be set and this blocks display model to get the correct information
of the guest display plane.
This patch is to provide a virtual opregion for guest. The original
author of this patch is Xiaoguang Chen.
This patch is split from the "Dma-buf support for GVT-g" patch set,
with being rebased to the latest gvt-staging branch.
v3:
- add checking region index during intel_vgpu_rw. (Xiong)
Xiong Zhang [Mon, 20 Nov 2017 07:31:15 +0000 (15:31 +0800)]
drm/i915/gvt: Alloc and Init guest opregion at vgpu creation
Currently guest opregion is allocated and initialised when guest
write opregion base register. This is too late for kvmgt, so
move it to vgpu_create time.
Signed-off-by: Xiong Zhang <xiong.y.zhang@intel.com> Tested-by: Tina Zhang <tina.zhang@intel.com> Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Chris Wilson [Mon, 4 Dec 2017 03:23:44 +0000 (11:23 +0800)]
drm/i915/gvt: Fix out-of-bounds buffer write into opregion->signature[]
sparse spots
drivers/gpu/drm/i915/gvt/opregion.c:234 alloc_and_init_virt_opregion() error: memcpy() 'header->signature' too small (16 vs 17)
as gvt is indeed trying to memcpy a string longer than the signature[].
Fixes: b2d6ef70614e ("drm/i915/gvt: Let each vgpu has separate opregion memory") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Xiong Zhang <xiong.y.zhang@intel.com> Cc: Zhenyu Wang <zhenyuw@linux.intel.com> Cc: Zhi Wang <zhi.a.wang@intel.com> Cc: intel-gvt-dev@lists.freedesktop.org Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
James Ausmus [Fri, 1 Dec 2017 02:17:00 +0000 (18:17 -0800)]
drm/i915/cnl: Mask previous DDI - PLL mapping
Without masking out the old value, we can end up pointing the DDI to a
disabled PLL, which makes the system fall over. Mask out the previous
value before setting the PLL to DDI mapping.
This can be observed by running igt/testdisplay with both an eDP and
HDMI/DP output active.
Chris Wilson [Fri, 1 Dec 2017 11:30:30 +0000 (11:30 +0000)]
drm/i915: Remove unsafe i915.enable_rc6
It has been many years since the last confirmed sighting (and fix) of an
RC6 related bug (usually a system hang). Remove the parameter to stop
users from setting dangerous values, as they often set it during triage
and end up disabling the entire runtime pm instead (the option is not a
fine scalpel!).
Furthermore, it allows users to set known dangerous values which were
intended for testing and not for production use. For testing, we can
always patch in the required setting without having to expose ourselves
to random abuse.
v2: Fixup NEEDS_WaRsDisableCoarsePowerGating fumble, and document the
lack of ilk support better.
v3: Clear intel_info->rc6p if we don't support rc6 itself.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Jani Nikula <jani.nikula@intel.com> Cc: Imre Deak <imre.deak@intel.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171201113030.18360-1-chris@chris-wilson.co.uk
Chris Wilson [Fri, 1 Dec 2017 12:20:11 +0000 (12:20 +0000)]
drm/i915: Sleep and retry a GPU reset if at first we don't succeed
As we declare the GPU wedged if the reset fails, such a failure is quite
terminal. Before taking that drastic action, let's sleep first and try
active, in the hope that the hardware has quietened down and is then
able to reset. After a few such attempts, it is fair to say that the HW
is truly wedged.
v2: Always print the failure message now, we precheck whether resets are
disabled.
References: https://bugs.freedesktop.org/show_bug.cgi?id=104007 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171201122011.16841-1-chris@chris-wilson.co.uk Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Ville Syrjälä [Wed, 29 Nov 2017 18:08:47 +0000 (20:08 +0200)]
drm/i915: Interlaced DP output doesn't work on VLV/CHV
Reject interlaced modes on VLV/CHV DP outputs. This simply does
not work correctly in the hardware. We do get some output, but
it's quite corrupted.
The available documentation fails to mention this fact. I
contacted some hardware people who eventually managed to locate
the relevant HSD for VLV, which was resolved by declaring
interlaced DP output as not supported. The HSD was never cloned
for CHV even though it inherited most of the hardware and
thus has the same problems with interlaced DP output.
Cc: Dennis Vshivkov <awesome.walrus+bugzilla@gmail.com> Reported-by: Dennis Vshivkov <awesome.walrus+bugzilla@gmail.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103922 Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171129180847.30613-1-ville.syrjala@linux.intel.com Acked-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Ville Syrjälä [Wed, 29 Nov 2017 15:37:31 +0000 (17:37 +0200)]
drm/i915: Wait for pipe to start on i830 as well
We should make sure the pipe has fully started when we enable it from
the i830 "power well". Otherwise theoretically i830 could also hit
problems with vblank timestamps jumping around (since we skip the
wait during modeset on i830). Additionally moving planes between the
pipes etc. might not work correctly until both pipes are actually up and
running.
v2: Less pointless duplication in the code (Chris)
Ville Syrjälä [Wed, 29 Nov 2017 15:37:30 +0000 (17:37 +0200)]
drm/i915: Fix vblank timestamp/frame counter jumps on gen2
Previously I was under the impression that the scanline counter
reads 0 when the pipe is off. Turns out that's not correct, and
instead the scanline counter simply stops when the pipe stops, and
it retains it's last value until the pipe starts up again, at which
point the scanline counter jumps to vblank start.
These jumps can cause the timestamp to jump backwards by one frame.
Since we use the timestamps to guesstimage also the frame counter
value on gen2, that would cause the frame counter to also jump
backwards, which leads to a massice difference from the previous value.
The end result is that flips/vblank events don't appear to complete as
they're stuck waiting for the frame counter to catch up to that massive
difference.
Fix the problem properly by actually making sure the scanline counter
has started to move before we assume that it's safe to enable vblank
processing.
v2: Less pointless duplication in the code (Chris)
Cc: stable@vger.kernel.org Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Fixes: b7792d8b54cc ("drm/i915: Wait for pipe to start before sampling vblank timestamps on gen2") Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171129153732.3612-1-ville.syrjala@linux.intel.com
Ville Syrjälä [Wed, 29 Nov 2017 12:54:11 +0000 (14:54 +0200)]
drm/i915: Fix deadlock in i830_disable_pipe()
i830_disable_pipe() gets called from the power well code, and thus
we're already holding the power domain mutex. That means we can't
call plane->get_hw_state() as it will also try to grab the
same mutex and will thus deadlock.
Replace the assert_plane() calls (which calls ->get_hw_state()) with
just raw register reads in i830_disable_pipe(). As a bonus we can
now get a warning if plane C is enabled even though we don't even
expose it as a drm plane.
v2: Do a separate WARN_ON() for each plane (Chris)
Ville Syrjälä [Wed, 29 Nov 2017 16:43:03 +0000 (18:43 +0200)]
drm/i915: Fix has_audio readout for DDI A
Transcoder EDP does not support audio. Let's not try to
read the state of the audio enable bit HSW_AUD_PIN_ELD_CP_VLD
based on the pipe when using transcoder EDP.
While at it make the function static and flatten it.
Ville Syrjälä [Wed, 29 Nov 2017 16:43:01 +0000 (18:43 +0200)]
drm/i915: Disable DP audio for g4x
Apparently g4x doesn't support audio over DP. Bspec lists the
bit as "Reserved for Audio Output Enable", and empirical evidence
tells us that the bit won't stick. So stop trying to enable DP
audio on g4x.
Chris Wilson [Thu, 30 Nov 2017 09:42:31 +0000 (09:42 +0000)]
drm/i915/selftests: Wake the device before executing requests on the GPU
To execute a requests requires us to have first woken the device, using
the rpm wakeref (as the request needs to write to hardware to setup the
context/ppGTT and execute on the GPU). So call intel_runtime_pm_get()
around queuing the request; the request itself will then carry a wakeref
until completion.
Chris Wilson [Fri, 1 Dec 2017 00:15:36 +0000 (00:15 +0000)]
drm/i915: Set fake_vma.size as well as fake_vma.node.size for capture
When capturing the bo, we allocate an error object with an array of
min(vma->size, vma->node.size) pages, plus a bit for compression overhead.
However, when creating the fake vma to describe the bo, only one of the
sizes was filled in, resulting in a too small array. Through my and CI
testing, this was sufficient for the mostly empty NULL context as
it compressed well (or the out-of-bounds access simply didn't cause an
issue). However, in real workloads on Cannonlake, we were overflowing
that array and causing havoc with the random memory corruption.
Reported-by: Rafael Antognolli <rafael.antognolli@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103964 Fixes: 4e90a6e22272 ("drm/i915: Record default HW state in the GPU error state") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Tested-by: Rodrigo Vivi <rodrigo.vivi@gmail.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171201001536.13941-1-chris@chris-wilson.co.uk Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
drm/i915: Enable IPS with only sprite plane visible too, v4.
This comment predates atomic, and I think with the way we currently
track IPS, it's safe to enable this for the case we switch too.
Changes since v1:
- Keep IPS enabled when switching planes.
Changes since v2:
- Enable IPS when at least one plane is enabled. (Ville)
Changes since v3:
- Actually do what was advertised in v3, sigh! (Ville, CI)
drm/i915: Make ips_enabled a property depending on whether IPS is enabled, v3.
ips_enabled was used as a variable of whether IPS can be enabled or not,
but should be used to test whether IPS is actually enabled.
Changes since v1:
- Call needs_modeset on new crtc state. (Ville)
- IPS can be enabled with sprite plane enabled too. (Ville)
- Fix CDCLK vs IPS workaround. (Ville)
Changes since v2:
- Only re-enable fastset when inheriting mode. (Ville)
- Put the conditions for enabling and disabling IPS in a helper.
Changes since v3:
- Keep the max_cdclk workaround working. (Ville)
- Also check logical cdclk out of paranoia.
- Remove planes check from IPS disable function for initial disable.
- Remove assert_plane_enabled/disabled checks and use
crtc_state->active_planes for hsw_enable_ips only, always allow
calling hsw_disable_ips to disable it initially in hw.
Imre Deak [Wed, 29 Nov 2017 17:51:37 +0000 (19:51 +0200)]
drm/i915: Avoid PPS HW/SW state mismatch due to rounding
We store a SW state of the t11_t12 timing in 100usec units but have to
program it in 100msec as required by HW. The rounding used during
programming means there will be a mismatch between the SW and HW states
of this value triggering a "PPS state mismatch" error. Avoid this by
storing the already rounded-up value in the SW state.
Note that we still calculate panel_power_cycle_delay with the finer
100usec granularity to avoid any needless waits using that version of
the delay.
Chris Wilson [Thu, 30 Nov 2017 10:29:51 +0000 (10:29 +0000)]
drm/i915: Skip switch-to-kernel-context on suspend when wedged
If the HW is already wedged, attempting to submit a request will
generate an -EIO. If we tried this during suspend, we would abort
whereas all we want to do is to go sleep and throw away the corrupt
state.
Fixes: 5ab57c702069 ("drm/i915: Flush logical context image out to memory upon suspend")
Testcase: igt/gem_eio/suspend Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171130102951.14965-1-chris@chris-wilson.co.uk
Valtteri Rantala [Tue, 28 Nov 2017 14:45:05 +0000 (16:45 +0200)]
drm/i915/glk: Apply WaProgramL3SqcReg1DefaultForPerf for GLK too
Testing the texture read performance shows that the same tuning for
the SQ credits is needed on GLK as on BXT/APL. This has been also
confirmed by Altug from the HW team.
drm/i915/guc: Change default GuC FW for KBL to v9.39
This patch makes v9.39 firmware as default firmware for KBL.
Note: GuC logging control is changed with this firmware. GuC is
expecting i915 to set control bit to enable "default logging"
while using GuC action UK_LOG_ENABLE_LOGGING.
However i915 is currently not doing this because it is version
specific change and can be handled entirely in GuC. It will need
to be fixed in future firmwares.
This update includes (since v9.14):
- DCC spec changes for BXT + DCT enabling
- Bug Fix for power conservation feature SLPC_DCC
- Scheduler 1-element submission during DCC cycles.
- SB based Pre-ETM/ETM flow enabling for debug signed GuC/HuC
- Moving GuC non_critical r/w data to lower SRAM 64KB
- Media engine Reset fix. Correctly marking context for resubmission in
Media Reset case.
- ABT Disable bug fix. Disabled Evaluation mode on context change.
- Async FW in Engine Schedule feature (not enabled from KMD)
- GuC clean up to align developer build in line to production build.
- Disable ARAT interrupt before programming ARAT delta.
- Memory range check in Parse to avoid failure due to overflow.
- GuC Msg Channel Hang WA - Stall GUC for mmio access when IDI is low
during CPD flow.
- Fix for submit queue over flow issue
- Enabling IBC on KBL GT3 15W, GT4 45W
- Disabling wrong device ID WA in production signed kernel
- Enabling WA for MSGCH hang issue upto required KBL stepping
- Clear forcewake in CSB when SQ is empty.
- 3Tries of GuC2CSME wake request
- During reset one parameter was not getting accounted
- Disable DCC 1-elem mode submission
- Move UkGuckmdInterface.h file from 2016 folders to common 2016 folder.
- This is file location change.No functional change done as part of this
check in.
- Enabling Guc Log changes for ultra low logging for OCA
- Enabling Dynamic Render Power Well Hysteresis Programming for Compute
Worklaods
- Enabling build failure check to catch critical section overflow.
- Disable build.bat redundant prints.
- Move few least used functions to non-critical section.
- Rearrange GuC documentation folder structure.
- Synchronize SLPC internal debug interface with other branches.
- Fixing Issue with Default Guc Log changes for OCA using special Control
Bit
- Aggressive DCC implementation for supported platforms.
v2: Rebase. Updated commit message.
Signed-off-by: Jeff McGee <jeff.mcgee@intel.com> Signed-off-by: Sagar Arun Kamble <sagar.a.kamble@intel.com> Cc: Spotswood John A <john.a.spotswood@intel.com> Cc: Anusha Srivatsa <anusha.srivatsa@intel.com> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: Anusha Srivatsa<anusha.srivatsa@intel.com > Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/1511972351-574-4-git-send-email-sagar.a.kamble@intel.com
drm/i915/guc: Change default GuC FW for BXT to v9.29
This patch makes v9.29 firmware as default firmware for BXT.
Note: GuC logging control is changed with this firmware. GuC is
expecting i915 to set control bit to enable "default logging"
while using GuC action UK_LOG_ENABLE_LOGGING.
However i915 is currently not doing this because it is version
specific change and can be handled entirely in GuC. It will need
to be fixed in future firmwares.
This update includes (since v8.7):
- Added support to log media reset count for host to read it
- BXT WA for fixing MTP hangs. WaDisableDOPRenderClkGatingAtSubmit
- Sub-feature level control for power management features.
- Minor clean-up for power management interface.
- Unified power management interface and scheduler interface into
1 file using same version.
- Bug Fix for multi context scheduler flag.
- DCC spec changes for BXT + DCT enabling
- Springboard based Pre-ETM/ETM flow enabling for debug signed GuC/HuC
- Moving GuC non_critical r/w data to lower SRAM 64KB
- Enabled IBC for BXT
- Media engine Reset fix. Correctly marking context for resubmission in
Media Reset case.
- SLPC Dynamic RPe fix to resolve issues where incorrect frequency was set.
- ABT Disable bug fix. Disabled Evaluation mode on context change.
- GuC clean up to align developer build in line to production build.
- Disable ARAT interrupt before programming ARAT delta.
- Memory range check in Parse to avoid failure due to overflow.
- Clear forcewake in CSB when SQ is empty.
- SLPC IBC 1.6 for APL to ensure multiplier does not cap IA below Pe.
- Move UkGuckmdInterface.h file from 2016 folders to common 2016 folder.
- This is file location change. No functional change done as part of this
check in.
- 3 tries of wake request needed from GuC2CSME for ME to wake up. Request
has come from ME spec
- During reset one parameter was not getting accounted
- Enabling Guc Log changes for ultra low logging for OCA
- Disable build.bat redundant prints.
- Move few least used functions to non-critical section.
- Rearrange GuC documentation folder structure.
- Fixing Issue with Default Guc Log changes for OCA using special Control
Bit
v2: Rebase. Updated commit message.
Signed-off-by: Jeff McGee <jeff.mcgee@intel.com> Signed-off-by: Sagar Arun Kamble <sagar.a.kamble@intel.com> Cc: Spotswood John A <john.a.spotswood@intel.com> Cc: Anusha Srivatsa <anusha.srivatsa@intel.com> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: Anusha Srivatsa <anusha.srivatsa@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/1511972351-574-3-git-send-email-sagar.a.kamble@intel.com
drm/i915/guc: Change default GuC FW for SKL to v9.33
This patch makes v9.33 firmware as default firmware for SKL.
Note: GuC logging control is changed with this firmware. GuC is
expecting i915 to set control bit to enable "default logging"
while using GuC action UK_LOG_ENABLE_LOGGING.
However i915 is currently not doing this because it is version
specific change and can be handled entirely in GuC. It will need
to be fixed in future firmwares.
This update includes (since v6.1):
- HuC RSA Keys updated.
- Adding per engine preemption support in GuC scheduler
- Minor bug fixes.
- Added support to log media reset count for host to read it
- Sub-feature level control for power management features.
- Minor clean-up for power management interface.
- Unified power management interface and scheduler interface into
1 file using same version.
- Bug Fix for multi context scheduler flag.
- DCC spec changes for BXT + DCT enabling
- SB based Pre-ETM/ETM flow enabling for debug signed GuC/HuC
- Moving GuC non_critical r/w data to lower SRAM 64KB
- Media engine Reset fix. Correctly marking context for resubmission in
Media Reset case.
- ABT Disable bug fix. Disabled Evaluation mode on context change.
- Async FW in Engine Schedule feature (not enabled from KMD)
- GuC clean up to align developer build in line to production build.
- DCC consistency fix for SKL
- Disable ARAT interrupt before programming ARAT delta.
- Memory range check in Parse to avoid failure due to overflow.
- Enabled WA for MSGCH hang issue
- Clear forcewake in CSB when SQ is empty.
- Move UkGuckmdInterface.h file from 2016 folders to common 2016 folder.
- This is file location change.No functional change done as part of this
check in.
- Enable decoupled freq for SKL GT4
- 3 tries of wake request needed from GuC2CSME for ME to wake up. Request
has come from ME spec
- During reset one parameter was not getting accounted
- Enabling Guc Log changes for ultra low logging for OCA
- Enabling build failure check to catch critical section overflow.
- Disable build.bat redundant prints.
- Move few least used functions to non-critical section.
- Rearrange GuC documentation folder structure.
- Synchronize SLPC internal debug interface with other branches.
- Fixing Issue with Default Guc Log changes for OCA using special Control
Bit
v2: Rebase. Updated commit message.
Signed-off-by: Jeff McGee <jeff.mcgee@intel.com> Signed-off-by: Sagar Arun Kamble <sagar.a.kamble@intel.com> Cc: Spotswood John A <john.a.spotswood@intel.com> Cc: Anusha Srivatsa <anusha.srivatsa@intel.com> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: Anusha Srivatsa <anusha.srivatsa@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/1511972351-574-2-git-send-email-sagar.a.kamble@intel.com
Tvrtko Ursulin [Wed, 29 Nov 2017 10:28:05 +0000 (10:28 +0000)]
drm/i915: Consolidate checks for engine stats availability
Sagar noticed the check can be consolidated between the engine stats
implementation and the PMU.
My first choice was a static inline helper but that got into include
ordering mess quickly fast so I went with a macro instead. At some point
we should perhaps looking into taking out the non-ringubffer bits from
intel_ringbuffer.h into a new intel_engine.h or something.
v2: Use engine->flags. (Chris Wilson)
v3: Rebase and mark GuC as not yet supported. (Chris Wilson)
v4: Move flag setting to intel_engines_reset_default_submission.
(Chris Wilson)
v5: Move flag setting to logical_ring_setup.
v6: intel_engines_reset_default_submission is the wrong place to set the
flag - it needs to be in execlists_set_default_submission. (Sagar)
v7: Flag setting in logical_ring_setup is not required. (Chris)
Joonas Lahtinen [Mon, 27 Nov 2017 09:12:33 +0000 (11:12 +0200)]
drm/i915: Disable THP until we have a GPU read BW W/A
We seem to be missing some W/A for 2M pages and are getting
a hit on raw GPU read bandwidths (even 30%) even though the
GPU write bandwidths improve (even 10%).
For now, disable THP, which is our only practical source of
2M pages until we have a W/A for the issue.
v2:
- Be explicit that we talk about GPU bandwidths (Eero)
- s/deny/never/ because that's why (Chris)
Reported-by: Valtteri Rantala <valtteri.rantala@intel.com> Fixes: b901bb89324a ("drm/i915/gemfs: enable THP") Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Matthew Auld <matthew.auld@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Jani Nikula <jani.nikula@linux.intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Valtteri Rantala <valtteri.rantala@intel.com> Cc: Eero Tamminen <eero.t.tamminen@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Tested-by: Valtteri Rantala <valtteri.rantala@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171127091233.7001-1-joonas.lahtinen@linux.intel.com
Tvrtko Ursulin [Tue, 28 Nov 2017 10:55:15 +0000 (10:55 +0000)]
drm/i915/pmu: Return -EINVAL when selecting the inactive CPU
In commit 0426c0465461 ("drm/i915/pmu: Only allow running on a single
CPU") I attempted to clarify the CPU hotplug logic in our PMU
implementation, but missed that a more logical error to return, when
attempting to initialize an event on a currently inactive CPU, is -EINVAL
rather than -ENODEV.
This is because i915 PMU explicitly disallows running counters on more
than one CPU at a time, and is not reporting that the requested CPU does
not exist, or is off-line.
Chris Wilson [Tue, 28 Nov 2017 11:01:47 +0000 (11:01 +0000)]
drm/i915: Enable hotplug polling after registering the outputs
Previously we would enable hotplug polling on the outputs immediately
upon construction. This would allow a very early hotplug event to
trigger before we had finishing setting up the driver to handle it.
Instead, move the output polling to the last step of registration, after
we have set up all handlers, including the fbdev configuration.
v2: Symmetrically turnoff the hotplug helper in unregister after the
fbdev is first synchronised then finalized. This stops a late hotplug
event being processed after the interrupts are disabled.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Ville Syrjala <ville.syrjala@linux.intel.com> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> #v1 Link: https://patchwork.freedesktop.org/patch/msgid/20171128110147.28654-1-chris@chris-wilson.co.uk
Chris Wilson [Mon, 27 Nov 2017 12:30:54 +0000 (12:30 +0000)]
drm/i915: Rename i915_gem_timelines_mark_idle
The kerneldoc markup for i915_gem_timelines_mark_idle() was incorrect,
so take the opportunity to also convert it from the "mark_idle" to "park"
naming scheme.
drivers/gpu/drm/i915/i915_gem_timeline.c:120: warning: No description found for parameter 'i915'
Chris Wilson [Thu, 23 Nov 2017 11:53:37 +0000 (11:53 +0000)]
drm/i915: Rename shrinker init/cleanup to match driver initialisation phase
Since the shrinker is registered and unregistered during
i915_driver_register and i915_driver_unregister, respectively, rename
the init/cleanup functions to match.
Chris Wilson [Sun, 26 Nov 2017 22:09:01 +0000 (22:09 +0000)]
drm/i915: Record default HW state in the GPU error state
It may be of interest to both compare the active HW context against the
default (aka NULL) context, to see what has been changed and if either are
corrupt.
Chris Wilson [Sun, 26 Nov 2017 21:48:56 +0000 (21:48 +0000)]
drm/i915: Flush everything on switching to the kernel_context
Even though all rendering should have been flushed at the end of the
previous requests, add an extra flush after switching to the
kernel_context. As the switch to the kernel_context is used when idling
the gpu (e.g. suspend), having an extra layer of paranoia to ensure
everything is flushed to memory seems sensible.
The alternative intel_backlight_device_register() definition apparently
never got used, but I have now run into a case of i915 being compiled
without CONFIG_BACKLIGHT_CLASS_DEVICE, resulting in a number of
identical warnings:
drivers/gpu/drm/i915/intel_drv.h:1739:12: error: 'intel_backlight_device_register' defined but not used [-Werror=unused-function]
This marks the function as 'inline', which was surely the original
intention here.
Chris Wilson [Sat, 25 Nov 2017 19:41:55 +0000 (19:41 +0000)]
drm/i915/fbdev: Serialise early hotplug events with async fbdev config
As both the hotplug event and fbdev configuration run asynchronously, it
is possible for them to run concurrently. If configuration fails, we were
freeing the fbdev causing a use-after-free in the hotplug event.
In order to keep the dev_priv->ifbdev alive after failure, we have to
avoid the free and leave it empty until we unload the module (which is
less than ideal, but a necessary evil for simplicity). Then we can use
intel_fbdev_sync() to serialise the hotplug event with the configuration.
The serialisation between the two was removed in commit 934458c2c95d
("Revert "drm/i915: Fix races on fbdev""), but the use after free is much
older, commit 366e39b4d2c5 ("drm/i915: Tear down fbdev if initialization
fails")
Fixes: 366e39b4d2c5 ("drm/i915: Tear down fbdev if initialization fails") Fixes: 934458c2c95d ("Revert "drm/i915: Fix races on fbdev"") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Lukas Wunner <lukas@wunner.de> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: stable@vger.kernel.org Reviewed-by: Lukas Wunner <lukas@wunner.de> Link: https://patchwork.freedesktop.org/patch/msgid/20171125194155.355-1-chris@chris-wilson.co.uk
Michal Wajdeczko [Fri, 24 Nov 2017 17:02:39 +0000 (17:02 +0000)]
drm/i915/guc: Use consistent name for scratch register count
We should be consistent on naming of similar definitions.
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Sagar Arun Kamble <sagar.a.kamble@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171124170239.29360-1-michal.wajdeczko@intel.com Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Tvrtko Ursulin [Fri, 24 Nov 2017 17:13:31 +0000 (17:13 +0000)]
drm/i915/pmu: Aggregate all RC6 states into one counter
Chris has discovered that RC6, RC6p and RC6pp counters are mutually
exclusive, and even that on some SNB SKUs you get RC6p increasing, and on
the others RC6.
Furthermore RC6p and RC6pp were only present starting from GEN6 until,
GEN7, not including Haswell.
All this combined makes it questionable whether we need to reserve new ABI
for these counters. One idea was to just combine them all under the RC6
counter to simplify things for userspace. So that is what this patch does.
Chris Wilson [Fri, 24 Nov 2017 13:00:30 +0000 (13:00 +0000)]
drm/i915: Use exponential backoff for wait_for()
Instead of sleeping for a fixed 1ms (roughly, depending on timer slack),
start with a small sleep and exponentially increase the sleep on each
cycle.
A good example of a beneficiary is the guc mmio communication channel.
Typically we expect (and so spin) for 10us for a quick response, but this
doesn't cover everything and so sometimes we fallback to the millisecond+
sleep. This incurs a significant delay in time-critical operations like
preemption (igt/gem_exec_latency), which can be improved significantly by
using a small sleep after the spin fails.
We've made this suggestion many times, but had little experimental data
to support adding the complexity.
v2: Bump the minimum usleep to 10us on advice of
Documentation/timers/timers-howto.txt (Tvrko)
v3: Specify min, max range for usleep intervals -- some code may
crucially depend upon and so want to specify the sleep pattern.
References: 1758b90e38f5 ("drm/i915: Use a hybrid scheme for fast register waits") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: John Harrison <John.C.Harrison@intel.com> Cc: Michał Winiarski <michal.winiarski@intel.com> Cc: Ville Syrjala <ville.syrjala@linux.intel.com> Reviewed-by: Michał Winiarski <michal.winiarski@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171124130031.20761-2-chris@chris-wilson.co.uk
Chris Wilson [Fri, 24 Nov 2017 13:00:29 +0000 (13:00 +0000)]
drm/i915/guc: Tidy ELSP port assignment
Since we know that the port is empty, we do not need to extract the
count from the old request it and copy it over to the new request, or
attempt to unref the NULL old request pointer.
Chris Wilson [Fri, 24 Nov 2017 13:37:44 +0000 (13:37 +0000)]
drm/i915/guc: Advance over port[0] if set and not preempting
Our execlist emulation is intended to only use a maximum of 2 ports per
engine, so as to not overflow the wq. (By knowing the limits, we can
avoid having to handle the wq exhaustion.) However, upon adding
preemption, we lost the skip over the first port if set for the
non-preemption path. Restore it.
Reported-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Fixes: c41937fd994a ("drm/i915/guc: Preemption! With GuC") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Michał Winiarski <michal.winiarski@intel.com> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171124133745.5173-1-chris@chris-wilson.co.uk Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: Michał Winiarski <michal.winiarski@intel.com>
Ville Syrjälä [Thu, 23 Nov 2017 19:41:57 +0000 (21:41 +0200)]
drm/i915: Prevent zero length "index" write
The hardware always writes one or two bytes in the index portion of
an indexed transfer. Make sure the message we send as the index
doesn't have a zero length.
Cc: stable@vger.kernel.org Cc: Daniel Kurtz <djkurtz@chromium.org> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Sean Paul <seanpaul@chromium.org> Fixes: 56f9eac05489 ("drm/i915/intel_i2c: use INDEX cycles for i2c read transactions") Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171123194157.25367-3-ville.syrjala@linux.intel.com Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Ville Syrjälä [Thu, 23 Nov 2017 19:41:56 +0000 (21:41 +0200)]
drm/i915: Don't try indexed reads to alternate slave addresses
We can only specify the one slave address to indexed reads/writes.
Make sure the messages we check are destined to the same slave
address before deciding to do an indexed transfer.
Cc: stable@vger.kernel.org Cc: Daniel Kurtz <djkurtz@chromium.org> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Sean Paul <seanpaul@chromium.org> Fixes: 56f9eac05489 ("drm/i915/intel_i2c: use INDEX cycles for i2c read transactions") Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171123194157.25367-2-ville.syrjala@linux.intel.com Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Tvrtko Ursulin [Fri, 24 Nov 2017 09:49:59 +0000 (09:49 +0000)]
drm/i915/pmu: Stop averaging with the previous sample
Averaging with the previous sample brings a small statistical improvement
to sampling counters, but can leek a little bit of state from a current
client to the next which mulls the border between past and present for
observing clients.
This is because on event enable clients record the current counter value
and use it as reference, but with rapid off-on event cycles, and due the
delayed nature of sampling timer self-disarm, previous sample value does
not get cleared under these circumstances.
Solution is to stop averaging with the previous sample. This has a small
downside of losing some precision with short and spiky signals, but the
alternatives look too complicated for the benefit.
Michal Wajdeczko [Fri, 24 Nov 2017 09:53:40 +0000 (09:53 +0000)]
drm/i915/guc: Rename i915_guc_reg.h to intel_guc_reg.h
We are using intel_ prefix for all file names with hardware
related definitions. GuC registers also fall into this category.
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Sagar Arun Kamble <sagar.a.kamble@intel.com> Reviewed-by: Sagar Arun Kamble <sagar.a.kamble@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171124095340.1500-1-michal.wajdeczko@intel.com Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Tvrtko Ursulin [Thu, 23 Nov 2017 12:34:32 +0000 (12:34 +0000)]
drm/i915/pmu: Only allow running on a single CPU
We do two things, both of which are purely to simplify and clarify the
implementation:
1.
Simplify the CPU online callback so it is more obvious that the purpose
there is to set a single CPU mask bit for the first CPU which comes
online. Using cpumask_weight for this reads more obvious than the trick
with cpumask_and_any.
2.
Modify the event init so that events can be created only on a single CPU.
This removes looking at the requested CPU thread siblings, and only allows
creating on the current active CPU.
Even for static CPU configurations, the hotplug CPU framework is still
used to determine the CPU topology, and is still being used by the perf
event register to check for valid CPUs.
Fixes: b46a33e271ed ("drm/i915/pmu: Expose a PMU interface for perf queries") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171123123432.25035-1-tvrtko.ursulin@linux.intel.com
Chris Wilson [Thu, 23 Nov 2017 23:37:12 +0000 (23:37 +0000)]
drm/i915/selftests: Hold rpm wakeref for request + ggtt usage
Since the removal of the delayed rc6 enabling, we now setup and drop the
early rpm wakeref during modules initialisation before we start the live
selftests. As such, we are now detecting errors in the tests where we
were not holding the required wakeref for various actions. As rpm is not
the primary goal of the tests involved, take a coarse and convenient rpm
wakeref around the tests.
Chris Wilson [Thu, 23 Nov 2017 21:17:51 +0000 (21:17 +0000)]
drm/i915/pmu: Hide the (unsigned long)ptr cast
We pretend the PMU config id is a pointer value when encoding it into
the device parameters for presentation via sysfs. This requires casting
of an unsigned long into and out of the pointer member, which annoys
smatch:
drivers/gpu/drm/i915/i915_pmu.c:684 i915_pmu_event_show() warn: argument 3 to %lx specifier is cast from pointer
Instead of abusing a generic dev_ext_attribute, define our own typesafe
attributes.
Chris Wilson [Thu, 23 Nov 2017 15:26:31 +0000 (15:26 +0000)]
drm/i915: Move mi_set_context() into the legacy ringbuffer submission
The legacy i915_switch_context() is only applicable to the legacy
ringbuffer submission method, so move it from the general
i915_gem_context.c to intel_ringbuffer.c (rename pending!).
The legacy context switch for ringbuffer submission is multistaged,
where each of those stages may fail. However, we were updating global
state after some stages, and so we had to force the incomplete request
to be submitted because we could not unwind. Save the global state
before performing the switches, and so enable us to unwind back to the
previous global state should any phase fail. We then must cancel the
request instead of submitting it should the construction fail.
Matthew Auld [Thu, 23 Nov 2017 13:54:21 +0000 (13:54 +0000)]
drm/i915/selftests: test descending addresses
For igt_write_huge make sure the higher gtt offsets don't feel left out,
which is especially true when dealing with the 48b PPGTT, where we
timeout long before we are able exhaust the address space.
Daniel Vetter [Mon, 13 Nov 2017 16:01:40 +0000 (17:01 +0100)]
drm/i915: sync dp link status checks against atomic commmits
Two bits:
- check actual atomic state, the legacy stuff can only be looked at
from within the atomic_commit_tail function, since it's only
protected by ordering and not by any locks.
- Make sure we don't wreak the work an ongoing nonblocking commit is
doing.
v2: We need the crtc lock too, because a plane update might change it
without having to acquire the connection_mutex (Maarten). Use
Maarten's changes for this locking, while keeping the logic that uses
the connection->commit->hw_done signal for syncing with nonblocking
commits.
v3: The initial state objects from the hw state readout do not have a
commit object. Check for that (spotted by CI).
v4: Fix deadlock from jumping to put_power with locks still held.
(mlankhorst)
Cc: Manasi Navare <manasi.d.navare@intel.com> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
References: https://bugs.freedesktop.org/show_bug.cgi?id=103336
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99272 Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20171113160140.22679-1-maarten.lankhorst@linux.intel.com
Tvrtko Ursulin [Thu, 23 Nov 2017 10:26:54 +0000 (10:26 +0000)]
drm/i915/pmu: Clear the previous sample value when parking
When turning off the engines, and the pmu sampling, clear the previous
value as the current measurement should be 0.
v2: Use a for-loop
v3:
* Move clearing to timer self-dis-arm to avoid race with parking.
* Clear frequency samples as well.
v4:
* Init frequency to idle_freq. (Chris Wilson)
Chris Wilson [Wed, 22 Nov 2017 22:25:10 +0000 (22:25 +0000)]
drm/i915: Save/restore irq state for vlv_residency_raw()
Since commit 6060b6aec03c ("drm/i915/pmu: Add RC6 residency metrics"),
vlv_residency_raw() may be called from an irq-disabled context (via perf
event sampling on remote cpu). As such, we can no longer assume that we
are called from process context and must save/restore the irq state for
the spinlock.
Chris Wilson [Wed, 22 Nov 2017 17:26:21 +0000 (17:26 +0000)]
drm/i915: Call i915_gem_init_userptr() before taking struct_mutex
We don't need struct_mutex to initialise userptr (it just allocates a
workqueue for itself etc), but we do need struct_mutex later on in
i915_gem_init() in order to feed requests onto the HW.
This should break the chain
[ 385.697902] ======================================================
[ 385.697907] WARNING: possible circular locking dependency detected
[ 385.697913] 4.14.0-CI-Patchwork_7234+ #1 Tainted: G U
[ 385.697917] ------------------------------------------------------
[ 385.697922] perf_pmu/2631 is trying to acquire lock:
[ 385.697927] (&mm->mmap_sem){++++}, at: [<ffffffff811bfe1e>] __might_fault+0x3e/0x90
[ 385.697941]
but task is already holding lock:
[ 385.697946] (&cpuctx_mutex){+.+.}, at: [<ffffffff8116fe8c>] perf_event_ctx_lock_nested+0xbc/0x1d0
[ 385.697957]
which lock already depends on the new lock.
Chris Wilson [Wed, 22 Nov 2017 12:06:00 +0000 (12:06 +0000)]
drm/i915/selftests: Use NOWARN for large allocations
We may try to do a large kmalloc for the permutation array, falling back
to a smaller array/test if the first allocation fails. Since we are
intentionally trying a large allocation which may fail, pass __GFP_NOWARN.
Tvrtko Ursulin [Tue, 21 Nov 2017 18:18:49 +0000 (18:18 +0000)]
drm/i915/pmu: Wire up engine busy stats to PMU
We can use engine busy stats instead of the sampling timer for
better accuracy.
By doing this we replace the stohastic sampling with busyness
metric derived directly from engine activity. This is context
switch interrupt driven, so as accurate as we can get from
software tracking.
As a secondary benefit, we can also not run the sampling timer
in cases only busyness metric is enabled.
v2: Rebase.
v3:
* Rebase, comments.
* Leave engine busyness controls out of workers.
v4: Checkpatch cleanup.
v5: Added comment to pmu_needs_timer change.
v6:
* Rebase.
* Fix style of some comments. (Chris Wilson)
v7: Rebase and commit message update. (Chris Wilson)
v8: Add delayed stats disabling to improve accuracy in face of
CPU hotplug events.
v9: Rebase.
v10: Rebase - i915_modparams.enable_execlists removal.
Tvrtko Ursulin [Tue, 21 Nov 2017 18:18:48 +0000 (18:18 +0000)]
drm/i915: Engine busy time tracking
Track total time requests have been executing on the hardware.
We add new kernel API to allow software tracking of time GPU
engines are spending executing requests.
Both per-engine and global API is added with the latter also
being exported for use by external users.
v2:
* Squashed with the internal API.
* Dropped static key.
* Made per-engine.
* Store time in monotonic ktime.
v3: Moved stats clearing to disable.
v4:
* Comments.
* Don't export the API just yet.
v5: Whitespace cleanup.
v6:
* Rename ref to active.
* Drop engine aggregate stats for now.
* Account initial busy period after enabling stats.
v7:
* Rebase.
v8:
* Move context in notification after the notifier. (Chris Wilson)
v9:
In cases where stats tracking is getting disabled while there is
an active context on an engine, add up the current value to the
total. This also implies we don't clear the total when tracking
is disabled any longer. There is no real need to do so because
we define the stats as relative while enabled, meaning
comparison between two samples while tracking is enabled is the
valid usage. However, when busy stats will later be plugged into
the perf PMU API, it is beneficial to not reset the total, since
the PMU core likes to do some counter disable/enable cycles on
startup, and while doing so during a single long context
executing on an engine we would lose some accuracy and so make
unit testing more difficult than needs to be.
v10:
* Fix accounting for preemption.
v11:
* Rebase for i915_modparams.enable_execlists removal.