Chris Wilson [Mon, 20 Nov 2017 10:20:01 +0000 (10:20 +0000)]
drm/i915: Pull the unconditional GPU cache invalidation into request construction
As the request will, in the following patch, implicitly invoke a
context-switch on construction, we should precede that with a GPU TLB
invalidation. Also, even before using GGTT, we always want to invalidate
the TLBs for any updates (as well as the ppgtt invalidates that are
unconditionally applied by execbuf). Since we almost always require the
TLB invalidate, do it unconditionally on request allocation and so we can
remove it from all other paths.
Chris Wilson [Mon, 20 Nov 2017 12:34:57 +0000 (12:34 +0000)]
drm/i915/execlists: Assert that we don't get mixed IDLE_ACTIVE | COMPLETE events
If IDLE_ACTIVE is set, then all other bits are invalid. For us, we can
assert that if we see a COMPLETE | PREEMPTED event, then it should be
impossible for it to also contain an IDLE_ACTIVE flag.
Suggested-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Michal Winiarski <michal.winiarski@intel.com> Cc: Michel Thierry <michel.thierry@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171120123458.23242-3-chris@chris-wilson.co.uk Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Chris Wilson [Mon, 20 Nov 2017 12:34:56 +0000 (12:34 +0000)]
drm/i915/execlists: Reduce completed event mask to COMPLETE | PREEMPTED
Since we get a COMPLETE event when the context switch occurs on
RING_HEAD == RING_TAIL and a PREEMPTED event when a switch occurs
before that point, COMPLETE | PREEMPTED should cover all possible context
switch completion events. We can move the ELEMENT_SWITCH info message
from the COMPLETED_MASK into an assertion for when we are performing a
switch to port[1].
Suggested-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Michal Winiarski <michal.winiarski@intel.com> Cc: Michel Thierry <michel.thierry@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171120123458.23242-2-chris@chris-wilson.co.uk Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
execlists has listened to (ACTIVE_IDLE | ELEMENT_SWITCH) for detecting
when one context completed and it either continued onto the next (in port
1) or idled. We would always see COMPLETE | ACTIVE_IDLE on the final
context-switch event, but on recent gen it appears that we now get
separate ACTIVE_IDLE and COMPLETE events. In particular, the ACTIVE_IDLE
events may not be coupled to a context (since it is a general state rather
than a specific context completion event).
v2: Update the history, execlists did originally start out by listening
to the COMPLETE event not ACTIVE_IDLE.
v3: Update preempt completion test to also use COMPLETE not ACTIVE_IDLE.
References: bspec/12255
References: https://bugs.freedesktop.org/show_bug.cgi?id=103800 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Oscar Mateo <oscar.mateo@intel.com> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Michal Winiarski <michal.winiarski@intel.com> Cc: Michel Thierry <michel.thierry@intel.com> Acked-by: Michel Thierry <michel.thierry@intel.com> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171120123458.23242-1-chris@chris-wilson.co.uk
Ville Syrjälä [Thu, 16 Nov 2017 16:02:15 +0000 (18:02 +0200)]
drm/i915: Fix init_clock_gating for resume
Moving the init_clock_gating() call from intel_modeset_init_hw() to
intel_modeset_gem_init() had an unintended effect of not applying
some workarounds on resume. This, for example, cause some kind of
corruption to appear at the top of my IVB Thinkpad X1 Carbon LVDS
screen after hibernation. Fix the problem by explicitly calling
init_clock_gating() from the resume path.
I really hope this doesn't break something else again. At least
the problems reported at https://bugs.freedesktop.org/show_bug.cgi?id=103549
didn't make a comeback, even after a hibernate cycle.
v2: Reorder the init_clock_gating vs. modeset_init_hw to match
the display reset path (Rodrigo)
Cc: stable@vger.kernel.org Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Fixes: 6ac43272768c ("drm/i915: Move init_clock_gating() back to where it was") Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20171116160215.25715-1-ville.syrjala@linux.intel.com Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Chris Wilson [Fri, 17 Nov 2017 10:26:35 +0000 (10:26 +0000)]
drm/i915: Add a policy note for removing workarounds
Rodrigo gave a persuasive argument for keeping workarounds: that they
serve as a good guide for the bring up of the next generation. Not only
do workarounds persist into the early revisions, they show where the
workarounds were previously added to the code flow and sometimes the old
workarounds have an explanation that give insight into their wider
implications.
Based on his suggestion, document the policy that we want to keep the
workarounds from the current generation to guide the next. Older
preproduction workarounds we still want to remove to keep the code
clean.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Jani Nikula <jani.nikula@intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20171117102635.8689-1-chris@chris-wilson.co.uk Acked-by: Jani Nikula <jani.nikula@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Chris Wilson [Fri, 17 Nov 2017 16:29:45 +0000 (16:29 +0000)]
drm/i915/selftests: Report ENOMEM clearly for an allocation failure
If we can not run the drunk_hole test because we couldn't allocate the
memory for the permutation array (even after we tried trimming the
size), report a clear ENOMEM. Similary, if we are asked to operate on a
hole too small for ourselves, make it skip quietly.
v2: Avoid malloc(0) since that returns ZERO_SIZE_PTR not NULL.
v3: Fixup similar construction for lowlevel_hole
v4: Use u64 >> 1 to avoid 64b div.
HSD says "WA withdrawn. It was causing corruption with some images.
WA is not strictly necessary since this bug just causes loss of FBC
compression with some sizes and images, but doesn't break anything."
The watermarks it should calculate against are the old optimal watermarks.
The currently active crtc watermarks are pure fiction, and are invalid in
case of a nonblocking modeset, page flip enabling/disabling planes or any
other reason.
When the crtc is disabled or during a modeset the intermediate watermarks
don't need to be programmed separately, and could be directly assigned
to the optimal watermarks.
CXSR must always be disabled in the intermediate case for modesets,
else we get a WARN for vblank wait timeout.
Also rename crtc_state to new_crtc_state, to distinguish it from the old
state.
The watermarks it should calculate against are the old optimal watermarks.
The currently active crtc watermarks are pure fiction, and are invalid in
case of a nonblocking modeset, page flip enabling/disabling planes or any
other reason.
When the crtc is disabled or during a modeset the intermediate watermarks
don't need to be programmed separately, and could be directly assigned
to the optimal watermarks.
CXSR must always be disabled in the intermediate case for modesets, else
we get a WARN for vblank wait timeout.
Also rename crtc_state to new_crtc_state, to distinguish it from the old state.
Changes since v1:
- Use intel_atomic_get_old_crtc_state. (ville)
Changes since v2:
- Always unset cxsr during modeset.
drm/i915: Enable FIFO underrun reporting after initial fastset, v4.
The firmware may have set up the pipe correctly, but the FIFO
underrun and CRC interrupts are likely not enabled.
This resulted in debugfs_test.read_all_entries failing on haswell,
because of a timeout when reading the crc debugfs entry.
Solve this by enabling FIFO underrun reporting after the initial
fastset, which lets interrupts be generated as expected.
Changes since v1:
- Always enable CPU FIFO underrun reporting for >GEN2,
and handle GEN2 correctly.
Changes since v2:
- Remove unneeded HAS_DDI, simplify GEN2 case.
Changes since v3:
- Use intel_crtc_pch_transcoder to determine pch transcoder for underruns. (Ville)
- Remove crtc->config dereference in intel_crtc_pch_transcoder. (Ville)
Chris Wilson [Tue, 14 Nov 2017 17:35:20 +0000 (17:35 +0000)]
drm/i915: Mark the userptr invalidate workqueue as WQ_MEM_RECLAIM
Commit 21cc6431e0c2 ("drm/i915: Mark the userptr invalidate workqueue
as WQ_MEM_RECLAIM") tried to fixup the check_flush_dependency warning
for hitting i915_gem_userptr_mn_invalidate_range_start from within the
shrinker, but I failed to notice userptr has 2 similarly named
workqueues. I marked up i915-userptr-acquire as WQ_MEM_RECLAIM whereas
we only wait upon i915-userptr-release from inside the reclaim paths.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103739 Fixes: 21cc6431e0c2 ("drm/i915: Mark the userptr invalidate workqueue as WQ_MEM_RECLAIM") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Michał Winiarski <michal.winiarski@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171114173520.8829-1-chris@chris-wilson.co.uk Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Michel Thierry [Thu, 16 Nov 2017 22:06:31 +0000 (14:06 -0800)]
drm/i915/selftests: Add a GuC doorbells selftest
The first test aims to check guc_init_doorbell_hw, changing the existing
guc clients and doorbells state before calling it.
The second test tries to create as many clients as it is currently possible
(currently limited to max number of doorbells) and exercise the doorbell
alloc/dealloc code.
Since our usage mode require very few clients/doorbells, this code has
been exercised very lightly and it's good to have a simple test for it.
As reference, this test already helped identify the bug fixed by
commit 7f1ea2ac3017 ("drm/i915/guc: Fix doorbell id selection").
v2: Extend number of clients; check for client allocation failure when
number of doorbells is exceeded; validate client properties; reuse
guc_init_doorbell_hw (Chris).
v3: guc_init_doorbell_hw test added per Chris suggestion.
v4: Try to explain why guc_init_doorbell_hw exist and comment some
details in the subtest.
v5: Remove redundant pr_info at the beginning of each subtest (Chris);
rebase (s/i915_guc_client/intel_guc_client/).
Signed-off-by: Michel Thierry <michel.thierry@intel.com> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20171116220632.1909-1-michel.thierry@intel.com Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Rodrigo Vivi [Thu, 16 Nov 2017 20:12:28 +0000 (12:12 -0800)]
Merge tag 'gvt-next-2017-11-16' of https://github.com/intel/gvt-linux into drm-intel-next-queued
gvt-next-2017-11-16
- CSB HWSP update support (Weinan)
- GVT debug helpers, dyndbg and debugfs (Chuanxiao, Shuo)
- full virtualized opregion (Xiaolin)
- VM health check for sane fallback (Fred)
- workload submission code refactor for future enabling (Zhi)
- Updated repo URL in MAINTAINERS (Zhenyu)
- other many misc fixes
Rodrigo Vivi [Wed, 15 Nov 2017 18:42:57 +0000 (10:42 -0800)]
drm/i915/cnl: Simplify dco_fraction calculation.
I confess I never fully understood that previous calculation,
so this is not a "fix". But let's simplify this math
so poor brains like mine can read and make some sense of
it in the future.
v2: Don't follow the spec since that gives invalid
values and it is also confusing. This Ville's
version is much simpler.
v3: Use u64 cast instead of declaring a u64 dco. (Ville).
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Cc: Mika Kahola <mika.kahola@intel.com> Cc: Manasi Navare <manasi.d.navare@intel.com> Cc: James Ausmus <james.ausmus@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171115184257.8633-1-rodrigo.vivi@intel.com
Rodrigo Vivi [Tue, 14 Nov 2017 19:47:57 +0000 (11:47 -0800)]
drm/i915/cnl: Don't blindly replace qdiv.
Accordingly to spec "If Kdiv != 2, then Qdiv must be 1."
but we already handle qdiv values properly and this case here
should be spurious. But instead of blindly replacing let's
warn loudly instead. Because it means something was really
wrong on initial setup.
Rodrigo Vivi [Tue, 14 Nov 2017 23:42:23 +0000 (15:42 -0800)]
drm/i915/cnl: Fix wrpll math for higher freqs.
Spec describe all values in MHz. We handle our
clocks in KHz. This includes the best_dco_centrality that was
forgot in the same unity as spec. Consequently we couldn't
get a good divider for high frequenies. Hence HDMI 2.0 wasn't
working.
Spec tells 999999 for initial best_dco_centrality meaning the
max value in MHz.
Since we convert dco from MHz to KHz we also need to convert
this initial best_doc_centrality to 999999000 or 999999999
or even better, to the max that its variable allow.
This patch also replaces the use of "* KHz(1)" with the values
directly on KHz to avoid future confusion.
v2: Use U32_MAX instead of random 99999 as spec tells. (Ville).
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Cc: Shashank Sharma <shashank.sharma@intel.com> Cc: Mika Kahola <mika.kahola@intel.com> Cc: Manasi Navare <manasi.d.navare@intel.com> Cc: James Ausmus <james.ausmus@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171114234223.10600-1-rodrigo.vivi@intel.com
Rodrigo Vivi [Tue, 14 Nov 2017 19:47:55 +0000 (11:47 -0800)]
drm/i915/cnl: Fix, simplify and unify wrpll variable sizes.
- 64 bits is not needed for afe_clock now we don't convert
that to Hz.
- 16 bits is not enough for all dco stuff.
- unsigned is not relevant/needed for all divisors values.
Chris Wilson [Wed, 15 Nov 2017 15:25:58 +0000 (15:25 +0000)]
drm/i915/selftests: exercise_ggtt may have nothing to do
When operating on the live_ggtt we have to find a usuable hole for our
test. It is possible for there to be no hole we can use, so initialise
the err to 0 for the early exit.
Ville Syrjälä [Wed, 15 Nov 2017 20:04:42 +0000 (22:04 +0200)]
drm/i915: Don't sanitize frame start delay if the pipe is off
Avoid touching PIPECONF in intel_sanitize_crtc() unless the pipe is
actually on. Should cure some unclaimed register accesses during reset,
as we are rather cavalier in our approach to powerdomain management.
We don't have to sanitize this if the pipe is off since we will
overwrite the frame start delay anyway when turning the pipe on.
v2: Amended commit message to implicate the reset path (Chris)
drm/i915/guc: Rename i915_guc_submission.c|h to intel_guc_submission.c|h
With all component structures and functions named appropriately, change
the names of GuC submission source files. There were bunch of style issues
in guc_submission.c that are highlighted now by checkpatch. Fix those.
Update name in Documentation/gpu. (Joonas)
v2: Rebase.
v3: Rebase.
Signed-off-by: Sagar Arun Kamble <sagar.a.kamble@intel.com> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Michal Winiarski <michal.winiarski@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Acked-by: Chris Wilson <chris@chris-wilson.co.uk> Acked-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Acked-by: Oscar Mateo <oscar.mateo@intel.com> Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/1510839162-25197-6-git-send-email-sagar.a.kamble@intel.com Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
drm/i915/guc: Rename i915_guc_client struct to intel_guc_client
GuC submission clients are currently being used in kernel only hence
update the structure name to intel_guc_client.
Signed-off-by: Sagar Arun Kamble <sagar.a.kamble@intel.com> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Michal Winiarski <michal.winiarski@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Acked-by: Chris Wilson <chris@chris-wilson.co.uk> Acked-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Acked-by: Oscar Mateo <oscar.mateo@intel.com> Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/1510839162-25197-5-git-send-email-sagar.a.kamble@intel.com Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
drm/i915/guc: Update name and prototype of GuC submission interface functions
i915 GuC submission is hardware interface and GuC APIs that are not user
facing should be named intel_guc* hence we change GuC submission related
functions name prefix to intel_guc. Also changed the parameter to these
functions to intel_guc struct.
v2: Using local guc variable in intel_uc_fini_hw. (Michal Wajdeczko)
Rebase.
Suggested-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Sagar Arun Kamble <sagar.a.kamble@intel.com> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Michal Winiarski <michal.winiarski@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Acked-by: Chris Wilson <chris@chris-wilson.co.uk> Acked-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Acked-by: Oscar Mateo <oscar.mateo@intel.com> Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/1510839162-25197-4-git-send-email-sagar.a.kamble@intel.com Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
drm/i915/guc: Update names of submission related static functions
i915_guc_submit, i915_guc_dequeue, i915_guc_submission_park and
i915_guc_submission_upark are functions internal to GuC submission
hence remove "i915_" prefix.
Suggested-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Signed-off-by: Sagar Arun Kamble <sagar.a.kamble@intel.com> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Michal Winiarski <michal.winiarski@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/1510839162-25197-3-git-send-email-sagar.a.kamble@intel.com Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
intel_lrc_irq_handler and i915_guc_irq_handler are HW submission related
tasklet functions. Name them with "submission_tasklet" suffix and
remove intel/i915 prefix as they are static. Also rename irq_tasklet
as just tasklet for clarity.
v2: s/_bh/_tasklet (Chris)
Suggested-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Signed-off-by: Sagar Arun Kamble <sagar.a.kamble@intel.com> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Michal Winiarski <michal.winiarski@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/1510839162-25197-2-git-send-email-sagar.a.kamble@intel.com Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Chris Wilson [Thu, 16 Nov 2017 10:50:59 +0000 (10:50 +0000)]
drm/i915: Prevent overflow of execbuf.buffer_count and num_cliprects
We check whether the multiplies will overflow prior to calling
kmalloc_array so that we can respond with -EINVAL for the invalid user
arguments rather than treating it as an -ENOMEM that would otherwise
occur. However, as Dan Carpenter pointed out, we did an addition on the
unsigned int prior to passing to kmalloc_array where it would be
promoted to size_t for the calculation, thereby allowing it to overflow
and underallocate.
v2: buffer_count is currently limited to INT_MAX because we treat it as
signaled variable for LUT_HANDLE in eb_lookup_vma
v3: Move common checks for eb1/eb2 into the same function
v4: Put the check back for nfence*sizeof(user_fence) overflow
v5: access_ok uses ULONG_MAX but kvmalloc_array uses SIZE_MAX
v6: size_t and unsigned long are not type-equivalent on 32b
Chris Wilson [Wed, 15 Nov 2017 12:14:58 +0000 (12:14 +0000)]
drm/i915: Clear breadcrumb node when cancelling signaling
When we call intel_engine_cancel_signaling() to stop reporting when
a request is completed via an asynchronous signal, we remove that request
from the breadcrumb wait queue. However, we may be concurrently
processing that request in the signaler itself, the actual operations on
the request's node itself are serialised but we do not actually clear the
waiter after removing it from the tree allowing both parties to attempt
to do so and corrupting the rbtree. (Previously removing from the
breadcrumb wait queue could only be done on behalf of i915_wait_request,
so this race could not happen).
Reported-by: "He, Bo" <bo.he@intel.com> Fixes: 9eb143bbec7d ("drm/i915: Allow a request to be cancelled") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: "He, Bo" <bo.he@intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Michał Winiarski <michal.winiarski@intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171115121458.24655-1-chris@chris-wilson.co.uk Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
fred gao [Tue, 14 Nov 2017 09:09:35 +0000 (17:09 +0800)]
drm/i915/gvt: Move request alloc to dispatch_workload path only
Previously the performance is improved through the workload auditing
and shadowing ahead of vGPU scheduling, however, there is the case that
more requests are allocated in submit_context before the previous request
is added, the timeline will hold its seqno which is later.
This patch is to move the request alloc to dispatch_workload function,
where is the same place as request is added.
It will fix the issue of kernel BUG for (timeline->seqno != request->fence.seqno)
check when add_request.
Fixes: 89ea20b930cb ("drm/i915/gvt: Factor out scan and shadow from workload dispatch") Signed-off-by: Chuanxiao Dong <chuanxiao.dong@intel.com> Signed-off-by: fred gao <fred.gao@intel.com> Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Xiong Zhang [Tue, 7 Nov 2017 16:45:21 +0000 (00:45 +0800)]
drm/i915/gvt: Let each vgpu has separate opregion memory
Currently every vgpu share a common gvt opregion memory, but
it is freed at vgpu destroy, then the later vgpu doesn't have
opregion memory once the first vgpu is destroyed. This cause
guest function failure like reboot, second or later boot.
This patch allocate and init virt opregion memory for each
vgpu, so this memory could be freed at vgpu destroy.
Signed-off-by: Xiong Zhang <xiong.y.zhang@intel.com> Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Xiong Zhang [Mon, 6 Nov 2017 21:23:02 +0000 (05:23 +0800)]
drm/i915/gvt: Limit read hw reg to active vgpu
mmio_read_from_hw() let vgpu could read hw reg, if vgpu's workload
is running on hw, things is good. Otherwise vgpu will get other
vgpu's reg val, it is unsafe.
This patch limit such hw access to active vgpu. If vgpu isn't
running on hw, the reg read of this vgpu will get the last active
val which saved at schedule_out.
v2: ring timestamp is walking continuously even if the ring is idle.
so read hw directly. (Zhenyu)
Signed-off-by: Xiong Zhang <xiong.y.zhang@intel.com> Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Changbin Du [Thu, 2 Nov 2017 05:33:42 +0000 (13:33 +0800)]
drm/i915/gvt: Emulate PCI expansion ROM base address register
Our vGPU doesn't have a device ROM, we need follow the PCI spec to
report this info to drivers. Otherwise, we would see below errors.
Inspecting possible rom at 0xfe049000 (vd=8086:1912 bdf=00:10.0)
qemu-system-x86_64: vfio-pci: Cannot read device rom at 00000000-0000-0000-0000-000000000001
Device option ROM contents are probably invalid (check dmesg).
Skip option ROM probe with rombar=0, or load from file with romfile=No option rom signature (got 4860)
I will also send a improvement patch to PCI subsystem related to PCI ROM.
But no idea to omit below error, since no pattern to detect vbios shadow
without touch its content.
0000:00:10.0: Invalid PCI ROM header signature: expecting 0xaa55, got 0x0000
Signed-off-by: Changbin Du <changbin.du@intel.com> Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Changbin Du [Mon, 23 Oct 2017 03:46:44 +0000 (11:46 +0800)]
drm/i915/gvt: Add new debugfs tool mmio_diff
This new debugfs entry is used to figure out which registers of vGPU
is different to host. It is a useful tool for new platform enabling
and debugging. When read this entry, all the diff mmio are recognized
and sorted by mmio offset. Besides, the bit positions of different
value are listed in 'Diff' column. Here is a show:
This patch add a function intel_gvt_for_each_tracked_mmio() to
iterate each tracked mmio. The caller don't be aware of how the
tracked mmios are presented internally.
v2: remove snapshot_hw_mmio_registers().
Signed-off-by: Changbin Du <changbin.du@intel.com> Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Xiaolin Zhang [Fri, 20 Oct 2017 13:52:29 +0000 (21:52 +0800)]
drm/i915/gvt: opregion virtualization for win guest
this is an enhanced opregion emulation for win guest support
by initializing more data members including opregion header
size, version and child device propertity for display port.
for simplicity, redefined child_device_config structure.
Signed-off-by: Xiaolin Zhang <xiaolin.zhang@intel.com> Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Weinan Li [Fri, 20 Oct 2017 07:16:46 +0000 (15:16 +0800)]
drm/i915/gvt: update CSB and CSB write pointer in virtual HWSP
The engine provides a mirror of the CSB and CSB write pointer in the HWSP.
Read these status from virtual HWSP in VM can reduce CPU utilization while
applications have much more short GPU workloads. Here we update the
corresponding data in virtual HWSP as it in virtual MMIO.
Before read these status from HWSP in GVT-g VM, please ensure the host
support it by checking the BIT(3) of caps in PVINFO.
Virtual HWSP only support GEN8+ platform, since the HWSP MMIO may change
follow the platform update, please add the corresponding MMIO emulation
when enable new platforms in GVT-g.
v3 : Add address audit in HWSP address update.
v4 :
Separate this patch with enalbe virtual HWSP in VM.
Use intel_gvt_render_mmio_to_ring_id() to determine ring_id by offset.
v5 : Remove unnessary check about Gen8, GVT-g only support Gen8+.
Signed-off-by: Weinan Li <weinan.z.li@intel.com> Cc: Zhenyu Wang <zhenyuw@linux.intel.com> Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Zhi Wang [Mon, 16 Oct 2017 18:00:27 +0000 (02:00 +0800)]
drm/i915/gvt: Refine broken PPGTT scratch
Refine previously broken PPGTT scratch. Scratch PTE was no correctly
handled and also the handling of scratch entries in page table walk was
not well organized, which brings gaps of introducing lazy shadow.
Zhi Wang [Tue, 10 Oct 2017 06:34:11 +0000 (14:34 +0800)]
drm/i915/gvt: Let the caller choose if a shadow page should be put into hash table
As we want to re-use intel_vgpu_shadow_page in buidling scrach page table
and we don't want to put scrach page table page into hash table, a new
param is introduced to give the caller a choice to decide if a shadow page
should be put into hash table.
Zhi Wang [Thu, 28 Sep 2017 18:47:55 +0000 (02:47 +0800)]
drm/i915/gvt: Factor intel_vgpu_page_track
As the data structure of "intel_vgpu_guest_page" will become much heavier
in future, it's better to factor out the guest memory page track mechnisim
as early as possible.
Changbin Du [Tue, 26 Sep 2017 08:19:13 +0000 (16:19 +0800)]
drm/i915/gvt: Add basic debugfs infrastructure
We need debugfs entry to expose some debug information of gvt and vGPUs.
The first tool will be added is mmio-diff, which help to find the
difference values of host and vGPU mmio. It's useful for platform
enabling.
This patch just add a basic debugfs infrastructure, each vGPU has its own
sub-folder. Two simple attributes are created as a template.
.
├── num_tracked_mmio
├── vgpu1
| └── active
└── vgpu2
└── active
Signed-off-by: Changbin Du <changbin.du@intel.com> Reviewed-by: Zhi Wang <zhi.a.wang@intel.com> Signed-off-by: Zhi Wang <zhi.a.wang@intel.com>
fred gao [Thu, 28 Sep 2017 03:03:02 +0000 (11:03 +0800)]
drm/i915/gvt: Move vGPU type related code into gvt file
In this patch, all the vGPU type related code will be merged into
same gvt file and the common interface will be exposed to both
XenGT and KvmGT.
v2:
- remove the useless mdev_* gvt_ops.
add get_gvt_attr ops for MPT module.
intel_gvt_{init,cleanup}_vgpu_type_groups are initialized in
gvt part. (Wang, Zhi)
- set gvt_vgpu_type_groups[i] to NULL. (Zhang,Xiong)
Signed-off-by: fred gao <fred.gao@intel.com> Reviewed-by: Zhi Wang <zhi.a.wang@intel.com> Signed-off-by: Zhi Wang <zhi.a.wang@intel.com>
Shuo Liu [Thu, 21 Sep 2017 00:02:31 +0000 (08:02 +0800)]
drm/i915/gvt: Use dyndbg for gvt debug info
It's better enable/disable and classify gvt debug info dynamically.
This patch change it to dyndbg so can be dynamically enable/disable
each item. All gvt log can be enabled by,
$ echo 'file *gvt* +p' > /sys/kernel/debug/dynamic_debug/control
Signed-off-by: Shuo Liu <shuo.a.liu@intel.com> Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Colin Ian King [Tue, 19 Sep 2017 15:55:34 +0000 (16:55 +0100)]
drm/i915/gvt: ensure -ve return value is handled correctly
An earlier fix changed the return type from find_bb_size however the
integer return is being assigned to a unsigned int so the -ve error
check will never be detected. Make bb_size an int to fix this.
Detected by CoverityScan CID#1456886 ("Unsigned compared against 0")
Fixes: 1e3197d6ad73 ("drm/i915/gvt: Refine error handling for perform_bb_shadow") Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
fred gao [Tue, 19 Sep 2017 07:11:28 +0000 (15:11 +0800)]
drm/i915/gvt: Add VM healthy check for workload_thread
When a scan error occurs in dispatch_workload, this patch is to
check the healthy state and free all the queued workloads before
the failsafe mode is entered.
Signed-off-by: fred gao <fred.gao@intel.com> Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
fred gao [Tue, 19 Sep 2017 21:36:47 +0000 (05:36 +0800)]
drm/i915/gvt: Change the return type during command scan
Generally, there are 3 types of errors during command scan: a) some
commands might be unknown with EBADRQC; b) some cmd access invalid
address with EFAULT; c) some unexpected force nonpriv cmd with EPERM.
later the healthy state can be judged through the return error.
v2:
- remove some internal i915 errors rating. (Zhenyu)
v3:
- the healthy state is judged through the internal defined return
error. (Zhenyu)
- force non priv cmd error can be ignored. (Kevin)
v4:
- reuse standard defined errno instead of recreate, e.g EBADRQC for
unknown cmd, EFAULT for invalid address, EPERM for nonpriv. (Zhenyu)
v5:
- remove some irrelevant code for the patch.
- fix typo of vgpu_is_vm_unhealthy. (Zhenyu)
v6:
- move the healthy check and failsafe code into another patch. (Zhenyu)
v7:
- polish title and commit message. (Zhenyu)
Signed-off-by: fred gao <fred.gao@intel.com> Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Zhi Wang [Sun, 10 Sep 2017 14:01:10 +0000 (22:01 +0800)]
drm/i915/gvt: Do not allocate initial ring scan buffer
Theoretically, the largest bulk of commands in the ring buffer of an
engine might be the first submission, which usually contains a lot
of commands to initialize the HW. After removing the initial allocation
of the ring scan buffer and let krealloc() do everything we need, we
still have a big chance to get the buffer of suitable size in the first
submission.
To move workload related functions into scheduler.c, an expected way is
to collect all the init/clean functions related to vGPU workload
submission into fewer functions.
Rename intel_vgpu_{init, clean}_gvt_context() for above usage in future.
Zhi Wang [Sun, 10 Sep 2017 08:40:04 +0000 (16:40 +0800)]
drm/i915/gvt: Make elsp_dwords in the right order
The context descriptors in elsp_dwords are stored in a reversed order and
the definition of context descriptor is also reversed. The revesred stuff
is hard to be used and might cause misunderstanding. Make them in the right
oder for following code re-factoring.
drm/i915/gvt: Add support for opregion virtualization
opregion emulated with a copy from host which leads to some display
bugs such as guest resolution adjustment failure due to host opregion
fail to claim port D support. with a fake opregion table provided
to fully emulate opregion to meet guest port requirement.
v1 - initial patch
v2 - reforamt opregion arrary with 0x02x output
v3 - opregion array removed with opregion generation on host initizaiton
v4 - rebased v3 patch from stable branch to staging branch which also has
different struct child_device_config and addressed v3 review comments.
Signed-off-by: Xiaolin Zhang <xiaolin.zhang@intel.com> Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Chris Wilson [Tue, 14 Nov 2017 13:51:16 +0000 (13:51 +0000)]
drm/i915: Remove pre-production pooled-EU w/a for Broxton
WaEnablePooledEuFor2x6 only applies to preproduction models, unsupported
since commit 0102ba1fd8af ("drm/i915: Add early BXT sdv to the list of
preproduction machines").
Chris Wilson [Wed, 15 Nov 2017 15:12:04 +0000 (15:12 +0000)]
drm/i915: Make request's wait-for-space explicit
At the start of building a request, we would wait for roughly enough
space to fit the average request (to reduce the likelihood of having to
wait and abort partway through request construction). To achieve we
would try to begin a 0-length command packet, this just adds extra
confusion so make the wait-for-space explicit, as in the next patch we
want to move it from the backend to the i915_gem_request_alloc() so it
can ensure that the wait-for-space is the first operation in building a
new request.
Chris Wilson [Wed, 15 Nov 2017 15:12:03 +0000 (15:12 +0000)]
drm/i915/selftests: Increase size for mock ringbuffer
We don't actually emit any commands into the ringbuffer, so we set it
very small. However, an upcoming change centralises the wait-for-space
into i915_gem_request_alloc() and that imposes a minimum size upon all
ringbuffers (mock or real) of MIN_SPACE_FOR_ADD_REQUEST. Grow the
mock ringbuffer such that we allocate a single page for the struct+buffer,
satisfying the new condition without wasting too much space.
Chris Wilson [Wed, 15 Nov 2017 13:17:05 +0000 (13:17 +0000)]
drm/i915: Initialise entry in intel_ppat_get() for older compilers
gcc-4.7.3 is confused by the guards inside intel_ppat_get() and reports:
drivers/gpu/drm/i915/i915_gem_gtt.c: In function ‘intel_ppat_get’:
drivers/gpu/drm/i915/i915_gem_gtt.c:3044:27: warning: ‘entry’ may be used uninitialized in this function [-Wmaybe-uninitialized]
Forgive the compiler this once, and rearrange the code so that entry is
always initialised.
v2: Flavour with a bit of NULL (instead of ERR_PTR(-ENOSPC))
Ville Syrjälä [Thu, 2 Nov 2017 15:17:37 +0000 (17:17 +0200)]
drm/i915: Use ELK stolen memory reserved detection for ILK
While I have no solid proof that ILK follows the ELK path when it
comes to the stolen memory reserved area, there are some hints that
it might be the case. Unfortunately my ILK doesn't have this enabled,
and no way to enable it via the BIOS it seems.
So let's have ILK use the ELK code path, and let's toss in a WARN
into the code to see if we catch anyone with an ILK that has this
enabled to further analyze the situation.
Ville Syrjälä [Thu, 2 Nov 2017 15:17:36 +0000 (17:17 +0200)]
drm/i915: Make the report about a bogus stolen reserved area an error
Now that we should be properly filtering out the cases when the stolen
reserved area is disabled, let's convert the debug message about a
misplaced reserved area into an error.
Ville Syrjälä [Thu, 2 Nov 2017 15:17:35 +0000 (17:17 +0200)]
drm/i915: Check if the stolen memory "reserved" area is enabled or not
Apparently there are some machines that put semi-sensible looking values
into the stolen "reserved" base and size, except those values are actually
outside the stolen memory. There is a bit in the register which
supposedly could tell us whether the reserved area is even enabled or
not. Let's check for that before we go trusting the base and size.
Hans de Goede [Tue, 14 Nov 2017 13:55:18 +0000 (14:55 +0100)]
drm/i915: Call uncore_suspend before platform suspend handlers
Quoting Ville: "the forcewake timer might still be active until the uncore
suspend, and having active forcewakes while we've already told the GT wake
stuff to stop acting normally doesn't seem quite right to me."
Hans de Goede [Tue, 14 Nov 2017 13:55:17 +0000 (14:55 +0100)]
drm/i915: Re-register PMIC bus access notifier on runtime resume
intel_uncore_suspend() unregisters the uncore code's PMIC bus access
notifier and gets called on both normal and runtime suspend.
intel_uncore_resume_early() re-registers the notifier, but only on
normal resume. Add a new intel_uncore_runtime_resume() function which
only re-registers the notifier and call that on runtime resume.
Hans de Goede [Fri, 10 Nov 2017 15:03:01 +0000 (16:03 +0100)]
drm/i915: Fix false-positive assert_rpm_wakelock_held in i915_pmic_bus_access_notifier v2
assert_rpm_wakelock_held is triggered from i915_pmic_bus_access_notifier
even though it gets unregistered on (runtime) suspend, this is caused
by a race happening under the following circumstances:
And pm_runtime_put_autosuspend calls intel_runtime_suspend from
a workqueue, so there is ample of time between the atomic_dec() and
intel_runtime_suspend() unregistering the notifier. If the notifier
gets called in this windowd assert_rpm_wakelock_held falsely triggers
(at this point we're not runtime-suspended yet).
This commit adds disable_rpm_wakeref_asserts and
enable_rpm_wakeref_asserts calls around the
intel_uncore_forcewake_get(FORCEWAKE_ALL) call in
i915_pmic_bus_access_notifier fixing the false-positive WARN_ON.
Changes in v2:
-Reword comment explaining why disabling the wakeref asserts is
ok and necessary
Chris Wilson [Tue, 14 Nov 2017 22:33:46 +0000 (22:33 +0000)]
drm/i915/selftests: Always initialise err
smatch does not track initialised values as well as gcc, and this
triggers many warnings by smatch not presented by gcc. Silence smatch by
initialising the error values to -ENODEV, which we use to denote
internal errors. (If we see a selftest fail with a silent -ENODEV, we
know smatch was right!)
v2: smatch was right about igt_create_vma(), it may unlikely fail on the
first object allocation which we want to be loud about.
Chris Wilson [Tue, 14 Nov 2017 13:03:00 +0000 (13:03 +0000)]
drm/i915: Resume GuC before using GEM
Resuming GEM presumes it can talk to hw, in particular to ensure the
kernel context is loaded upon resume for powersaving. If the GuC is
still asleep at this point, we upset the HW. Rearrange the resume such
that we restore the original order of init-hw, resume-guc, use-gem.
Fixes: 37cd33006d02 ("drm/i915: Remove redundant intel_autoenable_gt_powersave()")
References: a1c419941453 ("drm/i915/guc: Add host2guc notification for suspend and resume") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Alex Dai <yu.dai@intel.com> Cc: Sagar Arun Kamble <sagar.a.kamble@intel.com> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171114130300.25677-2-chris@chris-wilson.co.uk Reviewed-by: Sagar Arun Kamble <sagar.a.kamble@intel.com>