Chris Wilson [Thu, 15 Aug 2019 04:20:30 +0000 (05:20 +0100)]
drm/i915: Move tasklet kicking to __i915_request_queue caller
Since __i915_request_queue() may be called from hardirq (timer) context,
we cannot use local_bh_disable/enable at the lower level. As we do want
to kick the tasklet to speed up initial submission or preemption for
normal client submission, lift it to the normal process context
callpath.
Mika Kuoppala [Thu, 15 Aug 2019 09:49:29 +0000 (12:49 +0300)]
drm/i915/icl: Add gen11 specific render breadcrumbs
Flush according to what gen11 expects when writing
breadcrumbs. As only the seqnowrite + flush differs
between engine and gens, enclose the footer to
helper.
v2: avoid problem of sane local naming by not using them
Mika Kuoppala [Thu, 15 Aug 2019 08:30:53 +0000 (11:30 +0300)]
drm/i915/icl: Implement gen11 flush including tile cache
Add tile cache flushing for gen11. To relive us from the
burden of previous obsolete workarounds, make a dedicated
flush/invalidate callback for gen11.
To fortify an independent single flush, do post
sync op as there are indications that without it
we don't flush everything. This should also make this
callback more readily usable in tgl (see l3 fabric flush).
Dan reported the following static checker warning:
drivers/gpu/drm/i915/selftests/i915_buddy.c:670 igt_buddy_alloc_range()
error: we previously assumed 'block' could be null (see line 665)
Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Dan Carpenter <dan.carpenter@oracle.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20190815103210.11802-1-matthew.auld@intel.com
Chris Wilson [Thu, 15 Aug 2019 09:36:04 +0000 (10:36 +0100)]
drm/i915: Convert a few more bland dmesg info to be device specific
Looking around the GT initialisation, we have a few log messages we
think are interesting enough present to the user (such as the amount of L4
cache) and a few to inform them of the result of actions or conflicting
HW restrictions (i.e. quirks). These are device specific messages, so
use the dev family of printk.
Chris Wilson [Tue, 13 Aug 2019 20:09:05 +0000 (21:09 +0100)]
drm/i915: Serialise read/write of the barrier's engine
We use the request pointer inside the i915_active_node as the indicator
of the barrier's status; we mark it as used during
i915_request_add_active_barriers(), and search for an available barrier
in reuse_idle_barrier(). That check must be carefully serialised to
ensure we do use an engine for the barrier and not just a random
pointer. (Along the other reuse path, we are fully serialised by the
timeline->mutex.) The acquisition of the barrier itself is ordered through
the strong memory barrier in llist_del_all().
Chris Wilson [Tue, 13 Aug 2019 18:21:12 +0000 (19:21 +0100)]
drm/i915: Disregard drm_mode_config.fb_base
The fb_base is only used for communicating the GTT BAR from one piece of
the display code (kms setup) to another (fbdev). What is required in the
fbdev is just the aperture address which should be derived from the
bo we allocate for the framebuffer directly.
The same appears true for drm/; it is not used by the core or the uAPI,
it is merely for conveniently passing a device address from bit of
display management code to another.
v2: Note that since we only expose enough of a system map to cover our
single framebuffer, the screen_base/size and the smem are one and the
same.
The engine->guc_id is GuC FW defined and it is not guaranteed to be
below I915_NUM_ENGINES, so we shouldn't use it with the i915-defined
client->submissions, as we might overflow.
Instead of fixing it, just get rid of client->submissions, because the
information we get from it is not interesting anymore now that we only
have 1 client.
A new macro that is going to be added in a further patch will need to
adjust the offset returned by _MMIO_TRANS2(), so here adding
_TRANS2() and moving most of the implementation of _MMIO_TRANS2() to
it and while at it taking the opportunity to rename pipe to trans.
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Dhinakaran Pandiyan <dhinakaran.pandiya@intel.com> Reviewed-by: Dhinakaran Pandiyan <dhinakaran.pandiya@intel.com> Signed-off-by: José Roberto de Souza <jose.souza@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190730224753.14907-2-jose.souza@intel.com
Chris Wilson [Tue, 13 Aug 2019 19:07:05 +0000 (20:07 +0100)]
drm/i915: Push the wakeref->count deferral to the backend
If the backend wishes to defer the wakeref parking, make it responsible
for unlocking the wakeref (i.e. bumping the counter). This allows it to
time the unlock much more carefully in case it happens to needs the
wakeref to be active during its deferral.
For instance, during engine parking we may choose to emit an idle
barrier (a request). To do so, we borrow the engine->kernel_context
timeline and to ensure exclusive access we keep the
engine->wakeref.count as 0. However, to submit that request to HW may
require a intel_engine_pm_get() (e.g. to keep the submission tasklet
alive) and before we allow that we have to rewake our wakeref to avoid a
recursive deadlock.
drm/i915/tgl: Fix missing parentheses on TGL_TRANS_DDI_FUNC_CTL_VAL_TO_PORT
In this case we want to apply the mask and then shift so the
parentheses is needed.
SPANK! SPANK! SPANK! Naughty programmer!
Fixes: 9749a5b6c09f ("drm/i915/tgl: Fix the read of the DDI that transcoder is attached to") Cc: Lucas De Marchi <lucas.demarchi@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: José Roberto de Souza <jose.souza@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190812175405.14479-1-jose.souza@intel.com
Gao, Fred [Thu, 18 Jul 2019 01:39:01 +0000 (09:39 +0800)]
drm/i915/gvt: Utility for valid command length check
Add utility for valid command length check.
v2: Add F_VAL_CONST flag to identify the value is const
although LEN maybe variable. (Zhenyu)
v3: unused code removal, flag rename/conflict. (Zhenyu)
v4: redefine F_IP_ADVANCE_CUSTOM and move the check function to
next patch. (Zhenyu)
Reviewed-by: Zhenyu Wang <zhenyuw@linux.intel.com> Signed-off-by: Gao, Fred <fred.gao@intel.com> Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Zhi Wang [Mon, 22 Jul 2019 11:07:07 +0000 (14:07 +0300)]
drm/i915/gvt: factor out tlb and mocs register offset table
Factor out tlb and mocs register offset table to fix the issues reported
by klocwork, #512 and #550. Mostly, the reason why the klocwork reports
these problems is because there can be possbilities for platforms, which
have more rings than the ring offset table, to take the dirty data from
the stack as the register offset. It results to a random HW register
offset writting in this scenairo when doing context switch between vGPUs.
After the factoring, the ring offset table of TLB and MOCS should be per
platform.
v2:
- Enable TLB register switch for GEN8. (Zhenyu)
Reviewed-by: Zhenyu Wang <zhenyuw@linux.intel.com> Signed-off-by: Zhi Wang <zhi.a.wang@intel.com> Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
drm/i915/gvt: no need to check return value of debugfs_create functions
When calling debugfs functions, there is no need to ever check the
return value. The function can work or not, but the code logic should
never do something different based on this.
Because there is no need to check these functions, a number of local
functions can be made to return void to simplify things as nothing can
fail.
Cc: Zhenyu Wang <zhenyuw@linux.intel.com> Cc: Zhi Wang <zhi.a.wang@intel.com> Cc: Jani Nikula <jani.nikula@linux.intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: David Airlie <airlied@linux.ie> Cc: Daniel Vetter <daniel@ffwll.ch> Cc: intel-gvt-dev@lists.freedesktop.org Cc: intel-gfx@lists.freedesktop.org Cc: dri-devel@lists.freedesktop.org Reviewed-by: Zhenyu Wang <zhenyuw@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Michal Wajdeczko [Tue, 13 Aug 2019 08:15:59 +0000 (08:15 +0000)]
drm/i915/uc: Log fw status changes only under debug config
We don't care about internal firmware status changes unless
we are doing some real debugging. Note that our CI is not
using DRM_I915_DEBUG_GUC config by default so use it.
Chris Wilson [Mon, 12 Aug 2019 20:36:26 +0000 (21:36 +0100)]
drm/i915/guc: Use a local cancel_port_requests
Since execlists and the guc have diverged in their port tracking, we
cannot simply reuse the execlists cancellation code as it leads to
unbalanced reference counting. Use a local, simpler routine for the guc.
We rely on the tasklet to update the GT PM refcount, so we can't disable
it even if we've processed all the requests for the engine because we
might have detected the request completion before the interrupt arrived.
Since on all platforms on which we plan to support guc submission we
don't allow disabling the breadcrumb interrupts, we can further siplify
the park/unpark flow by removing the interrupt pin/unpin. A BUG_ON has
been added to catch changes to this flow that would require us to
restore some kind of pinning.
v2: split removal of engine_pin/unpin_breadcrumbs_irq to its own
patch (chris)
Chris Wilson [Mon, 12 Aug 2019 17:48:04 +0000 (18:48 +0100)]
drm/i915/overlay: Switch to using i915_active tracking
Remove the raw i915_active_request tracking in favour of the higher
level i915_active tracking for the sole purpose of making the lockless
transition easier in later patches.
Chris Wilson [Mon, 12 Aug 2019 17:48:03 +0000 (18:48 +0100)]
drm/i915: Forgo last_fence active request tracking
We were using the last_fence to track the last request that used this
vma that might be interpreted by a fence register and forced ourselves
to wait for this request before modifying any fence register that
overlapped our vma. Due to requirement that we need to track any XY_BLT
command, linear or tiled, this in effect meant that we have to track the
vma for its active lifespan anyway, so we can forgo the explicit
last_fence tracking and just use the whole vma->active.
Another solution would be to pipeline the register updates, and would
help resolve some long running stalls for gen3 (but only gen 2 and 3!)
Andi Shyti [Sun, 11 Aug 2019 21:06:33 +0000 (22:06 +0100)]
drm/i915: Extract general GT interrupt handlers
i915_irq.c is large. It serves as the central dispatch and handler for
all of our device interrupts. Lets break it up by pulling out the GT
interrupt handlers.
i915_irq.c is large. It serves as the central dispatch and handler for
all of our device interrupts. Pull out the GT pm interrupt handling
(leaving the central dispatch) so that we can encapsulate the logic a
little better.
Chris Wilson [Mon, 12 Aug 2019 09:10:38 +0000 (10:10 +0100)]
drm/i915/execlists: Avoid sync calls during park
Since we allow ourselves to use non-process context during parking, we
cannot allow ourselves to sleep and in particular cannot call
del_timer_sync() -- but we can use a plain del_timer().
Anshuman Gupta [Sun, 11 Aug 2019 10:02:32 +0000 (15:32 +0530)]
drm/i915/tgl: Fixing up list of PG3 power domains.
The DDI-IO power wells (PWR_WELL_CTL_DDI) are backing
the IO/PHY functionality, which doesn't need the PG3
power power well. Accordingly fixing up the list of
PG3 power domains.
Anshuman Gupta [Sun, 11 Aug 2019 08:19:08 +0000 (13:49 +0530)]
drm/i915/icl: Remove DDI IO power domain from PG3 power domains
The DDI-IO power wells (PWR_WELL_CTL_DDI) are backing
the IO/PHY functionality, which doesn't need the PG3
power power well. Accordingly fixing up the list of
PG3 power domains.
v2: Removed "DDI E/F IO"power domain as well [Imre]
Michal Wajdeczko [Sun, 11 Aug 2019 19:51:32 +0000 (19:51 +0000)]
drm/i915/uc: Use -EIO code for GuC initialization failures
Since commit 6ca9a2beb54a ("drm/i915: Unwind i915_gem_init() failure")
we believed that we correctly handle all errors encountered during
GuC initialization, including special one that indicates request to
run driver with disabled GPU submission (-EIO).
Unfortunately since commit 121981fafe69 ("drm/i915/guc: Combine
enable_guc_loading|submission modparams") we stopped using that
error code to avoid unwanted fallback to execlist submission mode.
In result any GuC initialization failure was treated as non-recoverable
error leading to driver load abort, so we could not even read related
GuC error log to investigate cause of the problem.
For now always return -EIO on any uC hardware related failure.
Michal Wajdeczko [Mon, 12 Aug 2019 07:39:49 +0000 (07:39 +0000)]
drm/i915/uc: Include HuC firmware version in summary
After successful uC initialization we are reporting GuC
firmware version and status of GuC submission and HuC.
Add HuC fw version to this report to make it complete,
but also skip all HuC info if HuC is not supported.
Chris Wilson [Sat, 10 Aug 2019 09:03:28 +0000 (10:03 +0100)]
drm/i915: Remove unused debugfs/i915_emon_status
Before we start upon our great GT interrupt refactor, throw out the
cruft! In this case, it is an unloved debugfs showing the current ips
status, a fairly meaningless bunch of numbers that we are not checking.
Matthew Auld [Fri, 9 Aug 2019 20:29:24 +0000 (21:29 +0100)]
drm/i915: buddy allocator
Simple buddy allocator. We want to allocate properly aligned
power-of-two blocks to promote usage of huge-pages for the GTT, so 64K,
2M and possibly even 1G. While we do support allocating stuff at a
specific offset, it is more intended for preallocating portions of the
address space, say for an initial framebuffer, for other uses drm_mm is
probably a much better fit. Anyway, hopefully this can all be thrown
away if we eventually move to having the core MM manage device memory.
Matthew Auld [Sat, 10 Aug 2019 17:43:38 +0000 (18:43 +0100)]
drm/i915/blt: support copying objects
We can already clear an object with the blt, so try to do the same to
support copying from one object backing store to another. Really this is
just object -> object, which is not that useful yet, what we really want
is two backing stores, but that will require some vma rework first,
otherwise we are stuck with "tmp" objects.
Matthew Auld [Fri, 9 Aug 2019 19:34:56 +0000 (20:34 +0100)]
drm/i915/gtt: disable 2M pages for pre-gen11
We currently disable THP(Transparent-Huge-Pages) for our shmem objects
due to a performance regression with read BW in some internal
benchmarks. Given that this is our main source of 2M pages, there really
isn't much point in enabling 2M GTT pages, especially as that comes at
the cost of disabling the GTT cache. However from gen11 it looks like we
should hopefully see the HW issue resolved. Given this opt for only
enabling 2M GTT pages from gen11 onwards.
Matthew Auld [Fri, 9 Aug 2019 19:34:55 +0000 (20:34 +0100)]
drm/i915/gtt: enable GTT cache by default
For some platforms the GTT cache is by default not enabled, and
currently where we explicitly enable it, we make it conditional on 2M GTT
page support, since the BSpec states that we must disable it if we
enable 2M/1G pages. To make this more consistent opt for blanket
enabling the GTT cache for all relevant gens in a single place, while
still keeping the same behaviour of checking for 2M support.
BSpec: 9314
BSpec: 423 Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20190809193456.3836-1-matthew.auld@intel.com
Matthew Auld [Sat, 10 Aug 2019 10:50:08 +0000 (11:50 +0100)]
drm/i915/selftests: move gpu-write-dw into utils
Using the gpu to write to some dword over a number of pages is rather
useful, and we already have two copies of such a thing, and we don't
want a third so move it to utils. There is probably some other stuff
also...
Matthew Auld [Sat, 10 Aug 2019 09:29:45 +0000 (10:29 +0100)]
drm/i915/blt: bump the size restriction
As pointed out by Chris, with our current approach we are actually
limited to S16_MAX * PAGE_SIZE for our size when using the blt to clear
pages. Keeping things simple try to fix this by reducing the copy to a
sequence of S16_MAX * PAGE_SIZE blocks.
Reported-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
[ickle: hide the details of the engine pool inside emit_vma] Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20190810092945.2762-1-chris@chris-wilson.co.uk
Matthew Auld [Sat, 10 Aug 2019 09:17:47 +0000 (10:17 +0100)]
drm/i915/blt: don't assume pinned intel_context
Currently we just pass in bcs0->engine_context so it matters not, but in
the future we may want to pass in something that is not a
kernel_context, so try to be a bit more generic.
Multiple uncore structures will share the debug infrastructure, so
move it to a common place and add extra locking around it.
Also, since we now have a separate object, it is cleaner to have
dedicated functions working on the object to stop and restart the
mmio debug. Apart from the cosmetic changes, this patch introduces
2 functional updates:
- All calls to check_for_unclaimed_mmio will now return false when
the debug is suspended, not just the ones that are active only when
i915_modparams.mmio_debug is set. If we don't trust the result of the
check while a user is doing mmio access then we shouldn't attempt the
check anywhere.
- i915_modparams.mmio_debug is not save/restored anymore around user
access. The value is now never touched by the kernel while debug is
disabled so no need for save/restore.
The filesystem reconfigure API is undergoing a transition, breaking our
current code. As we only set the default options, we can simply remove
the call to s_op->remount_fs(). In the future, when HW permits, we can
try re-enabling huge page support, albeit as suggested with new per-file
controls.
Reported-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Reported-by: Sedat Dilek <sedat.dilek@gmail.com> Suggested-by: Hugh Dickins <hughd@google.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Matthew Auld <matthew.auld@intel.com> Cc: Hugh Dickins <hughd@google.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190808172226.18306-1-chris@chris-wilson.co.uk
Chris Wilson [Fri, 9 Aug 2019 18:25:18 +0000 (19:25 +0100)]
drm/i915: Lift timeline into intel_context
Move the timeline from being inside the intel_ring to intel_context
itself. This saves much pointer dancing and makes the relations of the
context to its timeline much clearer.
Chris Wilson [Fri, 9 Aug 2019 18:25:16 +0000 (19:25 +0100)]
drm/i915/gt: Make deferred context allocation explicit
Refactor the backends to handle the deferred context allocation in a
consistent manner, and allow calling it as an explicit first step in
pinning a context for the first time. This should make it easier for
backends to keep track of partially constructed contexts from
initialisation.
Chris Wilson [Fri, 9 Aug 2019 18:25:15 +0000 (19:25 +0100)]
drm/i915: Remove i915_gem_context_create_gvt()
As we are phasing out using the GEM context for internal clients that
need to manipulate logical context state directly, remove the
constructor for the GVT context. We are not using it for anything other
than default setup and allocation of an i915_ppgtt.
Chris Wilson [Thu, 8 Aug 2019 07:41:52 +0000 (08:41 +0100)]
drm/i915: Drop the fudge warning on ring restart for ctg/elk
Since we have already stopped the ring, cleared the ring, disabled the
ring (and verifying the ring is clear), a later debug message that the
ring is no longer clear serves no function. It appears it restarts
anyway, and we verify that the ring started correctly afterwards.
Chris Wilson [Fri, 9 Aug 2019 09:10:09 +0000 (10:10 +0100)]
drm/i915: Replace global bsd_dispatch_index with random seed
We keep a global seed for the legacy BSD round-robin selector, but in
our testing of multiple simultaneous client workloads, a random seed
spreads the load more evenly. (As even as an initial round-robin selector
can be!) Removing the global is one less variable we have to find a home
for!
We can simulate multi-client (both same and mixed workloads) using
igt/gem_wsim to work out optimal strategies and then compare our
simulation with the actual transcoder on multi-engine machines. This
fixed round-robin turns out to be one of the worst methods.
No user is advised to use this method; the current suggestion is to use
a virtual engine for agnostic batches, randomised submission or using
the busyness tracking to select the most idle engine at the time of
dispatch. At the present time, intel-media is explicit, but libva still
seems to use it, with the exception of batches that must execute on vcs0.
Oh well.
Chris Wilson [Fri, 9 Aug 2019 12:31:53 +0000 (13:31 +0100)]
drm/i915: Check for a second VCS engine more carefully
To use the legacy BSD selector, you must have a second VCS engine, or
else the ABI simply maps the request for another engine onto VCS0.
However, we only checked a single VCS1 location and overlooking the
possibility of a sparse VCS set being mapped to the dense ABI.
v2: num_vcs_engines() turns out to be reusable and futureproof it so we
never have to worry about this silly bit of ABI again!
Chris Wilson [Fri, 9 Aug 2019 07:37:23 +0000 (08:37 +0100)]
drm/i915/execlists: Backtrack along timeline
After a preempt-to-busy, we may find an active request that is caught
between execution states. Walk back along the timeline instead of the
execution list to be safe.
[ 106.417541] i915 0000:00:02.0: Resetting rcs0 for preemption time out
[ 106.417659] ==================================================================
[ 106.418041] BUG: KASAN: slab-out-of-bounds in __execlists_reset+0x2f2/0x440 [i915]
[ 106.418123] Read of size 8 at addr ffff888703506b30 by task swapper/1/0
[ 106.418194]
[ 106.418267] CPU: 1 PID: 0 Comm: swapper/1 Tainted: G U 5.3.0-rc3+ #5
[ 106.418344] Hardware name: Intel Corporation NUC7i5BNK/NUC7i5BNB, BIOS BNKBL357.86A.0052.2017.0918.1346 09/18/2017
[ 106.418434] Call Trace:
[ 106.418508] <IRQ>
[ 106.418585] dump_stack+0x5b/0x90
[ 106.418941] ? __execlists_reset+0x2f2/0x440 [i915]
[ 106.419022] print_address_description+0x67/0x32d
[ 106.419376] ? __execlists_reset+0x2f2/0x440 [i915]
[ 106.419731] ? __execlists_reset+0x2f2/0x440 [i915]
[ 106.419810] __kasan_report.cold.6+0x1a/0x3c
[ 106.419888] ? __trace_bprintk+0xc0/0xd0
[ 106.420239] ? __execlists_reset+0x2f2/0x440 [i915]
[ 106.420318] check_memory_region+0x144/0x1c0
[ 106.420671] __execlists_reset+0x2f2/0x440 [i915]
[ 106.421029] execlists_reset+0x3d/0x50 [i915]
[ 106.421387] intel_engine_reset+0x203/0x3a0 [i915]
[ 106.421744] ? igt_reset_nop+0x2b0/0x2b0 [i915]
[ 106.421825] ? _raw_spin_trylock_bh+0xe0/0xe0
[ 106.421901] ? rcu_core+0x1b9/0x6a0
[ 106.422251] preempt_reset+0x9a/0xf0 [i915]
[ 106.422333] tasklet_action_common.isra.15+0xc0/0x1e0
[ 106.422685] ? execlists_submit_request+0x200/0x200 [i915]
[ 106.422764] __do_softirq+0x106/0x3cf
[ 106.422840] irq_exit+0xdc/0xf0
[ 106.422914] smp_apic_timer_interrupt+0x81/0x1c0
[ 106.422988] apic_timer_interrupt+0xf/0x20
[ 106.423059] </IRQ>
[ 106.423144] RIP: 0010:cpuidle_enter_state+0xc3/0x620
[ 106.423222] Code: 24 0f 1f 44 00 00 31 ff e8 da 87 9c ff 80 7c 24 10 00 74 12 9c 58 f6 c4 02 0f 85 33 05 00 00 31 ff e8 c1 77 a3 ff fb 45 85 e4 <0f> 89 bf 02 00 00 48 8d 7d 10 e8 4e 45 b9 ff c7 45 10 00 00 00 00
[ 106.423311] RSP: 0018:ffff88881c30fda8 EFLAGS: 00000202 ORIG_RAX: ffffffffffffff13
[ 106.423390] RAX: 0000000000000000 RBX: ffffffff825b4c80 RCX: ffffffff810c8a00
[ 106.423465] RDX: dffffc0000000000 RSI: 0000000039f89620 RDI: ffff88881f6b00a8
[ 106.423540] RBP: ffff88881f6b5bf8 R08: 0000000000000002 R09: 000000000002ed80
[ 106.423616] R10: 0000003fdd956146 R11: ffff88881c2d1e47 R12: 0000000000000008
[ 106.423691] R13: 0000000000000008 R14: ffffffff825b4f80 R15: ffffffff825b4fc0
[ 106.423772] ? sched_idle_set_state+0x20/0x30
[ 106.423851] ? cpuidle_enter_state+0xa6/0x620
[ 106.423874] ? tick_nohz_idle_stop_tick+0x1d1/0x3f0
[ 106.423896] cpuidle_enter+0x37/0x60
[ 106.423919] do_idle+0x246/0x280
[ 106.423941] ? arch_cpu_idle_exit+0x30/0x30
[ 106.423964] ? __wake_up_common+0x46/0x240
[ 106.423986] cpu_startup_entry+0x14/0x20
[ 106.424009] start_secondary+0x1b0/0x200
[ 106.424031] ? set_cpu_sibling_map+0x990/0x990
[ 106.424054] secondary_startup_64+0xa4/0xb0
[ 106.424075]
[ 106.424096] Allocated by task 626:
[ 106.424119] save_stack+0x19/0x80
[ 106.424143] __kasan_kmalloc.constprop.7+0xc1/0xd0
[ 106.424165] kmem_cache_alloc+0xb2/0x1d0
[ 106.424277] i915_sched_lookup_priolist+0x1ab/0x320 [i915]
[ 106.424385] execlists_submit_request+0x73/0x200 [i915]
[ 106.424498] submit_notify+0x59/0x60 [i915]
[ 106.424600] __i915_sw_fence_complete+0x9b/0x330 [i915]
[ 106.424713] __i915_request_commit+0x4bf/0x570 [i915]
[ 106.424818] intel_engine_pulse+0x213/0x310 [i915]
[ 106.424925] context_close+0x22f/0x470 [i915]
[ 106.425033] i915_gem_context_destroy_ioctl+0x7b/0xa0 [i915]
[ 106.425058] drm_ioctl_kernel+0x131/0x170
[ 106.425081] drm_ioctl+0x2d9/0x4f1
[ 106.425104] do_vfs_ioctl+0x115/0x890
[ 106.425126] ksys_ioctl+0x35/0x70
[ 106.425147] __x64_sys_ioctl+0x38/0x40
[ 106.425169] do_syscall_64+0x66/0x220
[ 106.425191] entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 106.425213]
[ 106.425234] Freed by task 0:
[ 106.425255] (stack is not available)
[ 106.425276]
[ 106.425297] The buggy address belongs to the object at ffff888703506a40
[ 106.425297] which belongs to the cache i915_priolist of size 104
[ 106.425321] The buggy address is located 136 bytes to the right of
[ 106.425321] 104-byte region [ffff888703506a40, ffff888703506aa8)
[ 106.425345] The buggy address belongs to the page:
[ 106.425367] page:ffffea001c0d4180 refcount:1 mapcount:0 mapping:ffff88873e1cf740 index:0xffff888703506e40 compound_mapcount: 0
[ 106.425391] flags: 0x8000000000010200(slab|head)
[ 106.425415] raw: 8000000000010200ffffea0020192b88ffff8888174b5450ffff88873e1cf740
[ 106.425439] raw: ffff888703506e40000000000010000e00000001ffffffff0000000000000000
[ 106.425464] page dumped because: kasan: bad access detected
[ 106.425486]
[ 106.425506] Memory state around the buggy address:
[ 106.425528] ffff888703506a00: fc fc fc fc fc fc fc fc 00 00 00 00 00 00 00 00
[ 106.425551] ffff888703506a80: 00 00 00 00 00 fc fc fc fc fc fc fc fc fc fc fc
[ 106.425573] >ffff888703506b00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[ 106.425597] ^
[ 106.425619] ffff888703506b80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[ 106.425642] ffff888703506c00: fc fc fc fc fc fc fc fc 00 00 00 00 00 00 00 00
[ 106.425664] ==================================================================
Chris Wilson [Fri, 9 Aug 2019 11:07:52 +0000 (12:07 +0100)]
drm/i915: Free the imported shmemfs file for phys objects
Matthew spotted that we lost the fput() for phys objects now that we are
not relying on the core to cleanup the GEM object. (For the record, phys
objects import the shmemfs from their original set of pages and keep it
to provide swap space, but we never transform back into a shmem object.)
Reported-by: Matthew Auld <matthew.auld@intel.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Fixes: 0c159ffef628 ("drm/i915/gem: Defer obj->base.resv fini until RCU callback") Cc: Matthew Auld <matthew.auld@intel.com> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190809110752.19763-1-chris@chris-wilson.co.uk
Jani Nikula [Thu, 8 Aug 2019 13:42:49 +0000 (16:42 +0300)]
drm/i915: extract i915_gem_shrinker.h from i915_drv.h
It used to be handy that we only had a couple of headers, but over time
i915_drv.h has become unwieldy. Extract declarations to a separate
header file corresponding to the implementation module, clarifying the
modularity of the driver.
Ensure the new header is self-contained, and do so with minimal further
includes, using forward declarations as needed. Include the new header
from i915_drv.h to avoid sprinkling includes all over the place; this
can be changed as a follow-up if necessary.
Jani Nikula [Thu, 8 Aug 2019 13:42:48 +0000 (16:42 +0300)]
drm/i915: extract gem/i915_gem_stolen.h from i915_drv.h
It used to be handy that we only had a couple of headers, but over time
i915_drv.h has become unwieldy. Extract declarations to a separate
header file corresponding to the implementation module, clarifying the
modularity of the driver.
Ensure the new header is self-contained, and do so with minimal further
includes, using forward declarations as needed. Include the new header
from i915_drv.h to avoid sprinkling includes all over the place; this
can be changed as a follow-up if necessary.
Jani Nikula [Thu, 8 Aug 2019 13:42:47 +0000 (16:42 +0300)]
drm/i915: extract i915_memcpy.h from i915_drv.h
It used to be handy that we only had a couple of headers, but over time
i915_drv.h has become unwieldy. Extract declarations to a separate
header file corresponding to the implementation module, clarifying the
modularity of the driver.
Ensure the new header is self-contained, and do so with minimal further
includes, using forward declarations as needed. Include the new header
only where needed, and sort the modified include directives while at it
and as needed.
Jani Nikula [Thu, 8 Aug 2019 13:42:46 +0000 (16:42 +0300)]
drm/i915: extract i915_suspend.h from i915_drv.h
It used to be handy that we only had a couple of headers, but over time
i915_drv.h has become unwieldy. Extract declarations to a separate
header file corresponding to the implementation module, clarifying the
modularity of the driver.
Ensure the new header is self-contained, and do so with minimal further
includes, using forward declarations as needed. Include the new header
only where needed, and sort the modified include directives while at it
and as needed.
Jani Nikula [Thu, 8 Aug 2019 13:42:45 +0000 (16:42 +0300)]
drm/i915: extract i915_sysfs.h from i915_drv.h
It used to be handy that we only had a couple of headers, but over time
i915_drv.h has become unwieldy. Extract declarations to a separate
header file corresponding to the implementation module, clarifying the
modularity of the driver.
Ensure the new header is self-contained, and do so with minimal further
includes, using forward declarations as needed. Include the new header
only where needed, and sort the modified include directives while at it
and as needed.
Jani Nikula [Thu, 8 Aug 2019 13:42:44 +0000 (16:42 +0300)]
drm/i915: extract i915_perf.h from i915_drv.h
It used to be handy that we only had a couple of headers, but over time
i915_drv.h has become unwieldy. Extract declarations to a separate
header file corresponding to the implementation module, clarifying the
modularity of the driver.
Ensure the new header is self-contained, and do so with minimal further
includes, using forward declarations as needed. Include the new header
only where needed, and sort the modified include directives while at it
and as needed.
The last user of dev_priv->no_aux_handshake was removed in commit 3cf2efb1a7c6 ("Revert "drm/i915/dp: use VBT provided eDP params if
available""). Finally remove the leftovers.
Make use of the struct_size() helper instead of an open-coded version
in order to avoid any potential type mistakes, in particular in the
context in which this code is being used.
Chris Wilson [Thu, 8 Aug 2019 16:24:07 +0000 (17:24 +0100)]
drm/i915: Make debugfs/per_file_stats scale better
Currently we walk the entire list of obj->vma for each obj within a file
to find the matching vma of this context. Since we know we are searching
for a particular vma bound to a user context, we can use the rbtree to
search for it rather than repeatedly walk everything.
Chris Wilson [Thu, 8 Aug 2019 14:45:11 +0000 (15:45 +0100)]
drm/i915: Only include active engines in the capture state
Skip printing out idle engines that did not contribute to the GPU hang.
As the number of engines gets ever larger, we have increasing noise in
the error state where typically there is only one guilty request on one
engine that we need to inspect.
Chris Wilson [Thu, 8 Aug 2019 20:27:58 +0000 (21:27 +0100)]
drm/i915: Defer final intel_wakeref_put to process context
As we need to acquire a mutex to serialise the final
intel_wakeref_put, we need to ensure that we are in process context at
that time. However, we want to allow operation on the intel_wakeref from
inside timer and other hardirq context, which means that need to defer
that final put to a workqueue.
Inside the final wakeref puts, we are safe to operate in any context, as
we are simply marking up the HW and state tracking for the potential
sleep. It's only the serialisation with the potential sleeping getting
that requires careful wait avoidance. This allows us to retain the
immediate processing as before (we only need to sleep over the same
races as the current mutex_lock).
v2: Add a selftest to ensure we exercise the code while lockdep watches.
v3: That test was extremely loud and complained about many things!
v4: Not a whale!
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111295
References: https://bugs.freedesktop.org/show_bug.cgi?id=111245
References: https://bugs.freedesktop.org/show_bug.cgi?id=111256 Fixes: 18398904ca9e ("drm/i915: Only recover active engines") Fixes: 51fbd8de87dc ("drm/i915/pmu: Atomically acquire the gt_pm wakeref") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190808202758.10453-1-chris@chris-wilson.co.uk
Chris Wilson [Thu, 8 Aug 2019 19:45:25 +0000 (20:45 +0100)]
drm/i915/selftests: Fixup a missing legacy_idx
Grr, missed one*. For using the legacy engine map, we should use
engine->legacy_idx. Ideally, we should know the intel_context in the
selftest and avoid all the fiddling around with unwanted GEM contexts.
* In my defence, the conflict was added in another patch after it was
tested by CI.
drm/i915: Get transcoder power domain before reading its register
When getting the pipes attached to encoder if it is not a eDP encoder
it iterates over all pipes and read a transcoder register.
But it should not read a transcoder register before get its power
domain.
It was not a issue in gens older than 12 because if it only had
port A connected it would be attached to EDP and it would skip all
the transcoders readout, if it had more than one port connected,
pipe B would cause PG3 to be on and it contains all other
transcoders.
But on gen 12 there is no EDP transcoder so it is always iterating
over all pipes and if only one sink is connected, PG3 is kept off
and reading other transcoders registers would cause a
unclaimed read warning.
So here getting the power domain of the transcoder only if it is
enabled, otherwise it is not connected to the DDI.
Chris Wilson [Thu, 8 Aug 2019 11:06:12 +0000 (12:06 +0100)]
drm/i915: Fix up the inverse mapping for default ctx->engines[]
The order in which we store the engines inside default_engines() for the
legacy ctx->engines[] has to match the legacy I915_EXEC_RING selector
mapping in execbuf::user_map. If we present VCS2 as being the second
instance of the video engine, legacy userspace calls that I915_EXEC_BSD2
and so we need to insert it into the second video slot.
v2: Record the legacy mapping (hopefully we can remove this need in the
future)
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111328 Fixes: 2edda80db3d0 ("drm/i915: Rename engines to match their user interface") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> #v1 Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190808110612.23539-2-chris@chris-wilson.co.uk
Chris Wilson [Thu, 8 Aug 2019 11:06:11 +0000 (12:06 +0100)]
drm/i915: Allocate kernel_contexts directly
Ignore the central i915->kernel_context for allocating an engine, as
that GEM context is being phased out. For internal clients, we just need
the per-engine logical state, so allocate it at the point of use.
drm/i915/tgl/dsi: Enable blanking packets during BLLP for video mode
Blanking packet bit will control whether the transcoder allows the link
to enter the LP state during BLLP regions (assuming there is enough time),
or whether it will keep the link in the HS state with a Blanking Packet
Jani Nikula [Wed, 7 Aug 2019 12:04:15 +0000 (15:04 +0300)]
drm/i915: split out intel_pch.[ch] from i915_drv.[ch]
Abstract the rather self-contained piece of code from i915_drv.[ch]. No
functional changes.
Cc: José Roberto de Souza <jose.souza@intel.com> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Acked-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: José Roberto de Souza <jose.souza@intel.com> Signed-off-by: Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190807120415.17917-1-jani.nikula@intel.com
Insert few more failure points into firmware fetch procedure to check
use of the wrong blob name or use of the mismatched firmware versions.
Also update some messages (remove ptr, duplicated infos) and stop
treating all fetch errors as missing firmware case.
v2: update log levels (Chris)
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
[ickle: fixup compiler warning for non-debug builds] Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20190807183759.8588-1-michal.wajdeczko@intel.com
drm/i915/uc: WOPCM programming errors are not always real
WOPCM programming error might be due to inserted earlier probe
failure that could affects HuC firmware loading and thus impacts
result of WOPCM partitioning that would be now incompatible with
previously programmed values.
drm/i915: Don't try to partition WOPCM without GuC firmware
For meaningful WOPCM partitioning we need GuC (and optionally HuC)
firmware size(s) and we shouldn't just rely on GuC support flag,
as we might fail to fetch GuC firmware and it's size will be 0
and all calculations will be just wrong/useless.
drm/i915/perf: Refactor oa object to better manage resources
The oa object manages the oa buffer and must be allocated when the user
intends to read performance counter snapshots. This can be achieved by
making the oa object part of the stream object which is allocated when a
stream is opened by the user.
Attributes in the oa object that are gen-specific are moved to the perf
object so that they can be initialized on driver load.
The split provides a better separation of the objects used in perf
implementation of i915 driver so that resources are allocated and
initialized only when needed.
v2: Fix checkpatch warnings
v3: Addressed Lionel's review comment
v4: Rebase
v5: Fix rebase/merge issue with ratelimit_state_init