Dave Airlie [Wed, 24 Apr 2019 01:54:26 +0000 (11:54 +1000)]
Merge tag 'drm-msm-next-2019-04-21' of https://gitlab.freedesktop.org/drm/msm into drm-next
This time around it is a bunch of cleanup and fixes, expanding gpu
"zap" shader support (so we can take the GPU out of secure mode on
boot) to a6xx, and small UABI extension to support robustness (see
mesa MR 673).
Dave Airlie [Wed, 24 Apr 2019 01:24:22 +0000 (11:24 +1000)]
Merge branch 'drm-next-5.2' of git://people.freedesktop.org/~agd5f/linux into drm-next
- Add the amdgpu specific bits for timeline support
- Add internal interfaces for xgmi pstate support
- DC Z ordering fixes for planes
- Add support for NV12 planes in DC
- Add colorspace properties for planes in DC
- eDP optimizations if the GOP driver already initialized eDP
- DC bandwidth validation tracing support
Dave Airlie [Wed, 24 Apr 2019 00:30:41 +0000 (10:30 +1000)]
Merge tag 'drm/tegra/for-5.2-rc1' of git://anongit.freedesktop.org/tegra/linux into drm-next
drm/tegra: Changes for v5.2-rc1
This contains a fix for the usage of shared resets that previously
generated a WARN on boot. In addition, there's a fix for CPU cache
maintenance of GEM buffers allocated using get_pages().
(airlied: contains a merge from a shared tegra tree)
Dave Airlie [Wed, 24 Apr 2019 00:12:34 +0000 (10:12 +1000)]
Merge tag 'drm-misc-next-2019-04-18' of git://anongit.freedesktop.org/drm/drm-misc into drm-next
drm-misc-next for v5.2:
UAPI Changes:
- Document which feature flags belong to which command in virtio_gpu.h
- Make the FB_DAMAGE_CLIPS available for atomic userspace only, it's useless for legacy.
Cross-subsystem Changes:
- Add device tree bindings for lg,acx467akm-7 panel and ST-Ericsson Multi Channel Display Engine MCDE
- Add parameters to the device tree bindings for tfp410
- iommu/io-pgtable: Add ARM Mali midgard MMU page table format
- dma-buf: Only do a 64-bits seqno compare when driver explicitly asks for it, else wraparound.
- Use the 64-bits compare for dma-fence-chains
Core Changes:
- Make the fb conversion functions use __iomem dst.
- Rename drm_client_add to drm_client_register
- Move intel_fb_initial_config to core.
- Add a drm_gem_objects_lookup helper
- Add drm_gem_fence_array helpers, and use it in lima.
- Add drm_format_helper.c to kerneldoc.
Driver Changes:
- Add panfrost driver for mali midgard/bitfrost.
- Converts bochs to use the simple display type.
- Small fixes to sun4i, tinydrm, ti-fp410.
- Fid aspeed's Kconfig options.
- Make some symbols/functions static in lima, sun4i and meson.
- Add a driver for the lg,acx467akm-7 panel.
Dave Airlie [Wed, 24 Apr 2019 00:02:20 +0000 (10:02 +1000)]
Merge tag 'drm-intel-next-2019-04-17' of git://anongit.freedesktop.org/drm/drm-intel into drm-next
UAPI Changes:
- uAPI "Fixes:" patch for the upcoming kernel 5.1, included here too
We have an Ack from the media folks (only current user) for this
late tweak
Cross-subsystem Changes:
- ALSA: hda: Fix racy display power access (Takashi, Chris)
Driver Changes:
- DDI and MIPI-DSI clocks fixes for Icelake (Vandita)
- Fix Icelake frequency change/locking (RPS) (Mika)
- Temporarily disable ppGTT read-only bit on Icelake (Mika)
- Add missing Icelake W/As (Mika)
- Enable 12 deep CSB status FIFO on Icelake (Mika)
- Inherit more Icelake code for Elkhartlake (Bob, Jani)
- Handle catastrophic error on engine reset (Mika)
- Shortcut readiness to reset check (Mika)
- Regression fix for GEM_BUSY causing us to report a mixed uabi-class request as not busy (Chris)
- Revert back to max link rate and lane count on eDP (Jani)
- Fix pipe BPP readout for BXT/GLK DSI (Ville)
- Set DP min_bpp to 8*3 for non-RGB output formats (Ville)
- Enable coarse preemption boundaries for Gen8 (Chris)
- Do not enable FEC without DSC (Ville)
- Restore correct BXT DDI latency optim setting calculation (Ville)
- Always reset context's RING registers to avoid running workload twice during reset (Chris)
- Set GPU wedged on driver unload (Janusz)
- Consolidate two similar barries from timeline into one (Chris)
- Only reset the pinned kernel contexts on resume (Chris)
- Wakeref tracking improvements (Chris, Imre)
- Lockdep fixes for shrinker interactions (Chris)
- Bump ready tasks ahead of busywaits in prep of semaphore use (Chris)
- Huge step in splitting display code into fine grained files (Jani)
- Refactor the IRQ init/reset macros for code saving (Paulo)
- Convert IRQ initialization code to uncore MMIO access (Paulo)
- Convert workarounds code to use uncore MMIO access (Chris)
- Nuke drm_crtc_state and use intel_atomic_state instead (Manasi)
- Update SKL clock-gating WA (Radhakrishna, Ville)
- Isolate GuC reset code flow (Chris)
- Expose force_dsc_enable through debugfs (Manasi)
- Header standalone compile testing framework (Jani)
- Code cleanups to reduce driver footprint (Chris)
- PSR code fixes and cleanups (Jose)
- Sparse and kerneldoc updates (Chris)
- Suppress spurious combo PHY B warning (Vile)
Jordan Crouse [Fri, 19 Apr 2019 19:46:15 +0000 (13:46 -0600)]
drm/msm/a6xx: Add zap shader load
The a6xx GPU powers on in secure mode which restricts what memory it can
write to. To get out of secure mode the GPU driver can write to
REG_A6XX_RBBM_SECVID_TRUST_CNTL but on targets that are "secure" that
register region is blocked and writes will cause the system to go down.
For those targets we need to execute a special sequence that involves
loadinga special shader that clears the GPU registers and use a PM4
sequence to pull the GPU out of secure. Add support for loading the zap
shader and executing the secure sequence. For targets that do not support
SCM or the specific SCM sequence this should fail and we would fall back
to writing the register.
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org> Signed-off-by: Rob Clark <robdclark@chromium.org>
Jordan Crouse [Fri, 19 Apr 2019 19:46:14 +0000 (13:46 -0600)]
drm/msm/gpu: Move zap shader loading to adreno
a5xx and a6xx both share (mostly) the same code to load the zap shader and
bring the GPU out of secure mode. Move the formerly 5xx specific code to
adreno to make it available for a6xx too.
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org> Signed-off-by: Rob Clark <robdclark@chromium.org>
Jordan Crouse [Fri, 19 Apr 2019 19:56:27 +0000 (13:56 -0600)]
dt-bindings: drm/msm/a6xx: Document interconnect properties for GPU
Add documentation for the interconnect and interconnect-names bindings
for the GPU node as detailed by bindings/interconnect/interconnect.txt.
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org> Reviewed-by: Douglas Anderson <dianders@chromium.org> Reviewed-by: Rob Herring <robh@kernel.org> Acked-by: Georgi Djakov <georgi.djakov@linaro.org> Signed-off-by: Rob Clark <robdclark@chromium.org>
drm/msm: Split submit_lookup_objects() into two loops
First loop does copy_from_user() without the table lock held and
just stores the handle. Second loop looks up buffer objects with the
table_lock held without potentially blocking or faulting. This lets us
clean up a bunch of custom, non-faulting copy_from_user() code.
Signed-off-by: Kristian H. Kristensen <hoegsberg@chromium.org> Signed-off-by: Rob Clark <robdclark@chromium.org>
We use a llist and a worker to delay the object cleanup. This avoids
taking mmap_sem and struct_mutex in the wrong order when calling
drm_gem_object_put_unlocked() from drm_gem_mmap().
Fixes lockdep problem with copy_from_user() in msm_ioctl_gem_submit().
Signed-off-by: Kristian H. Kristensen <hoegsberg@chromium.org> Signed-off-by: Rob Clark <robdclark@chromium.org>
Jordan Crouse [Mon, 4 Feb 2019 16:15:43 +0000 (09:15 -0700)]
msm/drm/a6xx: Turn off the GMU if resume fails
Currently if the GMU resume function fails all we try to do is clear the
BOOT_SLUMBER oob which usually times out and ends up in a cycle of death.
If the resume function fails at any point remove any RPMh votes that might
have been added and try to shut down the GMU hardware cleanly.
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org> Signed-off-by: Rob Clark <robdclark@chromium.org>
Jordan Crouse [Mon, 4 Feb 2019 16:15:42 +0000 (09:15 -0700)]
drm/msm/a6xx: Make GMU reset useful
Now that the GX domain is sorted we can wire up a working GMU reset.
IF a GMU hang was detected then try to forcefully shut down the GMU
in the power down sequence which should ensure that it can recover
normally on the next power up.
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org> Signed-off-by: Rob Clark <robdclark@chromium.org>
Jordan Crouse [Mon, 4 Feb 2019 16:15:41 +0000 (09:15 -0700)]
drm/msm/gpu: Attach to the GPU GX power domain
99.999% of the time during normal operation the GMU is responsible
for power and clock control on the GX domain and the CPU remains
blissfully unaware. However, there is one situation where the CPU
needs to get involved:
The power sequencing rules dictate that the GX needs to be turned
off before the CX so that the CX can be turned on before the GX
during power up. During normal operation when the CPU is taking
down the CX domain a stop command is sent to the GMU which turns
off the GX domain and then the CPU handles the CX domain.
But if the GMU happened to be unresponsive while the GX domain was
left then the CPU will need to step in and turn off the GX domain
before resetting the CX and rebooting the GMU. This unfortunately
means that the CPU needs to be marginally aware of the GX domain
even though it is expected to usually keep its hands off.
To support this we create a semi-disabled GX power domain that
does nothing to the hardware on power up but tries to shut it
down normally on power down. In this method the reference counting
is correct and we can step in with the pm_runtime_put() at the right
time during the failure path.
This patch sets up the connection to the GX power domain and does
the magic to "enable" and disable it at the right points.
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org> Signed-off-by: Rob Clark <robdclark@chromium.org>
Jordan Crouse [Mon, 4 Feb 2019 16:15:40 +0000 (09:15 -0700)]
dt-bindings: drm/msm/a6xx: Add GX power-domain for GMU bindings
The GMU should have two power domains defined: "cx" and "gx". "cx" is the
actual power domain for the device and "gx" will be attached at runtime
to manage reference counting on the GPU device in case of a GMU crash.
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org> Signed-off-by: Rob Clark <robdclark@chromium.org>
Jordan Crouse [Mon, 4 Feb 2019 16:15:39 +0000 (09:15 -0700)]
drm/msm/a6xx: Remove unwanted regulator code
The GMU code currently has some misguided code to try to work around
a hardware quirk that requires the power domains on the GPU be
collapsed in a certain order. Upcoming patches will do this the
right way so get rid of the unused and unwanted regulator
code.
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org> Signed-off-by: Rob Clark <robdclark@chromium.org>
Jordan Crouse [Fri, 22 Mar 2019 20:21:22 +0000 (14:21 -0600)]
drm/msm/gpu: Add submit queue queries
Add the capability to query information from a submit queue.
The first available parameter is for querying the number of GPU faults
(hangs) that can be attributed to the queue.
This is useful for implementing context robustness. A user context can
regularly query the number of faults to see if it is responsible for any
and if so it can invalidate itself.
This is also helpful for testing by confirming to the user driver if a
particular command stream caused a fault (or not as the case may be).
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org> Signed-off-by: Rob Clark <robdclark@chromium.org>
Rob Clark [Tue, 16 Apr 2019 23:13:28 +0000 (16:13 -0700)]
drm/msm: add param to retrieve # of GPU faults (global)
For KHR_robustness, userspace wants to know two things, the count of GPU
faults globally, and the count of faults attributed to a given context.
This patch providees the former, and the next patch provides the latter.
Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Jordan Crouse <jcrouse@codeaurora.org>
Wen Yang [Wed, 3 Apr 2019 16:04:11 +0000 (00:04 +0800)]
drm/msm: a5xx: fix possible object reference leak
The call to of_get_child_by_name returns a node pointer with refcount
incremented thus it must be explicitly decremented after the last
usage.
Detected by coccinelle with the following warnings:
drivers/gpu/drm/msm/adreno/a5xx_gpu.c:57:2-8: ERROR: missing of_node_put; acquired a node pointer with refcount incremented on line 47, but without a corresponding object release within this function.
drivers/gpu/drm/msm/adreno/a5xx_gpu.c:66:2-8: ERROR: missing of_node_put; acquired a node pointer with refcount incremented on line 47, but without a corresponding object release within this function.
drivers/gpu/drm/msm/adreno/a5xx_gpu.c:118:1-7: ERROR: missing of_node_put; acquired a node pointer with refcount incremented on line 47, but without a corresponding object release within this function.
drivers/gpu/drm/msm/adreno/a5xx_gpu.c:57:2-8: ERROR: missing of_node_put; acquired a node pointer with refcount incremented on line 51, but without a corresponding object release within this function.
drivers/gpu/drm/msm/adreno/a5xx_gpu.c:66:2-8: ERROR: missing of_node_put; acquired a node pointer with refcount incremented on line 51, but without a corresponding object release within this function.
drivers/gpu/drm/msm/adreno/a5xx_gpu.c:118:1-7: ERROR: missing of_node_put; acquired a node pointer with refcount incremented on line 51, but without a corresponding object release within this function.
Signed-off-by: Wen Yang <wen.yang99@zte.com.cn> Cc: Rob Clark <robdclark@gmail.com> Cc: Sean Paul <sean@poorly.run> Cc: David Airlie <airlied@linux.ie> Cc: Daniel Vetter <daniel@ffwll.ch> Cc: Jordan Crouse <jcrouse@codeaurora.org> Cc: Mamta Shukla <mamtashukla555@gmail.com> Cc: Thomas Zimmermann <tzimmermann@suse.de> Cc: Sharat Masetty <smasetty@codeaurora.org> Cc: linux-arm-msm@vger.kernel.org Cc: dri-devel@lists.freedesktop.org Cc: freedreno@lists.freedesktop.org Cc: linux-kernel@vger.kernel.org (open list) Reviewed-by: Jordan Crouse <jcrouse@codeaurora.org> Signed-off-by: Rob Clark <robdclark@gmail.com> Signed-off-by: Rob Clark <robdclark@chromium.org>
Douglas Anderson [Wed, 16 Jan 2019 18:46:22 +0000 (10:46 -0800)]
drm/msm: Cleanup A6XX opp-level reading
The patch ("OPP: Add support for parsing the 'opp-level' property")
adds an API enabling a cleaner way to read the opp-level. Let's use
the new API.
Signed-off-by: Douglas Anderson <dianders@chromium.org> Reviewed-by: Jordan Crouse <jcrouse@codeaurora.org> Signed-off-by: Rob Clark <robdclark@gmail.com> Signed-off-by: Rob Clark <robdclark@chromium.org>
Iterate and assign HW intf block to physical encoders
in encoder modeset. Moving all the HW block assignments
to encoder modeset to allow easy switching to state
based resource management.
drm/msm/dpu: map mixer/ctl hw blocks in encoder modeset
After resource allocation, iterate and populate mixer/ctl
hw blocks in encoder modeset thereby centralizing all
the resource mapping to the CRTC. This change is made
for easy switching to state based allocation using
private objects later in this series.
release resources allocated in mode_set if any of
the hw check fails. Most of these checks are not
necessary and they will be removed in the follow up
patches with state based resource allocations.
Both video and command physical encoders will have
a hw interface assigned to it. So there is really no
need to track the hw block in specific encoder subclass.
Sean Paul [Wed, 30 Jan 2019 16:32:12 +0000 (11:32 -0500)]
drm/msm: dpu: Don't set frame_busy_mask for async updates
The frame_busy mask is used in frame_done event handling, which is not
invoked for async commits. So an async commit will leave the
frame_busy mask populated after it completes and future commits will start
with the busy mask incorrect.
This showed up on disable after cursor move. I was hitting the "this should
not happen" comment in the frame event worker since frame_busy was set,
we queued the event, but there were no frames pending (since async
also doesn't set that).
Sean Paul [Mon, 28 Jan 2019 20:42:51 +0000 (15:42 -0500)]
drm/msm: dpu: Don't queue the frame_done watchdog for cursor
In the case of an async/cursor update, we don't wait for the frame_done
event, which means handle_frame_done is never called, and the frame_done
watchdog isn't canceled. Currently, this results in a frame_done timeout
every time the cursor moves without a synchronous frame following it up
before the timeout expires. Since we don't wait for frame_done, and
don't handle it, we shouldn't modify the watchdog.
Sean Paul [Mon, 28 Jan 2019 20:42:50 +0000 (15:42 -0500)]
drm/msm: dpu: Untangle frame_done timeout units
There exists a bunch of confusion as to what the actual units of
frame_done is:
- The definition states it's in # of frames
- CRTC treats it like it's ms
- frame_done_timeout comment thinks it's Hz, but it stores ms
- frame_done timer is setup such that it _should_ be in frames, but the
timeout is super long
So this patch tries to interpret what the driver really wants. I've
de-centralized the #define since the consumers are expecting different
units.
For crtc, we just use 60ms since that's what it was doing before.
Perhaps we could get fancy and scale with vrefresh, but that's for
another time.
For encoder, fix the comments and rename frame_done_timeout so it's
obvious what the units are. In practice, frame_done_timeout is really
just checked against 0 || !0, which I guess is why the units being wrong
didn't matter. I've also dropped the timeout from the previous 60 frames
to 5. That seems like more than enough time to give up on a frame, and
my guess is that no one intended for the timeout to _actually_ be 60
frames.
Instead of setting the timeout and then immediately reading it back
(along with the hand-rolled msecs_to_jiffies calculation), just
calculate it once and set it in both places at the same time.
Jordan Crouse [Tue, 19 Feb 2019 18:48:20 +0000 (11:48 -0700)]
drm/msm: Remove pm_runtime calls from msm_iommu.c
Currently the IOMMU code calls pm_runtime_get/put on the GPU or display
device before doing a IOMMU operation. This was because usually the
IOMMU driver didn't do power control of its own and since the hardware
used the same clocks and power as the respective multimedia device it
was a easy way to make sure that the power was available.
Now two things have changed. First, the SMMU devices can do their own power
control and more important bringing up the a6xx GPU isn't as easy as
turning on some clocks. To bring the GPU up we need the GMU which itself
needs the IOMMU so we have a chicken and egg problem.
Luckily this is easily fixed by removing the pm_runtime calls from the
functions and letting the device link to the IOMMU device handle the magic.
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org> Tested-by: Bjorn Andersson <bjorn.andersson@linaro.org> Signed-off-by: Rob Clark <robdclark@gmail.com> Signed-off-by: Rob Clark <robdclark@chromium.org>
Lucas Stach [Thu, 28 Feb 2019 06:23:29 +0000 (07:23 +0100)]
drm/msm: don't allocate pages from the MOVABLE zone
The pages backing the GEM objects are kept pinned in place as
long as they are alive, so they must not be allocated from the
MOVABLE zone. Blocking page migration for too long will cause
the VM subsystem headaches and will outright break CMA, as a
few pinned pages in CMA will lead to failure to find the
required large contiguous regions.
Signed-off-by: Lucas Stach <l.stach@pengutronix.de> Signed-off-by: Rob Clark <robdclark@gmail.com> Signed-off-by: Rob Clark <robdclark@chromium.org>
Dmitry Osipenko [Wed, 6 Mar 2019 22:55:19 +0000 (01:55 +0300)]
drm/tegra: gem: Fix CPU-cache maintenance for BO's allocated using get_pages()
The allocated pages need to be invalidated in CPU caches. On ARM32 the
DMA_BIDIRECTIONAL flag only ensures that data is written-back to DRAM and
the data stays in CPU cache lines. While the DMA_FROM_DEVICE flag ensures
that the corresponding CPU cache lines are getting invalidated and nothing
more, that's exactly what is needed for a newly allocated pages.
This fixes randomly failing rendercheck tests on Tegra30 using the
Opentegra driver for tests that use small-sized pixmaps (10x10 and less,
i.e. 1-2 memory pages) because apparently CPU reads out stale data from
caches and/or that data is getting evicted to DRAM at the time of HW job
execution.
Add binding for the LG ACX467AKM-7 4.95" 1080×1920 LCD panel that is
found on the LG Nexus 5 (hammerhead) phone. This appears to be a JDI
panel based on some Internet searches, however a specific model number
could not be found. I disassembled an old Nexus 5 with a broken
screen and the LG part number is the only model number present on the
back of the panel, so I think that is probably the best ID to use.
drm/drv: Fix incorrect resolution of merge conflict
Commit f06ddb53096b ("BackMerge v5.1-rc5 into drm-next") incorrectly
resolved a merge conflict related to a patch having been merged twice:
- commit 3f04e0a6cfeb ("drm: Fix drm_release() and device unplug")
introduced as a standalone fix via drm-fixes branch,
- commit 1ee57d4d75fb ("drm: Fix drm_release() and device unplug")
applied as patch 1/2 of a series on drm-next branch.
That incorrect resolution of the conflict effectively reverted a change
introduced to drivers/gpu/drm/drm_drv.c by patch 2/2 of that series -
commit ba3bf37e150a ("drm/drv: drm_dev_unplug(): Move out drm_dev_put()
call"). Fix it.
drivers/gpu/drm/meson/meson_viu.c:93:6: warning: symbol 'meson_viu_set_g12a_osd1_matrix' was not declared. Should it be static?
drivers/gpu/drm/meson/meson_viu.c:121:6: warning: symbol 'meson_viu_set_osd_matrix' was not declared. Should it be static?
drivers/gpu/drm/meson/meson_viu.c:190:6: warning: symbol 'meson_viu_set_osd_lut' was not declared. Should it be static?
drivers/gpu/drm/sun4i/sun8i_tcon_top.c:271:36: warning: symbol 'sun8i_r40_tcon_top_quirks' was not declared. Should it be static?
drivers/gpu/drm/sun4i/sun8i_tcon_top.c:276:36: warning: symbol 'sun50i_h6_tcon_top_quirks' was not declared. Should it be static?
drivers/gpu/drm/sun4i/sun4i_tcon.c:239:6: warning: symbol 'sun4i_tcon_set_mux' was not declared. Should it be static?
Jani Nikula [Tue, 16 Apr 2019 08:28:52 +0000 (11:28 +0300)]
drm/i915/ehl: inherit icl cdclk init/uninit
The cdclk init/uninit code was changed by commit 93a643f29bcb
("drm/i915/cdclk: have only one init/uninit function") between the
versions of commit 39564ae86d51 ("drm/i915/ehl: Inherit Ice Lake
conditional code"). What got merged fails to do cdclk init/uninit on
ehl.
Chris Wilson [Fri, 12 Apr 2019 07:14:16 +0000 (08:14 +0100)]
drm/i915: Introduce struct class_instance for engines across the uAPI
SSEU reprogramming of the context introduced the notion of engine class
and instance for a forwards compatible method of describing any engine
beyond the old execbuf interface. We wish to adopt this class:instance
description for more interfaces, so pull it out into a separate type for
userspace convenience.
Fixes: e46c2e99f600 ("drm/i915: Expose RPCS (SSEU) configuration to userspace (Gen11 only)") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Cc: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com> Cc: Tony Ye <tony.ye@intel.com> Cc: Andi Shyti <andi@etezian.org> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Acked-by: Tony Ye <tony.ye@intel.com> Reviewed-by: Andi Shyti <andi@etezian.org> Link: https://patchwork.freedesktop.org/patch/msgid/20190412071416.30097-1-chris@chris-wilson.co.uk
Eric Anholt [Mon, 1 Apr 2019 22:26:33 +0000 (15:26 -0700)]
drm: Add helpers for setting up an array of dma_fence dependencies.
I needed to add implicit dependency support for v3d, and Rob Herring
has been working on it for panfrost, and I had recently looked at the
lima implementation so I think this will be a good intersection of
what we all want and simplify our scheduler implementations.
v2: Rebase on xa_limit_32b API change, and tiny checkpatch cleanups on
the way in (unsigned int vs unsigned, extra return before
EXPORT_SYMBOL_GPL)
Paulo Zanoni [Wed, 10 Apr 2019 23:53:44 +0000 (16:53 -0700)]
drm/i915: fully convert the IRQ initialization macros to intel_uncore
Make them take the uncore argument from the caller instead of passing
the implicit &dev_priv->uncore directly. This will allow us to finally
pass something that's not dev_priv->uncore in the future, and gets rid
of the implicit variables in register macros.
Paulo Zanoni [Wed, 10 Apr 2019 23:53:43 +0000 (16:53 -0700)]
drm/i915: convert the IRQ initialization functions to intel_uncore
The IRQ initialization helpers are simple and self-contained. Continue
the transition started in the recent uncore rework to get us rid of
I915_READ/WRITE and the implicit dev_priv variables.
While the implicit dev_priv is removed from the IRQ initialization
helpers, we didn't get rid of them in the macro callers. Doing that
should be very simple now.
Paulo Zanoni [Wed, 10 Apr 2019 23:53:42 +0000 (16:53 -0700)]
drm/i915: add GEN2_ prefix to the I{E, I, M, S}R registers
This discussion started because we use token pasting in the
GEN{2,3}_IRQ_INIT and GEN{2,3}_IRQ_RESET macros, so gen2-4 passes an
empty argument to those macros, making the code a little weird. The
original proposal was to just add a comment as the empty argument, but
Ville suggested we just add a prefix to the registers, and that indeed
sounds like a more elegant solution.
Now doing this is kinda against our rules for register naming since we
only add gens or platform names as register prefixes when the given
gen/platform changes a register that already existed before. On the
other hand, we have so many instances of IIR/IMR in comments that
adding a prefix would make the users of these register more easily
findable, in addition to make our token pasting macros actually
readable. So IMHO opening an exception here is worth it.
Paulo Zanoni [Wed, 10 Apr 2019 23:53:41 +0000 (16:53 -0700)]
drm/i915: don't specify the IRQ register in the gen2 macros
Like the gen3+ macros, the gen2 versions of the IRQ initialization
macros take the register name in the 'type' argument. But gen2 only
has one set of registers, so there's really no need to specify the
type. This commit removes the type argument and uses the registers
directly instead of passing them through variables.
Paulo Zanoni [Wed, 10 Apr 2019 23:53:40 +0000 (16:53 -0700)]
drm/i915: refactor the IRQ init/reset macros
The whole point of having macros here is for the token pasting
necessary to automatically have IMR, IIR and IER selected. We don't
really need or want all the inlining that happens as a consequence.
The good thing about the current code is that it works regardless of
the relative offsets between these registers (they change after gen4,
with the usual VLV/CHV exceptions).
One thing which we can do is to split the logic of what we do with
imr/ier/iir to functions separate from the macros that pick them.
That's what we do in this commit. This allows us to get rid of the
gen8 duplicates and also all the inlining:
v2:
- Make checkpatch happy with a temporary which_ (Checkpatch).
- Reorder the arguments for the INIT macros (Ville).
- Correctly explain when the register offsets change in the commit
message (Ville).
- Use more line breaks in the macro calls to make the arguments look
a little more organized/readable.
- Update the bloat-o-meter output (minor change only).
Joel Stanley [Fri, 5 Apr 2019 08:11:17 +0000 (18:41 +1030)]
drm: aspeed: Clean up Kconfig options
The GFX IP is inside of the ASPEED BMC SoC so there is little use
enabling it on a kernel that does not support ASPEED.
When building with COMPILE_TEST the architecture many not have CMA
support, so to avoid breaking the build we only select these options if
the architecture supports the contiguous allocator.
I suspect the DRM_PANEL came from a cut/paste error.
Fixes: 4f2a8f5898ec ("drm: Add ASPEED GFX driver") Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Joel Stanley <joel@jms.id.au> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20190405081117.27339-1-joel@jms.id.au
Christian König [Mon, 15 Apr 2019 12:46:34 +0000 (14:46 +0200)]
dma-buf: explicitely note that dma-fence-chains use 64bit seqno
Instead of checking the upper values of the sequence number use an explicit
field in the dma_fence_ops structure to note if a sequence should be 32bit
or 64bit.
Chris Wilson [Tue, 16 Apr 2019 08:52:18 +0000 (09:52 +0100)]
drm/i915: Drop bool return from breadcrumbs signaler
Since removal of the "missed interrupt detection" nobody used the result
of whether or not we signaled anybody during that invocation, so now
remove the return value.
drm/i915/gvt: addressed guest GPU hang with HWS index mode
with the introduce of "switch to use HWS indices rather than address",
guest GPU hang observed when running workloads which will update the
seqno to the real HW HWSP, not vitural GPU HWSP and then cause GPU hang.
this patch is to revoke index mode in PIPE_CTRL and MI_FLUSH_DW and
patch guest GPU HWSP address value to these commands.
Fixes: 54939ea0bd85 ("drm/i915: Switch to use HWS indices rather than addresses") Reviewed-by: Zhenyu Wang <zhenyuw@linux.intel.com> Signed-off-by: Xiaolin Zhang <xiaolin.zhang@intel.com> Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Zhenyu Wang [Tue, 16 Apr 2019 08:50:34 +0000 (16:50 +0800)]
Merge tag 'drm-intel-next-2019-04-04' into gvt-next
Merge back drm-intel-next for engine name definition refinement
and 54939ea0bd85 ("drm/i915: Switch to use HWS indices rather than addresses")
that would need gvt fixes to depend on.
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
drm/i915: Nuke drm_crtc_state and use intel_atomic_state instead
This is one of the patches to start replacing drm pointers
and use the intel_atomic_state and intel_crtc to derive
the necessary intel state variables required for the intel
modeset functions.
v3:
* Remove the unwanted newline (Ville)
v2:
* Flip the function arguments (Ville)
* Remove some remaining instances of drm pointers (Ville)
* Use old_crtc_state and new_crtc_state (Ville)
Chris Wilson [Sat, 13 Apr 2019 12:58:20 +0000 (13:58 +0100)]
drm/i915/selftests: Skip live timeline/suspend tests if wedged
If the driver is wedged, we can not issue the requests to exercise the
timelines or the system across suspend, so skip the tests. live_hangcheck
is there to fail if we cannot recover.
drm/amd/display: Add profiling tools for bandwidth validation
[Why]
We used this change to investigate the performance of bandwidth validation,
it will be useful to have if we need to investigate further.
[How]
We use performance counter tick numbers to profile performance, they live
at dc->debug.bw_val_profile (set .enable in debugger to turn on measuring).
Signed-off-by: Joshua Aberback <joshua.aberback@amd.com> Reviewed-by: Tony Cheng <Tony.Cheng@amd.com> Acked-by: Bhawanpreet Lakha <Bhawanpreet Lakha@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
drm/amd/display: Relax requirements for CRTCs to be enabled
[Why]
As long as we have at least one non-cursor plane enabled on a CRTC then
the CRTC itself can remain enabled.
This will allow for commits where there's an overlay plane enabled but
no primary plane enabled.
[How]
Remove existing primary plane fb != NULL checks and replace them with
the new does_crtc_have_active_plane helper.
This will be called from atomic check when validating the CRTC.
Since the primary plane state can now potentially be NULL we'll need
to guard for that when accessing it in some of the cursor logic.
Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Reviewed-by: David Francis <David.Francis@amd.com> Acked-by: Bhawanpreet Lakha <Bhawanpreet Lakha@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
drm/amd/display: Check scaling info when determing update type
[Why]
Surface scaling info updates can affect bandwidth and blocks. We need
to be checking these with global validation to avoid underflow or
corruption.
[How]
Drop the state->allow_modeset early exit in
dm_determine_update_type_for_commit. Most of those should be considered
fast now anyway.
Fill in scaling info and it to the surface update in atomic
check.
Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Reviewed-by: David Francis <David.Francis@amd.com> Acked-by: Bhawanpreet Lakha <Bhawanpreet Lakha@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
drm/amd/display: Don't warn when DC update type > DM guess
[Why]
DM thinks that the update type should be full whenever a stream or
plane is added or removed (including recreations).
This won't match in the case where DC thinks what looks like a fast
update to DM is actually a medium or full - like scaling changes that
affect bandwidth and clocks.
[How]
Drop this warning. DC knows better than the DM does for determining
cases like this.
The other warning can be kept for now since it would warn on a pretty
serious DC or DM bug.
Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Reviewed-by: David Francis <David.Francis@amd.com> Acked-by: Bhawanpreet Lakha <Bhawanpreet Lakha@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
drm/amd/display: Use surface directly when checking update type
[Why]
DC expects the surface memory address to identify the surface.
This doesn't work with what we're doing with the temporary surfaces,
it will always assume this is a full update because the surface
isn't in the current context.
[How]
Use the surface directly. This doesn't give us much improvement yet,
since we always create a new dc_plane_state when state->allow_modeset
is true.
The call into dc_check_update_surfaces_for_stream also needs to be
locked, for two reasons:
1. It checks the current DC state
2. It modifies the surface update flags
Both of which could be currently in the middle of commit work from
commit tail.
A TODO here is to pass the context explicitly into this function and
find a way to get the surface update flags out of it without modifying
the surface in place.
Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Reviewed-by: David Francis <David.Francis@amd.com> Acked-by: Bhawanpreet Lakha <Bhawanpreet Lakha@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
drm/amd/display: Add basic downscale and upscale valdiation
[Why]
Planes have downscaling limits and upscaling limits per format and DM
is expected to validate these using DC caps. We should fail atomic
check validation if we aren't capable of doing the scaling.
[How]
We don't currently create store which DC plane maps to which DRM plane
so we can't easily check the caps directly. For now add basic
constraints that cover the absolute min and max downscale / upscale
limits for most RGB and YUV formats across ASICs.
Leave a TODO indicating that these should really be done with DC caps.
We'll probably need to subclass DRM planes again in order to correctly
identify which DC plane maps to it.
Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Reviewed-by: Harry Wentland <Harry.Wentland@amd.com> Reviewed-by: Tony Cheng <Tony.Cheng@amd.com> Acked-by: Bhawanpreet Lakha <Bhawanpreet Lakha@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
drm/amd/display: Rework DC plane filling and surface updates
[Why]
We currently don't do DC validation for medium or full updates where
the plane state isn't created. There are some medium and full updates
that can cause bandwidth or clock changes to occur resulting in
underflow or corruption.
We need to be able to fill surface and plane info updates during
atomic commit for dm_determine_update_type for commit. Since we already
do this during atomic commit tail it would be good if we had the same
logic in both places for creating these structures.
[How]
Introduce fill_dc_scaling_info and fill_dc_plane_info_and_addr.
These two functions cover the following three update structures:
Cleanup and adapter the existing fill_plane_* helpers to work with
these functions.
Update call sites that used most of these sub helpers directly to work
with the new functions. The exception being prepare_fb - we just want
the new buffer attributes specifically in the case where we're
creating the plane. This is needed for dc_commit_state in the case
where the FB hasn't been previously been used.
This isn't quite a refactor, but functionally driver behavior should
be mostly the smae as before. The one exception is that we now check
the return code for fill_plane_buffer_attributes which means that
commits will be rejected that try to enable DCC with erroneous
parameters.
Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Reviewed-by: David Francis <David.Francis@amd.com> Acked-by: Bhawanpreet Lakha <Bhawanpreet Lakha@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
drm/amd/display: Recalculate pitch when buffers change
[Why]
Pitch was only calculated based on format whenever the plane state
was recreated. This could result in surface corruption due to the
incorrect pitch being programmed when the surface pitch changed during
commits where state->allow_modeset = false.
[How]
Recalculate pitch at the same time we update the buffer address and
other buffer attributes. This function was previously called
fill_plane_tiling_attributes but I've also renamed it to
fill_plane_buffer_attributes to clarify the actual intent of the
function now that it's handling most buffer related attributes.
Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Reviewed-by: David Francis <David.Francis@amd.com> Acked-by: Bhawanpreet Lakha <Bhawanpreet Lakha@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
drm/amd/display: Maintain z-ordering when creating planes
[Why]
The overlay will be incorrectly placed *below* the primary plane for
commits with state->allow_modeset = true because the primary plane
won't be removed and recreated in the same commit.
[How]
Add the should_reset_plane helper to determine if the plane should be
reset or not. If we need to add or force reset any plane in the context
then we'll need to do the same for every plane on the stream.
Unfortunately we need to do the remove / recreate routine for removing
planes as well since DC currently isn't well equipped to handle the
plane with the top pipe being removed with other planes still active.
Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Reviewed-by: Sun peng Li <Sunpeng.Li@amd.com> Acked-by: Bhawanpreet Lakha <Bhawanpreet Lakha@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
drm/amd/display: Update plane scaling parameters for fast updates
[Why]
Plane scaling parameters are not correctly filled or updated when
performing fast updates.
They're filled when creating the dc plane state and during atomic check.
While the atomic check code path happens for the plane even during fast
updates, the issue is that they're done in place on the dc_plane_state
directly. This dc_plane_state may be the current state plane state
being used by the hardware, so these parameters won't be correctly
programmed.
The new scaling parameters should instead be passed as an update
to the plane.
[How]
Update fill_rects_from_plane_state to not modify dc_plane_state
directly. Update the call sites that use this to fill in the appropriate
values.
Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Reviewed-by: Sun peng Li <Sunpeng.Li@amd.com> Acked-by: Bhawanpreet Lakha <Bhawanpreet Lakha@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
David Francis [Mon, 25 Mar 2019 13:44:04 +0000 (09:44 -0400)]
drm/amd/display: Handle get crtc position error
[Why]
dc_stream_get_crtc_position can return false.
This was unhandled in delay_cursor_until_vupdate
[How]
If dc_stream_get_crtc_position returns false, something
is weird. Don't delay.
Signed-off-by: David Francis <David.Francis@amd.com> Reviewed-by: Sun peng Li <Sunpeng.Li@amd.com> Acked-by: Bhawanpreet Lakha <Bhawanpreet Lakha@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Jun Lei [Tue, 26 Mar 2019 21:32:59 +0000 (17:32 -0400)]
drm/amd/display: expand plane caps to include fp16 and scaling capability
[why]
there are some scaling capabilities such as fp16 which are known to be unsupported
on a given ASIC. exposing these static capabilities allows much simpler implementation
for OS interfaces which require to report such static capabilities to reduce the
number of dynamic validation calls
[how]
refactor the existing plane caps to be more extensible, and add fp16 and scaling
capabilities
Signed-off-by: Jun Lei <Jun.Lei@amd.com> Reviewed-by: Tony Cheng <Tony.Cheng@amd.com> Acked-by: Bhawanpreet Lakha <Bhawanpreet Lakha@amd.com> Acked-by: Harry Wentland <Harry.Wentland@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
drm/amd/display: Add DRM color properties for primary planes
[Why]
We need DC's color space to match the color encoding and color space
specified by userspace to correctly render YUV surfaces.
[How]
Add the DRM color properties when the DC plane supports NV12.
Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Reviewed-by: Sun peng Li <Sunpeng.Li@amd.com> Acked-by: Bhawanpreet Lakha <Bhawanpreet Lakha@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
drm/amd/display: Expose support for NV12 on suitable planes
[Why]
Hardware can support video surfaces and DC tells us which planes are
suitable via DC plane caps.
[How]
The supported formats array will now vary based on what DC tells us,
so create an array and fill it dynamically based on plane types and
caps.
Ideally we'd query support for every format via DC plane caps, but for
the framework is in place to do so later with this.
Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Reviewed-by: Sun peng Li <Sunpeng.Li@amd.com> Acked-by: Bhawanpreet Lakha <Bhawanpreet Lakha@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Anthony Koo [Thu, 28 Mar 2019 16:39:48 +0000 (12:39 -0400)]
drm/amd/display: Add switch for Fractional PWM on or off
[Why]
Some LED Driver might not like Fractional PWM especially at extreme
ranges near 0% or 100%.
For example, backlight flashing could be observed.
We want a way to switch fractional PWM on/off either for debug, or
possibly production.
[How]
Add DC code that can send new FW command to enable/disable
fractional PWM.
Signed-off-by: Anthony Koo <Anthony.Koo@amd.com> Reviewed-by: Aric Cyr <Aric.Cyr@amd.com> Acked-by: Leo Li <sunpeng.li@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Jann Horn reported that he can overflow the page ref count with
sufficient memory (and a filesystem that is intentionally extremely
slow).
Admittedly it's not exactly easy. To have more than four billion
references to a page requires a minimum of 32GB of kernel memory just
for the pointers to the pages, much less any metadata to keep track of
those pointers. Jann needed a total of 140GB of memory and a specially
crafted filesystem that leaves all reads pending (in order to not ever
free the page references and just keep adding more).
Still, we have a fairly straightforward way to limit the two obvious
user-controllable sources of page references: direct-IO like page
references gotten through get_user_pages(), and the splice pipe page
duplication. So let's just do that.
* branch page-refs:
fs: prevent page refcount overflow in pipe_buf_get
mm: prevent get_user_pages() from overflowing page refcount
mm: add 'try_get_page()' helper function
mm: make page ref count overflow check tighter and more explicit
Matthew Wilcox [Fri, 5 Apr 2019 21:02:10 +0000 (14:02 -0700)]
fs: prevent page refcount overflow in pipe_buf_get
Change pipe_buf_get() to return a bool indicating whether it succeeded
in raising the refcount of the page (if the thing in the pipe is a page).
This removes another mechanism for overflowing the page refcount. All
callers converted to handle a failure.
mm: prevent get_user_pages() from overflowing page refcount
If the page refcount wraps around past zero, it will be freed while
there are still four billion references to it. One of the possible
avenues for an attacker to try to make this happen is by doing direct IO
on a page multiple times. This patch makes get_user_pages() refuse to
take a new page reference if there are already more than two billion
references to the page.
This is the same as the traditional 'get_page()' function, but instead
of unconditionally incrementing the reference count of the page, it only
does so if the count was "safe". It returns whether the reference count
was incremented (and is marked __must_check, since the caller obviously
has to be aware of it).
Also like 'get_page()', you can't use this function unless you already
had a reference to the page. The intent is that you can use this
exactly like get_page(), but in situations where you want to limit the
maximum reference count.
The code currently does an unconditional WARN_ON_ONCE() if we ever hit
the reference count issues (either zero or negative), as a notification
that the conditional non-increment actually happened.
NOTE! The count access for the "safety" check is inherently racy, but
that doesn't matter since the buffer we use is basically half the range
of the reference count (ie we look at the sign of the count).