Karol Herbst [Fri, 24 Nov 2017 02:56:26 +0000 (03:56 +0100)]
drm/nouveau/pci: do a msi rearm on init
On my GP107 when I load nouveau after unloading it, for some reason the
GPU stopped sending or the CPU stopped receiving interrupts if MSI was
enabled.
Doing a rearm once before getting any interrupts fixes this.
Signed-off-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Thierry Reding <treding@nvidia.com> Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Karol Herbst [Mon, 6 Nov 2017 15:20:33 +0000 (16:20 +0100)]
drm/nouveau/fbcon: fix NULL pointer access in nouveau_fbcon_destroy
When the fbcon object is initialized, but nouveau_fbcon_create is not
called, we run into a NULL pointer access within nouveau_fbcon_create when
unloading nouveau.
The call to drm_fb_helper_funcs.fb_probe is deferred until there is a
display for real since 4.14, that's why fbcon->helper.fb is still not set.
Signed-off-by: Karol Herbst <kherbst@redhat.com> Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Ben Skeggs [Tue, 31 Oct 2017 17:56:19 +0000 (03:56 +1000)]
drm/nouveau: check kind validity against mmu object
This is already handled in the top-level gem_new() ioctl in another manner,
but this will be removed in a future commit.
Ideally we'd not need to check up-front at all, and let the VMM code handle
error checking, but there are paths in the current BO management code where
this isn't possible due to map() not always being called during BO creation,
and map() calls not being allowed to fail during buffer migration.
Ben Skeggs [Tue, 31 Oct 2017 17:56:19 +0000 (03:56 +1000)]
drm/nouveau: remove explicit unmaps
If the VMA is being deleted, we don't need to explicity unmap first
anymore. The MMU code will automatically merge the operations into
a single page tree walk.
Ben Skeggs [Tue, 31 Oct 2017 17:56:19 +0000 (03:56 +1000)]
drm/nouveau/mmu: implement new vmm frontend
These are the new priviledged interfaces to the VMM backends, and expose
some functionality that wasn't previously available.
It's now possible to allocate a chunk of address-space (even all of it),
without causing page tables to be allocated up-front, and then map into
it at arbitrary locations. This is the basic primitive used to support
features such as sparse mapping, or to allow userspace control over its
own address-space, or HMM (where the GPU driver isn't in control of the
address-space layout).
Rather than being tied to a subtle combination of memory object and VMA
properties, arguments that control map flags (ro, kind, etc) are passed
explicitly at map time.
The compatibility hacks to implement the old frontend on top of the new
driver backends have been replaced with something similar to implement
the old frontend's interfaces on top of the new frontend.
Ben Skeggs [Tue, 31 Oct 2017 17:56:19 +0000 (03:56 +1000)]
drm/nouveau/mmu/gp100,gp10b: implement new vmm backend
Adds support for:
- 64KiB/2MiB big page sizes (128KiB not supported by HW with new PT layout).
- System-memory PTs.
- LPTE "invalid" state.
- (Tegra) Use of video memory aperture.
- Sparse PDEs/PTEs.
- Additional blocklinear kinds.
- 49-bit address-space.
GP100 supports an entirely new 5-level page table layout that provides
an expanded 49-bit address-space. It also supports the layout present
on previous generations, which we've been making do with until now.
This commit implements support for the new layout, and enables it by
default.
Ben Skeggs [Tue, 31 Oct 2017 17:56:19 +0000 (03:56 +1000)]
drm/nouveau/mmu/gk104,gk20a: implement new vmm backend
Adds support for:
- 64KiB big page size.
- System-memory PTs.
- LPTE "invalid" state.
- (Tegra) Use of video memory aperture.
Adds support for marking LPTEs invalid, resulting in the corresponding
SPTEs being ignored, which is supposed to speed up TLB invalidates.
On The Tegra side, this will switch to using the video memory aperture
for all mappings. The HW will still target non-coherent system memory,
but this aperture needs to be selected in order to support compression.
Tegra's instmem backend somewhat cheated to get this effect previously.
Ben Skeggs [Tue, 31 Oct 2017 17:56:19 +0000 (03:56 +1000)]
drm/nouveau/mmu: implement new vmm backend
This is the common code to support a rework of the VMM backends.
It adds support for more than 2 levels of page table nesting, which
is required to be able to support GP100's MMU layout.
Sparse mappings (that don't cause MMU faults when accessed) are now
supported, where the backend provides it.
Dual-PT handling had to become more sophisticated to support sparse,
but this also allows us to support an optimisation the MMU provides
on GK104 and newer.
Certain operations can now be combined into a single page tree walk
to avoid some overhead, but also enables optimsations like skipping
PTE unmap writes when the PT will be destroyed anyway.
The old backend has been hacked up to forward requests onto the new
backend, if present, so that it's possible to bisect between issues
in the backend changes vs the upcoming frontend changes.
Until the new frontend has been merged, new backends will leak BAR2
page tables on module unload. This is expected, and it's not worth
the effort of hacking around this as it doesn't effect runtime.
Ben Skeggs [Tue, 31 Oct 2017 17:56:19 +0000 (03:56 +1000)]
drm/nouveau/ltc/gm200: limit NV_MMU_PTE_COMPTAGLINE bits to 16 where required
If NV_PFB_MMU_CTRL_USE_FULL_COMP_TAG_LINE is TRUE, then the last bit of
NV_MMU_PTE_COMPTAGLINE is re-purposed to select the upper/lower half of
a compression tag when using 64KiB big pages.
Ben Skeggs [Tue, 31 Oct 2017 17:56:19 +0000 (03:56 +1000)]
drm/nouveau/mmu: implement base for new vm management
This is the first chunk of the new VMM code that provides the structures
needed to describe a GPU virtual address-space layout, as well as common
interfaces to handle VMM creation, and connecting instances to a VMM.
The constructor now allocates the PD itself, rather than having the user
handle that manually. This won't/can't be used until after all backends
have been ported to these interfaces, so a little bit of memory will be
wasted on Fermi and newer for a couple of commits in the series.
Compatibility has been hacked into the old code to allow each GPU backend
to be ported individually.
GP100 "big" (which is a funny name, when it supports "even bigger") page
tables are small enough that we want to be able to suballocate them from
a larger block of memory.
This builds on the previous page table cache interfaces so that the VMM
code doesn't need to know the difference.
Ben Skeggs [Tue, 31 Oct 2017 17:56:19 +0000 (03:56 +1000)]
drm/nouveau/mmu: implement page table cache
Builds up and maintains a small cache of each page table size in order
to reduce the frequency of expensive allocations, particularly in the
pathological case where an address range ping-pongs between allocated
and free.
Ben Skeggs [Tue, 31 Oct 2017 17:56:19 +0000 (03:56 +1000)]
drm/nouveau: allocate vram with nvkm_ram_get()
This will cause a subtle behaviour change on GPUs that are in mixed-memory
configurations in that VRAM in the degraded section of VRAM will no longer
be used for TTM buffer objects.
That section of VRAM is not meant to be used for displayable/compressed
surfaces, and we have no reliable way with the current interfaces to be
able to make that decision properly.
Ben Skeggs [Tue, 31 Oct 2017 17:56:19 +0000 (03:56 +1000)]
drm/nouveau/fb/ram: add interface to allocate vram as an nvkm_memory object
Upcoming MMU changes use nvkm_memory as its basic representation of memory,
so we need to be able to allocate VRAM like this.
The code is basically identical to the current chipset-specific allocators,
minus support for compression tags (which will be handled elsewhere anyway).
Ben Skeggs [Tue, 31 Oct 2017 17:56:19 +0000 (03:56 +1000)]
drm/nouveau/core/memory: add mechanism to retrieve allocation granularity
Needed by VMM code to determine whether an allocation is compatible with
a given page size (ie. you can't map 4KiB system memory pages into 64KiB
GPU pages).
Ben Skeggs [Tue, 31 Oct 2017 17:56:19 +0000 (03:56 +1000)]
drm/nouveau/core/memory: change map interface to support upcoming mmu changes
Map flags (access, kind, etc) are currently defined in either the VMA,
or the memory object, which turns out to not be ideal for things like
suballocated buffers, etc.
These will become per-map flags instead, so we need to support passing
these arguments in nvkm_memory_map().
Ben Skeggs [Tue, 31 Oct 2017 17:56:19 +0000 (03:56 +1000)]
drm/nouveau/core/memory: comptag allocation
nvkm_memory is going to be used by the upcoming mmu rework for the basic
representation of a memory allocation, as such, this commit adds support
for comptag allocation to nvkm_memory.
This is very simple for now, in that it requires comptags for the entire
memory allocation even if only certain ranges are compressed.
Support for tracking ranges will be added at a later date.
Ben Skeggs [Tue, 31 Oct 2017 17:56:19 +0000 (03:56 +1000)]
drm/nouveau/fb: move comptags mm into nvkm_fb
We're moving towards having a central place to handle comptag allocation,
and as some GPUs don't have a ram submodule (ie. Tegra), we need to move
the mm somewhere else.