Ben Skeggs [Thu, 20 Aug 2015 04:54:06 +0000 (14:54 +1000)]
drm/nouveau/device: add direct pointer to struct device
A future commit will hide the platform/pci specifics from nvkm_device,
but it's still very useful in a lot of places to have access to the
Linux device struct.
Ben Skeggs [Thu, 27 Aug 2015 07:33:19 +0000 (17:33 +1000)]
drm/nouveau/pmu/gk104: implement a hackish workaround for a hw bug
Only a handful of machines have this enabled by default, where it's been
proven to work. The workaround can be explicitly enabled with a module
option also.
Still waiting on feedback from NVIDIA for a proper idea of exactly what
this fix is doing, and how to implement it properly.
Ben Skeggs [Fri, 21 Aug 2015 00:52:54 +0000 (10:52 +1000)]
drm/nouveau/bios/dcb: accept "maxwell" lane count values for dcb 4.0
We previously assumed that the values "2" and "4" were new in DCB 4.1,
however, there's at least one GM107 DCB 4.0 board (Quadro K620) that
uses the newer values.
Hans de Goede [Thu, 23 Jul 2015 15:20:12 +0000 (17:20 +0200)]
drm/nouveau/nv46: Change mc subdev oclass from nv44 to nv4c
MSI interrupts appear to not work for nv46 based cards. Change the mc
subdev oclass for these cards from nv44 to nv4c, the nv4c mc code is
identical to the nv44 mc code except that it does not use msi
(it does not define a msi_rearm callback).
Samuel Pitoiset [Sun, 26 Jul 2015 09:30:08 +0000 (11:30 +0200)]
drm/nouveau/pm/nv50: TPC[0x3] must be used for PGRAPH muxs on G80
I thought that using TPC[0x0] like for G84:GT215 was sufficient on G80,
but it's actually not the case. According to NVIDIA PerfKit on Windows,
we have to configure PGRAPH related muxs on TPC[0x3] for this chipset.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Add support for GM20B's graphics engine, based on GK20A. Note that this
code alone will not allow the engine to initialize on released devices
which require PMU-assisted secure boot.
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com> Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
drm/nouveau/gr/gk20a: use same initialization sequence as nvgpu
GK20A's initialization was based on GK104, but differences exist in the
way the initial context is built and the initialization process itself.
This patch follows the same initialization sequence as nvgpu performs
to avoid bad surprises. Since the register bundles initialization also
differ considerably from GK104, the register packs are now loaded from
firmware files, again similarly to what is done with nvgpu.
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com> Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
drm/nouveau/gr: use NVIDIA-provided external firmwares
NVIDIA will officially start providing GR firmwares through
linux-firmware for GPUs that require it. Change the GR firmware lookup
function to use these files.
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com> Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Samuel Pitoiset [Fri, 19 Jun 2015 15:37:18 +0000 (17:37 +0200)]
drm/nouveau/pm/gk104: add compute signals/sources
These signals and sources have been reverse engineered from CUPTI
(Linux). Graphics signals exposed by PerfKit (Windows only) will be
added later. I need to reverse engineer them and it's a bit painful.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Wei Ni [Tue, 16 Jun 2015 09:35:12 +0000 (17:35 +0800)]
drm/nouveau/drm/nouveau/clk: fix tstate to pstate calculation
According to the tstate calculation in nvkm_clk_tstate(),
the range of tstate is from -(clk->state_nr - 1) to 0,
it mean the tstate is negative value. But in nvkm_pstate_work(),
it use (clk->state_nr - 1 - clk->tstate) to limit pstate,
it's not correct.
This patch fix it to use (clk->state_nr - 1 + clk->tstate) to
limit pstate.
Signed-off-by: Wei Ni <wni@nvidia.com> Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Samuel Pitoiset [Sun, 14 Jun 2015 11:33:55 +0000 (13:33 +0200)]
drm/nouveau/pm/gf100: add compute signals/sources
These signals and sources have been reverse engineered from CUPTI
(Linux). Graphics signals exposed by PerfKit (Windows only) will be
added later. I need to reverse engineer them and it's a bit painful.
This commit also adds a new class for GF108 and GF117.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Samuel Pitoiset [Sun, 7 Jun 2015 20:40:28 +0000 (22:40 +0200)]
drm/nouveau/pm/nv50: add compute and graphics signals/sources
These signals and sources have been reverse engineered from NVIDIA
PerfKit (Windows) and CUPTI (Linux), they will be used to build complex
hardware events from the userspace.
This commit also adds a new class for GT200.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Samuel Pitoiset [Sun, 7 Jun 2015 20:40:26 +0000 (22:40 +0200)]
drm/nouveau/pm: allow to configure domains instead of simple counters
Configuring counters from the userspace require the kernel to handle some
logic related to performance counters. Basically, it has to find a free
slot to assign a counter, to handle extra counting modes like B4/B6 and it
must return and error when it can't configure a counter.
In my opinion, the kernel should not handle all of that logic but it
should only write the configuration sent by the userspace without
checking anything. In other words, it should overwrite the configuration
even if it's already counting and do not return any errors.
This patch allows the userspace to configure a domain instead of
separate counters. This has the advantage to move all of the logic to
the userspace.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Samuel Pitoiset [Sun, 7 Jun 2015 20:40:25 +0000 (22:40 +0200)]
drm/nouveau/pm: allow the userspace to schedule hardware counters
This adds a new method NVIF_PERFCTR_V0_INIT which starts a batch of
hardware counters for sampling. This will allow the userspace to start
a monitoring session using the INIT method and to stop it with SAMPLE,
for example before and after a frame is rendered.
This commit temporarily breaks nv_perfmon but this is going to be fixed
with the upcoming patch.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Samuel Pitoiset [Sun, 7 Jun 2015 20:40:22 +0000 (22:40 +0200)]
drm/nouveau/pm: add concept of sources
A source (or multiplexer) is a tuple addr+mask+shift which allows to
control a block of signals. The maximum number of sources that a signal
can define is arbitrary limited to 8 and this should be large enough.
This patch allows to define multi-level of sources for a signal.
Each different sources are stored to a global list and will be exposed
to the userspace through the nvif interface in order to avoid conflicts.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset at gmail.com> Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Samuel Pitoiset [Sun, 7 Jun 2015 20:40:21 +0000 (22:40 +0200)]
drm/nouveau/pm: allow to monitor hardware signal index 0x00
This signal index must be always allowed even if it's not clearly
defined in a domain in order to monitor a counter like 0x03020100
because it's the default value of signals.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Samuel Pitoiset [Sun, 7 Jun 2015 20:40:15 +0000 (22:40 +0200)]
drm/nouveau/pm: reorganize the nvif interface
This commit introduces the NVIF_IOCTL_NEW_V0_PERFMON class which will be
used in order to query domains, signals and sources. This separates the
querying and the counting interface.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Samuel Pitoiset [Sun, 7 Jun 2015 20:40:14 +0000 (22:40 +0200)]
drm/nouveau/pm: remove unused nvkm_perfsig_wrap() function
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Martin Peres <martin.peres@free.fr> Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Samuel Pitoiset [Sun, 7 Jun 2015 20:40:13 +0000 (22:40 +0200)]
drm/nouveau/pm: remove pmu signals
PDAEMON signals don't have to be exposed by the perfmon engine.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Martin Peres <martin.peres@free.fr> Signed-off-by: Ben Skeggs <bskeggs@redhat.com>