Jiri Olsa [Wed, 7 Mar 2018 15:50:08 +0000 (16:50 +0100)]
perf tools: Add MEM_TOPOLOGY feature to perf data file
Adding MEM_TOPOLOGY feature to perf data file,
that will carry physical memory map and its
node assignments.
The format of data in MEM_TOPOLOGY is as follows:
0 - version | for future changes
8 - block_size_bytes | /sys/devices/system/memory/block_size_bytes
16 - count | number of nodes
For each node we store map of physical indexes for
each node:
32 - node id | node index
40 - size | size of bitmap
48 - bitmap | bitmap of memory indexes that belongs to node
| /sys/devices/system/node/node<NODE>/memory<INDEX>
The MEM_TOPOLOGY could be displayed with following
report command:
Signed-off-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20180307155020.32613-8-jolsa@kernel.org
[ Rename 'index' to 'idx', as this breaks the build in rhel5, 6 and other systems where this is used by glibc headers ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Jiri Olsa [Wed, 7 Mar 2018 15:50:07 +0000 (16:50 +0100)]
perf c2c: Use mem_info refcnt logic
Switch to refcnt logic instead of duplicating mem_info objects. No
functional change, just saving some memory.
Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20180307155020.32613-7-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Jiri Olsa [Wed, 7 Mar 2018 15:50:06 +0000 (16:50 +0100)]
perf tools: Add refcnt into struct mem_info
It's passed along several hists entries in --hierarchy mode, so it's
better we keep track of it.
The current fail I see is that it gets removed in hierarchy --mem-mode
mode, where it's shared in the different hierarchies, but removed from
the template hist entry, so the report crashes.
Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20180307155020.32613-6-jolsa@kernel.org
[ Rename mem_info__aloc() to mem_info__new(), to fix the typo and use the convention for constructors ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Jiri Olsa [Wed, 7 Mar 2018 15:50:05 +0000 (16:50 +0100)]
perf record: Remove progname from struct record
It's no longer used.
Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20180307155020.32613-5-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Jiri Olsa [Wed, 7 Mar 2018 15:50:04 +0000 (16:50 +0100)]
perf record: Move machine variable down the function
It's used far more down to be declared on the top of the __cmd_record.
Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20180307155020.32613-4-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Jiri Olsa [Wed, 7 Mar 2018 15:50:03 +0000 (16:50 +0100)]
perf report: Display perf.data header info
Display more header info from perf.data file, following values:
$ perf report -i perf.data --header-only
...
# header version : 1
# data offset : 424
# data size : 3364280
# feat offset : 3364704
It's handy for debuging.
Signed-off-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20180307155020.32613-3-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Jiri Olsa [Wed, 7 Mar 2018 15:50:02 +0000 (16:50 +0100)]
perf report: Fix the output for stdio events list
Changing the output header for reporting forced groups via --groups
option on non grouped events, like:
$ perf record -e 'cycles,instructions'
$ perf report --stdio --group
Before:
# Samples: 24 of event 'anon group { cycles:u, instructions:u }'
After:
# Samples: 24 of events 'cycles:u, instructions:u'
Signed-off-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Fixes: ad52b8cb4886 ("perf report: Add support to display group output for non group events") Link: http://lkml.kernel.org/r/20180307155020.32613-2-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Thomas Richter [Wed, 7 Mar 2018 13:43:25 +0000 (14:43 +0100)]
perf annotate: Fix s390 target function disassembly
'perf annotate' displays function call assembler instructions with a
right arrow. Hitting enter on this line/instruction causes the browser
to disassemble this target function and show it on the screen. On s390
this results in an error message 'The called function was not found.'
The function call assembly line parsing does not handle the s390 bras
and brasl instructions. Function call__parse expects the target as first
operand:
callq e9140 <__fxstat>
S390 has a register number as first operand:
brasl %r14,41d60 <abort>
Therefore the target addresses on s390 are always zero which is an
invalid address.
Introduce a s390 specific call parsing function which skips the first
operand on s390.
Signed-off-by: Thomas Richter <tmricht@linux.vnet.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Hendrik Brueckner <brueckner@linux.vnet.ibm.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Link: http://lkml.kernel.org/r/20180307134325.96106-1-tmricht@linux.vnet.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Adrian Hunter [Wed, 7 Mar 2018 14:02:28 +0000 (16:02 +0200)]
perf intel-pt: Remove a check for sampling mode
Intel PT code already has some preparation for AUX area sampling mode.
However the implementation has changed from the first proposal and one
of the side-effects is that it will not be impossible to support snapshot
mode and sampling mode at the same time.
Although there are no plans to support it, let validation (not yet
implemented) control whether it is allowed rather than low-level
functions.
Adrian Hunter [Wed, 7 Mar 2018 14:02:23 +0000 (16:02 +0200)]
perf intel-pt: Fix error recovery from missing TIP packet
When a TIP packet is expected but there is a different packet, it is an
error. However the unexpected packet might be something important like a
TSC packet, so after the error, it is necessary to continue from there,
rather than the next packet. That is achieved by setting pkt_step to
zero.
Adrian Hunter [Wed, 7 Mar 2018 14:02:22 +0000 (16:02 +0200)]
perf intel-pt: Fix sync_switch
sync_switch is a facility to synchronize decoding more closely with the
point in the kernel when the context actually switched.
The flag when sync_switch is enabled was global to the decoding, whereas
it is really specific to the CPU.
The trace data for different CPUs is put on different queues, so add
sync_switch to the intel_pt_queue structure and use that in preference
to the global setting in the intel_pt structure.
That fixes problems decoding one CPU's trace because sync_switch was
disabled on a different CPU's queue.
Adrian Hunter [Wed, 7 Mar 2018 14:02:21 +0000 (16:02 +0200)]
perf intel-pt: Fix overlap detection to identify consecutive buffers correctly
Overlap detection was not not updating the buffer's 'consecutive' flag.
Marking buffers consecutive has the advantage that decoding begins from
the start of the buffer instead of the first PSB. Fix overlap detection
to identify consecutive buffers correctly.
Kan Liang [Tue, 6 Mar 2018 15:36:07 +0000 (10:36 -0500)]
perf mmap: Simplify perf_mmap__read_init()
It isn't necessary to pass the 'start', 'end' and 'overwrite' arguments
to perf_mmap__read_init(). The data is stored in the struct perf_mmap.
Discard the parameters.
Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Suggested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Andi Kleen <ak@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/1520350567-80082-8-git-send-email-kan.liang@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Kan Liang [Tue, 6 Mar 2018 15:36:06 +0000 (10:36 -0500)]
perf mmap: Simplify perf_mmap__read_event()
It isn't necessary to pass the 'overwrite', 'start' and 'end' argument
to perf_mmap__read_event(). Discard them.
Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Suggested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Andi Kleen <ak@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/1520350567-80082-7-git-send-email-kan.liang@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Kan Liang [Tue, 6 Mar 2018 15:36:05 +0000 (10:36 -0500)]
perf mmap: Simplify perf_mmap__consume()
It isn't necessary to pass the 'overwrite' argument to
perf_mmap__consume(). Discard it.
Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Suggested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Andi Kleen <ak@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/1520350567-80082-6-git-send-email-kan.liang@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Kan Liang [Tue, 6 Mar 2018 15:36:04 +0000 (10:36 -0500)]
perf mmap: Use stored 'overwrite' in perf_mmap__consume()
The 'overwrite' is set at allocation. It will not be changed. Using it
to replace the parameter of perf_mmap__consume(). The parameters will
be discarded later.
No functional change.
Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Suggested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Andi Kleen <ak@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/1520350567-80082-5-git-send-email-kan.liang@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Kan Liang [Tue, 6 Mar 2018 15:36:03 +0000 (10:36 -0500)]
perf mmap: Use the stored data in perf_mmap__read_event()
Using the 'start', 'end' and 'overwrite' which are stored in
struct perf_mmap to replace the parameters of perf_mmap__read_event().
The parameters will be discarded later.
No functional change.
Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Suggested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Andi Kleen <ak@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/1520350567-80082-4-git-send-email-kan.liang@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Kan Liang [Tue, 6 Mar 2018 15:36:02 +0000 (10:36 -0500)]
perf mmap: Use the stored scope data in perf_mmap__push()
Using the 'start' and 'end' which are stored in struct perf_mmap to
replace the temporary 'start' and 'end'.
The temporary variables will be discarded later.
It doesn't need to pass 'overwrite' to perf_mmap__push(). It's stored in
struct perf_mmap.
Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/1520350567-80082-3-git-send-email-kan.liang@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Kan Liang [Tue, 6 Mar 2018 15:36:01 +0000 (10:36 -0500)]
perf mmap: Store mmap scope in struct perf_mmap()
There is too much boilerplate in the perf_mmap__read*() interfaces.
The 'start' and 'end' variables should be stored in struct perf_mmap at
initialization. They will be used later.
The old 'startp' and 'endp' pointers are used by perf_mmap__read_event()
now. They cannot be removed. So the old 'startp/endp' and new
'md->start/md->end' will exist simultaneously now. The old one will be
removed later.
Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Suggested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Andi Kleen <ak@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/1520350567-80082-2-git-send-email-kan.liang@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Kan Liang [Tue, 6 Mar 2018 15:36:00 +0000 (10:36 -0500)]
perf evlist: Store 'overwrite' in struct perf_mmap
It has been determined that the map is for overwrite mode
(evlist->overwrite_mmap) or non-overwrite mode (evlist->mmap) when
calling perf_evlist__alloc_mmap().
Store the information in struct perf_mmap, which will be used later to
simplify the perf_mmap__read*() interfaces.
Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Suggested-by: Jiri Olsa <jolsa@kernel.org> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Andi Kleen <ak@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/1520350567-80082-1-git-send-email-kan.liang@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
perf pmu: Auto-merge PMU events created by prefix or glob match
Auto-merge for these events was disabled when auto-merging of non-alias
events was disabled in commit 63ce844 (perf stat: Only auto-merge events
that are PMU aliases).
Non-merging of legacy events is preserved:
$ perf stat -ag -e cache-misses,cache-misses sleep 1
perf pmu: Display pmu name when printing unmerged events in stat
To simplify creation of events accross multiple instances of the same
type of PMU stat supports two methods for creating multiple events from
a single event specification:
1. A prefix or glob can be used in the PMU name.
2. Aliases, which are listed immediately after the Kernel PMU events
by perf list, are used.
When the --no-merge option is passed and these events are displayed
individually the PMU name is lost and it's not possible to see which
count corresponds to which pmu:
$ perf stat -a -e l3cache/read-miss/ --no-merge ls > /dev/null
perf pmu: Support wildcards on pmu name in dynamic pmu events
Starting on v4.12 event parsing code for dynamic pmu events already
supports prefix-based matching of multiple pmus when creating dynamic
events. E.g., in a system with the following dynamic pmus:
mypmu_0
mypmu_1
mypmu_2
mypmu_4
passing mypmu/<config>/ as an event spec will result in the creation of
the event in all of the pmus. This change expands this matching through
the use of fnmatch so glob-like expressions can be used to create events
in multiple pmus. E.g., in the system described above if a user only
wants to create the event in mypmu_0 and mypmu_1, mypmu_[01]/<config>/
can be passed.
Signed-off-by: Agustin Vega-Frias <agustinv@codeaurora.org> Acked-by: Andi Kleen <ak@linux.intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: linux-arm-kernel@lists.infradead.org Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Timur Tabi <timur@codeaurora.org>
Change-Id: Icb25653fc5d5239c20f3bffdfdf4ab4c9c9bb20b Link: http://lkml.kernel.org/r/1520454947-16977-1-git-send-email-agustinv@codeaurora.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Takashi Iwai [Wed, 7 Mar 2018 10:54:41 +0000 (11:54 +0100)]
perf tools: Correct title markers for asciidoctor
I've tested to process the perf man pages with asciidoctor that is
picker than asciidoc, and it revealed minor syntax errors in some
documents. Namely, the title markers aren't aligned with the previous
line, hence asciidoctor didn't recognize as titles.
This patch corrects these markers to be processed properly.
Adrian Hunter [Tue, 6 Mar 2018 09:13:16 +0000 (11:13 +0200)]
perf auxtrace: Make auxtrace_queues__add_buffer() return buffer_ptr
In preparation for supporting AUX area sampling buffers,
auxtrace_queues__add_buffer() needs to be more generic. To that end, make
it return buffer_ptr instead of the caller.
One can set a cgroup as a default cgroup to be used by all events or
set cgroups with the 'perf stat' and 'perf record' behaviour, i.e.
'-G A' will be the cgroup for events defined so far in the command line.
Here in my main machine, with a kvm instance running a rhel6 guinea pig
I have:
# ls -la /sys/fs/cgroup/perf_event/ | grep drw
drwxr-xr-x. 14 root root 360 Mar 6 12:04 ..
drwxr-xr-x. 3 root root 0 Mar 6 15:05 machine.slice
#
So I can go ahead and use that cgroup hierarchy, say lets see what
syscalls are being emitted by threads in that 'machine.slice' hierarchy
that are taking more than 100ms:
See the documentation about how to set more than one cgroup for
different events in the same command line.
Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Hendrik Brueckner <brueckner@linux.vnet.ibm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Richter <tmricht@linux.vnet.ibm.com> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-t126jh4occqvu0xdqlcjygex@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The usual thing is for a constructor to allocate space for its members,
not to require that the caller pass a pre-allocated 'name' and then, at
its destructor, to free something not allocated by it.
Fix it by making cgroup__new() to receive a const char pointer, then
allocate cgroup->name that then can continue to be freed at
cgroup__delete(), balancing the alloc/free operations inside the cgroup
struct methods.
This eases calling evlist__findnew_cgroup() from the custom 'perf trace'
cgroup parser, that will only call parse_cgroups() when the '-G cgroup'
is passed on the command line after '-e event' entries, when it'll
behave just like 'perf stat' and 'perf record', i.e. the previous
parse_cgroup() users that mandate that -G only can come after a -e.
Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Stephane Eranian <eranian@google.com> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-4leugnuyqi10t98990o3xi1t@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
So that tools like 'perf trace' can allow the user to set a cgroup
to be used for all the evsels still without a crgroup setup by
parse_cgroups(), such as the one to use for the syscalls, vfs_getname
and other events involved in strace like syscall tracing.
Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Stephane Eranian <eranian@google.com> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-zf9jjsbj661r3lk6qb7g8j70@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Similar to machine__findnew_thread(), etc, i.e. try to find, get a
refcount if found and return it, otherwise return a new cgroup object.
Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Stephane Eranian <eranian@google.com> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-im1omevlihhyneiic4nl3g24@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Changbin Du <changbin.du@intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1520307457-23668-3-git-send-email-changbin.du@intel.com
[ Optimally pack struct thread_runtime when adding the new bool member ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
perf cgroup: Introduce cgroup__new() out of open coded equivalent
To follow the namespacing convention in tools/perf.
Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Stephane Eranian <eranian@google.com> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-jaalyl6bkvvji4r5u8wqw4n4@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Stephane Eranian <eranian@google.com> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-5yqshcf5hm837n7c86u7lhjf@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The refcount operation counterpart to cgroup__put(), use it when reusing
a cgroup.
Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Stephane Eranian <eranian@google.com> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-14ynvrl7y2cz8gyuy5q5v41g@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
perf cgroup: Rename close_cgroup() to cgroup__put()
It is not really closing the cgroup, but instead dropping a reference
count and if it hits zero, then calling delete, which will, among other
cleanup shores, close the cgroup fd.
So it is really dropping a reference to that cgroup, and the method name
for that is "put", so rename close_cgroup() to cgroup__put() to follow
this naming convention.
Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Stephane Eranian <eranian@google.com> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-sccxpnd7bgwc1llgokt6fcey@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Just to make this code look more like other places in tools/perf.
Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Stephane Eranian <eranian@google.com> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-j3j72vvn2d5j7tenlghdy195@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
perf cgroup: Rename 'struct cgroup_sel' to 'struct cgroup'
That name isn't used, is shorter, lets switch to it.
Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Stephane Eranian <eranian@google.com> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-e51yphwgvepd1y4f5fjptmjq@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Ingo Molnar [Wed, 7 Mar 2018 08:18:32 +0000 (09:18 +0100)]
Merge tag 'perf-urgent-for-mingo-4.16-20180306' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/urgent
Pull perf/urgent fixes from Arnaldo Carvalho de Melo:
- Be more robust when drawing arrows in the annotation TUI, avoiding a
segfault when jump instructions have as a target addresses in functions
other that the one currently being annotated. The full fix will come in
the following days, when jumping to other functions will work as call
instructions (Arnaldo Carvalho de Melo)
- Prevent auxtrace_queues__process_index() from queuing AUX area data for
decoding when the --no-itrace option has been used (Adrian Hunter)
- Sync copy of kvm UAPI headers and x86's cpufeatures.h (Arnaldo Carvalho de Melo)
- Fix 'perf stat' CSV output format for non-supported counters (Ilya Pronin)
Adrian Hunter [Wed, 28 Feb 2018 08:39:04 +0000 (10:39 +0200)]
perf tools: Fix trigger class trigger_on()
trigger_on() means that the trigger is available but not ready, however
trigger_on() was making it ready. That can segfault if the signal comes
before trigger_ready(). e.g. (USR2 signal delivery not shown)
Ingo Molnar [Tue, 6 Mar 2018 06:34:04 +0000 (07:34 +0100)]
Merge tag 'perf-core-for-mingo-4.17-20180305' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core
Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo:
- Be more robust when drawing arrows in the annotation TUI, avoiding a
segfault when jump instructions have as a target addresses in functions
other that the one currently being annotated. The full fix will come in
the following days, when jumping to other functions will work as call
instructions (Arnaldo Carvalho de Melo)
- Allow asking for the maximum allowed sample rate in 'top' and
'record', i.e. 'perf record -F max' will read the
kernel.perf_event_max_sample_rate sysctl and use it (Arnaldo Carvalho de Melo)
- When the user specifies a freq above kernel.perf_event_max_sample_rate,
Throttle it down to that max freq, and warn the user about it, add as
well --strict-freq so that the previous behaviour of not starting the
session when the desired freq can't be used can be selected (Arnaldo Carvalho de Melo)
- Find 'call' instruction target symbol at parsing time, used so far in
the TUI, part of the infrastructure changes that will end up allowing
for jumps to navigate to other functions, just like 'call'
instructions. (Arnaldo Carvalho de Melo)
- Use xyarray dimensions to iterate fds in 'perf stat' (Andi Kleen)
- Ignore threads for which the current user hasn't permissions when
enabling system-wide --per-thread (Jin Yao)
- Fix some backtrace perf test cases to use 'perf record' + 'perf script'
instead, till 'perf trace' starts using ordered_events or equivalent
to avoid symbol resolving artifacts due to reordering of
PERF_RECORD_MMAP events (Jiri Olsa)
- Fix crash in 'perf record' pipe mode, it needs to allocate the ID
array even for a single event, unlike non-pipe mode (Jiri Olsa)
- Make annoying fallback message on older kernels with newer 'perf top'
binaries trying to use overwrite mode and that not being present
in the older kernels (Kan Liang)
- Switch last users of old APIs to the newer perf_mmap__read_event()
one, then discard those old mmap read forward APIs (Kan Liang)
- Fix the usage on the 'perf kallsyms' man page (Sangwon Hong)
- Simplify cgroup arguments when tracking multiple events (weiping zhang)
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Ingo Molnar <mingo@kernel.org>
The changes in dd84441a7971 ("x86/speculation: Use IBRS if available
before calling into firmware") don't need any kind of special treatment
in the current tools/perf/ codebase, so just update the copy to get rid
of the perf build warning:
BUILD: Doing 'make -j4' parallel build
Warning: Kernel ABI header at 'tools/arch/x86/include/asm/cpufeatures.h' differs from latest version at 'arch/x86/include/asm/cpufeatures.h'
Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: David Woodhouse <dwmw@amazon.co.uk> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-mzmuxocrf96v922xkerey3ns@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
In 801e459a6f3a ("KVM: x86: Add a framework for supporting MSR-based
features") a new ioctl was introduced, which with this sync of the kvm
UAPI headers, makes 'perf trace' know about it:
This also silences the perf build header copy drift verifier:
make: Entering directory '/home/acme/git/perf/tools/perf'
BUILD: Doing 'make -j4' parallel build
Warning: Kernel ABI header at 'tools/include/uapi/linux/kvm.h' differs from latest version at 'include/uapi/linux/kvm.h'
Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Radim Krčmář <rkrcmar@redhat.com> Cc: Tom Lendacky <thomas.lendacky@amd.com> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-h31oz5g0mt1dh2s2ajq6o6no@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Jiri Olsa [Fri, 2 Mar 2018 16:13:54 +0000 (17:13 +0100)]
perf record: Fix crash in pipe mode
Currently we can crash perf record when running in pipe mode, like:
$ perf record ls | perf report
# To display the perf.data header info, please use --header/--header-only options.
#
perf: Segmentation fault
Error:
The - file has no samples!
The callstack of the crash is:
0x0000000000515242 in perf_event__synthesize_event_update_name
3513 ev = event_update_event__new(len + 1, PERF_EVENT_UPDATE__NAME, evsel->id[0]);
(gdb) bt
#0 0x0000000000515242 in perf_event__synthesize_event_update_name
#1 0x00000000005158a4 in perf_event__synthesize_extra_attr
#2 0x0000000000443347 in record__synthesize
#3 0x00000000004438e3 in __cmd_record
#4 0x000000000044514e in cmd_record
#5 0x00000000004cbc95 in run_builtin
#6 0x00000000004cbf02 in handle_internal_command
#7 0x00000000004cc054 in run_argv
#8 0x00000000004cc422 in main
The reason of the crash is that the evsel does not have ids array
allocated and the pipe's synthesize code tries to access it.
We don't force evsel ids allocation when we have single event, because
it's not needed. However we need it when we are in pipe mode even for
single event as a key for evsel update event.
Fixing this by forcing evsel ids allocation event for single event, when
we are in pipe mode.
Signed-off-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20180302161354.30192-1-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
I.e. jumps to a label inside that function (_cpp_lex_token), and those
works, but also this kind:
│1159e8b: ↓ jne c469be <cpp_named_operator2name@@Base+0xa72>
I.e. jumps to another function, outside _cpp_lex_token, which are not
being correctly handled generating as a side effect references to
ab->offset[] entries that are set to NULL, so to make this code more
robust, check that here.
A proper fix for will be put in place, looking at the function name
right after the '<' token and probably treating this like a 'call'
instruction.
For now just don't draw the arrow.
Reported-by: Ingo Molnar <mingo@kernel.org> Tested-by: Ingo Molnar <mingo@kernel.org> Reported-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Cc: Jin Yao <yao.jin@intel.com> Cc: Kan Liang <kan.liang@intel.com> Link: https://lkml.kernel.org/n/tip-5tzvb875ep2sel03aeefgmud@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Kan Liang [Mon, 26 Feb 2018 18:17:10 +0000 (10:17 -0800)]
perf top: Fix annoying fallback message on older kernels
On older (e.g. v4.4) kernels, an annoying fallback message can be
observed in 'perf top':
┌─Warning:──────────────────────┐
│fall back to non-overwrite mode│
│ │
│ │
│Press any key... │
└───────────────────────────────┘
The 'perf top' utility has been changed to overwrite mode since commit ebebbf082357 ("perf top: Switch default mode to overwrite mode").
For older kernels which don't have overwrite mode support, 'perf top'
will fall back to non-overwrite mode and print out the fallback message
using ui__warning(), which needs user's input to close.
The fallback message is not critical for end users. Turning it to debug
message which is printed when running with -vv.
Reported-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Kan Liang <kan.liang@intel.com> Cc: Kan Liang <kan.liang@intel.com> Fixes: ebebbf082357 ("perf top: Switch default mode to overwrite mode") Link: http://lkml.kernel.org/r/1519669030-176549-1-git-send-email-kan.liang@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Sangwon Hong [Sun, 11 Feb 2018 19:37:44 +0000 (04:37 +0900)]
perf kallsyms: Fix the usage on the man page
First, all man pages highlight only perf and subcommands except 'perf
kallsyms', which includes the full usage. Fix it for commands to
monopolize underlines.
Second, options can be ommited when executing 'perf kallsyms', so add
square brackets between <option>.
Signed-off-by: Sangwon Hong <qpakzk@gmail.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Taeung Song <treeze.taeung@gmail.com> Link: http://lkml.kernel.org/r/1518377864-20353-1-git-send-email-qpakzk@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Kan Liang [Thu, 1 Mar 2018 23:09:11 +0000 (18:09 -0500)]
perf mmap: Discard legacy interfaces for mmap read forward
Discards legacy interfaces perf_evlist__mmap_read_forward(),
perf_evlist__mmap_read() and perf_evlist__mmap_consume().
No tools use them.
Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/1519945751-37786-14-git-send-email-kan.liang@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Kan Liang [Thu, 1 Mar 2018 23:09:10 +0000 (18:09 -0500)]
perf test: Switch to new perf_mmap__read_event() interface for task-exit
The perf test 'task-exit' still use the legacy interface.
No functional change.
Committer notes:
Testing it:
# perf test exit
21: Number of exit events of a simple workload : Ok
#
Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/1519945751-37786-13-git-send-email-kan.liang@linux.intel.com
[ Changed bool parameters from 0 to 'false', as per Jiri comment ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Kan Liang [Thu, 1 Mar 2018 23:09:09 +0000 (18:09 -0500)]
perf test: Switch to new perf_mmap__read_event() interface for switch-tracking
The perf test 'switch-tracking' still use the legacy interface.
No functional change.
Committer testing:
# perf test switch
32: Track with sched_switch : Ok
#
Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/1519945751-37786-12-git-send-email-kan.liang@linux.intel.com
[ Changed bool parameters from 0 to 'false', as per Jiri comment ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Kan Liang [Thu, 1 Mar 2018 23:09:08 +0000 (18:09 -0500)]
perf test: Switch to new perf_mmap__read_event() interface for sw-clock
The perf test 'sw-clock' still use the legacy interface.
No functional change.
Committer testing:
# perf test clock
22: Software clock events period values : Ok
#
Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/1519945751-37786-11-git-send-email-kan.liang@linux.intel.com
[ Changed bool parameters from 0 to 'false', as per Jiri comment ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Kan Liang [Thu, 1 Mar 2018 23:09:07 +0000 (18:09 -0500)]
perf test: Switch to new perf_mmap__read_event() interface for time-to-tsc
The perf test 'time-to-tsc' still use the legacy interface.
No functional change.
Commiter notes:
Testing it:
# perf test tsc
57: Convert perf time to TSC : Ok
#
Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/1519945751-37786-10-git-send-email-kan.liang@linux.intel.com
[ Changed bool parameters from 0 to 'false', as per Jiri comment ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Kan Liang [Thu, 1 Mar 2018 23:09:06 +0000 (18:09 -0500)]
perf test: Switch to new perf_mmap__read_event() interface for perf-record
The perf test 'perf-record' still use the legacy interface.
No functional change.
Committer notes:
Testing it:
# perf test PERF_RECORD
8: PERF_RECORD_* events & perf_sample fields : Ok
#
Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/1519945751-37786-9-git-send-email-kan.liang@linux.intel.com
[ Changed bool parameters from 0 to 'false', as per Jiri comment ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Kan Liang [Thu, 1 Mar 2018 23:09:05 +0000 (18:09 -0500)]
perf test: Switch to new perf_mmap__read_event() interface for tp fields
The perf test 'syscalls:sys_enter_openat event fields' still use the
legacy interface.
No functional change.
Committer notes:
Testing it:
# perf test sys_enter_openat
15: syscalls:sys_enter_openat event fields : Ok
#
Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/1519945751-37786-8-git-send-email-kan.liang@linux.intel.com
[ Changed bool parameters from 0 to 'false', as per Jiri comment ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Kan Liang [Thu, 1 Mar 2018 23:09:04 +0000 (18:09 -0500)]
perf test: Switch to new perf_mmap__read_event() interface for mmap-basic
The perf test 'mmap-basic' still use the legacy interface.
No functional change.
Committer notes:
Testing it:
# perf test "mmap interface"
4: Read samples using the mmap interface : Ok
#
Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/1519945751-37786-7-git-send-email-kan.liang@linux.intel.com
[ Changed bool parameters from 0 to 'false', as per Jiri comment ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Kan Liang [Thu, 1 Mar 2018 23:09:03 +0000 (18:09 -0500)]
perf test: Switch to new perf_mmap__read_event() interface for "keep tracking" test
The perf test 'keep tracking' still use the legacy interface.
No functional change.
Committer testing:
# perf test tracking
25: Use a dummy software event to keep tracking : Ok
#
Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/1519945751-37786-6-git-send-email-kan.liang@linux.intel.com
[ Changed bool parameters from 0 to 'false', as per Jiri comment ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Kan Liang [Thu, 1 Mar 2018 23:09:02 +0000 (18:09 -0500)]
perf test: Switch to new perf_mmap__read_event() interface for 'code reading' test
The perf test 'object code reading' still use the legacy interface.
No functional change.
Committer notes:
Testing:
# perf test reading
23: Object code reading: Ok
#
Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/1519945751-37786-5-git-send-email-kan.liang@linux.intel.com
[ Changed bool parameters from 0 to 'false', as per Jiri comment ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Kan Liang [Thu, 1 Mar 2018 23:09:01 +0000 (18:09 -0500)]
perf test: Switch to new perf_mmap__read_event() interface for bpf
The perf test 'bpf' still use the legacy interface.
No functional change.
Committer notes:
Tested with:
# perf test bpf
39: BPF filter :
39.1: Basic BPF filtering : Ok
39.2: BPF pinning : Ok
39.3: BPF prologue generation : Ok
39.4: BPF relocation checker : Ok
#
Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/1519945751-37786-4-git-send-email-kan.liang@linux.intel.com
[ Changed bool parameters from 0 to 'false', as per Jiri comment ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/1519945751-37786-3-git-send-email-kan.liang@linux.intel.com
[ Changed bool parameters from 0 to 'false', as per Jiri comment ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Kan Liang [Thu, 1 Mar 2018 23:08:59 +0000 (18:08 -0500)]
perf trace: Switch to new perf_mmap__read_event() interface
The 'perf trace' utility still use the legacy interface.
Switch to the new perf_mmap__read_event() interface.
No functional change.
Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/1519945751-37786-2-git-send-email-kan.liang@linux.intel.com
[ Changed bool parameters from 0 to 'false', as per Jiri comment ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Kan Liang [Thu, 1 Mar 2018 23:08:58 +0000 (18:08 -0500)]
perf kvm: Switch to new perf_mmap__read_event() interface
The perf kvm still use the legacy interface.
Switch to the new perf_mmap__read_event() interface for perf kvm.
No functional change.
Committer notes:
Tested before and after running:
# perf kvm stat record
On a machine with a kvm guest, then used:
# perf kvm stat report
Before/after results match and look like:
# perf kvm stat record -a sleep 5
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 2.132 MB perf.data.guest (1828 samples) ]
# perf kvm stat report
Analyze events for all VMs, all VCPUs:
VM-EXIT Samples Samples% Time% Min Time Max Time Avg time
Total Samples:644, Total events handled time:4802790.72us.
#
Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/1519945751-37786-1-git-send-email-kan.liang@linux.intel.com
[ Changed bool parameters from 0 to 'false', as per Jiri comment ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Jiri Olsa [Fri, 2 Mar 2018 16:13:54 +0000 (17:13 +0100)]
perf record: Fix crash in pipe mode
Currently we can crash perf record when running in pipe mode, like:
$ perf record ls | perf report
# To display the perf.data header info, please use --header/--header-only options.
#
perf: Segmentation fault
Error:
The - file has no samples!
The callstack of the crash is:
0x0000000000515242 in perf_event__synthesize_event_update_name
3513 ev = event_update_event__new(len + 1, PERF_EVENT_UPDATE__NAME, evsel->id[0]);
(gdb) bt
#0 0x0000000000515242 in perf_event__synthesize_event_update_name
#1 0x00000000005158a4 in perf_event__synthesize_extra_attr
#2 0x0000000000443347 in record__synthesize
#3 0x00000000004438e3 in __cmd_record
#4 0x000000000044514e in cmd_record
#5 0x00000000004cbc95 in run_builtin
#6 0x00000000004cbf02 in handle_internal_command
#7 0x00000000004cc054 in run_argv
#8 0x00000000004cc422 in main
The reason of the crash is that the evsel does not have ids array
allocated and the pipe's synthesize code tries to access it.
We don't force evsel ids allocation when we have single event, because
it's not needed. However we need it when we are in pipe mode even for
single event as a key for evsel update event.
Fixing this by forcing evsel ids allocation event for single event, when
we are in pipe mode.
Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20180302161354.30192-1-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
perf annotate: Find 'call' instruction target symbol at parsing time
So that we do it just once, not everytime we press enter or -> on a
'call' instruction line.
Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-uysyojl1e6nm94amzzzs08tf@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
perf record: Throttle user defined frequencies to the maximum allowed
# perf record -F 200000 sleep 1
warning: Maximum frequency rate (15,000 Hz) exceeded, throttling from 200,000 Hz to 15,000 Hz.
The limit can be raised via /proc/sys/kernel/perf_event_max_sample_rate.
The kernel will lower it when perf's interrupts take too long.
Use --strict-freq to disable this throttling, refusing to record.
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.019 MB perf.data (15 samples) ]
# perf evlist -v
cycles:ppp: size: 112, { sample_period, sample_freq }: 15000, sample_type: IP|TID|TIME|PERIOD, disabled: 1, inherit: 1, mmap: 1, comm: 1, freq: 1, enable_on_exec: 1, task: 1, precise_ip: 3, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1
For those wanting that it fails if the desired frequency can't be used:
# perf record --strict-freq -F 200000 sleep 1
error: Maximum frequency rate (15,000 Hz) exceeded.
Please use -F freq option with a lower value or consider
tweaking /proc/sys/kernel/perf_event_max_sample_rate.
#
Suggested-by: Ingo Molnar <mingo@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-oyebruc44nlja499nqkr1nzn@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
perf top: Allow asking for the maximum allowed sample rate
Add the handy '-F max' shortcut, just introduced to 'perf record', to
reading and using the kernel.perf_event_max_sample_rate value as the
user supplied sampling frequency:
Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-hz04f296zccknnb5at06a6q0@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
perf top browser: Show sample_freq in browser title line
The '--stdio' 'perf top' UI shows it, so lets remove this UI difference
and show it too in '--tui', will be useful for 'perf top --tui -F max'.
Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-n3wd8n395uo4y9irst29pjic@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
perf record: Allow asking for the maximum allowed sample rate
Add the handy '-F max' shortcut to reading and using the
kernel.perf_event_max_sample_rate value as the user supplied
sampling frequency:
# perf record -F max sleep 1
info: Using a maximum frequency rate of 15,000 Hz
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.019 MB perf.data (14 samples) ]
# sysctl kernel.perf_event_max_sample_rate
kernel.perf_event_max_sample_rate = 15000
# perf evlist -v
cycles:ppp: size: 112, { sample_period, sample_freq }: 15000, sample_type: IP|TID|TIME|PERIOD, disabled: 1, inherit: 1, mmap: 1, comm: 1, freq: 1, enable_on_exec: 1, task: 1, precise_ip: 3, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1
# perf record -F 10 sleep 1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.019 MB perf.data (4 samples) ]
# perf evlist -v
cycles:ppp: size: 112, { sample_period, sample_freq }: 10, sample_type: IP|TID|TIME|PERIOD, disabled: 1, inherit: 1, mmap: 1, comm: 1, freq: 1, enable_on_exec: 1, task: 1, precise_ip: 3, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1
#
Suggested-by: Ingo Molnar <mingo@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-4y0tiuws62c64gp4cf0hme0m@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Jiri Olsa [Thu, 1 Mar 2018 16:52:15 +0000 (17:52 +0100)]
perf tests: Rename trace+probe_libc_inet_pton to record+probe_libc_inet_pton
Because the test is no longer using perf trace but perf record instead.
Signed-off-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20180301165215.6780-2-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Jiri Olsa [Thu, 1 Mar 2018 16:52:14 +0000 (17:52 +0100)]
perf tests: Switch trace+probe_libc_inet_pton to use record
There's a problem with relying on backtrace data from 'perf trace' the
way the trace+probe_libc_inet_pton does. This test inserts uprobe within
ping binary and checks that it gets its sample using 'perf trace'.
It also checks it gets proper backtrace from sample and that's where the
issue is.
The 'perf trace' does not sort events (by definition) so it can happen
that it processes the event sample before the ping binary memory map
event. This can (very rarely) happen as proved by this events dump
output (from custom added debug output):
In this case 'perf trace' fails to resolve the last callchain IP (within
the ping binary) because it does not know about the ping binary memory
map yet and the test fails like this:
PING ::1(::1) 56 data bytes
64 bytes from ::1: icmp_seq=1 ttl=64 time=0.037 ms
--- ::1 ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 0.037/0.037/0.037/0.000 ms
0.000 probe_libc:inet_pton:(7f4e29c2ed60))
__GI___inet_pton (/usr/lib64/libc-2.17.so)
getaddrinfo (/usr/lib64/libc-2.17.so)
[0] ([unknown])
FAIL: expected backtrace entry 8 ".*\(.*/bin/ping.*\)$" got "[0] ([unknown])"
Switching the test to use 'perf record' and 'perf script' instead of
'perf trace'.
Signed-off-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20180301165215.6780-1-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
I.e. jumps to a label inside that function (_cpp_lex_token), and those
works, but also this kind:
│1159e8b: ↓ jne c469be <cpp_named_operator2name@@Base+0xa72>
I.e. jumps to another function, outside _cpp_lex_token, which are not
being correctly handled generating as a side effect references to
ab->offset[] entries that are set to NULL, so to make this code more
robust, check that here.
A proper fix for will be put in place, looking at the function name
right after the '<' token and probably treating this like a 'call'
instruction.
For now just don't draw the arrow.
Reported-by: Ingo Molnar <mingo@kernel.org> Reported-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Cc: Jin Yao <yao.jin@intel.com> Cc: Kan Liang <kan.liang@intel.com> Link: https://lkml.kernel.org/n/tip-5tzvb875ep2sel03aeefgmud@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Linus Torvalds [Sun, 4 Mar 2018 20:12:48 +0000 (12:12 -0800)]
Merge branch 'x86/urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Thomas Gleixner:
"A small set of fixes for x86:
- Add missing instruction suffixes to assembly code so it can be
compiled by newer GAS versions without warnings.
- Switch refcount WARN exceptions to UD2 as we did in general
- Make the reboot on Intel Edison platforms work
- A small documentation update so text and sample command match"
* 'x86/urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
Documentation, x86, resctrl: Make text and sample command match
x86/platform/intel-mid: Handle Intel Edison reboot correctly
x86/asm: Add instruction suffixes to bitops
x86/entry/64: Add instruction suffix
x86/refcounts: Switch to UD2 for exceptions
Linus Torvalds [Sun, 4 Mar 2018 19:40:16 +0000 (11:40 -0800)]
Merge branch 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86/pti fixes from Thomas Gleixner:
"Three fixes related to melted spectrum:
- Sync the cpu_entry_area page table to initial_page_table on 32 bit.
Otherwise suspend/resume fails because resume uses
initial_page_table and triggers a triple fault when accessing the
cpu entry area.
- Zero the SPEC_CTL MRS on XEN before suspend to address a
shortcoming in the hypervisor.
- Fix another switch table detection issue in objtool"
* 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/cpu_entry_area: Sync cpu_entry_area to initial_page_table
objtool: Fix another switch table detection issue
x86/xen: Zero MSR_IA32_SPEC_CTRL before suspend
Linus Torvalds [Sun, 4 Mar 2018 19:34:49 +0000 (11:34 -0800)]
Merge branch 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull timer fixes from Thomas Gleixner:
"A small set of fixes from the timer departement:
- Add a missing timer wheel clock forward when migrating timers off a
unplugged CPU to prevent operating on a stale clock base and
missing timer deadlines.
- Use the proper shift count to extract data from a register value to
prevent evaluating unrelated bits
- Make the error return check in the FSL timer driver work correctly.
Checking an unsigned variable for less than zero does not really
work well.
- Clarify the confusing comments in the ARC timer code"
* 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
timers: Forward timer base before migrating timers
clocksource/drivers/arc_timer: Update some comments
clocksource/drivers/mips-gic-timer: Use correct shift count to extract data
clocksource/drivers/fsl_ftm_timer: Fix error return checking
Linus Torvalds [Sun, 4 Mar 2018 19:04:27 +0000 (11:04 -0800)]
Merge tag 'for-4.16-rc3-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux
Pull btrfs fixes from David Sterba:
- when NR_CPUS is large, a SRCU structure can significantly inflate
size of the main filesystem structure that would not be possible to
allocate by kmalloc, so the kvalloc fallback is used
- improved error handling
- fix endiannes when printing some filesystem attributes via sysfs,
this is could happen when a filesystem is moved between different
endianity hosts
- send fixes: the NO_HOLE mode should not send a write operation for a
file hole
- fix log replay for for special files followed by file hardlinks
- fix log replay failure after unlink and link combination
- fix max chunk size calculation for DUP allocation
* tag 'for-4.16-rc3-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
Btrfs: fix log replay failure after unlink and link combination
Btrfs: fix log replay failure after linking special file and fsync
Btrfs: send, fix issuing write op when processing hole in no data mode
btrfs: use proper endianness accessors for super_copy
btrfs: alloc_chunk: fix DUP stripe size handling
btrfs: Handle btrfs_set_extent_delalloc failure in relocate_file_extent_cluster
btrfs: handle failure of add_pending_csums
btrfs: use kvzalloc to allocate btrfs_fs_info
Linus Torvalds [Sat, 3 Mar 2018 22:32:00 +0000 (14:32 -0800)]
Merge branch 'libnvdimm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm
Pull libnvdimm fixes from Dan Williams:
"A 4.16 regression fix, three fixes for -stable, and a cleanup fix:
- During the merge window support for the new ACPI NVDIMM Platform
Capabilities structure disabled support for "deep flush", a
force-unit- access like mechanism for persistent memory. Restore
that mechanism.
- VFIO like RDMA is yet one more memory registration / pinning
interface that is incompatible with Filesystem-DAX. Disable long
term pins of Filesystem-DAX mappings via VFIO.
- The Filesystem-DAX detection to prevent long terms pins mistakenly
also disabled Device-DAX pins which are not subject to the same
block- map collision concerns.
- Similar to the setup path, softlockup warnings can trigger in the
shutdown path for large persistent memory namespaces. Teach
for_each_device_pfn() to perform cond_resched() in all cases.
- Boaz noticed that the might_sleep() in dax_direct_access() is stale
as of the v4.15 kernel.
These have received a build success notification from the 0day robot,
and the longterm pin fixes have appeared in -next. However, I recently
rebased the tree to remove some other fixes that need to be reworked
after review feedback.
* 'libnvdimm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm:
memremap: fix softlockup reports at teardown
libnvdimm: re-enable deep flush for pmem devices via fsync()
vfio: disable filesystem-dax page pinning
dax: fix vma_is_fsdax() helper
dax: ->direct_access does not sleep anymore
Linus Torvalds [Sat, 3 Mar 2018 18:37:01 +0000 (10:37 -0800)]
Merge tag 'kbuild-fixes-v4.16' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild
Pull Kbuild fixes from Masahiro Yamada:
- suppress sparse warnings about unknown attributes
- fix typos and stale comments
- fix build error of arch/sh
- fix wrong use of ld-option vs cc-ldoption
- remove redundant GCC_PLUGINS_CFLAGS assignment
- fix another memory leak of Kconfig
- fix line number in error messages of Kconfig
- do not write confusing CONFIG_DEFCONFIG_LIST out to .config
- add xstrdup() to Kconfig to handle memory shortage errors
- show also a Debian package name if ncurses is missing
* tag 'kbuild-fixes-v4.16' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild:
MAINTAINERS: take over Kconfig maintainership
kconfig: fix line number in recursive inclusion error message
Coccinelle: memdup: Fix typo in warning messages
kconfig: Update ncurses package names for menuconfig
kbuild/kallsyms: trivial typo fix
kbuild: test --build-id linker flag by ld-option instead of cc-ldoption
kbuild: drop superfluous GCC_PLUGINS_CFLAGS assignment
kconfig: Don't leak choice names during parsing
sh: fix build error for empty CONFIG_BUILTIN_DTB_SOURCE
kconfig: set SYMBOL_AUTO to the symbol marked with defconfig_list
kconfig: add xstrdup() helper
kbuild: disable sparse warnings about unknown attributes
Makefile: Fix lying comment re. silentoldconfig
Linus Torvalds [Sat, 3 Mar 2018 18:27:14 +0000 (10:27 -0800)]
Merge tag 'media/v4.16-3' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media
Pull media fixes from Mauro Carvalho Chehab:
- some build fixes with randconfigs
- an m88ds3103 fix to prevent an OOPS if the chip doesn't provide the
right version during probe (with can happen if the hardware hangs)
- a potential out of array bounds reference in tvp5150
- some fixes and improvements in the DVB memory mapped API (added for
kernel 4.16)
* tag 'media/v4.16-3' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media:
media: vb2: Makefile: place vb2-trace together with vb2-core
media: Don't let tvp5150_get_vbi() go out of vbi_ram_default array
media: dvb: update buffer mmaped flags and frame counter
media: dvb: add continuity error indicators for memory mapped buffers
media: dmxdev: Fix the logic that enables DMA mmap support
media: dmxdev: fix error code for invalid ioctls
media: m88ds3103: don't call a non-initalized function
media: au0828: add VIDEO_V4L2 dependency
media: dvb: fix DVB_MMAP dependency
media: dvb: fix DVB_MMAP symbol name
media: videobuf2: fix build issues with vb2-trace
media: videobuf2: Add VIDEOBUF2_V4L2 Kconfig option for VB2 V4L2 part
Linus Torvalds [Sat, 3 Mar 2018 03:40:43 +0000 (19:40 -0800)]
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull KVM fixes from Radim Krčmář:
"x86:
- fix NULL dereference when using userspace lapic
- optimize spectre v1 mitigations by allowing guests to use LFENCE
- make microcode revision configurable to prevent guests from
unnecessarily blacklisting spectre v2 mitigation feature"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
KVM: x86: fix vcpu initialization with userspace lapic
KVM: X86: Allow userspace to define the microcode version
KVM: X86: Introduce kvm_get_msr_feature()
KVM: SVM: Add MSR-based feature support for serializing LFENCE
KVM: x86: Add a framework for supporting MSR-based features
Dan Williams [Wed, 7 Feb 2018 03:34:11 +0000 (19:34 -0800)]
memremap: fix softlockup reports at teardown
The cond_resched() currently in the setup path needs to be duplicated in
the teardown path. Rather than require each instance of
for_each_device_pfn() to open code the same sequence, embed it in the
helper.
Link: https://github.com/intel/ixpdimm_sw/issues/11 Cc: "Jérôme Glisse" <jglisse@redhat.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Christoph Hellwig <hch@lst.de> Cc: <stable@vger.kernel.org> Fixes: 71389703839e ("mm, zone_device: Replace {get, put}_zone_device_page()...") Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Dave Jiang [Sat, 3 Mar 2018 03:31:40 +0000 (19:31 -0800)]
libnvdimm: re-enable deep flush for pmem devices via fsync()
Re-enable deep flush so that users always have a way to be sure that a
write makes it all the way out to media. Writes from the PMEM driver
always arrive at the NVDIMM since movnt is used to bypass the cache, and
the driver relies on the ADR (Asynchronous DRAM Refresh) mechanism to
flush write buffers on power failure. The Deep Flush mechanism is there
to explicitly write buffers to protect against (rare) ADR failure. This
change prevents a regression in deep flush behavior so that applications
can continue to depend on fsync() as a mechanism to trigger deep flush
in the filesystem-DAX case.
Fixes: 06e8ccdab15f4 ("acpi: nfit: Add support for detect platform CPU cache...") Reviewed-by: Jeff Moyer <jmoyer@redhat.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Dan Williams [Sun, 4 Feb 2018 18:34:02 +0000 (10:34 -0800)]
vfio: disable filesystem-dax page pinning
Filesystem-DAX is incompatible with 'longterm' page pinning. Without
page cache indirection a DAX mapping maps filesystem blocks directly.
This means that the filesystem must not modify a file's block map while
any page in a mapping is pinned. In order to prevent the situation of
userspace holding of filesystem operations indefinitely, disallow
'longterm' Filesystem-DAX mappings.
RDMA has the same conflict and the plan there is to add a 'with lease'
mechanism to allow the kernel to notify userspace that the mapping is
being torn down for block-map maintenance. Perhaps something similar can
be put in place for vfio.
Note that xfs and ext4 still report:
"DAX enabled. Warning: EXPERIMENTAL, use at your own risk"
...at mount time, and resolving the dax-dma-vs-truncate problem is one
of the last hurdles to remove that designation.
Acked-by: Alex Williamson <alex.williamson@redhat.com> Cc: Michal Hocko <mhocko@suse.com> Cc: kvm@vger.kernel.org Cc: <stable@vger.kernel.org> Reported-by: Haozhong Zhang <haozhong.zhang@intel.com> Tested-by: Haozhong Zhang <haozhong.zhang@intel.com> Fixes: d475c6346a38 ("dax,ext2: replace XIP read and write with DAX I/O") Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
- Fix a crash when BIOS didn't assign a BAR and we try to enlarge it
(Christian König)
* tag 'pci-v4.16-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci:
PCI: Allow release of resources that were never assigned
PCI: Update location of pci.ids file