perf trace: Don't relookup fields by name in each sample
Instead do the lookups just when creating the tracepoints, initially for
the most common, raw_syscalls:sys_{enter,exit}.
It works by having evsel->priv have a per tracepoint structure with
entries for the fields, for direct access, with the offset and a
function to get the value from the sample, doing the swap if needed.
Using a simple workload that does M millions write syscalls, we go from:
Ingo Molnar [Thu, 7 Nov 2013 07:46:13 +0000 (08:46 +0100)]
Merge branch 'uprobes/core' of git://git.kernel.org/pub/scm/linux/kernel/git/oleg/misc into perf/core
Pull uprobes updates from Oleg Nesterov:
" [...] this way the upcoming ARM port doesn't (almost) need
changes outside of arch/arm and thus it would be simpler to
route everything via the ARM trees. "
Oleg Nesterov [Tue, 5 Nov 2013 18:50:39 +0000 (19:50 +0100)]
uprobes: Export write_opcode() as uprobe_write_opcode()
set_swbp() and set_orig_insn() are __weak, but this is pointless
because write_opcode() is static.
Export write_opcode() as uprobe_write_opcode() for the upcoming
arm port, this way it can actually override set_swbp() and use
__opcode_to_mem_arm(bpinsn) instead if UPROBE_SWBP_INSN.
Oleg Nesterov [Mon, 4 Nov 2013 19:27:13 +0000 (20:27 +0100)]
uprobes: Introduce arch_uprobe->ixol
Currently xol_get_insn_slot() assumes that we should simply copy
arch_uprobe->insn[] which is (ignoring arch_uprobe_analyze_insn)
just the copy of the original insn.
This is not true for arm which needs to create another insn to
execute it out-of-line.
So this patch simply adds the new member, ->ixol into the union.
This doesn't make any difference for x86 and powerpc, but arm
can divorce insn/ixol and initialize the correct xol insn in
arch_uprobe_analyze_insn().
Oleg Nesterov [Thu, 31 Oct 2013 18:28:22 +0000 (19:28 +0100)]
uprobes: Kill module_init() and module_exit()
Turn module_init() into __initcall() and kill module_exit().
This code can't be compiled as a module so these module_*()
calls only add the confusion, especially if arch-dependant
code needs its own initialization hooks.
David A. Long [Tue, 15 Oct 2013 21:04:16 +0000 (17:04 -0400)]
uprobes: Move function declarations out of arch
Move the function declarations from the arch headers to the common
header, since only the function bodies are architecture-specific.
These changes are from Vincent Rabin's uprobes patch.
Yan, Zheng [Thu, 31 Oct 2013 05:36:55 +0000 (13:36 +0800)]
perf/x86/intel: Add Ivy Bridge-EP uncore IRP box support
Unlike other uncore boxes, IRP boxes live in PCI buses with no UBOX
device. For PCI bus without UBOX device, we find the next bus that
has UBOX device and use its 'bus to socket' mapping.
Besides the counter/control registers in IRP boxes are not properly
aligned.
Oleg Nesterov [Thu, 17 Oct 2013 18:24:17 +0000 (20:24 +0200)]
perf: Factor out strncpy() in perf_event_mmap_event()
While this is really minor, but strncpy() does the unnecessary
zero-padding till the end of tmp[16] and it is called every time
we are going to use the string literal.
Turn these strncpy()'s into the single strlcpy() under the new
label, saves 72 bytes.
Peter Zijlstra [Wed, 30 Oct 2013 20:16:22 +0000 (21:16 +0100)]
perf: Fix arch_perf_out_copy_user default
The arch_perf_output_copy_user() default of
__copy_from_user_inatomic() returns bytes not copied, while all other
argument functions given DEFINE_OUTPUT_COPY() return bytes copied.
Since copy_from_user_nmi() is the odd duck out by returning bytes
copied where all other *copy_{to,from}* functions return bytes not
copied, change it over and ammend DEFINE_OUTPUT_COPY() to expect bytes
not copied.
Oddly enough DEFINE_OUTPUT_COPY() already returned bytes not copied
while expecting its worker functions to return bytes copied.
Peter Zijlstra [Thu, 31 Oct 2013 16:29:29 +0000 (17:29 +0100)]
perf: Optimize perf_output_begin() -- lost_event case
Avoid touching the lost_event and sample_data cachelines twince. Its
not like we end up doing less work, but it might help to keep all
accesses to these cachelines in one place.
Due to code shuffle, this looses 4 bytes on x86_64-defconfig.
Signed-off-by: Peter Zijlstra <peterz@infradead.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> Cc: Michael Ellerman <michael@ellerman.id.au> Cc: Michael Neuling <mikey@neuling.org> Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> Cc: james.hogan@imgtec.com Cc: Vince Weaver <vince@deater.net> Cc: Victor Kaplansky <VICTORK@il.ibm.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Anton Blanchard <anton@samba.org> Link: http://lkml.kernel.org/n/tip-zfxnc58qxj0eawdoj31hhupv@git.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
Peter Zijlstra [Thu, 31 Oct 2013 16:25:38 +0000 (17:25 +0100)]
perf: Optimize perf_output_begin()
There's no point in re-doing the memory-barrier when we fail the
cmpxchg(). Also placing it after the space reservation loop makes it
clearer it only separates the userpage->tail read from the data
stores.
Signed-off-by: Peter Zijlstra <peterz@infradead.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> Cc: Michael Ellerman <michael@ellerman.id.au> Cc: Michael Neuling <mikey@neuling.org> Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> Cc: james.hogan@imgtec.com Cc: Vince Weaver <vince@deater.net> Cc: Victor Kaplansky <VICTORK@il.ibm.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Anton Blanchard <anton@samba.org> Link: http://lkml.kernel.org/n/tip-c19u6egfldyx86tpyc3zgkw9@git.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
Peter Zijlstra [Thu, 31 Oct 2013 16:20:25 +0000 (17:20 +0100)]
perf: Add unlikely() to the ring-buffer code
Add unlikely() annotations to 'slow' paths:
When having a sampling event but no output buffer; you have bigger
issues -- also the bail is still faster than actually doing the work.
When having a sampling event but a control page only buffer, you have
bigger issues -- again the bail is still faster than actually doing
work.
Optimize for the case where you're not loosing events -- again, not
doing the work is still faster but make sure that when you have to
actually do work its as fast as possible.
The typical watermark is 1/2 the buffer size, so most events will not
take this path.
Shrinks perf_output_begin() by 16 bytes on x86_64-defconfig.
Signed-off-by: Peter Zijlstra <peterz@infradead.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> Cc: Michael Ellerman <michael@ellerman.id.au> Cc: Michael Neuling <mikey@neuling.org> Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> Cc: james.hogan@imgtec.com Cc: Vince Weaver <vince@deater.net> Cc: Victor Kaplansky <VICTORK@il.ibm.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Anton Blanchard <anton@samba.org> Link: http://lkml.kernel.org/n/tip-wlg3jew3qnutm8opd0hyeuwn@git.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
Jiri Olsa [Tue, 5 Nov 2013 14:14:47 +0000 (15:14 +0100)]
perf tools: Check maximum frequency rate for record/top
Adding the check for maximum allowed frequency rate defined in following
file:
/proc/sys/kernel/perf_event_max_sample_rate
When we cross the maximum value we fail and display detailed error
message with advise.
$ perf record -F 3000 ls
Maximum frequency rate (2000) reached.
Please use -F freq option with lower value or consider
tweaking /proc/sys/kernel/perf_event_max_sample_rate.
In case user does not specify the frequency and the default value cross
the maximum, we display warning and set the frequency value to the
current maximum.
$ perf record ls
Lowering default frequency rate to 2000.
Please consider tweaking /proc/sys/kernel/perf_event_max_sample_rate.
Same messages are used for 'perf top'.
Signed-off-by: Jiri Olsa <jolsa@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1383660887-1734-4-git-send-email-jolsa@redhat.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Steven Rostedt [Fri, 1 Nov 2013 21:54:00 +0000 (17:54 -0400)]
tools lib traceevent: Add pevent_print_func_field() helper function
Add the pevent_print_func_field() that will look up a field that is
expected to be a function pointer, and it will print the function name
and offset of the address given by the field.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: http://lkml.kernel.org/r/20131101215501.869542711@goodmis.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Steven Rostedt [Fri, 1 Nov 2013 21:53:59 +0000 (17:53 -0400)]
tools lib traceevent: Add flags NOHANDLE and PRINTRAW to individual events
Add the flags EVENT_FL_NOHANDLE and EVENT_FL_PRINTRAW to the event flags
to have the event either ignore the register handler or to ignore the
handler and also print the raw format respectively.
This allows a tool to force a raw format or non handle for an event.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: http://lkml.kernel.org/r/20131101215501.655258742@goodmis.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
tools lib traceevent: Check for spaces in character array
Currently when using the raw format for fields, when looking at a
character array, to determine if it is a string or not, we make sure all
characters are "isprint()". If not, then we consider it a numeric array,
and print the hex numbers of the characters instead.
But it seems that '\n' fails the isprint() check! Add isspace() to the
check as well, such that if all characters pass isprint() or isspace()
it will assume the character array is a string.
Reported-by: Xenia Ragiadakou <burzalodowa@gmail.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Xenia Ragiadakou <burzalodowa@gmail.com> Link: http://lkml.kernel.org/r/20131101215501.465091682@goodmis.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The kernel has a few events with a format similar to this excerpt:
field:unsigned int len; offset:12; size:4; signed:0;
field:__data_loc unsigned char[] data_array; offset:16; size:4; signed:0;
print fmt: "%s", __print_hex(__get_dynamic_array(data_array), REC->len)
trace-cmd could already parse that arg correctly, but print_str_arg()
was unable to handle the first parameter being a dynamic array. (It
just printed a "field not found" warning).
Teach print_str_arg's PRINT_HEX case to handle the nested
PRINT_DYNAMIC_ARRAY correctly. The output now matches the kernel's own
formatting for this case.
Signed-off-by: Howard Cochran <hcochran@lexmark.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: http://lkml.kernel.org/r/1381503349-12271-1-git-send-email-hcochran@lexmark.com
[ Removed "polish compare", we don't do that here ] Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
tools lib traceevent: If %s is a pointer, check printk formats
If the format string of TP_printk() contains a %s, and the argument is
not a string, check if the argument is a pointer that might match the
printk_formats that were stored.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: http://lkml.kernel.org/r/20131101215500.698924777@goodmis.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
tools lib traceevent: Update printk formats when entered
Instead of cropping off the '"' and '\n"' from a printk format every
time it is referenced, do it when it's added. This makes it easier to
reference a printk_map and should speed things up a little.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: http://lkml.kernel.org/r/20131101215500.495619312@goodmis.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
tools lib traceevent: Add support for extracting trace_clock in report
If trace-cmd extracts trace_clock, trace-cmd reads trace_clock data from
the trace.dat and switches outputting format of timestamp for each
trace_clock.
Signed-off-by: Yoshihiro YUNOMAE <yoshihiro.yunomae.ez@hitachi.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: http://lkml.kernel.org/r/20130424231305.14877.86147.stgit@yunodevel Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Namhyung Kim [Fri, 1 Nov 2013 07:33:15 +0000 (16:33 +0900)]
perf stat: Enhance option parse error message
Print related option help messages only when it failed to process
options. While at it, modify parse_options_usage() to skip usage part
so that it can be used for showing multiple option help messages
naturally like below:
$ perf stat -Bx, ls
-B option not supported with -x
usage: perf stat [<options>] [<command>]
-B, --big-num print large numbers with thousands' separators
-x, --field-separator <separator>
print counts with custom separator
Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Ingo Molnar <mingo@kernel.org> Reviewed-by: Ingo Molnar <mingo@kernel.org> Enthusiastically-Supported-by: Ingo Molnar <mingo@kernel.org> Cc: David Ahern <dsahern@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1383291195-24386-6-git-send-email-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Namhyung Kim [Fri, 1 Nov 2013 07:33:14 +0000 (16:33 +0900)]
perf top: Use parse_options_usage() for -s option failure
The -s (--sort) option was processed after normal option parsing so that
it cannot call the parse_options_usage() automatically. Currently it
calls usage_with_options() which shows entire help messages for event
option. Fix it by showing just -s options.
$ perf top -s help
Error: Unknown --sort key: `help'
Namhyung Kim [Fri, 1 Nov 2013 07:33:13 +0000 (16:33 +0900)]
perf report: Use parse_options_usage() for -s option failure
The -s (--sort) option was processed after normal option parsing so that
it cannot call the parse_options_usage() automatically. Currently it
calls usage_with_options() which shows entire help messages for event
option. Fix it by showing just -s options.
$ perf report -s help
Error: Unknown --sort key: `help'
Namhyung Kim [Fri, 1 Nov 2013 07:33:12 +0000 (16:33 +0900)]
perf report: Postpone setting up browser after parsing options
If setup_browser() called earlier than option parsing, the actual error
message can be discarded during the terminal reset. So move it after
setup_sorting() checks whether the sort keys are valid.
Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Ingo Molnar <mingo@kernel.org> Reviewed-by: Ingo Molnar <mingo@kernel.org> Enthusiastically-Supported-by: Ingo Molnar <mingo@kernel.org> Cc: David Ahern <dsahern@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1383291195-24386-3-git-send-email-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Namhyung Kim [Fri, 1 Nov 2013 07:33:11 +0000 (16:33 +0900)]
perf tools: Show single option when failed to parse
Current option parser outputs whole option help string when it failed to
parse an option. However this is not good for user if the command has
many option, she might feel hard which one is related easily.
Fix it by just showing the help message of the given option only.
Signed-off-by: Namhyung Kim <namhyung@kernel.org> Requested-by: Ingo Molnar <mingo@kernel.org> Acked-by: Ingo Molnar <mingo@kernel.org> Reviewed-by: Ingo Molnar <mingo@kernel.org> Enthusiastically-Supported-by: Ingo Molnar <mingo@kernel.org> Cc: David Ahern <dsahern@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1383291195-24386-2-git-send-email-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Adrian Hunter [Fri, 1 Nov 2013 13:51:37 +0000 (15:51 +0200)]
perf test: Update "sample parsing" test for PERF_SAMPLE_TRANSACTION
In fact the "sample parsing" test does not automatically check new
sample type bits - they must be added to the comparison logic.
Doing that shows that the test fails because the functions
perf_event__synthesize_sample() and perf_event__sample_event_size() have
not been updated with PERF_SAMPLE_TRANSACTION either.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Namhyung Kim <namhyung@gmail.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/1383313899-15987-10-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Adrian Hunter [Fri, 1 Nov 2013 13:51:30 +0000 (15:51 +0200)]
perf script: Set up output options for in-stream attributes
Attributes (struct perf_event_attr) are recorded separately in the
perf.data file. perf script uses them to set up output options.
However attributes can also be in the event stream, for example when the
input is a pipe (i.e. live mode). This patch makes perf script process
in-stream attributes in the same way as on-file attributes.
Here is an example:
Before this patch:
$ perf record uname | perf script
Linux
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.015 MB (null) (~655 samples) ]
:4220 4220 [-01] 2933367.838906: cycles:
Now that comm strings are allocated only once and refcounted to be shared
among threads, these can now be safely compared by addresses. This
should remove most hists collapses on post processing.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Tested-by: Jiri Olsa <jolsa@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: David Ahern <dsahern@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/1381468543-25334-8-git-send-email-namhyung@kernel.org Signed-off-by: Namhyung Kim <namhyung@kernel.org>
This new COMM infrastructure provides two features:
1) It keeps track of all comms lifecycle for a given thread. This way we
can associate a timeframe to any thread COMM, as long as
PERF_SAMPLE_TIME samples are joined to COMM and fork events.
As a result we should have more precise COMM sorted hists with seperated
entries for pre and post exec time after a fork.
2) It also makes sure that a given COMM string is not duplicated but
rather shared among the threads that refer to it. This way the threads
COMM can be compared against pointer values from the sort
infrastructure.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Tested-by: Jiri Olsa <jolsa@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: David Ahern <dsahern@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-hwjf70b2wve9m2kosxiq8bb3@git.kernel.org
[ Rename some accessor functions ] Signed-off-by: Namhyung Kim <namhyung@kernel.org>
[ Use __ as separator for class__method for private comm_str methods ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
This way we can later delimit a lifecycle for the COMM and map a hist to
a precise COMM:timeslice couple.
PERF_RECORD_COMM and PERF_RECORD_FORK events that don't have
PERF_SAMPLE_TIME samples can only send 0 value as a timestamp and thus
should overwrite any previous COMM on a given thread because there is no
sensible way to keep track of all the comms lifecycles in a thread
without time informations.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Tested-by: Jiri Olsa <jolsa@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: David Ahern <dsahern@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-6tyow99vgmmtt9qwr2u2lqd7@git.kernel.org
[ Made it cope with PERF_RECORD_MMAP2 ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
As the thread comm is going to be implemented by way of a more
complicated data structure than just a pointer to a string from the
thread struct, convert the readers of comm to use an accessor instead of
accessing it directly.
The accessor will be later overriden to support an enhanced comm
implementation.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Tested-by: Jiri Olsa <jolsa@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: David Ahern <dsahern@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-wr683zwy94hmj4ibogmnv9ce@git.kernel.org
[ Rename thread__comm_curr() to thread__comm_str() ] Signed-off-by: Namhyung Kim <namhyung@kernel.org>
[ Fixed up some minor const pointer issues ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Linus Torvalds [Sun, 3 Nov 2013 19:36:41 +0000 (11:36 -0800)]
Merge branch 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus
Pull MIPS fixes from Ralf Baechle:
"Three fixes across arch/mips with the most complex one being the GIC
interrupt fix - at nine lines still not monster. I'm confident this
are the final MIPS patches even if there should go for an rc8"
* 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus:
MIPS: ralink: fix return value check in rt_timer_probe()
MIPS: malta: Fix GIC interrupt offsets
MIPS: Perf: Fix 74K cache map
Mathias Krause [Sun, 3 Nov 2013 11:36:28 +0000 (12:36 +0100)]
ipc, msg: forbid negative values for "msg{max,mnb,mni}"
Negative message lengths make no sense -- so don't do negative queue
lenghts or identifier counts. Prevent them from getting negative.
Also change the underlying data types to be unsigned to avoid hairy
surprises with sign extensions in cases where those variables get
evaluated in unsigned expressions with bigger data types, e.g size_t.
In case a user still wants to have "unlimited" sizes she could just use
INT_MAX instead.
Vineet Gupta [Sat, 2 Nov 2013 12:17:49 +0000 (17:47 +0530)]
ARC: Incorrect mm reference used in vmalloc fault handler
A vmalloc fault needs to sync up PGD/PTE entry from init_mm to current
task's "active_mm". ARC vmalloc fault handler however was using mm.
A vmalloc fault for non user task context (actually pre-userland, from
init thread's open for /dev/console) caused the handler to deref NULL mm
(for mm->pgd)
The reasons it worked so far is amazing:
1. By default (!SMP), vmalloc fault handler uses a cached value of PGD.
In SMP that MMU register is repurposed hence need for mm pointer deref.
2. In pre-3.12 SMP kernel, the problem triggering vmalloc didn't exist in
pre-userland code path - it was introduced with commit 20bafb3d23d108bc
"n_tty: Move buffers into n_tty_data"
Ming Lei [Fri, 1 Nov 2013 22:41:33 +0000 (09:11 +1030)]
scripts/kallsyms: filter symbols not in kernel address space
This patch uses CONFIG_PAGE_OFFSET to filter symbols which
are not in kernel address space because these symbols are
generally for generating code purpose and can't be run at
kernel mode, so we needn't keep them in /proc/kallsyms.
For example, on ARM there are some symbols which may be
linked in relocatable code section, then perf can't parse
symbols any more from /proc/kallsyms, this patch fixes the
problem (introduced b9b32bf70f2fb710b07c94e13afbc729afe221da)
Cc: Russell King <linux@arm.linux.org.uk> Cc: linux-arm-kernel@lists.infradead.org Cc: Michal Marek <mmarek@suse.cz> Signed-off-by: Ming Lei <tom.leiming@gmail.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Cc: stable@vger.kernel.org
Linus Torvalds [Fri, 1 Nov 2013 19:23:56 +0000 (12:23 -0700)]
Merge tag 'usb-3.12-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb
Pull USB fixes from Greg KH:
"Here is a set of patches that revert all of the changes done to the
pl2303 USB serial driver in the 3.12-rc timeframe, as it turns out
they break some devices that work just fine on 3.11. As it's not a
good idea to break working systems, drop them all and they will be
reworked for future kernel versions such that there is no breakage.
I've also included a MAINTAINERS update for the USB serial subsystem
and a new device id for the ftdi_sio driver as well"
* tag 'usb-3.12-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
USB: serial: ftdi_sio: add id for Z3X Box device
USB: Maintainers change for usb serial drivers
Revert "USB: pl2303: restrict the divisor based baud rate encoding method to the "HX" chip type"
Revert "usb: pl2303: fix+improve the divsor based baud rate encoding method"
Revert "usb: pl2303: do not round to the next nearest standard baud rate for the divisor based baud rate encoding method"
Revert "usb: pl2303: remove 500000 baud from the list of standard baud rates"
Revert "usb: pl2303: move the two baud rate encoding methods to separate functions"
Revert "usb: pl2303: increase the allowed baud rate range for the divisor based encoding method"
Revert "usb: pl2303: also use the divisor based baud rate encoding method for baud rates < 115200 with HX chips"
Revert "usb: pl2303: add two comments concerning the supported baud rates with HX chips"
Revert "pl2303: simplify the else-if contruct for type_1 chips in pl2303_startup()"
Revert "pl2303: improve the chip type information output on startup"
Revert "pl2303: improve the chip type detection/distinction"
Revert "USB: pl2303: distinguish between original and cloned HX chips"
Linus Torvalds [Fri, 1 Nov 2013 19:23:22 +0000 (12:23 -0700)]
Merge tag 'sound-3.12' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
Pull more sound fixes from Takashi Iwai:
"The fixes for random bugs that have been reported lately in the game:
a few fixes in ASoC dpam and wm_hubs bugs spotted by Coverity, a
one-liner HD-audio fixup, and a fix for Oops with DPCM.
They are not so critically urgent bugs, but all small and safe"
* tag 'sound-3.12' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
ALSA: fix oops in snd_pcm_info() caused by ASoC DPCM
ASoC: wm_hubs: Add missing break in hp_supply_event()
ALSA: hda - Add a fixup for ASUS N76VZ
ASoC: dapm: Return -ENOMEM in snd_soc_dapm_new_dai_widgets()
ASoC: dapm: Fix source list debugfs outputs
Linus Torvalds [Fri, 1 Nov 2013 19:22:47 +0000 (12:22 -0700)]
Merge tag 'clk-fixes-for-linus' of git://git.linaro.org/people/mturquette/linux
Pull clock subsystem fixes from Mike Turquette.
* tag 'clk-fixes-for-linus' of git://git.linaro.org/people/mturquette/linux:
clk: fixup argument order when setting VCO parameters
clk: socfpga: Fix incorrect sdmmc clock name
clk: armada-370: fix tclk frequencies
clk: nomadik: set all timers to use 2.4 MHz TIMCLK
Greg Thelen [Fri, 1 Nov 2013 19:16:59 +0000 (12:16 -0700)]
memcg: remove incorrect underflow check
When a memcg is deleted mem_cgroup_reparent_charges() moves charged
memory to the parent memcg. As of v3.11-9444-g3ea67d0 "memcg: add per
cgroup writeback pages accounting" there's bad pointer read. The goal
was to check for counter underflow. The counter is a per cpu counter
and there are two problems with the code:
(1) per cpu access function isn't used, instead a naked pointer is used
which easily causes oops.
(2) the check doesn't sum all cpus
The fix is to remove the check. It's currently dangerous and isn't
worth fixing it to use something expensive, such as
percpu_counter_sum(), for each reparented page. __this_cpu_read() isn't
enough to fix this because there's no guarantees of the current cpus
count. The only guarantees is that the sum of all per-cpu counter is >=
nr_pages.
Greg KH [Wed, 30 Oct 2013 18:07:31 +0000 (11:07 -0700)]
USB: Maintainers change for usb serial drivers
Johan has been conned^Wgracious in accepting the maintainership of the
USB serial drivers, especially as he's been doing all of the real work
for the past few years.
At the same time, remove a bunch of old entries for USB serial drivers
that don't make sense anymore, given that the developers are no longer
around, and individual driver maintainerships for tiny things like this
is pretty pointless.
Acked-by: Johan Hovold <jhovold@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Revert all of the pl2303 changes that went into 3.12-rc1 and -rc2 as
they cause regressions on some versions of the chip. This will all be
revisited for later kernel versions when we can figure out how to handle
this in a way that does not break working devices.
Reported-by: Mika Westerberg <mika.westerberg@linux.intel.com> Cc: Frank Schäfer <fschaefer.oss@googlemail.com> Acked-by: Johan Hovold <jhovold@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Revert all of the pl2303 changes that went into 3.12-rc1 and -rc2 as
they cause regressions on some versions of the chip. This will all be
revisited for later kernel versions when we can figure out how to handle
this in a way that does not break working devices.
Reported-by: Mika Westerberg <mika.westerberg@linux.intel.com> Cc: Frank Schäfer <fschaefer.oss@googlemail.com> Acked-by: Johan Hovold <jhovold@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Revert all of the pl2303 changes that went into 3.12-rc1 and -rc2 as
they cause regressions on some versions of the chip. This will all be
revisited for later kernel versions when we can figure out how to handle
this in a way that does not break working devices.
Reported-by: Mika Westerberg <mika.westerberg@linux.intel.com> Cc: Frank Schäfer <fschaefer.oss@googlemail.com> Acked-by: Johan Hovold <jhovold@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Revert all of the pl2303 changes that went into 3.12-rc1 and -rc2 as
they cause regressions on some versions of the chip. This will all be
revisited for later kernel versions when we can figure out how to handle
this in a way that does not break working devices.
Reported-by: Mika Westerberg <mika.westerberg@linux.intel.com> Cc: Frank Schäfer <fschaefer.oss@googlemail.com> Acked-by: Johan Hovold <jhovold@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Revert all of the pl2303 changes that went into 3.12-rc1 and -rc2 as
they cause regressions on some versions of the chip. This will all be
revisited for later kernel versions when we can figure out how to handle
this in a way that does not break working devices.
Reported-by: Mika Westerberg <mika.westerberg@linux.intel.com> Cc: Frank Schäfer <fschaefer.oss@googlemail.com> Acked-by: Johan Hovold <jhovold@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Revert all of the pl2303 changes that went into 3.12-rc1 and -rc2 as
they cause regressions on some versions of the chip. This will all be
revisited for later kernel versions when we can figure out how to handle
this in a way that does not break working devices.
Reported-by: Mika Westerberg <mika.westerberg@linux.intel.com> Cc: Frank Schäfer <fschaefer.oss@googlemail.com> Acked-by: Johan Hovold <jhovold@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Revert all of the pl2303 changes that went into 3.12-rc1 and -rc2 as
they cause regressions on some versions of the chip. This will all be
revisited for later kernel versions when we can figure out how to handle
this in a way that does not break working devices.
Reported-by: Mika Westerberg <mika.westerberg@linux.intel.com> Cc: Frank Schäfer <fschaefer.oss@googlemail.com> Acked-by: Johan Hovold <jhovold@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Revert all of the pl2303 changes that went into 3.12-rc1 and -rc2 as
they cause regressions on some versions of the chip. This will all be
revisited for later kernel versions when we can figure out how to handle
this in a way that does not break working devices.
Reported-by: Mika Westerberg <mika.westerberg@linux.intel.com> Cc: Frank Schäfer <fschaefer.oss@googlemail.com> Acked-by: Johan Hovold <jhovold@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Revert all of the pl2303 changes that went into 3.12-rc1 and -rc2 as
they cause regressions on some versions of the chip. This will all be
revisited for later kernel versions when we can figure out how to handle
this in a way that does not break working devices.
Reported-by: Mika Westerberg <mika.westerberg@linux.intel.com> Cc: Frank Schäfer <fschaefer.oss@googlemail.com> Acked-by: Johan Hovold <jhovold@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Revert all of the pl2303 changes that went into 3.12-rc1 and -rc2 as
they cause regressions on some versions of the chip. This will all be
revisited for later kernel versions when we can figure out how to handle
this in a way that does not break working devices.
Reported-by: Mika Westerberg <mika.westerberg@linux.intel.com> Cc: Frank Schäfer <fschaefer.oss@googlemail.com> Acked-by: Johan Hovold <jhovold@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Revert all of the pl2303 changes that went into 3.12-rc1 and -rc2 as
they cause regressions on some versions of the chip. This will all be
revisited for later kernel versions when we can figure out how to handle
this in a way that does not break working devices.
Reported-by: Mika Westerberg <mika.westerberg@linux.intel.com> Cc: Frank Schäfer <fschaefer.oss@googlemail.com> Acked-by: Johan Hovold <jhovold@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Revert all of the pl2303 changes that went into 3.12-rc1 and -rc2 as
they cause regressions on some versions of the chip. This will all be
revisited for later kernel versions when we can figure out how to handle
this in a way that does not break working devices.
Reported-by: Mika Westerberg <mika.westerberg@linux.intel.com> Cc: Frank Schäfer <fschaefer.oss@googlemail.com> Acked-by: Johan Hovold <jhovold@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
perf test: Update command line callchain attribute tests
The "struct perf_event_attr setup" entry in 'perf test' is in fact a
series of tests that will exec the tools, passing different sets of
command line arguments to then intercept the sys_perf_event_open
syscall, in user space, to check that the perf_event_attr->sample_type
and other feature request bits are setup as expected.
We recently restored the callchain requesting command line argument, -g,
to not require a parameter ("dwarf" or "fp"), instead using a default
("fp" for now) and making the long option variant, --call-chain, be the
one to be used when a different callchain collection method is
preferred.
The "struct perf_event_attr setup" test failed because we forgot to
update the tests involving callchains, not switching from, '-g dwarf' to
'--call-chain dwarf', making 'perf test' detect it:
Fix all of them now to use --call-chain when explicitely specifying a
method.
There is still work to do, as '-g fp', for instance, passed without
problems.
In that case 'perf test' saw no problems as the intercepted syscall got
the bits as expected, i.e. the default is 'fp', but the fact that 'fp'
may be an existing program and the specified workload would then be
passed as a parameter to it is an usability problem that needs fixing.
Next merge window tho.
Acked-by: Jiri Olsa <jolsa@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-jr3oq1k5iywnp7vvqlslzydm@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Wei Yang [Sun, 22 Sep 2013 08:49:24 +0000 (16:49 +0800)]
perf bench: Fix two warnings
There are two warnings in bench/numa, when building this on 32-bit
machine.
The warning output is attached:
bench/numa.c:1113:20: error: comparison between signed and unsigned integer expressions [-Werror=sign-compare]
bench/numa.c:1161:6: error: format ‘%lx’ expects argument of t'long unsigned int’, but argument 5 has type ‘u64’ [-Werror=format]
perf tools: Remove cast of non-variadic function to variadic
The 4fb71074a570 (perf ui/hist: Consolidate hpp helpers) cset introduced
a cast of percent_color_snprintf to a function pointer type with
varargs. Change percent_color_snprintf to be variadic and remove the
cast.
The symptom of this was all percentages being reported as 0.00% in perf
report --stdio output on the armhf arch.
Signed-off-by: Michael Hudson-Doyle <michael.hudson@linaro.org> Acked-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Will Deacon <will.deacon@arm.com> Cc: Jean Pihet <jean.pihet@linaro.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Will Deacon <will.deacon@arm.com> Cc: stable@vger.kernel.org Link: http://lkml.kernel.org/r/87zjppvw7y.fsf@canonical.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Linus Torvalds [Thu, 31 Oct 2013 23:58:23 +0000 (16:58 -0700)]
Merge branch 'akpm' (fixes from Andrew Morton)
Merge four more fixes from Andrew Morton.
* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
lib/scatterlist.c: don't flush_kernel_dcache_page on slab page
mm: memcg: fix test for child groups
mm: memcg: lockdep annotation for memcg OOM lock
mm: memcg: use proper memcg in limit bypass
Ming Lei [Thu, 31 Oct 2013 23:34:17 +0000 (16:34 -0700)]
lib/scatterlist.c: don't flush_kernel_dcache_page on slab page
Commit b1adaf65ba03 ("[SCSI] block: add sg buffer copy helper
functions") introduces two sg buffer copy helpers, and calls
flush_kernel_dcache_page() on pages in SG list after these pages are
written to.
Unfortunately, the commit may introduce a potential bug:
- Before sending some SCSI commands, kmalloc() buffer may be passed to
block layper, so flush_kernel_dcache_page() can see a slab page
finally
- According to cachetlb.txt, flush_kernel_dcache_page() is only called
on "a user page", which surely can't be a slab page.
- ARCH's implementation of flush_kernel_dcache_page() may use page
mapping information to do optimization so page_mapping() will see the
slab page, then VM_BUG_ON() is triggered.
Aaro Koskinen reported the bug on ARM/kirkwood when DEBUG_VM is enabled,
and this patch fixes the bug by adding test of '!PageSlab(miter->page)'
before calling flush_kernel_dcache_page().
Signed-off-by: Ming Lei <ming.lei@canonical.com> Reported-by: Aaro Koskinen <aaro.koskinen@iki.fi> Tested-by: Simon Baatz <gmbnomis@gmail.com> Cc: Russell King - ARM Linux <linux@arm.linux.org.uk> Cc: Will Deacon <will.deacon@arm.com> Cc: Aaro Koskinen <aaro.koskinen@iki.fi> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Cc: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Cc: Tejun Heo <tj@kernel.org> Cc: "James E.J. Bottomley" <JBottomley@parallels.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: <stable@vger.kernel.org> [3.2+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Johannes Weiner [Thu, 31 Oct 2013 23:34:15 +0000 (16:34 -0700)]
mm: memcg: fix test for child groups
When memcg code needs to know whether any given memcg has children, it
uses the cgroup child iteration primitives and returns true/false
depending on whether the iteration loop is executed at least once or
not.
Because a cgroup's list of children is RCU protected, these primitives
require the RCU read-lock to be held, which is not the case for all
memcg callers. This results in the following splat when e.g. enabling
hierarchy mode:
In the memcg case, we only care about children when we are attempting to
modify inheritable attributes interactively. Racing with deletion could
mean a spurious -EBUSY, no problem. Racing with addition is handled
just fine as well through the memcg_create_mutex: if the child group is
not on the list after the mutex is acquired, it won't be initialized
from the parent's attributes until after the unlock.
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> Acked-by: Michal Hocko <mhocko@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Johannes Weiner [Thu, 31 Oct 2013 23:34:13 +0000 (16:34 -0700)]
mm: memcg: use proper memcg in limit bypass
Commit 84235de394d9 ("fs: buffer: move allocation failure loop into the
allocator") allowed __GFP_NOFAIL allocations to bypass the limit if they
fail to reclaim enough memory for the charge. But because the main test
case was on a 3.2-based system, the patch missed the fact that on newer
kernels the charge function needs to return root_mem_cgroup when
bypassing the limit, and not NULL. This will corrupt whatever memory is
at NULL + percpu pointer offset. Fix this quickly before problems are
reported.
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> Acked-by: Michal Hocko <mhocko@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Thu, 31 Oct 2013 22:43:02 +0000 (15:43 -0700)]
vfs: decrapify dput(), fix cache behavior under normal load
We do not want to dirty the dentry->d_flags cacheline in dput() just to
set the DCACHE_REFERENCED flag when it is already set in the common case
anyway. This way the first cacheline of the dentry (which contains the
RCU lookup information etc) can stay shared among multiple CPU's.
This finishes off some of the details of all the scalability patches
merged during the merge window.
Also don't mark dentry_kill() for inlining, since it's the uncommon path
and inlining it just makes the common path slower due to extra function
entry/exit overhead.