perf tools: Fix broken number of samples for perf report -n
authorStephane Eranian <eranian@google.com>
Mon, 3 Oct 2011 09:38:15 +0000 (11:38 +0200)
committerArnaldo Carvalho de Melo <acme@redhat.com>
Fri, 7 Oct 2011 20:00:31 +0000 (17:00 -0300)
The perf report -n option was broken because it was not reporting the
correct number of samples depending on the sorting mode. By default,
samples are sorted by comm,dso,sym. That means that samples for the same
command (binary) get collapsed.

The hists__collapse_insert_entry() had a bug whereby it was aggregating
the number of events observed (periods) but not the number of samples.
Consequently, the number of samples reported could be below reality. The
percentage remained correct because based on the periods.

This patch fixes the problem by also aggregating the number of samples.
Here is an example:

$ perf report -n --stdio
    12.38%        842     pong  [kernel.kallsyms]     [k] __lock_acquire

Here pong (a ctxsw stress test), is the only program running
and thus it is the only one responsible for the lock_acquire samples.

If we change the sorting mode:

$ perf report -n --stdio --sort=sym
    12.38%       1732  [k] __lock_acquire

The actual number of samples is shown.

With the fix:

$ perf report -n --stdio
    12.38%       1732     pong  [kernel.kallsyms]     [k] __lock_acquire

Cc: David Ahern <dsahern@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20111003093815.GA6393@quad
Signed-off-by: Stephane Eranian <eranian@google.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
tools/perf/util/hist.c

index 87ef5c7..50c8fec 100644 (file)
@@ -281,6 +281,7 @@ static bool hists__collapse_insert_entry(struct hists *hists,
 
                if (!cmp) {
                        iter->period += he->period;
+                       iter->nr_events += he->nr_events;
                        if (symbol_conf.use_callchain) {
                                callchain_cursor_reset(&hists->callchain_cursor);
                                callchain_merge(&hists->callchain_cursor, iter->callchain,