Performance Counters for Linux, v8
From: | Ingo Molnar <mingo@elte.hu> | |
To: | linux-kernel@vger.kernel.org | |
Subject: | [Announce] Performance Counters for Linux, v8 | |
Date: | Sat, 6 Jun 2009 22:35:03 +0200 | |
Message-ID: | <20090606203503.GA26208@elte.hu> | |
Cc: | Paul Mackerras <paulus@samba.org>, Peter Zijlstra <a.p.zijlstra@chello.nl>, linux-kernel@vger.kernel.org, Mike Galbraith <efault@gmx.de>, Thomas Gleixner <tglx@linutronix.de>, "H. Peter Anvin" <hpa@zytor.com>, Andrew Morton <akpm@linux-foundation.org>, Linus Torvalds <torvalds@linux-foundation.org>, Arnaldo Carvalho de Melo <acme@redhat.com>, Arjan van de Ven <arjan@infradead.org>, =?iso-8859-1?Q?Fr=E9d=E9ric?= Weisbecker <fweisbec@gmail.com>, Robert Richter <robert.richter@amd.com>, oprofile-list@lists.sf.net, "David S. Miller" <davem@davemloft.net>, Stephane Eranian <eranian@googlemail.com>, Steven Rostedt <rostedt@goodmis.org>, Corey Ashford <cjashfor@linux.vnet.ibm.com> | |
Archive‑link: | Article |
We are pleased to announce version 8 of the performance counters subsystem for Linux. This new subsystem adds a new system call (sys_perf_counter_open()) and it provides the new 'perf' tool that makes use of these new kernel capabilities. This subsystem and this tool is new in that it tries a new approach at integrating all things performance analysis under one roof. There have been many changes since -v7 - see the shortlog below for details. There are a lot of new contributors to this code. Many thanks go to: Peter Zijlstra, Paul Mackerras, Robert Richter, Arnaldo Carvalho de Melo, Mike Galbraith, Thomas Gleixner, Wu Fengguang, Jaswinder Singh Rajput, Yong Wang, Frederic Weisbecker, Yinghai Lu, Luis Henriques, Eric Paris, Arjan van de Ven, Tim Blechmann, Steven Whitehouse, Jaswinder Singh, H. Peter Anvin, Hidetoshi Seto, Erdem Aktas and Andrew Morton. The biggest change in -v8 is a re-focusig of our effort towards building tools to help various user-space development workflows. The latest code and perfcounter-tools deal with all sorts of user-space profiling usage models, they are very fast and are able to look up DSO symbols regardless of where they are loaded - and try to be easy to use and easy to configure. Per-application and system-wide profiling modes are supported - plus a number of intermediate modes are supported as well via the use of inherited counters that traverse into child-task hierarchies automatically and transparently. With perfcounters there is no daemon needed: if a perfcounters kernel is booted on a supported CPU (all AMD models and Core2 / Corei7 / Atom Intel CPUs - both 64-bit and 32-bit user-space is supported) then profiling can be done straight away. Profiling sessions are recorded into local files, which can then be analyzed. There's a number of high-level-overview tools 'perf stat' and 'perf top' which help one get a quick impression about what to profile and in what way. New in -v8 is the 'perf' utility which has merged all the perfcounters utilities and which exposes all the functionality of the kernel subsystem, in one uniform and unified way: mercury:~/tip/tools/perf> perf usage: perf [--version] [--help] COMMAND [ARGS] The most commonly used perf commands are: annotate Read perf.data (created by perf record) and display annotated code list List all symbolic event types record Run a command and record its profile into perf.data report Read perf.data (created by perf record) and display the profile stat Run a command and gather performance counter statistics top Run a command and profile it See 'perf help COMMAND' for more information on a specific command. There's also a new "record + report" separated profilig workflow supported: use "perf record ./my-app" to record its profile, then use "perf report" and all its --sort options to get various high-level and low level details. Oprofile users will find this workflow familar. On the lowest level, 'perf annotate' will annotate the source code alongside profiling information and assembly code: $ perf annotate decode_tree_entry ------------------------------------------------ Percent | Source code & Disassembly of /home/mingo/git/git ------------------------------------------------ : : /home/mingo/git/git: file format elf64-x86-64 : : : Disassembly of section .text: : : 00000000004a0da0 <decode_tree_entry>: : *modep = mode; : return str; : } : : static void decode_tree_entry(struct tree_desc *desc, const char *buf, unsigned long size) : { 3.82 : 4a0da0: 41 54 push %r12 : const char *path; : unsigned int mode, len; : : if (size < 24 || buf[size - 21]) 0.17 : 4a0da2: 48 83 fa 17 cmp $0x17,%rdx : *modep = mode; : return str; : } : : static void decode_tree_entry(struct tree_desc *desc, const char *buf, unsigned long size) : { 0.00 : 4a0da6: 49 89 fc mov %rdi,%r12 0.00 : 4a0da9: 55 push %rbp 3.37 : 4a0daa: 53 push %rbx : const char *path; : unsigned int mode, len; : : if (size < 24 || buf[size - 21]) 0.08 : 4a0dab: 76 73 jbe 4a0e20 <decode_tree_entry+0x80> 0.00 : 4a0dad: 80 7c 16 eb 00 cmpb $0x0,-0x15(%rsi,%rdx,1) 3.48 : 4a0db2: 75 6c jne 4a0e20 <decode_tree_entry+0x80> : static const char *get_mode(const char *str, unsigned int *modep) : { : unsigned char c; : unsigned int mode = 0; : : if (*str == ' ') 1.94 : 4a0db4: 0f b6 06 movzbl (%rsi),%eax 0.39 : 4a0db7: 3c 20 cmp $0x20,%al 0.00 : 4a0db9: 74 65 je 4a0e20 <decode_tree_entry+0x80> : return NULL; : : while ((c = *str++) != ' ') { 0.06 : 4a0dbb: 89 c2 mov %eax,%edx : if (c < '0' || c > '7') 1.99 : 4a0dbd: 31 ed xor %ebp,%ebp : unsigned int mode = 0; : : if (*str == ' ') : return NULL; : : while ((c = *str++) != ' ') { 1.74 : 4a0dbf: 48 8d 5e 01 lea 0x1(%rsi),%rbx : if (c < '0' || c > '7') 0.00 : 4a0dc3: 8d 42 d0 lea -0x30(%rdx),%eax 0.17 : 4a0dc6: 3c 07 cmp $0x7,%al 0.00 : 4a0dc8: 76 0d jbe 4a0dd7 <decode_tree_entry+0x37> 0.00 : 4a0dca: eb 54 jmp 4a0e20 <decode_tree_entry+0x80> 0.00 : 4a0dcc: 0f 1f 40 00 nopl 0x0(%rax) 16.57 : 4a0dd0: 8d 42 d0 lea -0x30(%rdx),%eax 0.14 : 4a0dd3: 3c 07 cmp $0x7,%al 0.00 : 4a0dd5: 77 49 ja 4a0e20 <decode_tree_entry+0x80> : return NULL; : mode = (mode << 3) + (c - '0'); 3.12 : 4a0dd7: 0f b6 c2 movzbl %dl,%eax : unsigned int mode = 0; : : if (*str == ' ') : return NULL; : : while ((c = *str++) != ' ') { 0.00 : 4a0dda: 0f b6 13 movzbl (%rbx),%edx 16.74 : 4a0ddd: 48 83 c3 01 add $0x1,%rbx : if (c < '0' || c > '7') : return NULL; : mode = (mode << 3) + (c - '0'); Those who already use Git will (hopefully) find 'perf' intuitive, as we've picked up a number of internal libraries from Git to build this tool so the look-and-feel will be familar. It's very extensible, new subcommands can be added easily - while there's just a single new binary in the system. 'perf report' supports multi-key histograms and a rich set of views of the same performance data - per task or per dso, or a finegrained per symbol view (and all permutations of these keys). Most of the user-visible action in -v8 was in the tooling, but the kernel side code has been revamped all around as well: - Sampling support for inherited counters - Performance optimizations to lazy-switch PMU contexts - Enhanced PowerPC and x86 support. - Generic tracepoints can be used via perfcounters too - Fixed-frequency, auto-sampling counters. (they can be used via the '-F' option in perf record and perf top.) - Generic "hardware cache" event enumeration method - for those who want more than just a handful of essential hardware counters. - Automatic "fool-proof" event-throttling code to protect against accidentally too short sampling periods. - The 'raw events' configuration space has been extended - every event type that oprofile is able to handle can be specified via raw perfcounter events as well. - ... and lots of other changes. To try/test/check this code, the latest perfcounters tree can be pulled/cloned from: git pull \ git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip.git \ perfcounters/core Or the following patch can be applied to the latest (v2.6.30-rc8-git3) upstream -git Linux kernel: http://redhat.com/~mingo/perfcounters/perfcounters-v8-v2.... The 'perf' utility can be built by pulling that tree and by doing: cd tools/perf/ make make install ( The combo patch is too large to be posted to lkml - and all the v7->v8 patches have been posted to lkml already. ) As usual, test feedback, patche, comments and suggestions are welcome! Ingo ------------------> Andrew Morton (1): mutex: add atomic_dec_and_mutex_lock(), fix Arjan van de Ven (2): perf_counter tools: Warning fixes on 32-bit perf_counter tools: Initialize a stack variable before use Arnaldo Carvalho de Melo (26): perf record: Allow specifying a pid to record perf_counter: First part of 'perf report' conversion to C + elfutils perf_counter: Implement dso__load using libelf perf_counter: Use rb_trees in perf report perf_counter: Add our private copy of list.h perf_counter: Use rb_tree for symhists and threads in report perf report: Fix kernel symbol resolution perf: Don't assume /proc/kallsyms is ordered perf report: Sort output by symbol usage perf report: Use hex2long instead of sscanf perf report: Only load text symbols from kallsyms perf report: Show the IP only in --verbose mode perf_counter tools: Move symbol resolution classes from report to libperf perf_counter tools: struct symbol priv area perf_counter tools: Consolidate dso methods to load kernel symbols perf_counter tools: Optionally pass a symbol filter to the dso load routines perf_counter tools: Convert builtin-top to use libperf symbol routines perf_counter tools: Shorten the DSO names using cwd perf_counter tools: Add locking to perf top perf_counter tools: Add string.[ch] perf_counter tools: Use hex2u64 in more places perf_counter tools: Add missing rb_erase in dso__delete_symbols perf_counter tools: Cover PLT symbols too perf_counter tools: Fix off-by-one bug in symbol__new perf report: Fix rbtree bug perf report: Add -vvv to print the list of threads and its mmaps Erdem Aktas (1): perf_counter tools: fix buffer overwrite problem for perf top command Eric Paris (2): mutex: add atomic_dec_and_mutex_lock() mutex: add atomic_dec_and_mutex_lock() Frederic Weisbecker (3): perf_counter: Sleep before refresh using poll in perf top perf_counter tools: Fix warn_unused_result warnings perf top: Fix zero or negative refresh delay H. Peter Anvin (1): mutex: drop "inline" from mutex_lock() inside kernel/mutex.c Hidetoshi Seto (1): x86: smarten /proc/interrupts output for new counters Ingo Molnar (183): performance counters: documentation performance counters: x86 support x86, perfcounters: read out MSR_CORE_PERF_GLOBAL_STATUS with counters disabled perfcounters: select ANON_INODES perfcounters, x86: simplify disable/enable of counters perfcounters, x86: clean up debug code perfcounters: consolidate global-disable codepaths perf counters: restructure the API perf counters: add support for group counters perf counters: group counter, fixes perf counters: hw driver API perf counters: implement PERF_COUNT_CPU_CLOCK perf counters: consolidate hw_perf save/restore APIs perf counters: implement PERF_COUNT_TASK_CLOCK perf counters: add prctl interface to disable/enable counters perf counters: clean up state transitions perf counters: update docs x86: implement atomic64_t on 32-bit perfcounters: restructure x86 counter math perfcounters: implement "counter inheritance" perfcounters: fix task clock counter perfcounters: add context switch counter perfcounters: add task migrations counter perfcounters: add nr-of-faults counter perfcounters: fix non-intel-perfmon CPUs perfcounters, x86: fix sw counters on non-PMC CPUs perfcounters: fix lapic initialization perfcounters: release CPU context when exiting task counters perfcounters: flush on setuid exec perfcounters: use hw_event.disable flag perfcounters: remove warnings perfcounters: tweak group scheduling x86, perfcounters: rename intel_arch_perfmon.h => perf_counter.h x86, perfcounters: prepare for fixed-mode PMCs perfcounters: add fixed-mode PMC enumeration x86, perfcounters: refactor code for fixed-function PMCs perfcounters: hw ops rename perfcounters: fix task clock counter perfcounters: pull inherited counters perfcounters: fix init context lock perfcounters: enable lowlevel pmc code to schedule counters x86, perfcounters: print out the ->used bitmask perfcounters: remove ->nr_inherited perfcounters: generalize the counter scheduler perfcounters: add PERF_COUNT_BUS_CYCLES x86, perfcounters: add support for fixed-function pmcs perfcounters: include asm/perf_counter.h only if CONFIG_PERF_COUNTERS=y perfcounters: fix "perf counters kills oprofile" bug, v2 perfcounters: remove duplicate definition of LOCAL_PERF_VECTOR perfcounters: fix acpi_idle_do_entry() workaround perfcounters: fix reserved bits sizing perf_counter: fix crash on perfmon v1 systems perf_counter: create Documentation/perf_counter/ and move perfcounters.txt there perf_counter: add sample user-space to Documentation/perf_counter/ perf_counter tools: tidy up in-kernel dependencies perf_counter tools: fix build warning in kerneltop.c perf_counter tools: increase cpu-cycles again x86, perfcounters: add atomic64_xchg() perf_counter: fix off task->comm by one perf_counter tools: include PID in perf-report output, tweak user/kernel printut perf_counter: copy in Git's top Makefile perf_counter tools: add in basic glue from Git perf_counter tools: clean up after introduction of the Git command framework perf_counter tools: separate kerneltop into 'perf top' and 'perf stat' perf_counter tools: add help texts perf_counter tools: add 'perf record' command perf_counter tools: fix --version perf_counter tools: add 'perf help' perf_counter tools: fix 'make install' perfcounters, sched: remove __task_delta_exec() perf_counter tools: move helper library to util/* perf_counter: add/update copyrights perf_counter tools: add perf-report to the Makefile perf_counter tools: perf stat: make -l default-on perf_counter tools: fix infinite loop in perf-report on zeroed event records perf_counter tools: fix x86 syscall numbers perf_counter: round-robin per-CPU counters too perf_counter: initialize the per-cpu context earlier perf_counter: convert perf_resource_mutex to a spinlock perf_counter: fix fixed-purpose counter support on v2 Intel-PERFMON perf_counter tools: remove debug code from builtin-stat.c perf_counter: x86: Fix throttling perf_counter: x86: Disallow interval of 1 perf_counter: x86: Protect against infinite loops in intel_pmu_handle_irq() perf_counter: Remove ACPI quirk perf stat: handle Ctrl-C perf_counter: fix threaded task exit perf_counter, x86: fix zero irq_period counters perf_counter, x86: speed up the scheduling fast-path perf_counter: fix counter freeing logic perf_counter: fix counter inheritance race perf_counter: Fix context removal deadlock perf_counter: fix !PERF_COUNTERS build failure perf_counter tools: increase limits perf_counter: Increase mmap limit perf_counter tools: increase limits, fix perf_counter: Move child perfcounter init to after scheduler init perf stat: flip around ':k' and ':u' flags Revert "perf_counter, x86: speed up the scheduling fast-path" perf_counter: fix warning & lockup perf_counter, x86: Fix APIC NMI programming perf_counter, x86: Make NMI lockups more robust perf_counter: Initialize ->oncpu properly perf record: Straighten out argv types perf stat: Remove unused variable perf record: Convert to Git option parsing perf_counter tools: Librarize event string parsing perf stat: Convert to Git option parsing perf top: Convert to Git option parsing perf_counter tools: remove the standalone perf-report utility perf record: Convert to Git option parsing perf report: Add help/manpage perf report: add --dump-raw-trace option perf report: add counter for unknown events perf report: add more debugging perf report: Only load text symbols from kallsyms, fix perf_counter tools: Introduce stricter C code checking perf_counter tools: Rename output.perf to perf.data perf_counter tools: Add built-in pager support perf report: Remove <ctype.h> include pref_counter: tools: report: Add header printout & prettify pref_counter: tools: report: Robustify in case of weird events perf_counter: Fix perf_counter_init_task() on !CONFIG_PERF_COUNTERS perf_counter tools: report: Add help text for --sort perf_counter tools: Clean up builtin-stat.c's do_perfstat() perf_counter tools: Split display into reading and printing perf_counter tools: Also display time-normalized stat results perf_counter: Fix cpuctx->task_ctx races perf_counter: Robustify counter-free logic perf_counter tools: Print 'CPU utilization factor' in builtin-stat perf_counter tools: Fix 'make install' perf_counter tools: Generate per command manpages (and pdf/html, etc.) perf_counter tools: Fix unknown command help text perf_counter: Tidy up style details perf report: Clean up the default output perf report: Fix column width/alignment of dsos perf record: Add --append option perf record: Increase mmap buffering default perf report: Print more info instead of <unknown> entries perf_counter tools: Make source code headers more coherent perf record: Print out the number of events captured perf report: Print -D to stdout perf report: Improve sort key recognition perf report: Handle vDSO symbols properly perf_counter tools: Clean up old kerneltop references perf record: Refine capture printout perf report: Display 100% correctly perf stat: Print out all arguments perf report: Add front-entry cache for lookups perf help: Fix bug when there's no perf-* command around perf_counter tools: Optimize harder perf_counter tools: Work around warnings in older GCCs perf_counter: Fix throttling lock-up perf report: Clean up event processing perf report: Split out event processing helpers perf report: Handle all known event types perf top: Reduce default filter threshold perf record/report: Fix PID/COMM handling perf_counter tools: Build with native optimization perf_counter tools: Print out symbol parsing errors only if --verbose perf report: Print out the total number of events perf_counter tools: Add color terminal output support perf_counter tools: Dont output in color on !tty perf report: Bail out if there are unrecognized options/arguments perf stat: Update help text perf record: Split out counter creation into a helper function perf record, top: Implement --freq perf report: Display user/kernel differentiator perf_counter tools: Clarify events/samples naming perf_counter tools: Remove -march=native perf_counter tools: Sample and display frequency adjustment changes perf record: Set frequency correctly perf_counter: Separate out attr->type from attr->config perf_counter: Implement generalized cache event types perf_counter tools: Fix cache-event printout perf_counter tools: Uniform help printouts perf_counter tools: Tidy up manpage details perf_counter tools: Prepare for 'perf annotate' perf_counter tools: Add 'perf annotate' feature perf_counter tools: Move from Documentation/perf_counter/ to tools/perf/ perf_counter tools: Fix error condition in parse_aliases() perf annotate: Automatically pick up vmlinux in the local directory perf annotate: Fix command line help text Jaswinder Singh (1): x86: perf_counter.c intel_perfmon_event_map and max_intel_perfmon_events should be static Jaswinder Singh Rajput (7): x86: perf_counter remove unwanted hw_perf_enable_all x86: irqinit_32.c fix compilation warning x86: prepare perf_counter to add more cpus x86: AMD Support for perf_counter x86: decent declarations in perf_counter.c x86: use pr_info in perf_counter.c x86: perf_counter cleanup Luis Henriques (2): perf_counter: fix alignment in /proc/interrupts locking, rtmutex.c: Documentation cleanup Mike Galbraith (23): perfcounters: throttle on too high IRQ rates perfcounters: ratelimit performance counter interrupts perfcounters fix section mismatch warning in perf_counter.c::perf_counters_lapic_init() perfcounters: fix refcounting bug perfcounters: fix "perf counters kill oprofile" bug perf_counters: account NMI interrupts perfcounters: fix use after free in perf_release() perf_counter tools: kerneltop: add real-time data acquisition thread perf_counter tools: kerneltop: display per function percentage along with event count perf_counter tools: fix build error perf_counter, x86: clean up throttling printk perf top: fix segfault perf top: Reduce display overhead perf top: Remove leftover NMI/IRQ bits perf top: fix typo in -d option perf record: Fix the profiling of existing pid or whole box perf_counter tools: Document '--' option parsing terminator perf_counter tools: Fix top symbol table dump typo perf_counter tools: Fix top symbol table max_ip typo perf_counter tools: Guard against record damaging existing files perf_counter tools: Make .gitignore reflect perf_counter tools files perf_counter tools: Cleanup Makefile perf_counter tools: Fix uninitialized variable in perf-report.c Paul Mackerras (62): perf_counter: Fix return value from dummy hw_perf_counter_init perf_counter: Fix the cpu_clock software counter perf_counter: Add optional hw_perf_group_sched_in arch function perf_counter: Add dummy perf_counter_print_debug function powerpc/perf_counter: Add perf_counter system call on powerpc powerpc: Provide a way to defer perf counter work until interrupts are enabled powerpc/perf_counter: Add generic support for POWER-family PMU hardware powerpc/perf_counter: Add support for PPC970 family powerpc/perf_counter: Add support for POWER6 perf_counter: Always schedule all software counters in powerpc/perf_counter: Make sure PMU gets enabled properly perf_counter: Add support for pinned and exclusive counter groups perf_counter: Add counter enable/disable ioctls perf_counters: make software counters work as per-cpu counters perf_counters: allow users to count user, kernel and/or hypervisor events perfcounters: fix refcounting bug, take 2 perfcounters: make context switch and migration software counters work again perfcounters/powerpc: Make exclude_kernel bit work on Apple G5 processors perfcounters/powerpc: Add support for POWER5 processors perfcounters: fix a few minor cleanliness issues perfcounters: provide expansion room in the ABI perfcounters/powerpc: fix oops with multiple counters in a group perfcounters/powerpc: add support for POWER5+ processors perfcounters/powerpc: add support for POWER4 processors perf_counter: abstract wakeup flag setting in core to fix powerpc build perf_counter: powerpc: clean up perc_counter_interrupt perf_counter: fix type/event_id layout on big-endian systems perf_counter: add an mmap method to allow userspace to read hardware counters perf_counter tools: remove glib dependency and fix bugs in kerneltop.c perf_counter: update documentation perf_counter: record time running and time enabled for each counter perf_counter: powerpc: only reserve PMU hardware when we need it perf_counter: make it possible for hw_perf_counter_init to return error codes perf_counter tools: optionally scale counter values in perfstat mode perf_counter: fix powerpc build perf_counter: powerpc: set sample enable bit for marked instruction events perf_counter: add MAINTAINERS entry perf_counter: powerpc: add nmi_enter/nmi_exit calls perf_counter: powerpc: allow use of limited-function counters perf_counter: update copyright notice perf_counter: Put whole group on when enabling group leader perf_counter: don't count scheduler ticks as context switches perf_counter: call atomic64_set for counter->count perf_counter: call hw_perf_save_disable/restore around group_sched_in perf_counter: powerpc: use u64 for event codes internally perf_counter: allow arch to supply event misc flags and instruction pointer perf_counter: powerpc: supply more precise information on counter overflow events perf_counter: powerpc: initialize cpuhw pointer before use perf_counter: Dynamically allocate tasks' perf_counter_context struct perf_counter: Optimize context switch between identical inherited contexts perf_counter: powerpc: Implement interrupt throttling perf_counter: Fix race in attaching counters to tasks and exiting perf_counter: Don't swap contexts containing locked mutex perf_counter: Provide functions for locking and pinning the context for a task perf_counter: Allow software counters to count while task is not running perf_counter: Initialize per-cpu context earlier on cpu up perf_counter: Fix cpu migration counter perf_counter: Remove unused prev_state field perf_counter: powerpc: Fix event alternative code generation on POWER5/5+ perf_counter: powerpc: Fix race causing "oops trying to read PMC0" errors perf_counter: powerpc: Use new identifier names in powerpc-specific code perf_counter: Fix lockup with interrupting counters Peter Zijlstra (169): perfcounters: IRQ and NMI support on AMD CPUs perfcounters: IRQ and NMI support on AMD CPUs, fix x86: perf_counter cleanup perf_counter: x86: fix 32-bit irq_period assumption perf_counter: use list_move_tail() perf_counter: add comment to barrier perf_counter: x86: use ULL postfix for 64bit constants perf_counter: software counter event infrastructure perf_counter: provide pagefault software events perf_counter: provide major/minor page fault software events perf_counter: hrtimer based sampling for software time events perf_counter: add an event_list perf_counter: fix hrtimer sampling perf_counter: fix uninitialized usage of event_list perf_counter: generic context switch event perf_counter: fix up counter free paths perf_counter: hook up the tracepoint events perf_counter: revamp syscall input ABI perf_counter: unify irq output code perf_counter: remove the event config bitfields perf_counter: avoid recursion perf_counter: new output ABI - part 1 perf_counter tools: update to new syscall ABI perf_counter tools: use mmap() output perf_counter tools: remove glib dependency and fix bugs in kerneltop.c, fix poll() perf_counter: fix perf_poll() perf_counter: more elaborate write API perf_counter: output objects perf_counter: sanity check on the output API perf_counter: optionally provide the pid/tid of the sampled task perf_counter: kerneltop: mmap_pages argument perf_counter: kerneltop: output event support perf_counter: allow and require one-page mmap on counting counters perf_counter: unify and fix delayed counter wakeup perf_counter: fix update_userpage() perf_counter: kerneltop: simplify data_head read perf_counter: executable mmap() information perf_counter: kerneltop: parse the mmap data stream perf_counter: x86: proper error propagation for the x86 hw_perf_counter_init() perf_counter: small cleanup of the output routines perf_counter: re-arrange the perf_event_type perf_counter tools: kerneltop: update event_types perf_counter: provide generic callchain bits perf_counter: x86: callchain support perf_counter: pmc arbitration perf_counter: move the event overflow output bits to record_type perf_counter: per event wakeups perf_counter: kerneltop: update to new ABI perf_counter: add more context information perf_counter: update mmap() counter read perf_counter: update mmap() counter read, take 2 perf_counter: add more context information perf_counter: SIGIO support perf_counter: generalize pending infrastructure perf_counter: x86: self-IPI for pending work perf_counter: theres more to overflow than writing events perf_counter: fix the mlock accounting perf_counter: PERF_RECORD_TIME perf_counter: counter overflow limit perf_counter: comment the perf_event_type stuff perf_counter: change event definition perf_counter: rework context time perf_counter: rework the task clock software counter perf_counter: remove rq->lock usage perf_counter: minimize context time updates perf_counter: fix NMI race in task clock perf_counter: provide misc bits in the event header perf_counter: use misc field to widen type perf_counter: kerneltop: keep up with ABI changes perf_counter: add some comments perf_counter: track task-comm data perf_counter: some simple userspace profiling perf_counter: move PERF_RECORD_TIME perf_counter: allow for data addresses to be recorded perf_counter: optimize mmap/comm tracking perf_counter: sysctl for system wide perf counters perf_counter: log full path names perf_counter tools: fix Documentation/perf_counter build error perf_counter: fix race in perf_output_* perf_counter: fix nmi-watchdog interaction perf_counter: tool: handle 0-length data files perf_counter: documentation update perf_counter: x86: fixup nmi_watchdog vs perf_counter boo-boo perf_counter: uncouple data_head updates from wakeups perf_counter: add ioctl(PERF_COUNTER_IOC_RESET) perf_counter: provide an mlock threshold perf_counter: fix the output lock perf_counter: inheritable sample counters perf_counter: tools: update the tools to support process and inherited counters perf_counter: optimize perf_counter_task_tick() perf_counter: rework ioctl()s perf_counter: add PERF_RECORD_CONFIG perf_counter: add PERF_RECORD_CPU perf_counter: fix print debug irq disable perf_counter: x86: More accurate counter update perf_counter: x86: Allow unpriviliged use of NMIs perf_counter: Fix perf_output_copy() WARN to account for overflow perf_counter: x86: Fix up the amd NMI/INT throttle perf_counter: Rework the perf counter disable/enable perf_counter: x86: Robustify interrupt handling perf_counter: remove perf_disable/enable exports perf_counter: per user mlock gift perf_counter: frequency based adaptive irq_period perf top: update to use the new freq interface perf_counter: frequency based adaptive irq_period, 32-bit fix perf_counter: Fix inheritance cleanup code perf_counter: Fix counter inheritance perf_counter: Solve the rotate_ctx vs inherit race differently perf_counter: Log irq_period changes perf_counter: Optimize disable of time based sw counters perf_counter: Optimize sched in/out of counters perf_counter: Fix dynamic irq_period logging perf_counter: Sanitize counter->mutex perf_counter: Sanitize context locking perf_counter: Fix userspace build perf_counter: Simplify context cleanup perf_counter: Change pctrl() behaviour perf_counter: Remove perf_counter_context::nr_enabled perf_counter: Fix perf-$cmd invokation perf_counter: Remove unused ABI bits perf_counter: Make pctrl() affect inherited counters too perf_counter: Propagate inheritance failures down the fork() path perf_counter: Fix PERF_COUNTER_CONTEXT_SWITCHES for cpu counters perf_counter: x86: Expose INV and EDGE bits perf_counter: x86: Remove interrupt throttle perf_counter: Generic per counter interrupt throttle perf report: Fix segfault on unknown symbols perf report: Fix ELF symbol parsing perf report: More robust error handling perf_counter: tools: /usr/lib/debug%s.debug support perf_counter: tools: report: Add vmlinux support perf_counter: tools: report: Rework histogram code perf_counter: tools: report: Dynamic sort/print bits pref_counter: tools: report: Add --sort option perf_counter: tools: report: Add comm sorting pref_counter: tools: report: Add dso sorting perf_counter tools: report: Implement header output for --sort variants perf_counter: Fix COMM and MMAP events for cpu wide counters perf_counter: Clean up task_ctx vs interrupts perf_counter: Ammend cleanup in fork() fail perf_counter: Use PID namespaces properly perf_counter: tools: Expand the COMM,MMAP event synthesizer perf_counter: tools: Better handle existing data files perf_counter tools: Remove the last nmi bits x86: Fix atomic_long_xchg() on 64bit perf_counter: Add unique counter id perf_counter: Rename various fields perf_counter: Remove the last nmi/irq bits perf_counter: x86: Emulate longer sample periods perf_counter: Change data head from u32 to u64 perf_counter: Add ioctl for changing the sample period/frequency perf_counter: Rename perf_counter_hw_event => perf_counter_attr perf_counter tools: Fix up the ABI shakeup perf report: Separate out idle threads perf_counter: Add a comm hook for pure fork()s perf record: Use long arg for counter period perf report: Fix comm sorting perf_counter: Fix race in counter initialization perf report: Simplify symbol output perf report: Add consistent spacing rules perf_counter: Add fork event perf_counter: Remove munmap stuff perf_counter tools: Use fork and remove munmap events x86: Set context.vdso before installing the mapping perf_counter: Generate mmap events for install_special_mapping() perf report: Deal with maps perf_counter: Change PERF_SAMPLE_CONFIG into PERF_SAMPLE_ID perf_counter: Add PERF_SAMPLE_PERIOD perf_counter: Fix frequency adjustment for < HZ Robert Richter (30): perf_counter, x86: remove X86_FEATURE_ARCH_PERFMON flag for AMD cpus perf_counter, x86: declare perf_max_counters only for CONFIG_PERF_COUNTERS perf_counter, x86: add default path to cpu detection perf_counter, x86: rework pmc_amd_save_disable_all() and pmc_amd_restore_all() perf_counter, x86: protect per-cpu variables with compile barriers only perfcounters: rename struct hw_perf_counter_ops into struct pmu perf_counter, x86: rename struct pmc_x86_ops into struct x86_pmu perf_counter, x86: make interrupt handler model specific perf_counter, x86: remove get_status() from struct x86_pmu perf_counter, x86: remove ack_status() from struct x86_pmu perf_counter, x86: rename __hw_perf_counter_set_period into x86_perf_counter_set_period perf_counter, x86: rename intel only functions perf_counter, x86: modify initialization of struct x86_pmu perf_counter, x86: make x86_pmu data a static struct perf_counter, x86: move counter parameters to struct x86_pmu perf_counter, x86: make pmu version generic perf_counter, x86: make x86_pmu_read() static inline perf_counter, x86: rename cpuc->active_mask perf_counter, x86: generic use of cpuc->active perf_counter, x86: consistent use of type int for counter index perf_counter, x86: rework counter enable functions perf_counter, x86: rework counter disable functions perf_counter, x86: change and remove pmu initialization checks perf_counter, x86: implement the interrupt handler for AMD cpus perf_counter, x86: return raw count with x86_perf_counter_update() perf_counter, x86: introduce max_period variable perf_counter, x86: remove vendor check in fixed_mode_idx() perf_counter, x86: remove unused function argument in intel_pmu_get_status() perf_counter: update 'perf top' documentation perf_counter, x86: rename bitmasks to ->used_mask and ->active_mask Steven Whitehouse (1): perfcounters: export perf_tpcounter_event Thomas Gleixner (14): performance counters: core code perf counters: protect them against CSTATE transitions perf counters: clean up 'raw' type API perf counters: expand use of counter->event signals: split do_tkill signals: implement sys_rt_tgsigqueueinfo x86: hookup sys_rt_tgsigqueueinfo perf_counter tools: remove build generated files perfcounter tools: move common defines ... to local header file perfcounter tools: make rdclock an inline function perfcounter tools: fix pointer mismatch perfcounter tools: get the syscall number from arch/*/include/asm/unistd.h perf_counter tools: Add 'perf list' to list available events perf_counter tools: Add help for perf list Tim Blechmann (1): perf_counter: include missing header Wu Fengguang (9): perf_counter tools: Merge common code into perfcounters.h perf_counter tools: Move perfstat supporting code into perfcounters.h perf_counter tools: support symbolic event names in kerneltop perf_counter tools: Reuse event_name() in kerneltop perf_counter tools: move remaining code into kerneltop.c perf_counter tools: fix comment for sym_weight() perf_counter tools: fix event_id type perf_counter tools: cut down default count for cpu-cycles perf_counter tools: when no command is feed to perfstat, display help and exit Yinghai Lu (2): perf_counter: more barrier in blank weak function x86: make irqinit_32.c more like irqinit_64.c, v2 Yong Wang (4): perf_counter/x86: Always use NMI for performance-monitoring interrupt perf_counter/x86: Remove the IRQ (non-NMI) handling bits perf_counter: Documentation update perf_counter tools: Fix incorrect printf formats MAINTAINERS | 10 + arch/powerpc/include/asm/hw_irq.h | 39 + arch/powerpc/include/asm/paca.h | 1 + arch/powerpc/include/asm/perf_counter.h | 95 + arch/powerpc/include/asm/reg.h | 2 + arch/powerpc/include/asm/systbl.h | 2 +- arch/powerpc/include/asm/unistd.h | 1 + arch/powerpc/kernel/Makefile | 2 + arch/powerpc/kernel/asm-offsets.c | 1 + arch/powerpc/kernel/entry_64.S | 9 + arch/powerpc/kernel/irq.c | 5 + arch/powerpc/kernel/perf_counter.c | 1214 ++++++ arch/powerpc/kernel/power4-pmu.c | 557 +++ arch/powerpc/kernel/power5+-pmu.c | 630 ++++ arch/powerpc/kernel/power5-pmu.c | 570 +++ arch/powerpc/kernel/power6-pmu.c | 490 +++ arch/powerpc/kernel/ppc970-pmu.c | 441 +++ arch/powerpc/mm/fault.c | 10 +- arch/powerpc/platforms/Kconfig.cputype | 1 + arch/x86/Kconfig | 1 + arch/x86/ia32/ia32entry.S | 4 +- arch/x86/include/asm/atomic_32.h | 236 ++ arch/x86/include/asm/entry_arch.h | 2 +- arch/x86/include/asm/hardirq.h | 2 + arch/x86/include/asm/hw_irq.h | 2 + arch/x86/include/asm/intel_arch_perfmon.h | 31 - arch/x86/include/asm/irq_vectors.h | 8 +- arch/x86/include/asm/perf_counter.h | 100 + arch/x86/include/asm/unistd_32.h | 2 + arch/x86/include/asm/unistd_64.h | 5 +- arch/x86/kernel/apic/apic.c | 3 + arch/x86/kernel/cpu/Makefile | 12 +- arch/x86/kernel/cpu/common.c | 2 + arch/x86/kernel/cpu/perf_counter.c | 1417 +++++++ arch/x86/kernel/cpu/perfctr-watchdog.c | 4 +- arch/x86/kernel/entry_64.S | 5 + arch/x86/kernel/irq.c | 10 + arch/x86/kernel/irqinit_32.c | 59 +- arch/x86/kernel/irqinit_64.c | 12 +- arch/x86/kernel/signal.c | 1 - arch/x86/kernel/syscall_table_32.S | 2 + arch/x86/kernel/traps.c | 15 +- arch/x86/mm/fault.c | 12 +- arch/x86/oprofile/nmi_int.c | 7 +- arch/x86/oprofile/op_model_ppro.c | 10 +- arch/x86/vdso/vdso32-setup.c | 6 +- arch/x86/vdso/vma.c | 7 +- drivers/char/sysrq.c | 2 + fs/exec.c | 9 + include/asm-generic/atomic.h | 2 +- include/linux/compat.h | 2 + include/linux/init_task.h | 10 + include/linux/kernel_stat.h | 5 + include/linux/mutex.h | 1 + include/linux/perf_counter.h | 685 ++++ include/linux/prctl.h | 3 + include/linux/sched.h | 21 +- include/linux/signal.h | 2 + include/linux/syscalls.h | 5 + init/Kconfig | 35 + kernel/Makefile | 1 + kernel/compat.c | 11 + kernel/exit.c | 16 +- kernel/fork.c | 12 + kernel/mutex.c | 27 +- kernel/perf_counter.c | 4160 +++++++++++++++++++++ kernel/rtmutex.c | 8 +- kernel/sched.c | 57 +- kernel/signal.c | 56 +- kernel/sys.c | 7 + kernel/sys_ni.c | 3 + kernel/sysctl.c | 27 + kernel/timer.c | 3 + mm/mmap.c | 5 + tools/perf/.gitignore | 16 + tools/perf/Documentation/Makefile | 300 ++ tools/perf/Documentation/asciidoc.conf | 91 + tools/perf/Documentation/manpage-1.72.xsl | 14 + tools/perf/Documentation/manpage-base.xsl | 35 + tools/perf/Documentation/manpage-bold-literal.xsl | 17 + tools/perf/Documentation/manpage-normal.xsl | 13 + tools/perf/Documentation/manpage-suppress-sp.xsl | 21 + tools/perf/Documentation/perf-annotate.txt | 29 + tools/perf/Documentation/perf-help.txt | 38 + tools/perf/Documentation/perf-list.txt | 25 + tools/perf/Documentation/perf-record.txt | 42 + tools/perf/Documentation/perf-report.txt | 26 + tools/perf/Documentation/perf-stat.txt | 66 + tools/perf/Documentation/perf-top.txt | 39 + tools/perf/Documentation/perf.txt | 24 + tools/perf/Makefile | 929 +++++ tools/perf/builtin-annotate.c | 1355 +++++++ tools/perf/builtin-help.c | 461 +++ tools/perf/builtin-list.c | 20 + tools/perf/builtin-record.c | 544 +++ tools/perf/builtin-report.c | 1291 +++++++ tools/perf/builtin-stat.c | 339 ++ tools/perf/builtin-top.c | 692 ++++ tools/perf/builtin.h | 26 + tools/perf/command-list.txt | 10 + tools/perf/design.txt | 442 +++ tools/perf/perf.c | 428 +++ tools/perf/perf.h | 67 + tools/perf/util/PERF-VERSION-GEN | 42 + tools/perf/util/abspath.c | 117 + tools/perf/util/alias.c | 77 + tools/perf/util/cache.h | 119 + tools/perf/util/color.c | 241 ++ tools/perf/util/color.h | 36 + tools/perf/util/config.c | 873 +++++ tools/perf/util/ctype.c | 26 + tools/perf/util/environment.c | 9 + tools/perf/util/exec_cmd.c | 165 + tools/perf/util/exec_cmd.h | 13 + tools/perf/util/generate-cmdlist.sh | 24 + tools/perf/util/help.c | 367 ++ tools/perf/util/help.h | 29 + tools/perf/util/levenshtein.c | 84 + tools/perf/util/levenshtein.h | 8 + tools/perf/util/list.h | 603 +++ tools/perf/util/pager.c | 99 + tools/perf/util/parse-events.c | 316 ++ tools/perf/util/parse-events.h | 17 + tools/perf/util/parse-options.c | 508 +++ tools/perf/util/parse-options.h | 174 + tools/perf/util/path.c | 353 ++ tools/perf/util/quote.c | 481 +++ tools/perf/util/quote.h | 68 + tools/perf/util/rbtree.c | 383 ++ tools/perf/util/rbtree.h | 171 + tools/perf/util/run-command.c | 395 ++ tools/perf/util/run-command.h | 93 + tools/perf/util/sigchain.c | 52 + tools/perf/util/sigchain.h | 11 + tools/perf/util/strbuf.c | 359 ++ tools/perf/util/strbuf.h | 137 + tools/perf/util/string.c | 34 + tools/perf/util/string.h | 8 + tools/perf/util/symbol.c | 576 +++ tools/perf/util/symbol.h | 47 + tools/perf/util/usage.c | 80 + tools/perf/util/util.h | 410 ++ tools/perf/util/wrapper.c | 206 + 143 files changed, 26321 insertions(+), 122 deletions(-) create mode 100644 arch/powerpc/include/asm/perf_counter.h create mode 100644 arch/powerpc/kernel/perf_counter.c create mode 100644 arch/powerpc/kernel/power4-pmu.c create mode 100644 arch/powerpc/kernel/power5+-pmu.c create mode 100644 arch/powerpc/kernel/power5-pmu.c create mode 100644 arch/powerpc/kernel/power6-pmu.c create mode 100644 arch/powerpc/kernel/ppc970-pmu.c delete mode 100644 arch/x86/include/asm/intel_arch_perfmon.h create mode 100644 arch/x86/include/asm/perf_counter.h create mode 100644 arch/x86/kernel/cpu/perf_counter.c create mode 100644 include/linux/perf_counter.h create mode 100644 kernel/perf_counter.c create mode 100644 tools/perf/.gitignore create mode 100644 tools/perf/Documentation/Makefile create mode 100644 tools/perf/Documentation/asciidoc.conf create mode 100644 tools/perf/Documentation/manpage-1.72.xsl create mode 100644 tools/perf/Documentation/manpage-base.xsl create mode 100644 tools/perf/Documentation/manpage-bold-literal.xsl create mode 100644 tools/perf/Documentation/manpage-normal.xsl create mode 100644 tools/perf/Documentation/manpage-suppress-sp.xsl create mode 100644 tools/perf/Documentation/perf-annotate.txt create mode 100644 tools/perf/Documentation/perf-help.txt create mode 100644 tools/perf/Documentation/perf-list.txt create mode 100644 tools/perf/Documentation/perf-record.txt create mode 100644 tools/perf/Documentation/perf-report.txt create mode 100644 tools/perf/Documentation/perf-stat.txt create mode 100644 tools/perf/Documentation/perf-top.txt create mode 100644 tools/perf/Documentation/perf.txt create mode 100644 tools/perf/Makefile create mode 100644 tools/perf/builtin-annotate.c create mode 100644 tools/perf/builtin-help.c create mode 100644 tools/perf/builtin-list.c create mode 100644 tools/perf/builtin-record.c create mode 100644 tools/perf/builtin-report.c create mode 100644 tools/perf/builtin-stat.c create mode 100644 tools/perf/builtin-top.c create mode 100644 tools/perf/builtin.h create mode 100644 tools/perf/command-list.txt create mode 100644 tools/perf/design.txt create mode 100644 tools/perf/perf.c create mode 100644 tools/perf/perf.h create mode 100755 tools/perf/util/PERF-VERSION-GEN create mode 100644 tools/perf/util/abspath.c create mode 100644 tools/perf/util/alias.c create mode 100644 tools/perf/util/cache.h create mode 100644 tools/perf/util/color.c create mode 100644 tools/perf/util/color.h create mode 100644 tools/perf/util/config.c create mode 100644 tools/perf/util/ctype.c create mode 100644 tools/perf/util/environment.c create mode 100644 tools/perf/util/exec_cmd.c create mode 100644 tools/perf/util/exec_cmd.h create mode 100755 tools/perf/util/generate-cmdlist.sh create mode 100644 tools/perf/util/help.c create mode 100644 tools/perf/util/help.h create mode 100644 tools/perf/util/levenshtein.c create mode 100644 tools/perf/util/levenshtein.h create mode 100644 tools/perf/util/list.h create mode 100644 tools/perf/util/pager.c create mode 100644 tools/perf/util/parse-events.c create mode 100644 tools/perf/util/parse-events.h create mode 100644 tools/perf/util/parse-options.c create mode 100644 tools/perf/util/parse-options.h create mode 100644 tools/perf/util/path.c create mode 100644 tools/perf/util/quote.c create mode 100644 tools/perf/util/quote.h create mode 100644 tools/perf/util/rbtree.c create mode 100644 tools/perf/util/rbtree.h create mode 100644 tools/perf/util/run-command.c create mode 100644 tools/perf/util/run-command.h create mode 100644 tools/perf/util/sigchain.c create mode 100644 tools/perf/util/sigchain.h create mode 100644 tools/perf/util/strbuf.c create mode 100644 tools/perf/util/strbuf.h create mode 100644 tools/perf/util/string.c create mode 100644 tools/perf/util/string.h create mode 100644 tools/perf/util/symbol.c create mode 100644 tools/perf/util/symbol.h create mode 100644 tools/perf/util/usage.c create mode 100644 tools/perf/util/util.h create mode 100644 tools/perf/util/wrapper.c -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/