You can read a full tutorial from perf wiki and that will give a good impression on this utility.
The main problem come when you need to understand why we have to use this utility in linux.
Intro A trivial use the top command will show you the necessary information about your Linux.
If you look closely you will notice that :
load average: 0.09, 0.05, 0.01
The three numbers represent averages over progressively longer periods of time (one, five, and fifteen minute averages). This means for us: that lower numbers are better and the higher numbers represent a problem or an overloaded machine. Now about multicore and multiprocessor the rule is simple: the total number of cores is what matters, regardless of how many physical processors those cores are spread across. Let's use this command: First I will record some data about my CPU:
[mythcat@localhost ~]$ perf record -e cpu-clock -ag
Error:
You may not have permission to collect system-wide stats.
Consider tweaking /proc/sys/kernel/perf_event_paranoid,
which controls use of the performance events system by
unprivileged users (without CAP_SYS_ADMIN).
The current value is 2:
-1: Allow use of (almost) all events by all users
>= 0: Disallow raw tracepoint access by users without CAP_IOC_LOCK
>= 1: Disallow CPU event access by users without CAP_SYS_ADMIN
>= 2: Disallow kernel profiling by users without CAP_SYS_ADMIN
[mythcat@localhost ~]$ su
Password:
[root@localhost mythcat]# perf record -e cpu-clock -ag
^C[ perf record: Woken up 17 times to write data ]
[ perf record: Captured and wrote 5.409 MB perf.data (38518 samples) ]
[root@localhost mythcat]# ls -l perf.data
-rw-------. 1 mythcat mythcat 5683180 Feb 21 13:24 perf.data
You can see the perf tool working with root account and result is owned by deafult user.
Let's show this data using the default user - mythcat and perf tool:[mythcat@localhost ~]$ perf report
The result of this command: You can use the full list events by using this command:
[mythcat@localhost ~]$ perf list
List of pre-defined events (to be used in -e):
branch-instructions OR branches [Hardware event]
branch-misses [Hardware event]
bus-cycles [Hardware event]
cache-misses [Hardware event]
cache-references [Hardware event]
cpu-cycles OR cycles [Hardware event]
instructions [Hardware event]
ref-cycles [Hardware event]
alignment-faults [Software event]
bpf-output [Software event]
context-switches OR cs [Software event]
cpu-clock [Software event]
cpu-migrations OR migrations [Software event]
dummy [Software event]
emulation-faults [Software event]
major-faults [Software event]
minor-faults [Software event]
page-faults OR faults [Software event]
task-clock [Software event]
Let's see one event from this list and that will told us how Fedora working:
[root@localhost mythcat]# perf top -e minor-faults -ns comm
Is use the comm (keys are available: pid, comm, dso, symbol, parent, cpu, socket, srcline,
weight, local_weight) and the -ns args see the manual of perf command.
The result of this command is:
This is most simple way to see how is start and close some pids and how they interact in real-time with the operating system.
Another way to deal with the perf command is how to analyze most scheduler properties from within 'perf sched'
alone using the perf sched with the five sub-commands currently:
perf sched record # low-overhead recording of arbitrary workloads
perf sched latency # output per task latency metrics
perf sched map # show summary/map of context-switching
perf sched trace # output finegrained trace
perf sched replay # replay a captured workload using simlated threads
Try this example to see the to capture a trace and then to check
latencies (which analyzes the trace in perf.data record file).
perf sched record sleep 10 # record full system activity for 10 seconds
perf sched latency --sort max # report latencies sorted by max
You can also make a map of map of scheduling events by using this command:
[root@localhost mythcat]# perf sched record
This tutorial show you just only 1% of ways of using the perf command.