How to interpret perf -e cache-misses:pp?

Asked Mar 24 '16 at 12:55

Active Mar 24 '16 at 13:02

Viewed 1,093 times

I find the behavior of perf top -e cache-misses:pp -p <my_pid> quite confusing. I own an Intel i5-3230M, running kernel 4.4.5 64 bit.

If I just run that commmand, I get basically no samples from my application (a numerical simulation with large datasets, so for sure it has to show cache misses), but almost only from some kernel functions such as intel_pmu_lbr_enable_all, native_write_msr_safe, native_read_msr_safe, __intel_pmu_lbr_disable. If I limit the hits to user-space with -K, I do get hits only in my application, but at a very low count. If I remove a "precise" (p) modifier I get many more hits, but from opcodes that evidently don't cause memory load/writes.

How am I supposed to interpret this behavior? What exactly counts as "cache miss"?

edited Mar 24 '16 at 13:02

asked Mar 24 '16 at 12:55

Lorenzo Pistone

2

1) Why don't you use `perf record` or `perf stat` directly? 2) Cache misses are architecture specific and usually L1 misses. You processor should support HW events. – Jakuje Mar 25 '16 at 11:23

How to interpret perf -e cache-misses:pp?

0 Answers0