5

I find the behavior of perf top -e cache-misses:pp -p <my_pid> quite confusing. I own an Intel i5-3230M, running kernel 4.4.5 64 bit.

If I just run that commmand, I get basically no samples from my application (a numerical simulation with large datasets, so for sure it has to show cache misses), but almost only from some kernel functions such as intel_pmu_lbr_enable_all, native_write_msr_safe, native_read_msr_safe, __intel_pmu_lbr_disable. If I limit the hits to user-space with -K, I do get hits only in my application, but at a very low count. If I remove a "precise" (p) modifier I get many more hits, but from opcodes that evidently don't cause memory load/writes.

How am I supposed to interpret this behavior? What exactly counts as "cache miss"?

Lorenzo Pistone
  • 708
  • 1
  • 8
  • 20
  • 2
    1) Why don't you use `perf record` or `perf stat` directly? 2) Cache misses are architecture specific and usually L1 misses. You processor should support HW events. – Jakuje Mar 25 '16 at 11:23

0 Answers0