Using perf-probe To Analyse VM Exits
perf-probe
provides a way for you instrument the Linux kernel at
runtime with your very own dynamic tracepoint – you can create a
tracepoint for any function or source line in the kernel. While the
kernel does provide a whole bunch of tracepoints out of the box (I
count 1126
on this Haswell machine), they don’t cover everything.
Sometimes you need to roll your own.
Like recently, when I was looking at the following kvm_stat -d
output while comparing the performance of the same workload on bare
metal and in KVM:
All of these events incur a VM exit which is costly. Hundreds of cycles costly.
Worse, if you’re comparing against bare metal like I was, it’s overhead that simply doesn’t exist in the baseline because there is no need to transfer control to the VM monitor. The following diagram from the Intel Architecture Software Developer’s Manual (SDM) Vol. 3C illustrates the transition from guests to host.
Any chance to eliminate VM exits can potentially be a major performance win.
With that in mind, I paused when I read the io_exits
event in the
list above, having no idea what an io_exit
event was. Reading the
kernel source showed that the event happens in response to the KVM
guest executing an I/O instruction (inb
, outb
, etc), at which
point I’m thinking, “Why on earth is KVM
executing I/O instructions? That’s gotta be emulation for some legacy
peripheral”.
The problem was: I didn’t immediately know how to verify that.
The kvm:kvm_exit
tracepoint gives the address of the I/O
instruction that caused the VM exit, so it’s possible to find all
instruction addresses, sorted by number of occurrences by doing:
Looking up the function in the guest kernel with addr2line -e vmlinux
-f 0xffffffff813267ea
revealed that the iowrite16
function was the
cause of 99.7% of the io_exits
events.
And this is where perf-probe
comes into its own. From within the
guest, I can create my own dynamic tracepoint on iowrite16()
and
gather a cpu-cycles profile with callstack to show which code paths
lead to it.
There you have it, the cause of the majority of the io_exits
are the
virtio_blk
and
virtio-rng
drivers. Who knew?