[PATCH v2 1/2] ARM: KVM: Yield CPU when vcpu executes a WFE

Christoffer Dall christoffer.dall at linaro.org
Tue Oct 15 21:14:17 EDT 2013


On Tue, Oct 08, 2013 at 06:38:13PM +0100, Marc Zyngier wrote:
> On an (even slightly) oversubscribed system, spinlocks are quickly
> becoming a bottleneck, as some vcpus are spinning, waiting for a
> lock to be released, while the vcpu holding the lock may not be
> running at all.
> 
> This creates contention, and the observed slowdown is 40x for
> hackbench. No, this isn't a typo.
> 
> The solution is to trap blocking WFEs and tell KVM that we're
> now spinning. This ensures that other vpus will get a scheduling
> boost, allowing the lock to be released more quickly. Also, using
> CONFIG_HAVE_KVM_CPU_RELAX_INTERCEPT slightly improves the performance
> when the VM is severely overcommited.
> 
> Quick test to estimate the performance: hackbench 1 process 1000
> 
> 2xA15 host (baseline):	1.843s
> 
> 2xA15 guest w/o patch:	2.083s
> 4xA15 guest w/o patch:	80.212s
> 8xA15 guest w/o patch:	Could not be bothered to find out
> 
> 2xA15 guest w/ patch:	2.102s
> 4xA15 guest w/ patch:	3.205s
> 8xA15 guest w/ patch:	6.887s
> 
> So we go from a 40x degradation to 1.5x in the 2x overcommit case,
> which is vaguely more acceptable.
> 
Patch looks good, I can just apply it and add the other one I just send
as a reply if there are no objections.

Sorry for the long turn-around on this one.

-Christoffer



More information about the linux-arm-kernel mailing list