[PATCH v2 1/2] ARM: KVM: Yield CPU when vcpu executes a WFE

Marc Zyngier marc.zyngier at arm.com
Wed Oct 16 03:08:43 EDT 2013


On 2013-10-16 02:14, Christoffer Dall wrote:
> On Tue, Oct 08, 2013 at 06:38:13PM +0100, Marc Zyngier wrote:
>> On an (even slightly) oversubscribed system, spinlocks are quickly
>> becoming a bottleneck, as some vcpus are spinning, waiting for a
>> lock to be released, while the vcpu holding the lock may not be
>> running at all.
>>
>> This creates contention, and the observed slowdown is 40x for
>> hackbench. No, this isn't a typo.
>>
>> The solution is to trap blocking WFEs and tell KVM that we're
>> now spinning. This ensures that other vpus will get a scheduling
>> boost, allowing the lock to be released more quickly. Also, using
>> CONFIG_HAVE_KVM_CPU_RELAX_INTERCEPT slightly improves the 
>> performance
>> when the VM is severely overcommited.
>>
>> Quick test to estimate the performance: hackbench 1 process 1000
>>
>> 2xA15 host (baseline):	1.843s
>>
>> 2xA15 guest w/o patch:	2.083s
>> 4xA15 guest w/o patch:	80.212s
>> 8xA15 guest w/o patch:	Could not be bothered to find out
>>
>> 2xA15 guest w/ patch:	2.102s
>> 4xA15 guest w/ patch:	3.205s
>> 8xA15 guest w/ patch:	6.887s
>>
>> So we go from a 40x degradation to 1.5x in the 2x overcommit case,
>> which is vaguely more acceptable.
>>
> Patch looks good, I can just apply it and add the other one I just 
> send
> as a reply if there are no objections.

Yeah, I missed the updated comments on this one, thanks for taking care 
of it.

> Sorry for the long turn-around on this one.

No worries. As long as it goes in, I'm happy. It makes such a 
difference on my box, it is absolutely mind boggling.

Thanks,

          M.
-- 
Fast, cheap, reliable. Pick two.



More information about the linux-arm-kernel mailing list