RCU stall with high number of KVM vcpus

Tue Nov 14 03:34:56 PST 2017

On 14/11/17 08:49, Marc Zyngier wrote:
> On 14/11/17 07:52, Jan Glauber wrote:
>> On Mon, Nov 13, 2017 at 06:11:19PM +0000, Marc Zyngier wrote:
>>> On 13/11/17 17:35, Jan Glauber wrote:
>>
>> [...]
>>
>>>>>> numbers don't look good, see waittime-max:
>>>>>>
>>>>>> -----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>>>>>>                                class name    con-bounces    contentions   waittime-min   waittime-max waittime-total   waittime-avg    acq-bounces   acquisitions   holdtime-min   holdtime-max holdtime-total   holdtime-avg
>>>>>> -----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>>>>>>
>>>>>>                  &(&kvm->mmu_lock)->rlock:      99346764       99406604           0.14  1321260806.59 710654434972.0        7148.97      154228320      225122857           0.13   917688890.60  3705916481.39          16.46
>>>>>>                  ------------------------
>>>>>>                  &(&kvm->mmu_lock)->rlock       99365598          [<ffff0000080b43b8>] kvm_handle_guest_abort+0x4c0/0x950
>>>>>>                  &(&kvm->mmu_lock)->rlock          25164          [<ffff0000080a4e30>] kvm_mmu_notifier_invalidate_range_start+0x70/0xe8
>>>>>>                  &(&kvm->mmu_lock)->rlock          14934          [<ffff0000080a7eec>] kvm_mmu_notifier_invalidate_range_end+0x24/0x68
>>>>>>                  &(&kvm->mmu_lock)->rlock            908          [<ffff00000810a1f0>] __cond_resched_lock+0x68/0xb8
>>>>>>                  ------------------------
>>>>>>                  &(&kvm->mmu_lock)->rlock              3          [<ffff0000080b34c8>] stage2_flush_vm+0x60/0xd8
>>>>>>                  &(&kvm->mmu_lock)->rlock       99186296          [<ffff0000080b43b8>] kvm_handle_guest_abort+0x4c0/0x950
>>>>>>                  &(&kvm->mmu_lock)->rlock         179238          [<ffff0000080a4e30>] kvm_mmu_notifier_invalidate_range_start+0x70/0xe8
>>>>>>                  &(&kvm->mmu_lock)->rlock          19181          [<ffff0000080a7eec>] kvm_mmu_notifier_invalidate_range_end+0x24/0x68
>>>>>>
>>>>>> .............................................................................................................................................................................................................................
>>>>> [slots of stuff]
>>>>>
>>>>> Well, the mmu_lock is clearly contended. Is the box in a state where you
>>>>> are swapping? There seem to be as many faults as contentions, which is a
>>>>> bit surprising...
>>>>
>>>> I don't think it is swapping but need to double check.
>>>
>>> It is the number of aborts that is staggering. And each one of them
>>> leads to the mmu_lock being contended. So something seems to be taking
>>> its sweet time holding the damned lock.
>>
>> Can you elaborate on the aborts, I'm not familiar with KVM but from a
>> first look I thought kvm_handle_guest_abort() is in the normal path
>> when a vcpu is stopped. Is that wrong?
> 
> kvm_handle_guest_abort() is the entry point for our page fault handling
> (hence the mmu_lock being taken). On its own, the number of faults is
> irrelevant. What worries me is that in almost all the cases the lock was
> contended, we were handling a page fault.
> 
> What would be interesting is to find out *who* is holding the lock when
> we're being blocked in kvm_handle_guest_abort...

Just a thought, turning on the tracepoints for kvm_hva_* might help to get some
more data on what we are doing with the HVA ranges.

Cheers
Suzuki