[PATCH 7/7] cpuidle/poll_state: replace cpu_relax with smp_cond_load_relaxed
Ankur Arora
ankur.a.arora at oracle.com
Thu Nov 30 22:59:45 PST 2023
Christoph Lameter (Ampere) <cl at linux.com> writes:
> On Wed, 22 Nov 2023, Mihai Carabas wrote:
>
>> La 22.11.2023 22:51, Christoph Lameter a scris:
>>> On Mon, 20 Nov 2023, Mihai Carabas wrote:
>>>
>>>> cpu_relax on ARM64 does a simple "yield". Thus we replace it with
>>>> smp_cond_load_relaxed which basically does a "wfe".
>>> Well it clears events first (which requires the first WFE) and then does a
>>> WFE waiting for any events if no events were pending.
>>> WFE does not cause a VMEXIT? Or does the inner loop of
>>> smp_cond_load_relaxed now do 2x VMEXITS?
>>> KVM ARM64 code seems to indicate that WFE causes a VMEXIT. See
>>> kvm_handle_wfx().
>>
>> In KVM ARM64 the WFE traping is dynamic: it is enabled only if there are more
>> tasks waiting on the same core (e.g. on an oversubscribed system).
>>
>> In arch/arm64/kvm/arm.c:
>>
>> 457 >-------if (single_task_running())
>> 458 >------->-------vcpu_clear_wfx_traps(vcpu);
>> 459 >-------else
>> 460 >------->-------vcpu_set_wfx_traps(vcpu);
>
> Ahh. Cool did not know about that. But still: Lots of VMEXITs once the load has
> to be shared.
Yeah, anytime there's more than one runnable process. Another, more
critical place where we will vmexit is the qspinlock slowpath which
uses smp_cond_load.
>> This of course can be improved by having a knob where you can completly
>> disable wfx traping by your needs, but I left this as another subject to
>> tackle.
Probably needs to be adaptive since we use WFE in error paths as well
(for instance to park the CPU.)
Ankur
More information about the linux-arm-kernel
mailing list