ARM WFET application scenario consultation

Mon Apr 12 14:48:47 BST 2021

On Mon, 12 Apr 2021 14:15:21 +0100,
Catalin Marinas <catalin.marinas at arm.com> wrote:
> 
> On Mon, Apr 12, 2021 at 02:09:10PM +0100, Marc Zyngier wrote:
> > On Mon, 12 Apr 2021 13:46:37 +0100,
> > Catalin Marinas <catalin.marinas at arm.com> wrote:
> > > 
> > > On Mon, Apr 12, 2021 at 08:08:23PM +0800, yangwendong wrote:
> > > > Recently, a new feature of WFE with timeouts has been added to ARMv8.
> > > > I have some doubts about the application scenarios of this feature.
> > > > 
> > > > 1) Arm spec said that WFE or WFET can be used in spinlock. Since the
> > > > thread using spinlock can't be sleep, if we use the wfet instruction, we
> > > > can do nothing but wait when timeout,  so what's the difference between
> > > > the two instructions in this scenario?
> > > 
> > > Not much point in using it it in a classic spinlock, unless you have
> > > some specific implementation that's supposed to time out.
> > > 
> > > Note that we already enabled the event stream in Linux so that an event
> > > is generated at 100KHz waking up any WFE. One reason we had for this was
> > > some hardware errata where events between clusters were not generated.
> > > Another was some small delays required in in certain user programs
> > > without going through a kernel syscall, though not sure anyone's
> > > actually using it.
> > > 
> > > > 2) Are there any other special scenarios where using wfet instructions
> > > > can be beneficial ?
> > > 
> > > In the kernel we could replace our udelay loop with WFIT for example
> > > (not WFET because of the event stream). As for user, we can expose a
> > > HWCAP but it's up to user libraries to make use of it.
> > 
> > Note that since c219bc4e9205K ("arm64: Trap WFI executed in
> > userspace"), we actively prevent WFI from being used in userspace, and
> > I would expect WFIT to be given the same treatment. It otherwise is a
> > precise tool for userspace to synchronise against kernel events.
> 
> I agree. I only thought about using it in the kernel as a simpler
> udelay(). The user should not attempt WFI/WFIT.

That's my position as well. I'll post a patch dealing with that
shortly.

> Now, if KVM traps WFI/WFIT as well, maybe we should not bother with
> udelay() in the kernel either.

"It depends". As long as there is no direct injection of interrupts
(GICv4+), WFI is always trapped. This saves us IPI-ing the physical
CPU to force an IRQ state reload.

However, when the vcpu can be targeted by directly injected interrupts
*and* that it is the only thread in the CPU's run queue, we stop
trapping WFI so that direct injection has a chance of doing its thing.

Of course, the number of systems implementing direct injection is so
far extremely close to zero, so I don't think it is worth basing
udelay() on that just yet.

KVM needs a bit of work to honour the timeout on trap as well, as
currently, a trapped WFIT without any interrupt being injected would
result in a guest that never make forward progress.

I'll put that on my list of things to look at.

Thanks,

	M.

-- 
Without deviation from the norm, progress is not possible.