[PATCH v2 16/16] KVM: arm64: Handle deferred SErrors consumed on guest exit

Fri Aug 4 06:12:36 PDT 2017

On Thu, Aug 03, 2017 at 06:03:33PM +0100, James Morse wrote:
> Hi Christoffer,
> 
> On 01/08/17 14:18, Christoffer Dall wrote:
> > On Fri, Jul 28, 2017 at 03:10:19PM +0100, James Morse wrote:
> >> On systems with VHE, the RAS extensions and IESB support, KVM gets an
> >> implicit ESB whenever it enters/exits a guest, because the host sets
> >> SCTLR_EL1.IESB.
> >>
> >> To prevent errors being lost, add code to __guest_exit() to read DISR_EL1,
> >> and save it in the kvm_vcpu_fault_info. Add code to handle_exit() to
> >> process this deferred SError. This data is in addition to the reason the
> >> guest exitted.
> > 
> > Two questions:
> > 
> > First, am I reading the spec incorrectly when it says "The implicit form
> > of Error Synchronization Barrier: [...] Has no effect on DISR_EL1 or
> > VDISR_EL2" and I understand this as we wouldn't actually read anything
> > from DISR_EL1 if we rely on the IESB?
> 
> (This is from section 2.4.5 Extension for barrier at exception entry and exit of
> DDI 0587A.)
> 
> Well spotted ... that's embarrassing!

Not at all, that spec is a little dense.

> 
> The DISR write is in the pseudocode's ESBOperation() which is not the same as
> ErrorSynchronizationBarrier(). Running an 'ESB' does both, but an IESB only does
> ErrorSynchronizationBarrier().
> 
> I think this distinction is because the CPU may know about RAS errors it hasn't
> yet made pending SErrors. (they must have to have a severity for the ESR by this
> point).
> 
> So IESB makes hidden RAS errors pending SErrors, it doesn't do what ESB does.
> 
> Yes, this means the DISR_EL1 check on kernel-entry and guest exit is useless.
> Given this the host kernel entry/exit can be simplified, probably getting rid of
> the SError over eret horror. I will need to re-think the KVM changes, (we may
> just need the ESR from the existing vaxorcism code).
> 
> 
> > Second, what if we have several SErrors, and one happens upon entering
> > the guest and another one happens when returning from the guest - do we
> > end up overwriting the DISR_EL1 by only looking at it during exit and
> > potentially miss errors?
> 
> There can only be one pending SError at a time, but if we have PSTATE.A set, a
> pending SError and a hidden RAS error, then ESB must have to pick one to defer,
> and IESB must have to discard one. I suspect the answer is 'implementation
> defined', but I will ask!
> 

As long as we're doing what we can, and we're not missing something that
the architecture gives us a way to retrieve, then that's probably the
best we can do.

> 
> >> Future patches may add a firmware-first callout from
> >> kvm_handle_deferred_serror() to decode CPER records populated by firmware,
> >> or call some arm64 arch code to process the RAS 'ERR' registers for
> >> kernel-first handling. Without either of these, we just make a judgement
> >> on the severity: corrected and restartable errors are ignored, all others
> >> result it an SError being given to the guest.
> > 
> > *in an* ?
> 
> 
> > Why do we give the remaining types of SErrors to the guest?
> 
> Just because that is what KVM does today.
> 
> > What would the kernel normally do for any other workload than running a VM when
> > discovering this type of error?
> 
> I'm trying to make that clearer! Today we 'kill the running task', if its the
> kernel, we would panic(). But because the CPU masks SError on exception entry,
> and we never touch PSTATE.A, its always masked in the kernel, so we take the
> SError and kill the next user space task that gets run.
> 
> We should panic() like we do in the early boot code if an SError was pending
> from firmware.
> 
> 
> Should the host panic because of an SError taken during a guest?, not
> necessarily. All the system registers are save/restored by world-switch, and the
> host doesn't depend on anything in guest memory. The host should be immune to
> any corruption that occurs while a guest was running.
> Gengdongjiu's example of device pass-through is the exception to this reasoning,
> I think we need a way for the host to contain/reset pass-through devices that
> trigger an SError.
> 

I'm not an expert on what can generate the SError.  If it's because the
guest misprogrammed a system register, then it makes sense to just tell
the guest.

However, if this could be related to corrupted memory, or a CPU fault,
or really any resource that the guest is using which can be used by the
host later on (memory, CPU, GIC, passthrough devices, ...) then it feels
a little dangerous to just signal the guest and carry on.

Thanks,
-Christoffer