[PATCH v2 16/16] KVM: arm64: Handle deferred SErrors consumed on guest exit

Thu Aug 3 10:03:33 PDT 2017

Hi Christoffer,

On 01/08/17 14:18, Christoffer Dall wrote:
> On Fri, Jul 28, 2017 at 03:10:19PM +0100, James Morse wrote:
>> On systems with VHE, the RAS extensions and IESB support, KVM gets an
>> implicit ESB whenever it enters/exits a guest, because the host sets
>> SCTLR_EL1.IESB.
>>
>> To prevent errors being lost, add code to __guest_exit() to read DISR_EL1,
>> and save it in the kvm_vcpu_fault_info. Add code to handle_exit() to
>> process this deferred SError. This data is in addition to the reason the
>> guest exitted.
> 
> Two questions:
> 
> First, am I reading the spec incorrectly when it says "The implicit form
> of Error Synchronization Barrier: [...] Has no effect on DISR_EL1 or
> VDISR_EL2" and I understand this as we wouldn't actually read anything
> from DISR_EL1 if we rely on the IESB?

(This is from section 2.4.5 Extension for barrier at exception entry and exit of
DDI 0587A.)

Well spotted ... that's embarrassing!

The DISR write is in the pseudocode's ESBOperation() which is not the same as
ErrorSynchronizationBarrier(). Running an 'ESB' does both, but an IESB only does
ErrorSynchronizationBarrier().

I think this distinction is because the CPU may know about RAS errors it hasn't
yet made pending SErrors. (they must have to have a severity for the ESR by this
point).

So IESB makes hidden RAS errors pending SErrors, it doesn't do what ESB does.

Yes, this means the DISR_EL1 check on kernel-entry and guest exit is useless.
Given this the host kernel entry/exit can be simplified, probably getting rid of
the SError over eret horror. I will need to re-think the KVM changes, (we may
just need the ESR from the existing vaxorcism code).

> Second, what if we have several SErrors, and one happens upon entering
> the guest and another one happens when returning from the guest - do we
> end up overwriting the DISR_EL1 by only looking at it during exit and
> potentially miss errors?

There can only be one pending SError at a time, but if we have PSTATE.A set, a
pending SError and a hidden RAS error, then ESB must have to pick one to defer,
and IESB must have to discard one. I suspect the answer is 'implementation
defined', but I will ask!

>> Future patches may add a firmware-first callout from
>> kvm_handle_deferred_serror() to decode CPER records populated by firmware,
>> or call some arm64 arch code to process the RAS 'ERR' registers for
>> kernel-first handling. Without either of these, we just make a judgement
>> on the severity: corrected and restartable errors are ignored, all others
>> result it an SError being given to the guest.
> 
> *in an* ?

> Why do we give the remaining types of SErrors to the guest?

Just because that is what KVM does today.

> What would the kernel normally do for any other workload than running a VM when
> discovering this type of error?

I'm trying to make that clearer! Today we 'kill the running task', if its the
kernel, we would panic(). But because the CPU masks SError on exception entry,
and we never touch PSTATE.A, its always masked in the kernel, so we take the
SError and kill the next user space task that gets run.

We should panic() like we do in the early boot code if an SError was pending
from firmware.

Should the host panic because of an SError taken during a guest?, not
necessarily. All the system registers are save/restored by world-switch, and the
host doesn't depend on anything in guest memory. The host should be immune to
any corruption that occurs while a guest was running.
Gengdongjiu's example of device pass-through is the exception to this reasoning,
I think we need a way for the host to contain/reset pass-through devices that
trigger an SError.

Thanks!

James