[PATCH v3 00/20] SError rework + RAS&IESB for firmware first support

James Morse james.morse at arm.com
Thu Oct 5 12:17:52 PDT 2017


Hello,

The aim of this series is to enable IESB and add ESB-instructions to let us
kick any pending RAS errors into firmware to be handled by firmware-first.

Not all systems will have this firmware, so these RAS errors will become
pending SErrors. We should take these as quickly as possible and avoid
panic()ing for errors where we could have continued.

This first part of this series reworks the DAIF masking so that SError is
unmasked unless we are handling a debug exception.

The last part provides the same minimal handling for SError that interrupt
KVM. KVM is currently unable to handle SErrors during world-switch, unless
they occur during a magic single-instruction window, it hyp-panics. I suspect
this will be easier to fix once the VHE world-switch is further optimised.

KVMs kvm_inject_vabt() needs updating for v8.2 as now we can specify an ESR,
and all-zeros has a RAS meaning.

KVM's existing 'impdef SError to the guest' behaviour probably needs revisiting.
These are errors where we don't know what they mean, they may not be
synchronised by ESB. Today we blame the guest.
My half-baked suggestion would be to make a virtual SError pending, but then
exit to user-space to give Qemu the change to quit (for virtual machines that
don't generate SError), pend an SError with a new Qemu-specific ESR, or blindly
continue and take KVMs default all-zeros impdef ESR.

Known issues:
 * Synchronous external abort SET severity is not yet considered, all
   synchronous-external-aborts are still considered fatal.

 * KVM-Migration: VDISR_EL2 is exposed to userspace as DISR_EL1, but how should
   HCR_EL2.VSE or VSESR_EL2 be migrated when the guest has an SError pending but
   hasn't taken it yet...?

 * No HCR_EL2.{TEA/TERR} setting ... Dongjiu Geng had a patch that was almost
   finished, I haven't seen the new version.

 * KVM unmasks SError and IRQ before calling handle_exit, we may be rescheduled
   while holding an uncontained ESR... (this is currently an improvement on
   assuming its an impdef error we can blame on the guest)
    * We need to fix this for APEI's SEI or kernel-first RAS, the guest-exit
      SError handling will need to move to before kvm_arm_vhe_guest_exit().


Changes from v2 ... (where do I start?)
 * All the KVM patches rewritten.
 * VSESR_EL2 setting/save/restore is new, as is
 * save/restoring VDISR_EL2 and exposing it to user space as DISR_EL1.

 * The new ARM-ARM (DDI0487B.b) has an SCTLR_EL2.IESB even for !VHE, we turn
   that on.
 * 'survivable' SError are now described as 'blocking' because the CPU can't
   make progress, this makes all the commit messages clearer.
 * My IESB!=ESB confusion got fixed, so the crazy eret with SError unmasked
   is gone, never to return.
 * The cost of masking SError on return to user-space has been wrapped up with
   the ret-to-user loop. (This was only visible with microbenchmarks like
   getpid)
 * entry.S changes got cleaner, commit messages got better,


This series can be retrieved from:
git://linux-arm.org/linux-jm.git -b serror_rework/v3


Comments and contradictions welcome,

James Morse (18):
  arm64: explicitly mask all exceptions
  arm64: introduce an order for exceptions
  arm64: Move the async/fiq helpers to explicitly set process context
    flags
  arm64: Mask all exceptions during kernel_exit
  arm64: entry.S: Remove disable_dbg
  arm64: entry.S: convert el1_sync
  arm64: entry.S convert el0_sync
  arm64: entry.S: convert elX_irq
  KVM: arm/arm64: mask/unmask daif around VHE guests
  arm64: kernel: Survive corrected RAS errors notified by SError
  arm64: cpufeature: Enable IESB on exception entry/return for
    firmware-first
  arm64: kernel: Prepare for a DISR user
  KVM: arm64: Set an impdef ESR for Virtual-SError using VSESR_EL2.
  KVM: arm64: Save/Restore guest DISR_EL1
  KVM: arm64: Save ESR_EL2 on guest SError
  KVM: arm64: Handle RAS SErrors from EL1 on guest exit
  KVM: arm64: Handle RAS SErrors from EL2 on guest exit
  KVM: arm64: Take any host SError before entering the guest

Xie XiuQi (2):
  arm64: entry.S: move SError handling into a C function for future
    expansion
  arm64: cpufeature: Detect CPU RAS Extentions

 arch/arm64/Kconfig                   | 33 +++++++++++++-
 arch/arm64/include/asm/assembler.h   | 50 ++++++++++++++-------
 arch/arm64/include/asm/barrier.h     |  1 +
 arch/arm64/include/asm/cpucaps.h     |  4 +-
 arch/arm64/include/asm/daifflags.h   | 61 +++++++++++++++++++++++++
 arch/arm64/include/asm/esr.h         | 17 +++++++
 arch/arm64/include/asm/exception.h   | 14 ++++++
 arch/arm64/include/asm/irqflags.h    | 40 ++++++-----------
 arch/arm64/include/asm/kvm_emulate.h | 10 +++++
 arch/arm64/include/asm/kvm_host.h    | 16 +++++++
 arch/arm64/include/asm/processor.h   |  2 +
 arch/arm64/include/asm/sysreg.h      |  6 +++
 arch/arm64/include/asm/traps.h       | 36 +++++++++++++++
 arch/arm64/kernel/asm-offsets.c      |  1 +
 arch/arm64/kernel/cpufeature.c       | 43 ++++++++++++++++++
 arch/arm64/kernel/debug-monitors.c   |  5 ++-
 arch/arm64/kernel/entry.S            | 86 +++++++++++++++++++++---------------
 arch/arm64/kernel/hibernate.c        |  5 ++-
 arch/arm64/kernel/machine_kexec.c    |  4 +-
 arch/arm64/kernel/process.c          |  3 ++
 arch/arm64/kernel/setup.c            |  8 ++--
 arch/arm64/kernel/signal.c           |  8 +++-
 arch/arm64/kernel/smp.c              | 12 ++---
 arch/arm64/kernel/suspend.c          |  7 +--
 arch/arm64/kernel/traps.c            | 64 ++++++++++++++++++++++++++-
 arch/arm64/kvm/handle_exit.c         | 19 +++++++-
 arch/arm64/kvm/hyp-init.S            |  3 ++
 arch/arm64/kvm/hyp/entry.S           | 13 ++++++
 arch/arm64/kvm/hyp/switch.c          | 19 ++++++--
 arch/arm64/kvm/hyp/sysreg-sr.c       |  6 +++
 arch/arm64/kvm/inject_fault.c        | 13 +++++-
 arch/arm64/kvm/sys_regs.c            |  1 +
 arch/arm64/mm/proc.S                 | 14 +++---
 virt/kvm/arm/arm.c                   |  4 ++
 34 files changed, 513 insertions(+), 115 deletions(-)
 create mode 100644 arch/arm64/include/asm/daifflags.h

-- 
2.13.3




More information about the linux-arm-kernel mailing list