[PATCH v4 0/3] KVM/arm/arm64: enhance armv7/8 fp/simd lazy switch

Mario Smarduch m.smarduch at samsung.com
Sat Nov 14 14:12:07 PST 2015


This patch series combines the previous armv7 and armv8 versions.
For an FP and lmbench load it reduces fp/simd context switch from 30-50% down
to 2%. Results will vary with load but is no worse then current
approach.

In summary current lazy vfp/simd implementation switches hardware context only
on guest access and again on exit to host, otherwise hardware context is
skipped. This patch set builds on that functionality and executes a hardware
context switch only when  vCPU is scheduled out or returns to user space.

Patches were tested on FVP and Foundation Model sw platforms running floating 
point applications comparing outcome against known results. A bad FP/SIMDcontext
switch should result FP errors. Artificially skipping a fp/simd context switch
(1 in 1000) causes the applications to report errors.

The test can be found here, https://github.com/mjsmar/arm-arm64-fpsimd-test

Tests Ran:
armv7:
- On host executed 12 fp applications - evently pinned to cpus
- Two guests - with 12 fp crunching processes - also pinned to vpus.
- half ran with 1ms sleep, remaining with no sleep

armv8:
- same as above except used mix of armv7 and armv8 guests.

These patches are based on earlier arm64 fp/simd optimization work -
https://lists.cs.columbia.edu/pipermail/kvmarm/2015-July/015748.html

And subsequent fixes by Marc and Christoffer at KVM Forum hackathon to handle
32-bit guest on 64 bit host - 
https://lists.cs.columbia.edu/pipermail/kvmarm/2015-August/016128.html

Changes since v3->v4:
- Followup on Christoffers comments 
  - Move fpexc handling to vcpu_load and vcpu_put
  - Enable and restore fpexc in EL2 mode when running a 32 bit guest on 
    64bit EL2
  - rework hcptr handling

Changes since v2->v3:
- combined arm v7 and v8 into one short patch series
- moved access to fpexec_el2 back to EL2
- Move host restore to EL1 from EL2 and call directly from host
- optimize trap enable code 
- renamed some variables to match usage

Changes since v1->v2:
- Fixed vfp/simd trap configuration to enable trace trapping
- Removed set_hcptr branch label
- Fixed handling of FPEXC to restore guest and host versions on vcpu_put
- Tested arm32/arm64
- rebased to 4.3-rc2
- changed a couple register accesses from 64 to 32 bit


Mario Smarduch (3):
  add hooks for armv7 fp/simd lazy switch support
  enable enhanced armv7 fp/simd lazy switch
  enable enhanced armv8 fp/simd lazy switch

 arch/arm/include/asm/kvm_host.h   | 42 ++++++++++++++++++++
 arch/arm/kernel/asm-offsets.c     |  2 +
 arch/arm/kvm/arm.c                | 24 +++++++++++
 arch/arm/kvm/interrupts.S         | 58 ++++++++++++++++-----------
 arch/arm/kvm/interrupts_head.S    | 26 ++++++++----
 arch/arm64/include/asm/kvm_asm.h  |  2 +
 arch/arm64/include/asm/kvm_host.h | 19 +++++++++
 arch/arm64/kernel/asm-offsets.c   |  1 +
 arch/arm64/kvm/hyp.S              | 83 +++++++++++++++++++++++++--------------
 9 files changed, 196 insertions(+), 61 deletions(-)

-- 
1.9.1




More information about the linux-arm-kernel mailing list