[PATCH] arm64: KVM: Optimize arm64 guest exit VFP/SIMD register save/restore

Mario Smarduch m.smarduch at samsung.com
Mon Jun 15 20:04:10 PDT 2015


On 06/15/2015 11:20 AM, Marc Zyngier wrote:
> On 15/06/15 19:04, Mario Smarduch wrote:
>> On 06/15/2015 03:00 AM, Marc Zyngier wrote:
>>> Hi Mario,
>>>
[ ... ]
>>>
>>> On 13/06/15 23:20, Mario Smarduch wrote:
>>>> Currently VFP/SIMD registers are always saved and restored
>>>> on Guest entry and exit.
>>>>
>>>> This patch only saves and restores VFP/SIMD registers on
>>>> Guest access. To do this cptr_el2 VFP/SIMD trap is set
>>>> on Guest entry and later checked on exit. This follows
>>>> the ARMv7 VFPv3 implementation. Running an informal test
>>>> there are high number of exits that don't access VFP/SIMD
>>>> registers.
>>>
>>> It would be good to add some numbers here. How often do we exit without
>>> having touched the FPSIMD regs? For which workload?
>>
>> Lmbench is what I typically use, with ssh server, i.e., cause page
>> faults and interrupts - usually registers are not touched.
>> I'll run the tests again and define usually.
>>
>> Any other loads you had in mind?
> 
> Not really (apart from running hackbench, of course...;-). I'd just like
> to see the numbers in the commit message, so that we can document the
> improvement (and maybe track regressions).

Hi Marc,
   some ballpark numbers.

   hackbench about 30% of the time optimized path is taken
(for 10*40 test).

Lmbench3 upwards of 50% for context switching, memory bw,
pipe, proc creation, sys call. There are lot more tests
but I limited to these tests. In addition other processes
are running in background NTP, SSH, ... doing their own
thing.

I added a tmp counter to kvm_vcpu_arch to count vfpsimd
events.

- Mario
> 
> [...]
> 
>>>
>>>>  	skip_debug_state x3, 1f
>>>>  	// Clear the dirty flag for the next run, as all the state has
>>>>  	// already been saved. Note that we nuke the whole 64bit word.
>>>> @@ -1166,6 +1211,10 @@ el1_sync:					// Guest trapped into EL2
>>>>  	mrs	x1, esr_el2
>>>>  	lsr	x2, x1, #ESR_ELx_EC_SHIFT
>>>>
>>>> +	/* Guest accessed VFP/SIMD registers, save host, restore Guest */
>>>> +	cmp	x2, #ESR_ELx_EC_FP_ASIMD
>>>> +	b.eq	switch_to_guest_vfp
>>>> +
>>>
>>> I'd prefer you moved that hunk to el1_trap, where we handle all the
>>> traps coming from the guest.
>>
>> I'm thinking would it make sense to update the armv7 side as
>> well. When reading both exit handlers the flow mirrors
>> each other.
> 
> The 32bit code is starting to show its age, and could probably do with a
> refactor. If you have some cycles to spare, that'd be quite interesting.
> 
> Thanks,
> 
> 	M.
> 




More information about the linux-arm-kernel mailing list