[PATCH v3 05/22] arm64: KVM: Implement vgic-v3 save/restore
Mario Smarduch
m.smarduch at samsung.com
Mon Dec 7 18:14:36 PST 2015
On 12/7/2015 10:20 AM, Marc Zyngier wrote:
> On 07/12/15 18:05, Mario Smarduch wrote:
>>
>>
>> On 12/7/2015 9:37 AM, Marc Zyngier wrote:
[...]
>>>
>>
>> I was thinking something like 'current_lr[VGIC_V3_LR_INDEX(...)]'.
>
> That doesn't change anything, the compiler is perfectly able to
> optimize something like this:
>
> [...]
> ffffffc0007f31ac: 38624862 ldrb w2, [x3,w2,uxtw]
> ffffffc0007f31b0: 10000063 adr x3, ffffffc0007f31bc <__vgic_v3_save_state+0x64>
> ffffffc0007f31b4: 8b228862 add x2, x3, w2, sxtb #2
> ffffffc0007f31b8: d61f0040 br x2
> ffffffc0007f31bc: d53ccde2 mrs x2, s3_4_c12_c13_7
> ffffffc0007f31c0: f9001c02 str x2, [x0,#56]
> ffffffc0007f31c4: d53ccdc2 mrs x2, s3_4_c12_c13_6
> ffffffc0007f31c8: f9002002 str x2, [x0,#64]
> ffffffc0007f31cc: d53ccda2 mrs x2, s3_4_c12_c13_5
> ffffffc0007f31d0: f9002402 str x2, [x0,#72]
> ffffffc0007f31d4: d53ccd82 mrs x2, s3_4_c12_c13_4
> ffffffc0007f31d8: f9002802 str x2, [x0,#80]
> ffffffc0007f31dc: d53ccd62 mrs x2, s3_4_c12_c13_3
> ffffffc0007f31e0: f9002c02 str x2, [x0,#88]
> ffffffc0007f31e4: d53ccd42 mrs x2, s3_4_c12_c13_2
> ffffffc0007f31e8: f9003002 str x2, [x0,#96]
> ffffffc0007f31ec: d53ccd22 mrs x2, s3_4_c12_c13_1
> ffffffc0007f31f0: f9003402 str x2, [x0,#104]
> ffffffc0007f31f4: d53ccd02 mrs x2, s3_4_c12_c13_0
> ffffffc0007f31f8: f9003802 str x2, [x0,#112]
> ffffffc0007f31fc: d53ccce2 mrs x2, s3_4_c12_c12_7
> ffffffc0007f3200: f9003c02 str x2, [x0,#120]
> ffffffc0007f3204: d53cccc2 mrs x2, s3_4_c12_c12_6
> ffffffc0007f3208: f9004002 str x2, [x0,#128]
> ffffffc0007f320c: d53ccca2 mrs x2, s3_4_c12_c12_5
> ffffffc0007f3210: f9004402 str x2, [x0,#136]
> ffffffc0007f3214: d53ccc82 mrs x2, s3_4_c12_c12_4
> ffffffc0007f3218: f9004802 str x2, [x0,#144]
> ffffffc0007f321c: d53ccc62 mrs x2, s3_4_c12_c12_3
> ffffffc0007f3220: f9004c02 str x2, [x0,#152]
> ffffffc0007f3224: d53ccc42 mrs x2, s3_4_c12_c12_2
> ffffffc0007f3228: f9005002 str x2, [x0,#160]
> ffffffc0007f322c: d53ccc22 mrs x2, s3_4_c12_c12_1
> ffffffc0007f3230: f9005402 str x2, [x0,#168]
> ffffffc0007f3234: d53ccc02 mrs x2, s3_4_c12_c12_0
> ffffffc0007f3238: 7100183f cmp w1, #0x6
> ffffffc0007f323c: f9005802 str x2, [x0,#176]
>
> As you can see, this is as optimal as it gets, short of being able
> to find a nice way to use more than one register...
Interesting, thanks for the dump I'm no expert on pipeline optimizations but I'm
wondering with these system register accesses can these be executed out of order
provided you didn't have what I thinks are write after read dependencies?
It's only 4 registers here, there are some other longer stretches in subsequent
patches.
I minor note here is some white space in this patch.
>
> Thanks,
>
> M.
>
More information about the linux-arm-kernel
mailing list