[PATCH] Skip unnecessary pte makeup when clearing it.

bill4carson bill4carson at gmail.com
Fri Feb 3 02:43:58 EST 2012



On 2012年02月03日 14:54, Uwe Kleine-König wrote:
> Hello,
>
> On Mon, Jan 30, 2012 at 04:36:07PM +0800, bill4carson at gmail.com wrote:
>> From: Bill Carson<bill4carson at gmail.com>
>>
>> If we are only about to clear a hardware pte entry, then pte makeup code is
>> unnecessary for cpu_v7_set_pte_ext and armv6_set_pte_ext, so just skip it.
>>
>> Signed-off-by: Bill Carson<bill4carson at gmail.com>
> I havn't tested and I don't know if the patch brings any advantages like
> increased speed. But AFAICT it doesn't change the behaviour of
> armv6_set_pte_ext and cpu_v7_set_pte_ext.
>
Hi, Uwe

I'm sorry I didn't state the purpose of this patch clearly.
As a matter of fact, it does change the behavior of set_pte_ext :)

Without this patch, the code path when:
set a pte: line 147->173, 176->181
clear a pte: line 147->174, 176->181

Point is line 147->173 takes a lot of cpu cycles to figure out the right r3,
This is only used when set a pte, when clearing a pte, r3 *ALWAYS* has zero
value! which means line 147->173 doesn't need to be executed in such case.

145ENTRY(cpu_v7_set_pte_ext)
146#ifdef CONFIG_MMU
147 str r1, [r0] @ linux version
148
149 bic r3, r1, #0x000003f0
150 bic r3, r3, #PTE_TYPE_MASK
151 orr r3, r3, r2
152 orr r3, r3, #PTE_EXT_AP0 | 2
153
154 tst r1, #1 << 4
155 orrne r3, r3, #PTE_EXT_TEX(1)
156
157 eor r1, r1, #L_PTE_DIRTY
158 tst r1, #L_PTE_RDONLY | L_PTE_DIRTY
159 orrne r3, r3, #PTE_EXT_APX
160
161 tst r1, #L_PTE_USER
162 orrne r3, r3, #PTE_EXT_AP1
163#ifdef CONFIG_CPU_USE_DOMAINS
164 @ allow kernel read/write access to read-only user pages
165 tstne r3, #PTE_EXT_APX
166 bicne r3, r3, #PTE_EXT_APX | PTE_EXT_AP0
167#endif
168
169 tst r1, #L_PTE_XN
170 orrne r3, r3, #PTE_EXT_XN
171
172 tst r1, #L_PTE_YOUNG
173 tstne r1, #L_PTE_PRESENT
174 moveq r3, #0
175
176 ARM( str r3, [r0, #2048]! )
177 THUMB( add r0, r0, #2048 )
178 THUMB( str r3, [r0] )
179 mcr p15, 0, r0, c7, c10, 1 @ flush_pte
180#endif
181 mov pc, lr
182ENDPROC(cpu_v7_set_pte_ext)



With this patch, the code path when:
set a pte: line 147->150, 153->181
clear a pte: line 147->152, 176->181

The code path when setting a pte does not change much at all.
But code path of clearing a pte is significantly shorter than before,
and performance enhancement is handy here.

145 ENTRY(cpu_v7_set_pte_ext)
146 #ifdef CONFIG_MMU
147 str r1, [r0] @ linux version
148
149 tst r1, #L_PTE_YOUNG
150 tstne r1, #L_PTE_PRESENT
151 moveq r3, #0
152 beq 1f
153 bic r3, r1, #0x000003f0
154 bic r3, r3, #PTE_TYPE_MASK
155 orr r3, r3, r2
156 orr r3, r3, #PTE_EXT_AP0 | 2
157
158 tst r1, #1 << 4
159 orrne r3, r3, #PTE_EXT_TEX(1)
160
161 eor r1, r1, #L_PTE_DIRTY
162 tst r1, #L_PTE_RDONLY | L_PTE_DIRTY
163 orrne r3, r3, #PTE_EXT_APX
164
165 tst r1, #L_PTE_USER
166 orrne r3, r3, #PTE_EXT_AP1
167 #ifdef CONFIG_CPU_USE_DOMAINS
168 @ allow kernel read/write access to read-only user pages
169 tstne r3, #PTE_EXT_APX
170 bicne r3, r3, #PTE_EXT_APX | PTE_EXT_AP0
171 #endif
172
173 tst r1, #L_PTE_XN
174 orrne r3, r3, #PTE_EXT_XN
175
176 1:
177 ARM( str r3, [r0, #2048]! )
178 THUMB( add r0, r0, #2048 )
179 THUMB( str r3, [r0] )
180 mcr p15, 0, r0, c7, c10, 1 @ flush_pte
181 #endif
182 mov pc, lr
ENDPROC(cpu_v7_set_pte_ext)


I hope the above explanation could justify this patch.



Regards
Bill



> Best regards
> Uwe
>

-- 
I am a slow learner
but I will keep trying to fight for my dreams!

--bill




More information about the linux-arm-kernel mailing list