[PATCH] [ARM] Introduce patching of phys_to_virt and vice versa

Eric Miao eric.miao at canonical.com
Tue Sep 28 09:35:11 EDT 2010


On Tue, Aug 10, 2010 at 12:55 AM, Nicolas Pitre <nico at fluxnic.net> wrote:
> On Sun, 8 Aug 2010, Russell King - ARM Linux wrote:
>
>> On Fri, Aug 06, 2010 at 01:07:46PM -0400, Nicolas Pitre wrote:
>> > No idea.  That depends how gcc is considering inline asm.  Given there
>> > is one input operand, I suppose gcc would account for the delay slot
>> > before that operand is actually available.
>>
>> For something which could very well become by default enabled across
>> the board - due to the push to have a single kernel binary for lots
>> of significantly different platforms - it seems little research has
>> been carried out on this point.
>>
>> While I agree that other obvious solutions would be more expensive, I
>> think it makes sense to have the commit which is introducing this
>> method include a proper evaluation, not least so that people who need
>> to make the decision to enable this aren't repeating the same research
>> that someone else has done (or, worse, just decide to enable it 'just
>> because' without any understanding of what the effect may be.)
>
> Sure, I agree.  I'll contact gcc people for a definitive answer.
>

Nico, any feedback?

>> I think we also need to consider whether the volatile is really required.
>> This asm() doesn't have any side effects other than its output operand,
>> so I suspect that the volatile may get in the way of some optimisations
>> (such as deleting the operation if the output is not actually used.)
>
> Indeed, the volatile should go.  This is even getting in the way of
> better scheduling.
>

Yep. I don't see the necessity here for the volatile. Actually after removing
it, it resulted a better instruction scheduling.

I took example of arch/arm/mach-pxa/mioa701.c

static int mioa701_sys_suspend(struct sys_device *sysdev, pm_message_t state)
{
	int i = 0, is_bt_on;
	u32 *mem_resume_vector	= phys_to_virt(RESUME_VECTOR_ADDR);
	u32 *mem_resume_enabler = phys_to_virt(RESUME_ENABLE_ADDR);
	u32 *mem_resume_bt	= phys_to_virt(RESUME_BT_ADDR);
	u32 *mem_resume_unknown	= phys_to_virt(RESUME_UNKNOWN_ADDR);

with __volatile__ gcc gave me something like:

c004de44 <mioa701_sys_suspend>:
c004de44:	e1a0c00d 	mov	ip, sp
c004de48:	e92dd9f8 	push	{r3, r4, r5, r6, r7, r8, fp, ip, lr, pc}
c004de4c:	e24cb004 	sub	fp, ip, #4
c004de50:	e59f80d8 	ldr	r8, [pc, #216]	; c004df30 <mioa701_sys_suspend+0xec>
c004de54:	e2888000 	add	r8, r8, #0
c004de58:	e59f50d4 	ldr	r5, [pc, #212]	; c004df34 <mioa701_sys_suspend+0xf0>
c004de5c:	e2855000 	add	r5, r5, #0
c004de60:	e59f40d0 	ldr	r4, [pc, #208]	; c004df38 <mioa701_sys_suspend+0xf4>
c004de64:	e2844000 	add	r4, r4, #0
c004de68:	e59f70cc 	ldr	r7, [pc, #204]	; c004df3c <mioa701_sys_suspend+0xf8>
c004de6c:	e2877000 	add	r7, r7, #0
c004de70:	e59f30c8 	ldr	r3, [pc, #200]	; c004df40 <mioa701_sys_suspend+0xfc>
c004de74:	e3a00053 	mov	r0, #83	; 0x53


below was what generated after removing __volatile__:

c004de44 <mioa701_sys_suspend>:
c004de44:	e1a0c00d 	mov	ip, sp
c004de48:	e92dd9f8 	push	{r3, r4, r5, r6, r7, r8, fp, ip, lr, pc}
c004de4c:	e24cb004 	sub	fp, ip, #4
c004de50:	e59f30d8 	ldr	r3, [pc, #216]	; c004df30 <mioa701_sys_suspend+0xec>
c004de54:	e3a00053 	mov	r0, #83	; 0x53
c004de58:	e59f80d4 	ldr	r8, [pc, #212]	; c004df34 <mioa701_sys_suspend+0xf0>
c004de5c:	e5936008 	ldr	r6, [r3, #8]
c004de60:	e59f50d0 	ldr	r5, [pc, #208]	; c004df38 <mioa701_sys_suspend+0xf4>
c004de64:	e59f40d0 	ldr	r4, [pc, #208]	; c004df3c <mioa701_sys_suspend+0xf8>
c004de68:	e1a069a6 	lsr	r6, r6, #19
c004de6c:	e2166001 	ands	r6, r6, #1
c004de70:	13a01802 	movne	r1, #131072	; 0x20000
c004de74:	03a01801 	moveq	r1, #65536	; 0x10000
c004de78:	ebffee0a 	bl	c00496a8 <pxa2xx_mfp_set_lpm>
c004de7c:	e2888000 	add	r8, r8, #0
c004de80:	e2855000 	add	r5, r5, #0
c004de84:	e2844000 	add	r4, r4, #0
c004de88:	e59f30b0 	ldr	r3, [pc, #176]	; c004df40 <mioa701_sys_suspend+0xfc>
c004de8c:	e59f70b0 	ldr	r7, [pc, #176]	; c004df44 <mioa701_sys_suspend+0x100>
c004de90:	e2877000 	add	r7, r7, #0

Note those 'add rN, rN, #0' instructions are those stub instructions
for patching.

Does this also show something w.r.t an optimal instruction scheduling?

>
> Nicolas
>



More information about the linux-arm-kernel mailing list