[PATCH 1/1] irqchip: irq-gic: forward SGI to itself for cortex-a7 single core

Peter Chen hzpeterchen at gmail.com
Tue Aug 9 02:39:42 PDT 2016


On Tue, Aug 09, 2016 at 09:54:34AM +0100, Marc Zyngier wrote:
> On 09/08/16 08:18, Peter Chen wrote:
> > On Tue, Aug 09, 2016 at 07:59:30AM +0100, Marc Zyngier wrote:
> >> On Tue, 9 Aug 2016 13:57:01 +0800
> >> Peter Chen <hzpeterchen at gmail.com> wrote:
> >>
> >>> On Tue, Aug 09, 2016 at 06:34:01AM +0100, Marc Zyngier wrote:
> >>>> On Tue, 9 Aug 2016 11:46:13 +0800
> >>>> Peter Chen <hzpeterchen at gmail.com> wrote:
> >>>>   
> >>>>> On Mon, Aug 08, 2016 at 02:59:16PM +0100, Marc Zyngier wrote:  
> >>>>>> On Mon, 8 Aug 2016 14:48:42 +0100
> >>>>>> Mark Rutland <mark.rutland at arm.com> wrote:
> >>>>>>     
> >>>>>>> On Mon, Aug 08, 2016 at 09:28:47PM +0800, Peter Chen wrote:    
> >>>>>>>> On Mon, Aug 08, 2016 at 02:07:54PM +0100, Mark Rutland wrote:      
> >>>>>>>>> I see that for arm64 we have:
> >>>>>>>>>
> >>>>>>>>> static inline bool arch_irq_work_has_interrupt(void)
> >>>>>>>>> {
> >>>>>>>>> 	return !!__smp_cross_call;
> >>>>>>>>> }
> >>>>>>>>>
> >>>>>>>>> Could we do similarly for ARM, and ony register gic_raise_softirq if
> >>>>>>>>> we have non-zero SGI targets?
> >>>>>>>>>
> >>>>>>>>> If I've understood correctly, that would make things behave as they do
> >>>>>>>>> for UP on you system.      
> >>>>>>>
> >>>>>>> [...]
> >>>>>>>     
> >>>>>>>>> If self-IPI is necessary, then this would be up to the GIC code to
> >>>>>>>>> solve.
> >>>>>>>>>
> >>>>>>>>> For that case, it would be nicer if we could detect whether this was
> >>>>>>>>> necessary based on the GIC registers alone. That way we handle the
> >>>>>>>>> various ways this can be integrated, aren't totally relient on the DT,
> >>>>>>>>> work in VMs, etc.      
> >>>>>>>>
> >>>>>>>> How we can detect IPI capabilities based on GIC register?      
> >>>>>>>
> >>>>>>> Check the mask associated with SGIs, as we do for gic_get_cpumask(). If
> >>>>>>> this is zero, we have a non-multiprocessor GIC (or one that's otherwise
> >>>>>>> broken), and can't do SGI in the usual way.
> >>>>>>>
> >>>>>>> However, it only makes sense to do this if self-IPI is truly a
> >>>>>>> necessity. Given there are other interrupt controllers that can't do
> >>>>>>> self-IPI, avoiding self-IPI in general would be a better strategy,
> >>>>>>> avoiding churn in each and every driver...    
> >>>>>>
> >>>>>> Indeed. And I won't take such a patch until all other avenues have been
> >>>>>> explored, including fixing core code if required...
> >>>>>>     
> >>>>>
> >>>>> Ok, it seems both you and Mark agree with disable IPI for GIC who has only
> >>>>> self-IPI capability (GICD_ITARGETSR0 to GICD_ITARGETSR7 are all
> >>>>> zero), right?  
> >>>>
> >>>> Not necessarily. This can be seen a latency improvement, compared to
> >>>> the timer method which should be the fallback.
> >>>>   
> >>>
> >>> Why? Your below patch (I tried too) just fixes NULL pointer issue for
> >>
> >> And that's the first issue to solve.
> >>
> >>> without define smp_cross_call function. But imx6ul is a SMP platform
> >>
> >> It is *not* an SMP platform. It may have a SMP-capable core, but that's
> >> about it.
> >>
> > 
> > Well. That's what I thought at the beginning, but the kernel
> > takes it is. At __fixup_smp (arch/arm/kernel/head.S), it only checks
> > MPIDR, for MPcore, it is 0x80000000, it means it is Multiprocessing
> > Extensions and Processor is part of a multiprocessor system.
> 
> If it is a multi-processor system, please show me the second core.
> If you can't, this is a UP system, end of story. It may have the MP
> extensions (there is no such thing as a non-MP A7), but that doesn't
> make it an SMP system. Please don't confuse the two things.
> 

> 
> Again, you're confusing MP-capable with SMP. Yes, the kernel tends to 
> confuse the two as well because it is not always easy to tell them 
> apart (as you just found). That doesn't mean we can't do a better job 
> separating the two concepts when we have the right level of information
> (i.e. we know the topology of the system).
> 

Thanks, you are right, from hardware level, it is UP system.

> > 
> >>> Besides, if the hardware has IPI capability, but we just disable it
> >>> to align with UP platforms, is it reasonable?
> >>
> >> Again: having a self-IPI on UP is an optimization. Nothing more.
> >>
> >> Now, coming back to your original idea, I'm aiming towards something
> >> like this:
> >>
> > 
> > Your below patch can work (tested), but why not registering an self-IPI
> > smp_cross_call function for single core, it can avoid judging in code
> > for each IPI calls.
> 
> Because it is an unnecessary complication. If you can demonstrate that 
> this single test is an overhead, then I'll consider making this a 
> separate function. Also, we can  move the test out of the lock that protects
> the CPU map as, by definition, there is nothing to protect, making it even
> more lightweight than your own approach:

Your patch can work for my case. Below is objdump for gic_raise_softirq,
the code with your patch seems have more instructions.

- With your patch:
00000c44 <gic_raise_softirq>:
     c44:	e1a0c00d 	mov	ip, sp
     c48:	e92ddbf0 	push	{r4, r5, r6, r7, r8, r9, fp, ip, lr, pc}
     c4c:	e24cb004 	sub	fp, ip, #4
     c50:	e59f908c 	ldr	r9, [pc, #140]	; ce4 <gic_raise_softirq+0xa0>
     c54:	e1a06000 	mov	r6, r0
     c58:	e1a07001 	mov	r7, r1
     c5c:	e5993000 	ldr	r3, [r9]
     c60:	e3530001 	cmp	r3, #1
     c64:	0a000019 	beq	cd0 <gic_raise_softirq+0x8c>
     c68:	e59f0078 	ldr	r0, [pc, #120]	; ce8 <gic_raise_softirq+0xa4>
     c6c:	ebfffffe 	bl	0 <_raw_spin_lock_irqsave>
     c70:	e3a04000 	mov	r4, #0
     c74:	e59f5070 	ldr	r5, [pc, #112]	; cec <gic_raise_softirq+0xa8>
     c78:	e1a08000 	mov	r8, r0
     c7c:	e3e00000 	mvn	r0, #0
     c80:	ea000001 	b	c8c <gic_raise_softirq+0x48>
     c84:	e5d236ac 	ldrb	r3, [r2, #1708]	; 0x6ac
     c88:	e1844003 	orr	r4, r4, r3
     c8c:	e2802001 	add	r2, r0, #1
     c90:	e3a01004 	mov	r1, #4
     c94:	e1a00006 	mov	r0, r6
     c98:	ebfffffe 	bl	0 <_find_next_bit_le>
     c9c:	e5993000 	ldr	r3, [r9]
     ca0:	e1530000 	cmp	r3, r0
     ca4:	e0852000 	add	r2, r5, r0
     ca8:	cafffff5 	bgt	c84 <gic_raise_softirq+0x40>
     cac:	e3a03000 	mov	r3, #0
     cb0:	ee073fba 	mcr	15, 0, r3, cr7, cr10, {5}
     cb4:	e1874804 	orr	r4, r7, r4, lsl #16
     cb8:	e5953088 	ldr	r3, [r5, #136]	; 0x88
     cbc:	e5834f00 	str	r4, [r3, #3840]	; 0xf00
     cc0:	e59f0020 	ldr	r0, [pc, #32]	; ce8 <gic_raise_softirq+0xa4>
     cc4:	e1a01008 	mov	r1, r8
     cc8:	ebfffffe 	bl	0 <_raw_spin_unlock_irqrestore>
     ccc:	e89dabf0 	ldm	sp, {r4, r5, r6, r7, r8, r9, fp, sp, pc}
     cd0:	e59f3014 	ldr	r3, [pc, #20]	; cec <gic_raise_softirq+0xa8>
     cd4:	e3817402 	orr	r7, r1, #33554432	; 0x2000000
     cd8:	e5933088 	ldr	r3, [r3, #136]	; 0x88
     cdc:	e5837f00 	str	r7, [r3, #3840]	; 0xf00
     ce0:	e89dabf0 	ldm	sp, {r4, r5, r6, r7, r8, r9, fp, sp, pc}
     ce4:	00000000 	.word	0x00000000
     ce8:	00000004 	.word	0x00000004
     cec:	00000000 	.word	0x00000000

- The current code
00000468 <gic_raise_softirq>:
     468:	e1a0c00d 	mov	ip, sp
     46c:	e92ddbf0 	push	{r4, r5, r6, r7, r8, r9, fp, ip, lr, pc}
     470:	e24cb004 	sub	fp, ip, #4
     474:	e1a06000 	mov	r6, r0
     478:	e59f0068 	ldr	r0, [pc, #104]	; 4e8 <gic_raise_softirq+0x80>
     47c:	e1a07001 	mov	r7, r1
     480:	ebfffffe 	bl	0 <_raw_spin_lock_irqsave>
     484:	e3a04000 	mov	r4, #0
     488:	e59f505c 	ldr	r5, [pc, #92]	; 4ec <gic_raise_softirq+0x84>
     48c:	e59f905c 	ldr	r9, [pc, #92]	; 4f0 <gic_raise_softirq+0x88>
     490:	e1a08000 	mov	r8, r0
     494:	e3e00000 	mvn	r0, #0
     498:	ea000001 	b	4a4 <gic_raise_softirq+0x3c>
     49c:	e5d236ac 	ldrb	r3, [r2, #1708]	; 0x6ac
     4a0:	e1844003 	orr	r4, r4, r3
     4a4:	e2802001 	add	r2, r0, #1
     4a8:	e3a01004 	mov	r1, #4
     4ac:	e1a00006 	mov	r0, r6
     4b0:	ebfffffe 	bl	0 <_find_next_bit_le>
     4b4:	e5993000 	ldr	r3, [r9]
     4b8:	e1500003 	cmp	r0, r3
     4bc:	e0852000 	add	r2, r5, r0
     4c0:	bafffff5 	blt	49c <gic_raise_softirq+0x34>
     4c4:	e3a03000 	mov	r3, #0
     4c8:	ee073fba 	mcr	15, 0, r3, cr7, cr10, {5}
     4cc:	e1874804 	orr	r4, r7, r4, lsl #16
     4d0:	e5953088 	ldr	r3, [r5, #136]	; 0x88
     4d4:	e5834f00 	str	r4, [r3, #3840]	; 0xf00
     4d8:	e59f0008 	ldr	r0, [pc, #8]	; 4e8 <gic_raise_softirq+0x80>
     4dc:	e1a01008 	mov	r1, r8
     4e0:	ebfffffe 	bl	0 <_raw_spin_unlock_irqrestore>
     4e4:	e89dabf0 	ldm	sp, {r4, r5, r6, r7, r8, r9, fp, sp, pc}
     4e8:	00000004 	.word	0x00000004

> 
> From 23ba8b645d219e333a10224d74fb5d8d75d67de2 Mon Sep 17 00:00:00 2001
> From: Marc Zyngier <marc.zyngier at arm.com>
> Date: Tue, 9 Aug 2016 07:50:44 +0100
> Subject: [PATCH] irqchip/gic: Allow self-SGIs for SMP on UP configurations
> 
> On systems where a single CPU is present, the GIC may not support
> having SGIs delivered to a target list. In that case, we use the
> self-SGI mechanism to allow the interrupt to be delivered locally.
> 
> Signed-off-by: Marc Zyngier <marc.zyngier at arm.com>
> ---
>  drivers/irqchip/irq-gic.c | 7 +++++++
>  1 file changed, 7 insertions(+)
> 
> diff --git a/drivers/irqchip/irq-gic.c b/drivers/irqchip/irq-gic.c
> index c2cab57..390fac5 100644
> --- a/drivers/irqchip/irq-gic.c
> +++ b/drivers/irqchip/irq-gic.c
> @@ -769,6 +769,13 @@ static void gic_raise_softirq(const struct cpumask *mask, unsigned int irq)
>  	int cpu;
>  	unsigned long flags, map = 0;
>  
> +	if (unlikely(nr_cpu_ids == 1)) {
> +		/* Only one CPU? let's do a self-IPI... */
> +		writel_relaxed(2 << 24 | irq,
> +			       gic_data_dist_base(&gic_data[0]) + GIC_DIST_SOFTINT);
> +		return;
> +	}
> +
>  	raw_spin_lock_irqsave(&irq_controller_lock, flags);
>  
>  	/* Convert our logical CPU mask into a physical one. */
> 
> Also, your patch seems to break the arm64 ACPI support by moving the SMP
> setup to a DT-specific function (I don't see why this should be DT only
> anyway).
> 

Sorry, I don't know the code well.

-- 

Best Regards,
Peter Chen



More information about the linux-arm-kernel mailing list