[PATCH RFC v3 14/21] irqchip/gic-v3: Don't return errors from gic_acpi_match_gicc()

Jonathan Cameron Jonathan.Cameron at Huawei.com
Tue Jan 23 02:08:21 PST 2024


On Tue, 9 Jan 2024 19:27:20 +0000
"Russell King (Oracle)" <linux at armlinux.org.uk> wrote:

> On Fri, Dec 15, 2023 at 04:33:01PM +0000, Jonathan Cameron wrote:
> > On Wed, 13 Dec 2023 12:50:23 +0000
> > Russell King (Oracle) <rmk+kernel at armlinux.org.uk> wrote:
> >   
> > > From: James Morse <james.morse at arm.com>
> > > 
> > > gic_acpi_match_gicc() is only called via gic_acpi_count_gicr_regions().
> > > It should only count the number of enabled redistributors, but it
> > > also tries to sanity check the GICC entry, currently returning an
> > > error if the Enabled bit is set, but the gicr_base_address is zero.
> > > 
> > > Adding support for the online-capable bit to the sanity check
> > > complicates it, for no benefit. The existing check implicitly
> > > depends on gic_acpi_count_gicr_regions() previous failing to find
> > > any GICR regions (as it is valid to have gicr_base_address of zero if
> > > the redistributors are described via a GICR entry).
> > > 
> > > Instead of complicating the check, remove it. Failures that happen
> > > at this point cause the irqchip not to register, meaning no irqs
> > > can be requested. The kernel grinds to a panic() pretty quickly.
> > > 
> > > Without the check, MADT tables that exhibit this problem are still
> > > caught by gic_populate_rdist(), which helpfully also prints what
> > > went wrong:
> > > | CPU4: mpidr 100 has no re-distributor!
> > > 
> > > Signed-off-by: James Morse <james.morse at arm.com>
> > > Reviewed-by: Gavin Shan <gshan at redhat.com>
> > > Tested-by: Miguel Luis <miguel.luis at oracle.com>
> > > Tested-by: Vishnu Pajjuri <vishnu at os.amperecomputing.com>
> > > Tested-by: Jianyong Wu <jianyong.wu at arm.com>
> > > Signed-off-by: Russell King (Oracle) <rmk+kernel at armlinux.org.uk>
> > > ---
> > >  drivers/irqchip/irq-gic-v3.c | 18 ++++++------------
> > >  1 file changed, 6 insertions(+), 12 deletions(-)
> > > 
> > > diff --git a/drivers/irqchip/irq-gic-v3.c b/drivers/irqchip/irq-gic-v3.c
> > > index 98b0329b7154..ebecd4546830 100644
> > > --- a/drivers/irqchip/irq-gic-v3.c
> > > +++ b/drivers/irqchip/irq-gic-v3.c
> > > @@ -2420,21 +2420,15 @@ static int __init gic_acpi_match_gicc(union acpi_subtable_headers *header,
> > >  
> > >  	/*
> > >  	 * If GICC is enabled and has valid gicr base address, then it means
> > > -	 * GICR base is presented via GICC
> > > +	 * GICR base is presented via GICC. The redistributor is only known to
> > > +	 * be accessible if the GICC is marked as enabled. If this bit is not
> > > +	 * set, we'd need to add the redistributor at runtime, which isn't
> > > +	 * supported.
> > >  	 */
> > > -	if (acpi_gicc_is_usable(gicc) && gicc->gicr_base_address) {
> > > +	if (gicc->flags & ACPI_MADT_ENABLED && gicc->gicr_base_address)  
> > 
> > I was very vague in previous review.  I think the reasons you are switching
> > from acpi_gicc_is_useable(gicc) to the gicc->flags & ACPI_MADT_ENABLED
> > needs calling out as I'm fairly sure that this point in the series at least
> > acpi_gicc_is_usable is same as current upstream:
> > 
> > static inline bool acpi_gicc_is_usable(struct acpi_madt_generic_interrupt *gicc)
> > {
> > 	return gicc->flags & ACPI_MADT_ENABLED;
> > }  
> 
> In a previous patch adding acpi_gicc_is_usable() c54e52f84d7a ("arm64,
> irqchip/gic-v3, ACPI: Move MADT GICC enabled check into a helper") this
> was:
> 
> -       if ((gicc->flags & ACPI_MADT_ENABLED) && gicc->gicr_base_address) {
> +       if (acpi_gicc_is_usable(gicc) && gicc->gicr_base_address) {
> 
> so effectively this is undoing that particular change, which raises in
> my mind why the change was made in the first place if it's just going
> to be reverted in a later patch (because in a following patch,
> acpi_gicc_is_usable() has an additional condition added to it that
> isn't applicable here.) which effectively makes acpi_gicc_is_usable()
> return true if either ACPI_MADT_ENABLED _or_
> ACPI_MADT_GICC_ONLINE_CAPABLE (as it is now known) are set.

Ok. So maybe just calling out that we are about to change the meaning
of acpi_gicc_is_usable() so need to partly revert that earlier patch
to make use of it everywhere.

Or perhaps introduce
acpi_gicc_is_enabled() which is called by acpi_gicc_is_usable()
along with the new conditions when they are added though as you
say later, what does usable mean?

> 
> However, if ACPI_MADT_GICC_ONLINE_CAPABLE is set, does that actually
> mean that the GICC is usable? I'm not sure it does. ACPI v6.5 says that
> this bit indicates that the system supports enabling this processor
> later. Is the GICC of a currently disabled processor "usable"...

I agree, this is confusing.

acpi_gicc_may_be_usable()?

Or invert it in all places to give a cleaner meaning
!acpi_gicc_never_usable()

Bit of a pain to change this throughout again, but maybe necessary
to avoid confusion in future.

> 
> Clearly, the intention of this change is not to count this GICC entry
> if it is marked ACPI_MADT_GICC_ONLINE_CAPABLE, but I feel that isn't
> described in the commit message.

Agreed, though that only happens in the next patch so easier to describe
there or via a patch adding initially identical multiple helper functions
that then diverge in following patch?

Whilst a helper for this one location seems silly it would let us put
the two helpers next to each other where the distinction is obvious.

> 
> Moreover, I am getting the feeling that there are _two_ changes going
> on here - there's the change that's talked about in the commit message
> (the complex validation that seems unnecessary) and then there's the
> preparation for the change to acpi_gicc_is_usable() - which maybe
> should be in the following patch where it would be less confusing.

Agreed.

> 
> Would you agree?
> 
Yes, the move would help as then it's obvious why this needs to change
and that is separate from the naming question.

So in conclusion, I agree with everything you've called out on this one,
up to you to pick which solution cleans this up. I think options are.
1) Just move the change to the next patch where it's easier to describe.
   Leaves the odd 'usable' behind.
2) Rename the useable() to something else, maybe inverting logic as
   !never is easier than now_or_maybe_later.
3) Possibly add another helper for this new case which starts as matching
   the existing one, but diverges in a later patch (Should still not be
   in this patch which as you observer is doing something else and I think
   is actually a bug fix anyway, be it one that has never mattered for
   any shipping firmware).

Jonathan





More information about the linux-arm-kernel mailing list