[PATCH] GICv3: Add restart handler to detach CPU from GICv3
Joakim Tjernlund
Joakim.Tjernlund at infinera.com
Wed Jan 4 11:52:33 PST 2023
On Wed, 2023-01-04 at 18:48 +0000, Marc Zyngier wrote:
> On Wed, 04 Jan 2023 17:23:42 +0000,
> Joakim Tjernlund <Joakim.Tjernlund at infinera.com> wrote:
> >
> > On Wed, 2023-01-04 at 16:50 +0000, Marc Zyngier wrote:
> > > On Wed, 04 Jan 2023 16:04:14 +0000,
> > > Mark Rutland <mark.rutland at arm.com> wrote:
> > > >
> > > > On Tue, Jan 03, 2023 at 04:27:07PM +0000, Joakim Tjernlund wrote:
> > > > > On Fri, 2022-12-16 at 17:21 +0100, Joakim Tjernlund wrote:
> > > > >
> > > > > Ping?
> > > >
> > > > To whom?
> > > >
> > > > You don't appeared to have Cc'd any relevant maintainer, and people are still
> > > > on holiday, so it's extremely likely this will be missed.
> > >
> > > That, plus nobody reads the list looking for this sort of things.
> > >
> > > >
> > > > For the maintainer, please use scripts/get_maintainer.pl, e.g.
> > > >
> > > > > [mark at lakrids:~/src/linux]% ./scripts/get_maintainer.pl -f drivers/irqchip/irq-gic-v3.c
> > > > > Thomas Gleixner <tglx at linutronix.de> (maintainer:IRQCHIP DRIVERS)
> > > > > Marc Zyngier <maz at kernel.org> (maintainer:IRQCHIP DRIVERS)
> > > > > linux-kernel at vger.kernel.org (open list:IRQCHIP DRIVERS)
> > > >
> > > > Note: I've Cc'd Marc, who wrote the GICv3 driver.
> > >
> > > Cheers Mark, much appreciated.
> >
> > Sorry for missing that extra maintainer CC:
> >
> > >
> > > >
> > > > > > Needed for reboot without resetting the whole GIC
> > > >
> > > > This doesn't really explain what you're trying to do nor why.
> > > >
> > > > Why do you need to "reboot without resetting the whole GIC" ?
> > > >
> > > > Do you encounter a problem if we try to reset the whole GIC?
> > > >
> > > > Is this for kexec?
> > > >
> > > > Is this for some use-case enabled by out-of-tree code?
> > >
> > > All valid questions. This smells of a terrible hack...
> >
> > Yes, all god Q's.
>
> And no answer?
No kexec, out of tree kernel with minor mods
>
> >
> > >
> > > The interesting aspect is that this is only done when DS=1, probably
> > > meaning that they are doing this in a VM. it also rely on some
> >
> > Nope, on our custom target.
>
> And you run with DS=1? On bare metal? Humpf...
Yes, we control all SW running on this target.
>
> >
> > > (unbounded) UNPRED behaviour as ProcessorSleep is entered without
> > > any consideration for Group0... Good luck with that.
No Group0 IRQ used(or so I think but I could be mistaken)
> >
> > hmm, I am doing the same as PM does which also needs DS=1 so I figured
> > this was uncontroversial.
>
> I seriously doubt anyone is actually using that code in anger. I
> always found it dodgy, and I'm pretty sure it is totally broken.
I see
>
> >
> > >
> > > Anyway, I don't think we want any of this stuff. Certainly not without
> > > a cast-iron justification.
> >
> > We use several cores but only one run Linux so a Linux reboot will
> > only reset its own core.
>
> And what happens with the distributor? One of the key assumption of
> the GIC architecture is that there is only one operating system in
> control of it, the whole of it. The only way to share it is by
> virtualising it. Shades of Jailhouse...
Distributor is not touched until Linux is going up again
>
> >
> > Without this patch I either need to reset the GIC as part of the
> > reboot or I get RWP timeout when linux starts again. Resetting the
> > GIC kills IRQ on the other cores for a long time which is unwanted.
>
> But that's what happens anyway when Linux boots (we reinitialise the
> whole distributor). So what is this about?
At that point IRQ will only be lost shortly until the other cores are reloaded by Linux user space
>
> >
> > The RWP timeout comes from lost HW handshake between core and GIC.
> > Is there another way to regain the HW handshake ?
>
> That's the firmware's job. But overall, getting these timeouts means
> your GIC has locked up. It would be more interesting to understand
FW here is u-boot starting from EL3
> *why* you get in that situation, which I presume is due to the way the
> driver initialises itself.
>
> Assuming you use kexec to reboot your Linux instance, does the
> following help (totally untested)?
no kexec and no it did not help.
I get the first RWP here:
static void __init gic_dist_init(void)
{
unsigned int i;
u64 affinity;
void __iomem *base = gic_data.dist_base;
u32 val;
/* Disable the distributor */
writel_relaxed(0, base + GICD_CTLR);
gic_dist_wait_for_rwp(); <------ RWP timeout
I guess it is impossible to recover GIC HW handshake once it is lost ?
>
> M.
>
> diff --git a/drivers/irqchip/irq-gic-v3.c b/drivers/irqchip/irq-gic-v3.c
> index 997104d4338e..0db35a07ffdb 100644
> --- a/drivers/irqchip/irq-gic-v3.c
> +++ b/drivers/irqchip/irq-gic-v3.c
> @@ -1186,12 +1186,25 @@ static int gic_dist_supports_lpis(void)
> static void gic_cpu_init(void)
> {
> void __iomem *rbase;
> + bool g0;
> int i;
>
> /* Register ourselves with the rest of the world */
> if (gic_populate_rdist())
> return;
>
> + g0 = gic_has_group0();
> +
> + if (read_gicreg(ICC_IGRPEN1_EL1) ||
> + (g0 && read_gicreg(ICC_IGRPEN0_EL1))) {
> + if (g0)
> + write_gicreg(0, ICC_IGRPEN0_EL1);
> +
> + write_gicreg(0, ICC_IGRPEN1_EL1);
> + isb();
> + gic_enable_redist(false);
> + }
> +
> gic_enable_redist(true);
>
> WARN((gic_data.ppi_nr > 16 || GIC_ESPI_NR != 0) &&
>
More information about the linux-arm-kernel
mailing list