GICv3: RWP timeout, gone fishing
Joakim Tjernlund
Joakim.Tjernlund at infinera.com
Wed May 4 15:27:35 PDT 2022
On Fri, 2022-04-22 at 12:19 +0200, Joakim Tjernlund wrote:
> After a 1 cluster reset(got 2 clusters with 2 A53 cores in each, only one core runs Linux) I se a bunch of
> GICv3: RWP timeout, gone fishing
> when linux is booting up and IRQs aren't working, the console hangs when starting user space.
>
> The cluster reset is impl. in linux reboot code so everything should shut down properly
> before the cluster reset.
>
> kernel 5.15.26
>
> What can could be the cause? Where do I even begin to look?
>
> Jocke
Tried to fix these RWP timeout by impl. restart_handler by mimicking PM:
--- a/drivers/irqchip/irq-gic-v3.c
+++ b/drivers/irqchip/irq-gic-v3.c
@@ -23,6 +23,7 @@
#include <linux/irqchip/arm-gic-common.h>
#include <linux/irqchip/arm-gic-v3.h>
#include <linux/irqchip/irq-partition-percpu.h>
+#include <linux/reboot.h>
#include <asm/cputype.h>
#include <asm/exception.h>
@@ -811,6 +812,7 @@ static void __init gic_dist_init(void)
gic_dist_config(base, GIC_LINE_NR, gic_dist_wait_for_rwp);
val = GICD_CTLR_ARE_NS | GICD_CTLR_ENABLE_G1A | GICD_CTLR_ENABLE_G1;
if (gic_data.rdists.gicd_typer2 & GICD_TYPER2_nASSGIcap) {
pr_info("Enabling SGIs without active state\n");
val |= GICD_CTLR_nASSGIreq;
@@ -1359,6 +1361,22 @@ static void gic_cpu_pm_init(void)
static inline void gic_cpu_pm_init(void) { }
#endif /* CONFIG_CPU_PM */
+static int gicv3_restart_notify(struct notifier_block *nb,
+ unsigned long mode, void *cmd)
+{
+ if (1 || gic_dist_security_disabled()) {
+ gic_write_grpen1(0);
+ gic_enable_redist(false);
+ }
+
+ return NOTIFY_DONE;
+}
+
+static struct notifier_block gicv3_restart_nb = {
+ .notifier_call = gicv3_restart_notify,
+ .priority = 0, /* Call late/last */
+};
+
static struct irq_chip gic_chip = {
.name = "GICv3",
.irq_mask = gic_mask_irq,
@@ -1843,6 +1861,8 @@ static int __init gic_init_bases(void __iomem *dist_base,
gic_cpu_init();
gic_smp_init();
gic_cpu_pm_init();
+ //register_reboot_notifier(&gicv3_restart_nb);
+ register_restart_handler(&gicv3_restart_nb);
However this does not work until I move local_irq_disable() in machine_restart
so that the restart handler runs before IRQs are turned off:
--- a/arch/arm64/kernel/process.c
+++ b/arch/arm64/kernel/process.c
@@ -125,8 +125,10 @@ void machine_power_off(void)
*/
void machine_restart(char *cmd)
{
+
+
/* Disable interrupts first */
- local_irq_disable();
+ //local_irq_disable();
smp_send_stop();
/*
@@ -138,6 +140,7 @@ void machine_restart(char *cmd)
/* Now call the architecture specific reboot code. */
do_kernel_restart(cmd);
+ local_irq_disable();
/*
* Whoops - the architecture was unable to reboot.
I am probably barking up the work tree here too, any clues?
Another related Q: gic_dist_security_disabled() is false in my system, not
sure if that is best/common ? I can change u-boot to disable security here, we control all
SW running in this system.
Jocke
PS.
reboot_notifier does is even worse but perhaps this is the way forward anyway?
More information about the linux-arm-kernel
mailing list