[PATCH 11/15] arm64: kdump: implement machine_crash_shutdown()
Will Deacon
will.deacon at arm.com
Tue Nov 10 01:54:11 PST 2015
On Tue, Nov 10, 2015 at 10:23:56AM +0900, AKASHI Takahiro wrote:
> On 11/07/2015 04:14 AM, Geoff Levand wrote:
> >From: AKASHI Takahiro <takahiro.akashi at linaro.org>
> >
> >kdump calls machine_crash_shutdown() to shut down non-boot cpus and
> >save registers' status in per-cpu ELF notes before starting the crash
> >dump kernel. See kernel_kexec().
> >
> >ipi_cpu_stop() is a bit modified and used to support this behavior.
>
> I've got some concerns of using ipi_cpu_stop().
>
> >Signed-off-by: AKASHI Takahiro <takahiro.akashi at linaro.org>
> >---
> > arch/arm64/include/asm/kexec.h | 34 +++++++++++++++++++++++++++++++++-
> > arch/arm64/kernel/machine_kexec.c | 31 +++++++++++++++++++++++++++++--
> > arch/arm64/kernel/smp.c | 16 ++++++++++++++--
> > 3 files changed, 76 insertions(+), 5 deletions(-)
[...]
> >diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c
> >index dbdaacd..88aec66 100644
> >--- a/arch/arm64/kernel/smp.c
> >+++ b/arch/arm64/kernel/smp.c
> >@@ -37,6 +37,7 @@
> > #include <linux/completion.h>
> > #include <linux/of.h>
> > #include <linux/irq_work.h>
> >+#include <linux/kexec.h>
> >
> > #include <asm/alternative.h>
> > #include <asm/atomic.h>
> >@@ -54,6 +55,8 @@
> > #include <asm/ptrace.h>
> > #include <asm/virt.h>
> >
> >+#include "cpu-reset.h"
> >+
> > #define CREATE_TRACE_POINTS
> > #include <trace/events/ipi.h>
> >
> >@@ -679,8 +682,12 @@ static DEFINE_RAW_SPINLOCK(stop_lock);
> > /*
> > * ipi_cpu_stop - handle IPI from smp_send_stop()
> > */
> >-static void ipi_cpu_stop(unsigned int cpu)
> >+static void ipi_cpu_stop(unsigned int cpu, struct pt_regs *regs)
> > {
> >+#ifdef CONFIG_KEXEC
> >+ /* printing messages may slow down the shutdown. */
> >+ if (!in_crash_kexec)
> >+#endif
> > if (system_state == SYSTEM_BOOTING ||
> > system_state == SYSTEM_RUNNING) {
> > raw_spin_lock(&stop_lock);
> >@@ -693,6 +700,11 @@ static void ipi_cpu_stop(unsigned int cpu)
> >
> > local_irq_disable();
> >
> >+#ifdef CONFIG_KEXEC
> >+ if (in_crash_kexec)
> >+ crash_save_cpu(regs, cpu);
> >+#endif /* CONFIG_KEXEC */
> >+
> > while (1)
> > cpu_relax();
> > }
>
> cpu_relax() is defined as asm("yield"), and this puts all but boot cpu into
> a infinite loop of nop (actually, whether nop or other depends on hw implementation).
> Thus all the secondary cpus are still running busy loop even after crash dump kernel
> has started up, and the chip can potentially get overheated.
> I ran into this situation when I tested the code on Hikey, and the system was
> forced to be shut down by thermal driver.
>
> So I'd like to modify the code a bit like:
> if (in_crash_kernel {
> crash_save_cpu(regs, cpu);
> while (1)
> asm("wfi"); /* irq is disabled here. */
> }
>
> Does this make sense?
It would be even better if we could hotplug them off.
Will
More information about the kexec
mailing list