[PATCH] Fix kexec abort due to IPI from panic().

Seiji Aguchi seiji.aguchi at hds.com
Thu Sep 16 16:16:14 EDT 2010


Hi,

I'm Seiji Aguchi.
I work for Hitachi Data Systems.
It's a first time to send a patch to lkml.
Nice to meet you.
 
I found an issue in kexec.
Please give me your comments and suggestions.
 
Kexec abort when two cpus panic at the same time.
An example scenario:
1. Two cpus panic at the same time .
2. One cpu ,cpu0, get kexec_mutex in crash_kexec().
3. The other cpu ,cpu1, can't get kexec_mutex and return from crash_kexec().
4. Cpu0 runs kmsg_dump(KMSG_DUMP_KEXEC).
5. Cpu1 can't get dump_list_lock and return from kmsg_dump(KMSG_DUMP_PANIC).
6. Cpu1 runs smp_send_stop() in panic() and sends IPI to other cpus.
7. Cpu0 may receive IPI from cpu1 while running kmsg_dump(KMSG_DUMP_KEXEC),
   crash_setup_regs(), or crash_save_vmcore().
 
We can solve this issue by disabling external interrupt while getting kexec_mutex 
in crash_kexec().


 Signed-off-by: Seiji Aguchi <seiji.aguchi at hds.com>

---
 kernel/kexec.c |    7 +++++++
 1 files changed, 7 insertions(+), 0 deletions(-)

diff --git a/kernel/kexec.c b/kernel/kexec.c
index c0613f7..9e9f159 100644
--- a/kernel/kexec.c
+++ b/kernel/kexec.c
@@ -1075,6 +1075,10 @@ void crash_kexec(struct pt_regs *regs)
 	 * sufficient.  But since I reuse the memory...
 	 */
 	if (mutex_trylock(&kexec_mutex)) {
+		unsigned long flags;
+
+		local_irq_save(flags);
+
 		if (kexec_crash_image) {
 			struct pt_regs fixed_regs;
 
@@ -1085,6 +1089,9 @@ void crash_kexec(struct pt_regs *regs)
 			machine_crash_shutdown(&fixed_regs);
 			machine_kexec(kexec_crash_image);
 		}
+
+		local_irq_restore(flags);
+
 		mutex_unlock(&kexec_mutex);
 	}
 }
-- 
1.7.2.2


Regards,

Seiji



More information about the kexec mailing list