[PATCH] [WATCHDOG] Fix kdump when using hpwdt

Bernhard Walle bwalle at suse.de
Sun Oct 26 10:59:37 EDT 2008

When the "hpwdt" module is loaded (even if the /dev/watchdog device is not
opened), then kdump does not work. The panic kernel either does not start at
all or crash in various places.

The problem is that hpwdt_pretimeout is registered with register_die_notifier()
with the highest possible priority. Because it returns NOTIFY_STOP, the
crash_nmi_callback which is also registered with register_die_notifier()
is never executed. This causes the shutdown of other CPUs to fail.

Reverting the order is no option: The crash_nmi_callback executes HLT
and so never returns normally. Because of that, it must be executed as
last notifier, which currently is done.

So, that patch returns NOTIFY_OK in case allow_kdump is set as module parameter
in the hpwdt module. Also, it changes the default of allow_kdump to 1. Kdump is
quite common and should be working as default.

Signed-off-by: Bernhard Walle <bwalle at suse.de>
Cc: Wim Van Sebroeck <wim at iguana.be>
Cc: Thomas Mingarelli <thomas.mingarelli at hp.com>
Cc: Vivek Goyal <vgoyal at redhat.com>
 drivers/watchdog/hpwdt.c |   10 +++++++---
 1 files changed, 7 insertions(+), 3 deletions(-)

diff --git a/drivers/watchdog/hpwdt.c b/drivers/watchdog/hpwdt.c
index a3765e0..65e7102 100644
--- a/drivers/watchdog/hpwdt.c
+++ b/drivers/watchdog/hpwdt.c
@@ -116,7 +116,7 @@ static unsigned int reload;			/* the computed soft_margin */
 static int nowayout = WATCHDOG_NOWAYOUT;
 static char expect_release;
 static unsigned long hpwdt_is_open;
-static unsigned int allow_kdump;
+static unsigned int allow_kdump = 1;
 static void __iomem *pci_mem_addr;		/* the PCI-memory address */
 static unsigned long __iomem *hpwdt_timer_reg;
@@ -482,7 +482,11 @@ static int hpwdt_pretimeout(struct notifier_block *nb, unsigned long ulReason,
 			"Management Log for details.\n");
-	return NOTIFY_STOP;
+	/*
+	 * for kdump, we must return NOTIFY_OK here to execute the
+	 * crash_nmi_callback afterwards, see arch/x86/kernel/crash.c
+	 */
+	return allow_kdump ? NOTIFY_OK : NOTIFY_STOP;
 module_param(soft_margin, int, 0);
 MODULE_PARM_DESC(soft_margin, "Watchdog timeout in seconds");
-module_param(allow_kdump, int, 0);
+module_param(allow_kdump, int, 1);
 MODULE_PARM_DESC(allow_kdump, "Start a kernel dump after NMI occurs");
 module_param(nowayout, int, 0);

More information about the kexec mailing list