[PATCH v3] watchdog: Add hook for kicking in kdump path
linux at roeck-us.net
Thu Apr 18 09:49:04 EDT 2013
On Thu, Apr 18, 2013 at 09:00:09AM -0400, Don Zickus wrote:
> On Wed, Apr 17, 2013 at 02:49:59PM -0700, Eric W. Biederman wrote:
> > Don Zickus <dzickus at redhat.com> writes:
> > > A common problem with kdump is that during the boot up of the
> > > second kernel, the hardware watchdog times out and reboots the
> > > machine before a vmcore can be captured.
> > >
> > > Instead of tellling customers to disable their hardware watchdog
> > > timers, I hacked up a hook to put in the kdump path that provides
> > > one last kick before jumping into the second kernel.
> > >
> > > The assumption is the watchdog timeout is at least 10-30 seconds
> > > long, enough to get the second kernel to userspace to kick the watchdog
> > > again, if needed.
> > Why not double the watchdog timeout? and/or pet the watchdog a little
> > more frequently.
> I am not sure if the watchdog timeouts can be doubled. I think Guenter
> was saying some have a max of a couple seconds?? Petting a little more
> frequently might be an option. Guenter can that be done with a softdog
Most watchdog driver permit at least a minute. Some are more limited.
Worst I have seen is the BookE watchdog timer (non-Freescale version)
which has a maximum of three seconds. But that is broken anyway.
Most hardware watchdogs implement a softdog on top of the hardware watchdog
if the hardware needs to be pinged faster than every 60 seconds.
So, yes, for the most common case you should actually be able to live with a,
say, 30-60 second timeout which is pinged at least every 5-10 seconds. I thought
that somehow did not work in your case. Maybe a misunderstanding ?
If you have a customer with a specific problem on a specific watchdog which has
a too-short maximum interval, maybe another solution sould be to look into that
specific watchdog driver and see if it can be fixed.
More information about the kexec