[PATCH 1/3] panic: Disable crash_kexec_post_notifiers if kdump is not available

Vivek Goyal vgoyal at redhat.com
Tue Jul 14 09:16:12 PDT 2015


On Tue, Jul 14, 2015 at 03:48:33PM +0000, dwalker at fifo99.com wrote:
> On Tue, Jul 14, 2015 at 11:40:40AM -0400, Vivek Goyal wrote:
> > On Tue, Jul 14, 2015 at 03:34:30PM +0000, dwalker at fifo99.com wrote:
> > > On Tue, Jul 14, 2015 at 11:02:08AM -0400, Vivek Goyal wrote:
> > > > On Tue, Jul 14, 2015 at 01:59:19PM +0000, dwalker at fifo99.com wrote:
> > > > > On Mon, Jul 13, 2015 at 08:19:45PM -0500, Eric W. Biederman wrote:
> > > > > > dwalker at fifo99.com writes:
> > > > > > 
> > > > > > > On Fri, Jul 10, 2015 at 08:41:28AM -0500, Eric W. Biederman wrote:
> > > > > > >> Hidehiro Kawai <hidehiro.kawai.ez at hitachi.com> writes:
> > > > > > >> 
> > > > > > >> > You can call panic notifiers and kmsg dumpers before kdump by
> > > > > > >> > specifying "crash_kexec_post_notifiers" as a boot parameter.
> > > > > > >> > However, it doesn't make sense if kdump is not available.  In that
> > > > > > >> > case, disable "crash_kexec_post_notifiers" boot parameter so that
> > > > > > >> > you can't change the value of the parameter.
> > > > > > >> 
> > > > > > >> Nacked-by: "Eric W. Biederman" <ebiederm at xmission.com>
> > > > > > >
> > > > > > > I think it would make sense if he just replaced "kdump" with "kexec".
> > > > > > 
> > > > > > It would be less insane, however it still makes no sense as without
> > > > > > kexec on panic support crash_kexec is a noop.  So the value of the
> > > > > > seeting makes no difference.
> > > > > 
> > > > > Can you explain more, I don't really understand what you mean. Are you suggesting
> > > > > the whole "crash_kexec_post_notifiers" feature has no value ?
> > > > 
> > > > Daniel,
> > > > 
> > > > BTW, why are you using crash_kexec_post_notifiers commandline? Why not
> > > > without it?
> > > 
> > > It was explained in the prior thread but to rehash, the notifiers are used to do a switch
> > > over from the crashed machine to another redundant machine.
> > 
> > So why not detect failure using polling or issue notifications from second
> > kernel.
> > 
> > IOW, expecting that a crashed machine will be able to deliver notification
> > reliably is falwed to begin with, IMHO.
> 
> It's flawed to think you can kexec, but you still do it right ? I've not gotten into
> the deep details of this switching process, but that's how this interface is used.

Sure. But the deal here is that users of interface know that sometimes it
can be unreliable. And in the absence of more reliable mechanism, somewhat
less reliable mechanism is fine. 

>  
> > If a machine is failing, there are high chance it can't deliver you the
> > notification. Detecting that failure suing some kind of polling mechanism
> > might be more reliable. And it will make even kdump mechanism more
> > reliable so that it does not have to run panic notifiers after the crash.
> 
> I think what your suggesting is that my company should change how it's hardware works
> and that's not really an option for me. This isn't a simple thing like checking over the
> network if the machine is down or not, this is way more complex hardware design.

That means you are ready to live with an unreliable design. There might be
cases where notifier does not get run properly and you will not do switch
despite the fact that OS has failed. I was just trying to nudge you in
a direction which could be more reliable mechanism.

Thanks
Vivek



More information about the kexec mailing list