[PATCH 0/2] add new notifier function ,take3

Thu Apr 17 01:31:38 EDT 2008

Andrew Morton wrote:
> On Mon, 14 Apr 2008 12:01:46 -0400
> Neil Horman <nhorman at redhat.com> wrote:
> 
>> On Mon, Apr 14, 2008 at 10:53:23AM -0400, Vivek Goyal wrote:
>>> On Mon, Apr 14, 2008 at 10:42:28AM -0400, Neil Horman wrote:
>>>> On Mon, Apr 14, 2008 at 09:46:22AM -0400, Vivek Goyal wrote:
>>>>> On Fri, Apr 11, 2008 at 09:07:51PM -0700, Andrew Morton wrote:
>>>>>
>>>>> [..]
>>>>>>> Kernel panic - not syncing: Panic by panic_module.
>>>>>>> __tunable_atomic_notifier_call_chain enter
>>>>>>> msg_handler:panic_event was called.
>>>>>>> ipmi_wdog:wdog_panic_handler was called.
>>>>>>> notifier_test: notifier_test_panic() is called.
>>>>>>> notifier_test: notifier_test_panic2() is called.
>>>>>> OK.  But I don't see anywhere in here the most important piece of
>>>>>> information: why do we need this feature in Linux?
>>>>>>
>>>>>> What are the use-cases?  What is the value?  etc.
>>>>>>
>>>>>> Often I can guess (but I like the originator to remove the guesswork).  In
>>>>>> this case I'm stumped - I can't see any reason why anyone would want this.
>>>>>>
>>>>> Hi Andrew,
>>>>>
>>>>> To begin with, he wants kdb, kgdb etc to co-exist with kdump. He wants
>>>>> to put all the RAS tools (who are interested in panic event) on a list
>>>>> and export it to user space and let user decide in what order do the tool get
>>>>> executed at panic time (based on priority).
>>>>>
>>>>> This brings in little bit reliability concerns for kdump due to notifier
>>>>> code being run after panic.
>>>>>
>>>>> I think people want to use this infrastrutucure beyond RAS tools. I
>>>>> remember somebody wanting to send a message to remote node after a
>>>>> panic (before kdump kicks in)  so that remote node can initiate failover
>>>>> etc.
>>>>>
>>>> I know it doesn't particularly relate to this patch, but FWIW, for cases like
>>>> failover, I've inserted infrastrucutre in the userspace part of kdump for
>>>> Fedora/RHEL to support this sort of thing.  We can run arbitrary scripts righte
>>>> before and after a capture so that notifications can be sent to remote nodes in
>>>> a much safer fashion than using the notifier chain after a panic.
>>>> Neil
>>>>
>>> That's great. I did not know about these. So user can write custom
>>> scripts/binaries which can be packed into kdump initrd and executed either
>>> before or after dump capture? Any idea, if somebody has started using it
>>> already?
>>>
>> Thats exactly right.  I'm not sure if there is any serious use as of yet, but
>> I've had some interrogatories about it.  Specific cases that I recall include:
>>
>> 1) A set of users in japan that are using the pre-dump script to block execution
>> until a scsi controller detects all its drives (it apparently takes up to three
>> minues to scan its bus)
>>
>> 2) I think some people using clustering services were using the pre-script to
>> notify cluster peers of the failure to avoid power fencing while a node
>> completed the crash dump
>>
>> 3) A national lab had an interest in using the pre script to send an email to an
>> administrative address to log the failure in a cluster 
>>
> 
> OK, thanks.
> 
> I think I'll duck the patch for now as it seems that a littlee more thought
> and coordination is neeed.
> 
> Plus it appears that the only users of this infrastructure are provided via
> presently-out-of-tree patches, so people who are already patching and
> building their own kernels can easily add this other patch as well, for now.
> 
> 

Hi,

The one of the reason why I want this functionality is managing RAS
tool behavior for postmotem actions, initially from kdb invocation.
(I used kdb for debugging and crash analysis very useful in lkcd days,
but it is "want" and it is not "must" today ;-))

The other postmotem action is disabling hardware watchdog.
Watch dog handler would stop keepalive heartbeat when system panics
and we must disable hardware watchdog as soon as possible, since 2nd
kernel startup takes some time (10 or 100? secs) and there may be
miss-firing window. But currently we have no chance to do anything
before crash_exec().

And thinking about a clustering software. If the system encounter
the panic, system must notify standby node. But... :-(

I am interested in pre-dump scripts Neil mentioned. I think it can
resolve some of our requirements. I will try it.

For quick invocation of kdump, I partially agree with the idea of
"kdump should be invoked as soon as system panic, since we can not
trust broken kernels", but we would like to have some choise what
to do on panic (and if notifier is controllable by my patch,
you can still call kdump first)

Anyway, completely broken kernel can not call kdump or any other
mechanism  ;-P  and I feel it is somewhat matter of degree.

Thanks,
    Takenori