[V3 PATCH 3/4] kexec: Fix race between panic() and crash_kexec() called directly

河合英宏 / KAWAI,HIDEHIRO hidehiro.kawai.ez at hitachi.com
Tue Aug 25 20:11:28 PDT 2015


Hi,

> From: Peter Zijlstra [mailto:peterz at infradead.org]
> 
> On Sat, Aug 22, 2015 at 02:35:24AM +0000, 河合英宏 / KAWAI,HIDEHIRO wrote:
> > > From: Peter Zijlstra [mailto:peterz at infradead.org]
> > >
> > > On Thu, Aug 06, 2015 at 02:45:43PM +0900, Hidehiro Kawai wrote:
> > > >  void crash_kexec(struct pt_regs *regs)
> > > >  {
> > > > +	int old_cpu, this_cpu;
> > > > +
> > > > +	/*
> > > > +	 * `old_cpu == -1' means we are the first comer and crash_kexec()
> > > > +	 * was called without entering panic().
> > > > +	 * `old_cpu == this_cpu' means crash_kexec() was called from panic().
> > > > +	 */
> > > > +	this_cpu = raw_smp_processor_id();
> > > > +	old_cpu = atomic_cmpxchg(&panic_cpu, -1, this_cpu);
> > > > +	if (old_cpu != -1 && old_cpu != this_cpu)
> > > > +		return;
> > >
> > > This allows recursive calling of crash_kexec(), the Changelog did not
> > > mention that. Is this really required?
> >
> > What part are you arguing?  Recursive call of crash_kexec() doesn't
> > happen.  In the first place, one of the purpose of this patch is
> > to prevent a recursive call of crash_kexec() in the following case
> > as I stated in the description:
> >
> > CPU 0:
> >   oops_end()
> >     crash_kexec()
> >       mutex_trylock() // acquired
> >         <NMI>
> >         io_check_error()
> >           panic()
> >             crash_kexec()
> >               mutex_trylock() // failed to acquire
> >             infinite loop
> >
> 
> Yes, but what to we want to do there? It seems to me that is wrong, we
> do not want to let a recursive crash_kexec() proceed.
> 
> Whereas the condition you created explicitly allows this recursion by
> virtue of the 'old_cpu != this_cpu' check.

I understand your question.  I don't intend to permit the recursive
call of crash_kexec() as for 'old_cpu != this_cpu' check.  That is
needed for the case of panic() --> crash_kexec().  Since panic_cpu has
already been set to this_cpu in panic() (please see PATCH 1/4), no one
can run crash_kexec() without 'old_cpu != this_cpu' check.

If you don't like this check, I would also be able to handle this case
like below:

crash_kexec()
{
	old_cpu = atomic_cmpxchg(&panic_cpu, -1, this_cpu);
	if (old_cpu != -1)
		return;

	__crash_kexec();
}

panic()
{
	atomic_cmpxchg(&panic_cpu, -1, this_cpu);
	__crash_kexec();
...


Regards,

Hidehiro Kawai
Hitachi, Ltd. Research & Development Group



More information about the kexec mailing list