[PATCH v2 0/3] Fix dump-capture kernel hangs with notsc

Wei, Jiangang weijg.fnst at cn.fujitsu.com
Tue Aug 2 00:45:08 PDT 2016


Hi Eric,

Thanks for your reply firstly.

On Mon, 2016-08-01 at 12:09 -0500, Eric W. Biederman wrote:
> "Wei, Jiangang" <weijg.fnst at cn.fujitsu.com> writes:
> 
> > Ping ...
> > May I ask for some community attention to this series?
> > I purpose is fixing  the dump-capture kernel hangs in
> > calibrate_delay_converge() while specifying notsc.
> 
> Did you not see my reply to patch 3/3?

Yes, I read your email and made a reply
(https://lkml.org/lkml/2016/7/26/112) . I put forward several questions
in that letter, but no feedback...

> 
> The short version of my feedback is that you seem to be fixing a case
> that should not exist.  So the good fix is to skip completely past
> virtual wire mode and into full apic mode as soon as possible.

I am afraid that there are some disagreements between us.

1)  The case that dump-capture kernel boot up with the disabled APIC is
very real, and the bug can be reproduced 100%.  I want to emphasize that
there is no guarantee of the interrupt mode of APIC and status of local
APIC, Especially for the dump-capture kernel that won't through the BIOS
phrase. That's why I do more check in init_bsp_APIC(), not only depends
on the MP tables which be generated before the first kernel boots up.

Make a point here, The BIOS must disable interrupts to all processors
and set the APICs to the system initial state before giving control to
the operating system. That means APICs won't be reset to initial state
without BIOS phrase.

2)  Your proposal (switch into full apic mode as soon as possible) seems
to contradict the Intel Spec, "An MP operating system is booted under
either one of the two PC/AT-compatible modes. Later the operating system
switches to Symmetric I/O Mode **as it enters multiprocessor mode**."
And in other words, the BSP should be in PIC mode or Virtual wire mode
in startup stage.

3)  The apic initialization codes maybe need a overhaul, but it goes out
the scope of this patch. I focus on fixing kdump failure with notsc. And
the apic initialization codes has no modification for a long time and
can be regard as stable.  Overhaul of it increases the chances of
hitting a bug.

If there's anything wrong with my understanding, please point out.

Thanks,
wei
> 
> For a subset of cases the code already supports that.
> 
> Eric
> 
> 





More information about the kexec mailing list