[Help Test] kdump, x86, acpi: Reproduce CPU0 SMI corruption issue after unsetting BSP flag

Petr Tesarik ptesarik at suse.cz
Mon Aug 19 09:46:26 EDT 2013


On Sun, 18 Aug 2013 19:59:53 -0700
"Eric W. Biederman" <ebiederm at xmission.com> wrote:

> 
> 
> 
> >
> >Sorry Eric, I'm not clear to what you mean by ``short one core''...
> >Which are you suggesting? Disabling BSP if crash happens on AP is
> >reasonable?
> >Or restricting cpus to a single one only just as the current kdump
> >configuration is reasonable?
> 
> I am suggesting we start every cpu except the BSP from the AP we started on.
> 
> N-1 cpus seems like a good tradeoff between performance and reliability for those who need it.

FWIW a large customers of ours is fine with such a limitation. And I
have already tested this approach manually (starting the kdump kernel
with maxcpus=1 and hot-plugging the remaining APs from user-space).

Now that this approach is in line with upstream efforts, I'm going to
test it on some more machines and see if there are any troubles.

@Hatayama-san:
> BTW, I have question that does normal kdump work well if crash happens
> on some AP? I wonder the same issue could happen on the 2nd kernel.

I'm not sure what you mean. Normal kdump starts with "maxcpus=1", and
yes, that works even if the secondary kernel is booted from an AP. OTOH
I suspect that not having any BSP in the system may be the cause of some
mysterious random reboots and/or hangs experienced by some customers.

I'll try setting the BSP flag on the boot CPU unconditionally and see
if it makes any difference.

Petr Tesarik



More information about the kexec mailing list