[RFC][PATCH] Add a sysctl option controlling kexec when MCE occurred

Eric W. Biederman ebiederm at xmission.com
Sat Dec 25 12:19:54 EST 2010


Seiji Aguchi <seiji.aguchi at hds.com> writes:

> Hi,
>
> Thank you for giving your comments.
>
>>So what is the problem you are trying to avoid, and why can't we do
>>something in the kernels initialization path to avoid initializing
>>when there is a problem?
>
> Kdump gets a dump disk identifier based on information from memory.
>
> So, kdump may receive wrong identifier when it starts after MCE 
> occurred, because MCE is reported by memory, cache, and TLB errors
>
> In the worst case, kdump will overwrite user data if it recognizes a 
> disk saving user data as a dump disk.

Absurdly unlikely there is a sha256 checksum verified over the
kdump kernel before it starts booting.  If you have very broken
memory it is possible, but absurdly unlikely that the machine will
even boot if you are having enough uncorrectable memory errors
an hour to get past the sha256 checksum and then be corruppt.

> Kdump shouldn't write any data to disk when information from
> hardware is incredible because saving user data is always first 
> priority.

Which is what is already implemented.

It looks to me like you are jumping at shadows, and adding
complexity to the kernel with no gain, and significant cost.


Eric



More information about the kexec mailing list