[Patch 1/4][kernel][slimdump] Add new elf-note of type NT_NOCOREDUMP to capture slimdump

Vivek Goyal vgoyal at redhat.com
Wed Oct 5 11:19:36 EDT 2011


On Wed, Oct 05, 2011 at 12:37:28PM +0530, K.Prasad wrote:

[..]
> > Well, there are MCE types for which we need to panic but we don't
> > necessarily corrupt memory. Your approach is to unconditionally avoid
> > dumping core whenever we panic while you should look at the MCE
> > signature and decide then whether to capture crashed kernel memory or
> > not.
> > 
> > For example, if the MCE signature says UC DRAM error, then you can
> > be pretty sure that there is a landmine somewhere in the DRAM region
> > mapping the crashed kernel. If it is, say, a UC when doing data fills
> > from L2 to L1, that doesn't necessarily mean that DRAM is corrupted. But
> > even in the first case, you can evaluate the MCi_ADDR reported with the
> > UC DRAM error and simply skip that particular cacheline when dumping the
> > core instead of not capturing anything at all.
> > 
> 
> True. Like stated by me earlier, there could be two possible outcomes
> from capturing memory dump in such cases - they're either dangerous or
> doesn't make sense. It is best to avoid a normal kdump in both cases,
> although the elf-note doesn't distinguish between the two.

So what are your objectives here. If panic happened due to an MCE don't
capture a dump? If we try to capture the dump and lets say we run into
issues, anyway we will reboot and not capture the dump. 

So only thing you want to achieve with this patch is that you want
to give an explicit message that panic happened due to MCE hence we
did not capture the dump?

Thanks
Vivek



More information about the kexec mailing list