[Patch 1/4][kernel][slimdump] Add new elf-note of type NT_NOCOREDUMP to capture slimdump

Luck, Tony tony.luck at intel.com
Tue Oct 11 14:55:05 EDT 2011


> Frankly, I don't think that it is undefined - you basically should be
> able to read DRAM albeit with the corrupted data in it. However, you
> probably best disable the whole DRAM error detection first by clearing
> a couple of bits in MC4_CTL_MASK (at least on AMD that should work, I
> dunno how Intel does that).

Intel is the same - disable machine check in CR4, and you can read
corrupted memory (multi-bit ECC error) without getting a machine check
(or any indication that you just got garbage).

Pages that were marked as poisoned can then be handled with appropriate
suspicion by your crash dump analysis tools.

Of course if there are any other memory errors that haven't been seen
yet - the pages won't be marked as poison - so the crash dump tool will
have no idea that it is looking at invalid data.  This could be a problem
if whatever caused the memory problem affected more than a single location.

So if you do disable machine check in order to get a crash dump - you should
be conservative and mark the whole file as "possibly garbage".

-Tony


More information about the kexec mailing list