[PATCH net-next v2 0/2] kernel: add support to collect hardware logs in crash recovery kerne

Dave Anderson anderson at redhat.com
Tue Mar 27 09:29:31 PDT 2018



----- Original Message -----
> 
> On Tuesday, March 03/27/18, 2018 at 18:47:34 +0530, Eric W. Biederman wrote:
> > Rahul Lakkireddy <rahul.lakkireddy at chelsio.com> writes:
> > 
> > > On Saturday, March 03/24/18, 2018 at 20:50:52 +0530, Eric W. Biederman wrote:
> > >> 
> > >> Rahul Lakkireddy <rahul.lakkireddy at chelsio.com> writes:
> > >> 
> > >> > On production servers running variety of workloads over time, kernel
> > >> > panic can happen sporadically after days or even months. It is
> > >> > important to collect as much debug logs as possible to root cause
> > >> > and fix the problem, that may not be easy to reproduce. Snapshot of
> > >> > underlying hardware/firmware state (like register dump, firmware
> > >> > logs, adapter memory, etc.), at the time of kernel panic will be very
> > >> > helpful while debugging the culprit device driver.
> > >> >
> > >> > This series of patches add new generic framework that enable device
> > >> > drivers to collect device specific snapshot of the hardware/firmware
> > >> > state of the underlying device in the crash recovery kernel. In crash
> > >> > recovery kernel, the collected logs are exposed via /sys/kernel/crashdd/
> > >> > directory, which is copied by user space scripts for post-analysis.
> > >> >
> > >> > A kernel module crashdd is newly added. In crash recovery kernel,
> > >> > crashdd exposes /sys/kernel/crashdd/ directory containing device
> > >> > specific hardware/firmware logs.
> > >> 
> > >> Have you looked at instead of adding a sysfs file adding the dumps
> > >> as additional elf notes in /proc/vmcore?
> > >> 
> > >
> > > I see the crash recovery kernel's memory is not present in any of the
> > > the PT_LOAD headers.  So, makedumpfile is not collecting the dumps
> > > that are in crash recovery kernel's memory.
> > >
> > > Also, are you suggesting exporting the dumps themselves as PT_NOTE
> > > instead?  I'll look into doing it this way.
> > 
> > Yes.  I was suggesting exporting the dumps themselves as PT_NOTE
> > in /proc/vmcore.  I think that will allow makedumpfile to collect
> > your new information without modification.
> > 
> 
> If I export the dumps themselves as PT_NOTE in /proc/vmcore, can the
> crash tool work without modification; i.e can crash tool extract these
> notes?
> 
> Thanks,
> Rahul

The crash utility will continue to work without modification.  If the
dumpfile is still in its ELF format, crash will show the PT_NOTE header
and do a raw dump of the contents of the note (i.e., just a stream of
64-bit words, so if it's ASCII data, it won't be too useful).  For a 
compressed kdump, I believe that makedumpfile copies all PT_NOTEs 
to the compressed dumpfile header, but the dumpfile header does not
currently contain a direct pointer to each note.  Here is what's there
now in version 6 of the kdump_sub_header:

 * struct kdump_sub_header {
 * [0]     unsigned long   phys_base;
 * [4]     int             dump_level;         /  header_version 1 and later  /
 * [8]     int             split;              /  header_version 2 and later  /
 * [12]    unsigned long   start_pfn;          /  header_version 2 and later  /
 * [16]    unsigned long   end_pfn;            /  header_version 2 and later  /
 * [20]    off_t           offset_vmcoreinfo;  /  header_version 3 and later  /
 * [28]    unsigned long   size_vmcoreinfo;    /  header_version 3 and later  /
 * [32]    off_t           offset_note;        /  header_version 4 and later  /
 * [40]    unsigned long   size_note;          /  header_version 4 and later  /
 * [44]    off_t           offset_eraseinfo;   /  header_version 5 and later  /
 * [52]    unsigned long   size_eraseinfo;     /  header_version 5 and later  /
 * [56]    unsigned long long   start_pfn_64;  /  header_version 6 and later  /
 * [64]    unsigned long long   end_pfn_64;    /  header_version 6 and later  /
 * [72]    unsigned long long   max_mapnr_64;  /  header_version 6 and later  /

Note that explicit pointers only exist for the vmcoreinfo and eraseinfo notes,
but there are other notes (e.g., the NT_PRSTATUS and QEMU notes) that the crash
utility digs out of the full "size_note" segment of dumpfile memory that contains
a copy of all notes from the original ELF /proc/vmcore file.  Anyway, there 
would be no extraction/display of a new note type in a compressed kdump.  

As far as extraction in a format of your liking, you can always post a patch
to the crash utility mailing list to extract and/or display it in whatever 
format you desire.

Dave












More information about the kexec mailing list