[PATCH net-next v2 1/2] fs/crashdd: add API to collect hardware dump in second kernel

Mon Apr 2 05:30:45 PDT 2018

On Monday, April 04/02/18, 2018 at 14:41:43 +0530, Jiri Pirko wrote:
> Fri, Mar 30, 2018 at 08:42:00PM CEST, ebiederm at xmission.com wrote:
> >Rahul Lakkireddy <rahul.lakkireddy at chelsio.com> writes:
> >
> >> On Friday, March 03/30/18, 2018 at 16:09:07 +0530, Jiri Pirko wrote:
> >>> Sat, Mar 24, 2018 at 11:56:33AM CET, rahul.lakkireddy at chelsio.com wrote:
> >>> >Add a new module crashdd that exports the /sys/kernel/crashdd/
> >>> >directory in second kernel, containing collected hardware/firmware
> >>> >dumps.
> >>> >
> >>> >The sequence of actions done by device drivers to append their device
> >>> >specific hardware/firmware logs to /sys/kernel/crashdd/ directory are
> >>> >as follows:
> >>> >
> >>> >1. During probe (before hardware is initialized), device drivers
> >>> >register to the crashdd module (via crashdd_add_dump()), with
> >>> >callback function, along with buffer size and log name needed for
> >>> >firmware/hardware log collection.
> >>> >
> >>> >2. Crashdd creates a driver's directory under
> >>> >/sys/kernel/crashdd/<driver>. Then, it allocates the buffer with
> >>> 
> >>> This smells. I need to identify the exact ASIC instance that produced
> >>> the dump. To identify by driver name does not help me if I have multiple
> >>> instances of the same driver. This looks wrong to me. This looks like
> >>> a job for devlink where you have 1 devlink instance per 1 ASIC instance.
> >>> 
> >>> Please see:
> >>> http://patchwork.ozlabs.org/project/netdev/list/?series=36524
> >>> 
> >>> I bevieve that the solution in the patchset could be used for
> >>> your usecase too.
> >>> 
> >>> 
> >>
> >> The sysfs approach proposed here had been dropped in favour exporting
> >> the dumps as ELF notes in /proc/vmcore.
> >>
> >> Will be posting the new patches soon.
> >
> >The concern was actually how you identify which device that came from.
> >Where you read the identifier changes but sysfs or /proc/vmcore the
> >change remains valid.
> 
> Yeah. I still don't see how you link the dump and the device.

In our case, the dump and the device are being identified by the
driver’s name followed by its corresponding pci bus id.  I’ve posted an
example in my v3 series:

https://www.spinics.net/lists/netdev/msg493781.html

Here’s an extract from the link above:

# readelf -n /proc/vmcore

Displaying notes found at file offset 0x00001000 with length 0x04003288:
Owner                 Data size     Description
VMCOREDD_cxgb4_0000:02:00.4 0x02000fd8      Unknown note type:(0x00000700)
VMCOREDD_cxgb4_0000:04:00.4 0x02000fd8      Unknown note type:(0x00000700)
CORE                 0x00000150     NT_PRSTATUS (prstatus structure)
CORE                 0x00000150     NT_PRSTATUS (prstatus structure)
CORE                 0x00000150     NT_PRSTATUS (prstatus structure)
CORE                 0x00000150     NT_PRSTATUS (prstatus structure)
CORE                 0x00000150     NT_PRSTATUS (prstatus structure)
CORE                 0x00000150     NT_PRSTATUS (prstatus structure)
CORE                 0x00000150     NT_PRSTATUS (prstatus structure)
CORE                 0x00000150     NT_PRSTATUS (prstatus structure)
VMCOREINFO           0x0000074f     Unknown note type: (0x00000000)

Here, for my two devices, the dump’s names are
VMCOREDD_cxgb4_0000:02:00.4 and VMCOREDD_cxgb4_0000:04:00.4.

It’s really up to the callers to write their own unique name for the
dump.  The name is appended to “VMCOREDD_” string.

> Rahul, did you look at the patchset I pointed out?

For devlink, I think the dump name would be identified by
bus_type/device_name; i.e. “pci/0000:02:00.4” for my example.
Is my understanding correct?

Thanks,
Rahul