[PATCH 1/1] nvme: Add verbose error logging

Mon Dec 13 16:31:55 PST 2021

The main purpose of the patch is to get a bit more info into the logs on customer production
systems.   

What if we drop the string and just display the status, DNR and MORE bits:

[   56.058003] nvme0n1: Read @ LBA 2048, 1 blocks, Unrecovered Read Error (sct 0x2 / sc 0x81) DNR MORE
[   56.058075] critical medium error, dev nvme0n1, sector 2048 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 0

This would be extremely useful when agoing through customer logs.

Alan

> On Dec 12, 2021, at 1:52 AM, Sagi Grimberg <sagi at grimberg.me> wrote:
> 
> 
> 
> On 12/10/21 6:56 PM, Keith Busch wrote:
>> On Fri, Dec 10, 2021 at 10:54:45AM -0500, Martin K. Petersen wrote:
>>> 
>>> Hi Chaitanya!
>>> 
>>>> Although I always prefer descriptive error messages, I've not seen even
>>>> half of these errors in the field. This will :-
>>>> 
>>>> 1. Increase the size of the driver code.
>>>> 2. Require constant patching of the driver code when spec is updated,
>>>>    I'm afraid this will turn into churn.
>>> 
>>> Updating the corresponding list in SCSI hasn't been a big problem.
>>> 
>>>> Also, we can easily track down the error in the NVMe spec from the
>>>> error code that driver is printing currently, is that not sufficient ?
>>> 
>>> Only the block layer status is captured.
>>> 
>>> In addition, customers are not really armed with the NVMe error decoder
>>> ring and are left with no explanation as to why an I/O failed. We have
>>> had good results in the field with the verbose SCSI error logging and
>>> that's why this patch was modeled to work the same way.
>> If they are not familiar with the decoder, does seeing something like
>> "Conflicting Attributes" really provide the user a better sense than
>> "0x280"? Either way, it sounds like they'd just open a bug with someone
>> who is familiar with the spec, in which case numeric codes should be
>> sufficient.
>> Are the existing nvme trace events not providing enough information? The
>> existing ones appear to have even greater detail than what this patch
>> provides. Or is there some other reason why the kernel log is preferred
>> over the event tracer for capturing these types of errors?
> 
> Maybe the string info be added to the trace events? If so, we may
> need a verbosity level for the trace to record "only errors".
> 
> Although the one advantage for the log would be the first time error
> status (or for generally rare errors).
> 
>> On the patch itself:
>> I'm not sure having it be a new Kconfig option is a good idea since it
>> requires a recompile to enable/disable. Could this be toggled with a
>> static_key instead so we may disable/enable verbose logging at runtime?
>> The new "struct nvme_string_table" is a bit inefficient. Unknown errors
>> will scan the entire table, which will slow down our hot completion path
>> on any error. The codes are small and mostly sequential, so I think you
>> could create a sparse "static const char * const[]", then the "code"
>> becomes the index that can be looked up in constant time. The total
>> space of the array shouldn't be much different than the proposal.
> 
> 100% agree.
> 
>> A significant amount of the new strings are specific to admin commands,
>> yet errors for admin commands skip the logging. Why define unreachable
>> strings? Or perhaps a better question: why skip logging admin errors?
>> IO errors can happen in large batches, so perhaps pr_err() should be
>> ratelimted.
> 
> Agree.