[PATCH 0/7] nvme: export additional diagnostic counters via sysfs

Nilay Shroff nilay at linux.ibm.com
Tue Feb 3 01:07:21 PST 2026



On 2/3/26 4:26 AM, Hannes Reinecke wrote:
> On 1/30/26 19:20, Nilay Shroff wrote:
>> Hi,
>>
>> The NVMe driver encounters various events and conditions during normal
>> operation that are either not tracked today or not exposed to userspace
>> via sysfs. Lack of visibility into these events can make it difficult to
>> diagnose subtle issues related to controller behavior, multipath
>> stability, and I/O reliability.
>>
>> This patchset adds several diagnostic counters that provide improved
>> observability into NVMe behavior. These counters are intended to help
>> users understand events such as transient path unavailability,
>> controller retries/reconnect/reset, failovers, and I/O failures. They
>> can also be consumed by monitoring tools such as nvme-top.
>>
>> Specifically, this series proposes to export the following counters via
>> sysfs:
>>    - Command retry count
>>    - Multipath failover count
>>    - Command error count
>>    - I/O requeue count
>>    - I/O failure count
>>    - Controller reset event counts
>>    - Controller reconnect counts
>>
>> The patchset consists of seven patches:
>>    Patch 1: Export command retry count
>>    Patch 2: Export multipath failover count
>>    Patch 3: Export command error count
>>    Patch 4: Export I/O requeue count
>>    Patch 5: Export I/O failure count
>>    Patch 6: Export controller reset event counts
>>    Patch 7: Export controller reconnect event count
>>
>> Please note that this patchset doesn't make any functional change but
>> rather export relevant counters to user space via sysfs.
>>
>> As usual, feedback/comments/suggestions are welcome!
>>
> 
> While I do agree with the general idea, I do wonder whether debugfs
> would not be a better suited place for all of this. Having all of
> this information in sysfs will clutter is by quite a bit, plus we
> do have the usual issues with ABI stability if we ever see the need
> to change (or, heaven forbid, remove) any of these counters.
> 
> (And when doing so it might be an idea to add a 'version' entry
> to debugfs such that we can manage userspace expectation).
> 
I understand the concern regarding ABI stability and potential sysfs clutter.
However, one of the challenges with relying on debugfs is that it is not always
guaranteed to be available or enabled in production environments. As a result,
exposing these statistics exclusively via debugfs could limit their usefulness
for real-world deployments.

These counters are intended to be consumed by user-space tools such as nvme-cli/
nvme-top, which are often used in production systems for monitoring and diagnostics.
Depending on debugfs in such cases may therefore not be reliable.

In fact, there has been prior discussion expressing reservations about relying on
debugfs for NVMe-related statistics. For example, Daniel previously raised concerns
in the context of nvme-cli:
https://lore.kernel.org/all/803f429d-60f3-4af7-9535-37a2038e53c1@flourine.local/

Given this, IMO, exposing a carefully scoped and well-documented set of diagnostic
counters via sysfs seems more appropriate for long-term usability, provided we
remain mindful of ABI stability considerations.

Thanks,
--Nilay








More information about the Linux-nvme mailing list