[PATCHv2 1/7] nvme: export command retry count via sysfs

Sagi Grimberg sagi at grimberg.me
Sat Feb 7 05:28:30 PST 2026



On 05/02/2026 14:48, Nilay Shroff wrote:
> When Advanced Command Retry Enable (ACRE) is configured, a controller
> may interrupt command execution and return a completion status
> indicating command interrupted with the DNR bit cleared. In this case,
> the driver retries the command based on the Command Retry Delay (CRD)
> value provided in the completion status.
>
> Currently, these command retries are handled entirely within the NVMe
> driver and are not visible to userspace. As a result, there is no
> observability into retry behavior, which can be a useful diagnostic
> signal.
>
> Expose the command retries count through sysfs to provide visibility
> into retry activity. This information can help identify controller-side
> congestion under load and enables comparison across paths in multipath
> setups (for example, detecting cases where one path experiences
> significantly more retries than another under identical workloads).
>
> This exported metric is intended for diagnostics and monitoring tools
> such as nvme-top, and does not change command retry behavior.

This is designed to show an accumulated value of how much retries were
done on a namespace since boot? I'm wandering if this does not belong in
debugfs?



More information about the Linux-nvme mailing list