[PATCH 0/7] nvme-cli: add nvme top command for real-time monitoring

Nilay Shroff nilay at linux.ibm.com
Thu Apr 30 03:52:21 PDT 2026


Hi,

Monitoring NVMe devices and paths in production is currently limited to
static snapshots via nvme-cli. While this is sufficient for basic
inspection, it is not ideal for NVMe-oF (fabrics) deployments where path
conditions can change dynamically due to varying network latency,
congestion, or link failures.

In multipath environments, administrators often need continuous
visibility into path state, ANA status, queue depth, link speed, and
error counters. Today, this typically requires repeatedly invoking
commands or relying on ad-hoc tooling, making it harder to quickly
identify issues.

This patch series introduces "nvme top", a tool for real-time monitoring
of NVMe devices and fabrics paths, similar in spirit to tools such as
top or iotop. The goal is to provide a continuously updating view of
device and path health, enabling faster detection of link degradation,
multipath imbalances, and transient failures.

The series first adds the necessary building blocks for supporting a
top-like dashboard. The initial patches extend the table APIs (including
support for additional data types such as unsigned, long, float, and
double) and introduce a generic dashboard framework. The final patch
adds the nvme top command built on top of this framework.

Future work:
- Export NVMe statistics to external monitoring systems (e.g. Grafana).
- Improve topology change detection in multipath configurations. The
  current implementation relies on kobject uevents for topology change,
  but namespace path add/delete events are not exported by the kernel
  since they are associated with hidden gendisk kobjects. This may
  require explicit uevent generation from the NVMe driver for namespace
  path changes.
- Wire nvme top into an MCP pipeline and feed it to an LLM

As usual feedback, comments, and suggestions are welcome!

Nilay Shroff (7):
  nvme: add support for unsigned and long types in
    table_get_value_width()
  nvme: use table_get_value_width() in table_print_centered()
  nvme: add support for float and double types in table_print_XXX()
  nvme: allow table output to be directed to a FILE stream
  nvme: add sigaction for SIGWINCH
  nvme: add generic top-like dashboard framework
  nvme: add nvme top command

 meson.build         |    1 +
 nvme-builtin.h      |    1 +
 nvme-print-stdout.c | 1205 +++++++++++++++++++++++++++++++++++++++++++
 nvme-print.c        |    5 +
 nvme-print.h        |    5 +-
 nvme-top.c          |  345 +++++++++++++
 nvme-top.h          |   26 +
 nvme.c              |   28 +
 util/dashboard.c    |  851 ++++++++++++++++++++++++++++++
 util/dashboard.h    |   53 ++
 util/meson.build    |    3 +-
 util/sighdl.c       |   14 +-
 util/sighdl.h       |    1 +
 util/table.c        |  122 +++--
 util/table.h        |   27 +
 15 files changed, 2630 insertions(+), 57 deletions(-)
 create mode 100644 nvme-top.c
 create mode 100644 nvme-top.h
 create mode 100644 util/dashboard.c
 create mode 100644 util/dashboard.h

-- 
2.53.0




More information about the Linux-nvme mailing list