nvme-rdma: possible issue around error injection debugfs entries

Wed Jan 8 06:30:45 PST 2025

Hi Experts,

During testing one of the corner cases of our NVMe RDMA use case, we discovered
that in the event of many failed connection attempts, the inode_cache and dentry
slab can grow to a huge size (and is not reclaimable). Consequently, if we wait
long enough, we can reach an OOM (Out of Memory) condition.

Example from crash:
ffff88dd2abfca80      592   29007830  29007936  537184    32k  inode_cache

During debugging, we noticed that in the case of RDMA connections, these debugfs
entries created in nvme_fault_inject_init are created before a successful RDMA
connection. So, with many failures, these entries are created and removed
repeatedly. This behavior likely causes some troubles for slab/debugfs.

So far, we have worked around this issue by moving the point of creating these
entries to after a successful connection, and this has fixed the issue.

We are wondering if a patch with the same or a similar approach can be applied
on upstream, or another approach (for example: raising the issue with debugfs
maintainers) should be chosen here.

Thanks,
Marcin