[LSF/MM/BPF TOPIC] nvme-of connect retries
Hannes Reinecke
hare at suse.de
Wed May 8 02:52:38 PDT 2024
Hi all,
I'd like to request another session for LSF/MM:
NVMe-oF connect retries
There had been several discussions on the mailing list on how to handle
failures or retries which occurs during 'connect'.
Issues to discuss:
- Should the initial connect return with a status after _all_
queues are connected? That will introduce a severe lag
for large installation, with the risk of systemd timing out
the command.
- Should we try to combine workflows? TCP has three different
'connect' code paths, one for the initial connect, one for
reset, and one for reconnect.
- Where should a possible retry be handled? Should user space
be responsible for a retry, or should it be left to the driver?
- If user space should be driving the retry, how can we return
a meaningful error to user space?
It would be good if we could come to a consensus here such that
we can start consolidating the various transports.
Cheers,
Hannes
More information about the Linux-nvme
mailing list