[PATCH 3/3] nvme-tcp: fix I/O stalls on congested sockets

Hannes Reinecke hare at suse.de
Tue May 6 11:05:15 PDT 2025


Hi Kamaljit,

On 5/6/25 19:40, Kamaljit Singh wrote:
> Hi Sagi,
> 
>  >On 26/04/2025 2:47, Kamaljit Singh wrote:
>  >> Sagi,
>  >>
>  >>>> Right after the check sk_stream_is_writeable(), there could have new
>  >>>> room in the socket
>  >>>> send buffer, state_change() fired, and queued io_work, before we get
>  >>>> to the pending
>  >>>> check. This is the "lost event" problem.
>  >>> Actually, this is incorrect. write_space() -> queue_work(io_work) will
>  >>> be re-queued if io_work
>  >>> is currently running. Hence I'm not sure that this flag is needed.
>  >>>
>  >>> Kamaljit, can you run with the below and report if it fixes your 
> problem?
>  >> With the last set of fixes that included list_del_init() you 
> suggested, all
>  >> tests have been stable. I'm not seeing any more issues. Are you 
> asking to
>  >> test this patch regardless? If so, does this need to be in addition to
>  >> previous patches?
>  >
>  >Instead of my patch that introduces WAKE_SENDER - apply just the change
>  >below
>  >that looks at sk_stream_is_writeable() only. I no longer think that the
>  >other changes in that patch solve anything.
> Sorry, I was away on PTO. Working now to get the below patch tested.
> 
Can you please retest with the patchset '[PATCHv5 0/2] nvme-tcp: fixup 
I/O stall on congested sockets' _only_ ?
(on top of nvme-6.16 latest, of course).
I think I _should_ have included all the suggestions floating here,
but we need to have confirmation.

Thanks a bunch.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke                  Kernel Storage Architect
hare at suse.de                                +49 911 74053 688
SUSE Software Solutions GmbH, Frankenstr. 146, 90461 Nürnberg
HRB 36809 (AG Nürnberg), GF: I. Totev, A. McDonald, W. Knoblich



More information about the Linux-nvme mailing list