[PATCH net v3 1/8] net/handshake: Use spin_lock_bh for hn_lock

Hannes Reinecke hare at suse.de
Wed May 27 01:59:04 PDT 2026


On 5/25/26 18:51, Chuck Lever wrote:
> From: Chuck Lever <chuck.lever at oracle.com>
> 
> nvmet_tcp_state_change(), a socket callback that runs in BH context,
> can reach handshake_req_cancel() via nvmet_tcp_schedule_release_queue()
> and tls_handshake_cancel().  handshake_req_cancel() acquires
> hn->hn_lock with plain spin_lock().  If a process-context thread on
> the same CPU holds hn->hn_lock when a softirq invokes the cancel path,
> the lock attempt deadlocks.  This is the only caller that invokes
> tls_handshake_cancel() from BH context; every other consumer calls it
> from process context.
> 
> Deferring the cancel to process context in the NVMe target is not
> straightforward: nvmet_tcp_schedule_release_queue() must call
> tls_handshake_cancel() atomically with its state transition to
> DISCONNECTING.  If the cancel were deferred, the handshake completion
> callback could fire in the window before the cancel runs, observe the
> unexpected state, and return without dropping its kref on the queue.
> Reworking that interlock is considerably more invasive than hardening
> the handshake lock.  Convert all hn->hn_lock acquisitions from
> spin_lock/spin_unlock to spin_lock_bh/spin_unlock_bh so the lock is
> never taken with softirqs enabled.
> 
> Fixes: 675b453e0241 ("nvmet-tcp: enable TLS handshake upcall")
> Signed-off-by: Chuck Lever <chuck.lever at oracle.com>
> ---
>   net/handshake/netlink.c |  4 ++--
>   net/handshake/request.c | 14 +++++++-------
>   net/handshake/tlshd.c   |  2 ++
>   3 files changed, 11 insertions(+), 9 deletions(-)
> 
... and there is always the question whather we should be calling
tls_handshake_cancel() in the first place.
We already call 'tls_handshake_cancel()' from the handshake timeout
handler, and this instance of 'tls_handshake_cancel()' is called
from a socket callback (ie when the soecket is closed or something).
I would have expected that the handshake code cleans up outstanding
requests on socket close already; if so, we can delete the cancel
here.
Hmm?

Cheers,

Hannes
-- 
Dr. Hannes Reinecke                  Kernel Storage Architect
hare at suse.de                                +49 911 74053 688
SUSE Software Solutions GmbH, Frankenstr. 146, 90461 Nürnberg
HRB 36809 (AG Nürnberg), GF: I. Totev, A. McDonald, W. Knoblich



More information about the Linux-nvme mailing list