[PATCH 1/3] nvme: fixup kato deadlock
Christoph Hellwig
hch at lst.de
Wed Mar 3 12:35:49 GMT 2021
On Wed, Mar 03, 2021 at 01:01:50PM +0100, Hannes Reinecke wrote:
> > Adding BLK_MQ_REQ_NOWAIT should be a separate prep patch, together with
> > reducing the number of reserved tags.
> >
> Okay.
> But why would we need to reduce the number of tags? It's not that we're
> changing anything with the allocation, we're just guaranteed to fail if
> for some reason the stack messes up.
Because we could otherwise still have to keep alive requests, or one keep
a live and one connect requests allocated at the same time.
>
> > Also why do we still need the extra test_and_set_bit with the
> > NOWAIT allocation?
> >
> This is essentially a safeguard; there is nothing in the current code
> telling us if a KATO command is in flight or not.
> And I really do hate doing pointless work (like allocating) commands if
> we don't need to.
>
> Plus it'll differentiate between legit callers of
> nvme_start_keep_alive() (which can be called at any time, and hence
> might be adding the ka_work element just after the previous one had
> submitted the KATO command), and real failures like failing to allocate
> the KATO command itself, which points to a real issue in the stack as
> normally there should be enough reserved commands.
You can distinguish them by the error code from blk_mq_alloc_request.
More information about the Linux-nvme
mailing list