[PATCH 4/4] nvme: add support for mq_ops->queue_rqs()

Wed Dec 15 09:29:47 PST 2021

On Wed, Dec 15, 2021 at 09:24:21AM -0700, Jens Axboe wrote:
> +static bool nvme_prep_rq_batch(struct nvme_queue *nvmeq, struct request *req)
> +{
> +	/*
> +	 * We should not need to do this, but we're still using this to
> +	 * ensure we can drain requests on a dying queue.
> +	 */
> +	if (unlikely(!test_bit(NVMEQ_ENABLED, &nvmeq->flags)))
> +		return false;

The patch looks good:

Reviewed-by: Keith Busch <kbusch at kernel.org>

Now a side comment on the above snippet:

I was going to mention in v2 that you shouldn't need to do this for each
request since the queue enabling/disabling only happens while quiesced,
so the state doesn't change once you start a batch. But I realized
multiple hctx's can be in a single batch, so we have to check each of
them instead of just once. :(

I tried to remove this check entirely ("We should not need to do this",
after all), but that's not looking readily possible without just
creating an equivalent check in blk-mq: we can't end a particular
request in failure without draining whatever list it may be linked
within, and we don't know what list it's in when iterating allocated
hctx tags.

Do you happen to have any thoughts on how we could remove this check?
The API I was thinking of is something like "blk_mq_hctx_dead()" in
order to fail pending requests on that hctx without sending them to the
low-level driver so that it wouldn't need these kinds of per-IO checks.