[Report] requests are submitted to hardware in reverse order from nvme/virtio-blk queue_rqs()

Jens Axboe axboe at kernel.dk
Thu Jan 25 07:33:03 PST 2024


On 1/24/24 4:59 AM, Ming Lei wrote:
> Hello,
> 
> Requests are added to plug list in reverse order, and both virtio-blk
> and nvme retrieves request from plug list in order, so finally requests
> are submitted to hardware in reverse order via nvme_queue_rqs() or
> virtio_queue_rqs, see:
> 
> 	io_uring       submit_bio  vdb      6302096     4096
> 	io_uring       submit_bio  vdb     12235072     4096
> 	io_uring       submit_bio  vdb      7682280     4096
> 	io_uring       submit_bio  vdb     11912464     4096
> 	io_uring virtio_queue_rqs  vdb     11912464     4096
> 	io_uring virtio_queue_rqs  vdb      7682280     4096
> 	io_uring virtio_queue_rqs  vdb     12235072     4096
> 	io_uring virtio_queue_rqs  vdb      6302096     4096
> 
> 
> May this reorder be one problem for virtio-blk and nvme-pci?

This is known and has been the case for a while, that requests inside
the plug list are added to the front and we dispatch in list order
(hence getting them reversed). Not aware of any performance issues
related to that, though I have had someone being surprised by it.

It'd be trivial to just reverse the list before queue_rqs or direct
dispatch, and probably not at a huge cost as the lists will be pretty
short. See below. Or we could change the plug list to be doubly linked,
which would (I'm guessing...) likely be a higher cost. But unless this
is an actual issue, I propose we just leave it alone.


diff --git a/block/blk-mq.c b/block/blk-mq.c
index aa87fcfda1ec..ecfba42157ee 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -2697,6 +2697,21 @@ static blk_status_t blk_mq_request_issue_directly(struct request *rq, bool last)
 	return __blk_mq_issue_directly(hctx, rq, last);
 }
 
+static struct request *blk_plug_reverse_order(struct blk_plug *plug)
+{
+	struct request *rq = plug->mq_list, *new_head = NULL;
+
+	while (rq) {
+		struct request *tmp = rq;
+
+		rq = rq->rq_next;
+		tmp->rq_next = new_head;
+		new_head = tmp;
+	}
+
+	return new_head;
+}
+
 static void blk_mq_plug_issue_direct(struct blk_plug *plug)
 {
 	struct blk_mq_hw_ctx *hctx = NULL;
@@ -2704,6 +2719,7 @@ static void blk_mq_plug_issue_direct(struct blk_plug *plug)
 	int queued = 0;
 	blk_status_t ret = BLK_STS_OK;
 
+	plug->mq_list = blk_plug_reverse_order(plug);
 	while ((rq = rq_list_pop(&plug->mq_list))) {
 		bool last = rq_list_empty(plug->mq_list);
 
@@ -2741,6 +2757,7 @@ static void __blk_mq_flush_plug_list(struct request_queue *q,
 {
 	if (blk_queue_quiesced(q))
 		return;
+	plug->mq_list = blk_plug_reverse_order(plug);
 	q->mq_ops->queue_rqs(&plug->mq_list);
 }
 

-- 
Jens Axboe




More information about the Linux-nvme mailing list