nvme: split bios issued in reverse order

Jonathan Nicklin jnicklin at blockbridge.com
Mon May 23 09:16:20 PDT 2022


There seems to be an inconsistency in the order of writes that are
issued after splitting a bio. Ordering depends on how the application
write is submitted and the number of I/O queues configured.

In our testing nvme/tcp, a 128K write issued with fio/pvsync is split
into four 32K I/Os (the target maximum data transfer size is set to
32K, and max_sectors_kb is therefore 32K). As expected, the four write
I/Os are issued to the target in sequential order. However, if the
128K write is issued using fio/libaio, the four 32K writes are issued
in reverse order:

fio-8098 [001] ..... 254009.711080: nvme_setup_cmd: nvme1:
disk=nvme1c1n1, qid=2, cmdid=16468, nsid=1, flags=0x0, meta=0x0,
cmd=(nvme_cmd_write slba=192, len=63, ctrl=0x0, dsmgmt=0, reftag=0)

fio-8098 [001] ..... 254009.711083: nvme_setup_cmd: nvme1:
disk=nvme1c1n1, qid=2, cmdid=16467, nsid=1, flags=0x0, meta=0x0,
cmd=(nvme_cmd_write slba=128, len=63, ctrl=0x0, dsmgmt=0, reftag=0)

fio-8098 [001] ..... 254009.711084: nvme_setup_cmd: nvme1:
disk=nvme1c1n1, qid=2, cmdid=16466, nsid=1, flags=0x0, meta=0x0,
cmd=(nvme_cmd_write slba=64, len=63, ctrl=0x0, dsmgmt=0, reftag=0)

fio-8098 [001] ..... 254009.711085: nvme_setup_cmd: nvme1:
disk=nvme1c1n1, qid=2, cmdid=16465, nsid=1, flags=0x0, meta=0x0,
cmd=(nvme_cmd_write slba=0, len=63, ctrl=0x0, dsmgmt=0, reftag=0)

Further investigation found that if the number of I/Os queues is
limited to 1 at connect time, the issue order is sequential for both
pwritev and libaio.

I've spent some time tracing through the bio/blk_mq code and
can't seem to find what might be causing the difference in
behavior. Can anyone confirm that this is expected or desired
behavior?

Thanks,
-Jonathan



More information about the Linux-nvme mailing list