[PATCHv10 0/9] write hints with nvme fdp, scsi streams
Bart Van Assche
bvanassche at acm.org
Mon Dec 9 14:13:40 PST 2024
On 12/5/24 12:03 AM, Nitesh Shetty wrote:
> But where do we store the read sector info before sending write.
> I see 2 approaches here,
> 1. Should it be part of a payload along with write ?
> We did something similar in previous series which was not liked
> by Christoph and Bart.
> 2. Or driver should store it as part of an internal list inside
> namespace/ctrl data structure ?
> As Bart pointed out, here we might need to send one more fail
> request later if copy_write fails to land in same driver.
Hi Nitesh,
Consider the following example: dm-linear is used to concatenate two
block devices. An NVMe device (LBA 0..999) and a SCSI device (LBA
1000..1999). Suppose that a copy operation is submitted to the dm-linear
device to copy LBAs 1..998 to LBAs 2..1998. If the copy operation is
submitted as two separate operations (REQ_OP_COPY_SRC and
REQ_OP_COPY_DST) then the NVMe device will receive the REQ_OP_COPY_SRC
operation and the SCSI device will receive the REQ_OP_COPY_DST
operation. The NVMe and SCSI device drivers should fail the copy
operations after a timeout because they only received half of the copy
operation. After the timeout the block layer core can switch from
offloading to emulating a copy operation. Waiting for a timeout is
necessary because requests may be reordered.
I think this is a strong argument in favor of representing copy
operations as a single operation. This will allow stacking drivers
as dm-linear to deal in an elegant way with copy offload requests
where source and destination LBA ranges map onto different block
devices and potentially different block drivers.
Thanks,
Bart.
More information about the Linux-nvme
mailing list