[PATCHv10 0/9] write hints with nvme fdp, scsi streams

Mon Dec 9 14:13:40 PST 2024

On 12/5/24 12:03 AM, Nitesh Shetty wrote:
> But where do we store the read sector info before sending write.
> I see 2 approaches here,
> 1. Should it be part of a payload along with write ?
>      We did something similar in previous series which was not liked
>      by Christoph and Bart.
> 2. Or driver should store it as part of an internal list inside
> namespace/ctrl data structure ?
>      As Bart pointed out, here we might need to send one more fail
>      request later if copy_write fails to land in same driver.

Hi Nitesh,

Consider the following example: dm-linear is used to concatenate two
block devices. An NVMe device (LBA 0..999) and a SCSI device (LBA
1000..1999). Suppose that a copy operation is submitted to the dm-linear
device to copy LBAs 1..998 to LBAs 2..1998. If the copy operation is
submitted as two separate operations (REQ_OP_COPY_SRC and
REQ_OP_COPY_DST) then the NVMe device will receive the REQ_OP_COPY_SRC
operation and the SCSI device will receive the REQ_OP_COPY_DST
operation. The NVMe and SCSI device drivers should fail the copy 
operations after a timeout because they only received half of the copy
operation. After the timeout the block layer core can switch from
offloading to emulating a copy operation. Waiting for a timeout is
necessary because requests may be reordered.

I think this is a strong argument in favor of representing copy
operations as a single operation. This will allow stacking drivers
as dm-linear to deal in an elegant way with copy offload requests
where source and destination LBA ranges map onto different block
devices and potentially different block drivers.

Thanks,

Bart.