[PATCH v20 02/12] Add infrastructure for copy offload in block and request layer.

Bart Van Assche bvanassche at acm.org
Mon Jun 3 10:12:48 PDT 2024


On 5/31/24 22:59, Christoph Hellwig wrote:
> On Thu, May 30, 2024 at 10:11:15AM -0700, Bart Van Assche wrote:
>> This new approach has the following two disadvantages:
>> * Without plug, REQ_OP_COPY_SRC and REQ_OP_COPY_DST are not combined. These two
>>    operation types are the only operation types for which not using a plug causes
>>    an I/O failure.
> 
> So?  We can clearly document that and even fail submission with a helpful
> message trivially to enforce that.

Consider the following use case:
* Task A calls blk_start_plug()
* Task B calls blk_start_plug()
* Task A submits a REQ_OP_COPY_DST bio and a REQ_OP_COPY_SRC bio.
* Task B submits a REQ_OP_COPY_DST bio and a REQ_OP_COPY_SRC bio.
* The stacking driver to which all REQ_OP_COPY_* operations have been
   submitted processes bios asynchronusly.
* Task A calls blk_finish_plug()
* Task B calls blk_finish_plug()
* The REQ_OP_COPY_DST bio from task A and the REQ_OP_COPY_SRC bio from
   task B are combined into a single request.
* The REQ_OP_COPY_DST bio from task B and the REQ_OP_COPY_SRC bio from
   task A are combined into a single request.

This results in silent and hard-to-debug data corruption. Do you agree
that we should not restrict copy offloading to stacking drivers that
process bios synchronously and also that this kind of data corruption
should be prevented?

Thanks,

Bart.



More information about the Linux-nvme mailing list