[PATCH v20 12/12] null_blk: add support for copy offload
Nitesh Shetty
nj.shetty at samsung.com
Tue May 21 07:46:29 PDT 2024
On 20/05/24 04:42PM, Bart Van Assche wrote:
>On 5/20/24 03:20, Nitesh Shetty wrote:
>>+ if (blk_rq_nr_phys_segments(req) != BLK_COPY_MAX_SEGMENTS)
>>+ return status;
>
>Why is this check necessary?
>
>>+ /*
>>+ * First bio contains information about destination and last bio
>>+ * contains information about source.
>>+ */
>
>Please check this at runtime (WARN_ON_ONCE()?).
>
>>+ __rq_for_each_bio(bio, req) {
>>+ if (seg == blk_rq_nr_phys_segments(req)) {
>>+ sector_in = bio->bi_iter.bi_sector;
>>+ if (rem != bio->bi_iter.bi_size)
>>+ return status;
>>+ } else {
>>+ sector_out = bio->bi_iter.bi_sector;
>>+ rem = bio->bi_iter.bi_size;
>>+ }
>>+ seg++;
>>+ }
>
>_rq_for_each_bio() iterates over the bios in a request. Does a copy
>offload request always have two bios - one copy destination bio and
>one copy source bio? If so, is 'seg' a bio counter? Why is that bio
>counter compared with the number of physical segments in the request?
>
Yes, your observation is right. We are treating first bio as dst and
second as src. If not for that comparision, we might need to store the
index in a temporary variable and parse based on index value.
>>+ trace_nullb_copy_op(req, sector_out << SECTOR_SHIFT,
>>+ sector_in << SECTOR_SHIFT, rem);
>>+
>>+ spin_lock_irq(&nullb->lock);
>>+ while (rem > 0) {
>>+ chunk = min_t(size_t, nullb->dev->blocksize, rem);
>>+ offset_in = (sector_in & SECTOR_MASK) << SECTOR_SHIFT;
>>+ offset_out = (sector_out & SECTOR_MASK) << SECTOR_SHIFT;
>>+
>>+ if (null_cache_active(nullb) && !is_fua)
>>+ null_make_cache_space(nullb, PAGE_SIZE);
>>+
>>+ t_page_in = null_lookup_page(nullb, sector_in, false,
>>+ !null_cache_active(nullb));
>>+ if (!t_page_in)
>>+ goto err;
>>+ t_page_out = null_insert_page(nullb, sector_out,
>>+ !null_cache_active(nullb) ||
>>+ is_fua);
>>+ if (!t_page_out)
>>+ goto err;
>>+
>>+ in = kmap_local_page(t_page_in->page);
>>+ out = kmap_local_page(t_page_out->page);
>>+
>>+ memcpy(out + offset_out, in + offset_in, chunk);
>>+ kunmap_local(out);
>>+ kunmap_local(in);
>>+ __set_bit(sector_out & SECTOR_MASK, t_page_out->bitmap);
>>+
>>+ if (is_fua)
>>+ null_free_sector(nullb, sector_out, true);
>>+
>>+ rem -= chunk;
>>+ sector_in += chunk >> SECTOR_SHIFT;
>>+ sector_out += chunk >> SECTOR_SHIFT;
>>+ }
>>+
>>+ status = 0;
>>+err:
>>+ spin_unlock_irq(&nullb->lock);
>
>In the worst case, how long does this loop disable interrupts?
>
We havn't measured this. But this should be similar to read and write in
present infra, as we followed similar approach.
>>+TRACE_EVENT(nullb_copy_op,
>>+ TP_PROTO(struct request *req,
>>+ sector_t dst, sector_t src, size_t len),
>>+ TP_ARGS(req, dst, src, len),
>>+ TP_STRUCT__entry(
>>+ __array(char, disk, DISK_NAME_LEN)
>>+ __field(enum req_op, op)
>>+ __field(sector_t, dst)
>>+ __field(sector_t, src)
>>+ __field(size_t, len)
>>+ ),
>
>Isn't __string() preferred over __array() since the former occupies less space
>in the trace buffer?
>
Again we followed the present existing implementation, to have a simpler
series to review.
Thank you,
Nitesh Shetty
More information about the Linux-nvme
mailing list