[PATCH v20 12/12] null_blk: add support for copy offload

Nitesh Shetty nj.shetty at samsung.com
Tue May 21 07:46:29 PDT 2024


On 20/05/24 04:42PM, Bart Van Assche wrote:
>On 5/20/24 03:20, Nitesh Shetty wrote:
>>+	if (blk_rq_nr_phys_segments(req) != BLK_COPY_MAX_SEGMENTS)
>>+		return status;
>
>Why is this check necessary?
>
>>+	/*
>>+	 * First bio contains information about destination and last bio
>>+	 * contains information about source.
>>+	 */
>
>Please check this at runtime (WARN_ON_ONCE()?).
>
>>+	__rq_for_each_bio(bio, req) {
>>+		if (seg == blk_rq_nr_phys_segments(req)) {
>>+			sector_in = bio->bi_iter.bi_sector;
>>+			if (rem != bio->bi_iter.bi_size)
>>+				return status;
>>+		} else {
>>+			sector_out = bio->bi_iter.bi_sector;
>>+			rem = bio->bi_iter.bi_size;
>>+		}
>>+		seg++;
>>+	}
>
>_rq_for_each_bio() iterates over the bios in a request. Does a copy
>offload request always have two bios - one copy destination bio and
>one copy source bio? If so, is 'seg' a bio counter? Why is that bio
>counter compared with the number of physical segments in the request?
>
Yes, your observation is right. We are treating first bio as dst and
second as src. If not for that comparision, we might need to store the
index in a temporary variable and parse based on index value.

>>+	trace_nullb_copy_op(req, sector_out << SECTOR_SHIFT,
>>+			    sector_in << SECTOR_SHIFT, rem);
>>+
>>+	spin_lock_irq(&nullb->lock);
>>+	while (rem > 0) {
>>+		chunk = min_t(size_t, nullb->dev->blocksize, rem);
>>+		offset_in = (sector_in & SECTOR_MASK) << SECTOR_SHIFT;
>>+		offset_out = (sector_out & SECTOR_MASK) << SECTOR_SHIFT;
>>+
>>+		if (null_cache_active(nullb) && !is_fua)
>>+			null_make_cache_space(nullb, PAGE_SIZE);
>>+
>>+		t_page_in = null_lookup_page(nullb, sector_in, false,
>>+					     !null_cache_active(nullb));
>>+		if (!t_page_in)
>>+			goto err;
>>+		t_page_out = null_insert_page(nullb, sector_out,
>>+					      !null_cache_active(nullb) ||
>>+					      is_fua);
>>+		if (!t_page_out)
>>+			goto err;
>>+
>>+		in = kmap_local_page(t_page_in->page);
>>+		out = kmap_local_page(t_page_out->page);
>>+
>>+		memcpy(out + offset_out, in + offset_in, chunk);
>>+		kunmap_local(out);
>>+		kunmap_local(in);
>>+		__set_bit(sector_out & SECTOR_MASK, t_page_out->bitmap);
>>+
>>+		if (is_fua)
>>+			null_free_sector(nullb, sector_out, true);
>>+
>>+		rem -= chunk;
>>+		sector_in += chunk >> SECTOR_SHIFT;
>>+		sector_out += chunk >> SECTOR_SHIFT;
>>+	}
>>+
>>+	status = 0;
>>+err:
>>+	spin_unlock_irq(&nullb->lock);
>
>In the worst case, how long does this loop disable interrupts?
>
We havn't measured this. But this should be similar to read and write in
present infra, as we followed similar approach.

>>+TRACE_EVENT(nullb_copy_op,
>>+		TP_PROTO(struct request *req,
>>+			 sector_t dst, sector_t src, size_t len),
>>+		TP_ARGS(req, dst, src, len),
>>+		TP_STRUCT__entry(
>>+				 __array(char, disk, DISK_NAME_LEN)
>>+				 __field(enum req_op, op)
>>+				 __field(sector_t, dst)
>>+				 __field(sector_t, src)
>>+				 __field(size_t, len)
>>+		),
>
>Isn't __string() preferred over __array() since the former occupies less space
>in the trace buffer?
>
Again we followed the present existing implementation, to have a simpler
series to review.

Thank you,
Nitesh Shetty


More information about the Linux-nvme mailing list