[PATCH for-next v7 4/5] block: add helper to map bvec iterator for passthrough
Kanchan Joshi
joshi.k at samsung.com
Thu Sep 22 08:23:31 PDT 2022
On Tue, Sep 20, 2022 at 02:08:02PM +0200, Christoph Hellwig wrote:
>> -static int bio_map_user_iov(struct request *rq, struct iov_iter *iter,
>> +static struct bio *bio_map_get(struct request *rq, unsigned int nr_vecs,
>> gfp_t gfp_mask)
>
>bio_map_get is a very confusing name.
So I chose that name because functionality is opposite of what we do
inside existing bio_map_put helper. In that way it is symmetric.
>And I also still think this is
>the wrong way to go. If plain slab allocations don't use proper
>per-cpu caches we have a MM problem and need to talk to the slab
>maintainers and not use the overkill bio_set here.
This series is not about using (or not using) bio-set. Attempt here has
been to use pre-mapped buffers (and bvec) that we got from io_uring.
>> +/* Prepare bio for passthrough IO given an existing bvec iter */
>> +int blk_rq_map_user_bvec(struct request *rq, struct iov_iter *iter)
>
>I'm a little confused about the interface we're trying to present from
>the block layer to the driver here.
>
>blk_rq_map_user_iov really should be able to detect that it is called
>on a bvec iter and just do the right thing rather than needing different
>helpers.
I too explored that possibility, but found that it does not. It maps the
user-pages into bio either directly or by doing that copy (in certain odd
conditions) but does not know how to deal with existing bvec.
Reason, I guess, is no one felt the need to try passthrough for bvecs
before. It makes sense only in context of io_uring passthrough.
And it really felt cleaner to me write a new function rather than
overloading the blk_rq_map_user_iov with multiple if/else canals.
I tried that again after your comment, but it does not seem to produce
any good-looking code.
The other factor is - it seemed safe to go this way as I am more sure
that I will not break something else (using blk_rq_map_user_iov).
>> + /*
>> + * If the queue doesn't support SG gaps and adding this
>> + * offset would create a gap, disallow it.
>> + */
>> + if (bvprvp && bvec_gap_to_prev(lim, bvprvp, bv->bv_offset))
>> + goto out_err;
>
>So now you limit the input that is accepted? That's not really how
>iov_iters are used. We can either try to reshuffle the bvecs, or
>just fall back to the copy data version as blk_rq_map_user_iov does
>for 'weird' iters˙
Since I was writing a 'new' helper for passthrough only, I thought it
will not too bad to just bail out (rather than try to handle it using
copy) if we hit this queue_virt_boundary related situation.
To handle it the 'copy data' way we would need this -
585 else if (queue_virt_boundary(q))
586 copy = queue_virt_boundary(q) & iov_iter_gap_alignment(iter);
587
But iov_iter_gap_alignment does not work on bvec iters. Line #1274 below
1264 unsigned long iov_iter_gap_alignment(const struct iov_iter *i)
1265 {
1266 unsigned long res = 0;
1267 unsigned long v = 0;
1268 size_t size = i->count;
1269 unsigned k;
1270
1271 if (iter_is_ubuf(i))
1272 return 0;
1273
1274 if (WARN_ON(!iter_is_iovec(i)))
1275 return ~0U;
Do you see a way to overcome this. Or maybe this can be revisted as we
are not missing a lot?
>> +
>> + /* check full condition */
>> + if (nsegs >= nr_segs || bytes > UINT_MAX - bv->bv_len)
>> + goto out_err;
>> +
>> + if (bytes + bv->bv_len <= nr_iter &&
>> + bv->bv_offset + bv->bv_len <= PAGE_SIZE) {
>> + nsegs++;
>> + bytes += bv->bv_len;
>> + } else
>> + goto out_err;
>
>Nit: This would read much better as:
>
> if (bytes + bv->bv_len > nr_iter)
> goto out_err;
> if (bv->bv_offset + bv->bv_len > PAGE_SIZE)
> goto out_err;
>
> bytes += bv->bv_len;
> nsegs++;
Indeed, cleaner. Thanks.
More information about the Linux-nvme
mailing list