[PATCH 08/17] nvme: enable passthrough with fixed-buffer
Kanchan Joshi
joshiiitr at gmail.com
Mon Mar 14 06:06:17 PDT 2022
On Fri, Mar 11, 2022 at 12:13 PM Christoph Hellwig <hch at lst.de> wrote:
>
> > +int blk_rq_map_user_fixedb(struct request_queue *q, struct request *rq,
> > + u64 ubuf, unsigned long len, gfp_t gfp_mask,
> > + struct io_uring_cmd *ioucmd)
>
> Looking at this a bit more, I don't think this is a good interface or
> works at all for that matter.
>
> > +{
> > + struct iov_iter iter;
> > + size_t iter_count, nr_segs;
> > + struct bio *bio;
> > + int ret;
> > +
> > + /*
> > + * Talk to io_uring to obtain BVEC iterator for the buffer.
> > + * And use that iterator to form bio/request.
> > + */
> > + ret = io_uring_cmd_import_fixed(ubuf, len, rq_data_dir(rq), &iter,
> > + ioucmd);
>
> Instead of pulling the io-uring dependency into blk-map.c we could just
> pass the iter to a helper function and have that as the block layer
> abstraction if we really want one. But:
>
> > + if (unlikely(ret < 0))
> > + return ret;
> > + iter_count = iov_iter_count(&iter);
> > + nr_segs = iter.nr_segs;
> > +
> > + if (!iter_count || (iter_count >> 9) > queue_max_hw_sectors(q))
> > + return -EINVAL;
> > + if (nr_segs > queue_max_segments(q))
> > + return -EINVAL;
> > + /* no iovecs to alloc, as we already have a BVEC iterator */
> > + bio = bio_alloc(gfp_mask, 0);
> > + if (!bio)
> > + return -ENOMEM;
> > +
> > + ret = bio_iov_iter_get_pages(bio, &iter);
>
> I can't see how this works at all. block drivers have a lot more
> requirements than just total size and number of segments. Very typical
> is a limit on the size of each sector, and for nvme we also have the
> weird virtual boundary for the PRPs. None of that is being checked here.
> You really need to use bio_add_pc_page or open code the equivalent checks
> for passthrough I/O.
Indeed, I'm missing those checks. Will fix up.
> > + if (likely(nvme_is_fixedb_passthru(ioucmd)))
> > + ret = blk_rq_map_user_fixedb(q, req, ubuffer, bufflen,
> > + GFP_KERNEL, ioucmd);
>
> And I'm also really worried about only supporting fixed buffers. Fixed
> buffers are a really nice benchmarketing feature, but without supporting
> arbitrary buffers this is rather useless in real life.
Sorry, I did not get your point on arbitrary buffers.
The goal has been to match/surpass io_uring's block-io peak perf, so
pre-mapped buffers had to be added.
--
Kanchan
More information about the Linux-nvme
mailing list