[PATCH] iov_iter: don't require contiguous pages in iov_iter_extract_bvec_pages
Ming Lei
ming.lei at redhat.com
Thu Oct 31 04:17:44 PDT 2024
On Thu, Oct 31, 2024 at 09:42:32AM +0100, Klara Modin wrote:
> On 2024-10-31 01:22, Ming Lei wrote:
> > On Thu, Oct 31, 2024 at 08:14:49AM +0800, Ming Lei wrote:
> > > On Wed, Oct 30, 2024 at 06:56:48PM +0100, Klara Modin wrote:
> > > > Hi,
> > > >
> > > > On 2024-10-24 07:00, Christoph Hellwig wrote:
> > > > > From: Ming Lei <ming.lei at redhat.com>
> > > > >
> > > > > The iov_iter_extract_pages interface allows to return physically
> > > > > discontiguous pages, as long as all but the first and last page
> > > > > in the array are page aligned and page size. Rewrite
> > > > > iov_iter_extract_bvec_pages to take advantage of that instead of only
> > > > > returning ranges of physically contiguous pages.
> > > > >
> > > > > Signed-off-by: Ming Lei <ming.lei at redhat.com>
> > > > > [hch: minor cleanups, new commit log]
> > > > > Signed-off-by: Christoph Hellwig <hch at lst.de>
> > > >
> > > > With this patch (e4e535bff2bc82bb49a633775f9834beeaa527db in next-20241030),
> > > > I'm unable to connect via nvme-tcp with this in the log:
> > > >
> > > > nvme nvme1: failed to send request -5
> > > > nvme nvme1: Connect command failed: host path error
> > > > nvme nvme1: failed to connect queue: 0 ret=880
> > > >
> > > > With the patch reverted it works as expected:
> > > >
> > > > nvme nvme1: creating 24 I/O queues.
> > > > nvme nvme1: mapped 24/0/0 default/read/poll queues.
> > > > nvme nvme1: new ctrl: NQN
> > > > "nqn.2018-06.eu.kasm.int:freenas:backup:parmesan.int.kasm.eu", addr
> > > > [2001:0678:0a5c:1204:6245:cbff:fe9c:4f59]:4420, hostnqn:
> > > > nqn.2018-06.eu.kasm.int:parmesan
> > >
> > > I can't reproduce it by running blktest 'nvme_trtype=tcp ./check nvme/'
> > > on both next tree & for-6.13/block.
> > >
> > > Can you collect the following bpftrace log by running the script before
> > > connecting to nvme-tcp?
>
> I didn't seem to get any output from the bpftrace script (I confirmed that I
> had the config as you requested, but I'm not very familiar with bpftrace so
> I could have done something wrong). I could, however, reproduce the issue in
It works for me on Fedora(37, 40).
> qemu and added breakpoints on nvmf_connect_io_queue and
> iov_iter_extract_pages. The breakpoint on iov_iter_extract_pages got hit
> once when running nvme connect:
>
> (gdb) break nvmf_connect_io_queue
> Breakpoint 1 at 0xffffffff81a5d960: file
> /home/klara/git/linux/drivers/nvme/host/fabrics.c, line 525.
> (gdb) break iov_iter_extract_pages
> Breakpoint 2 at 0xffffffff817633b0: file
> /home/klara/git/linux/lib/iov_iter.c, line 1900.
> (gdb) c
> Continuing.
> [Switching to Thread 1.1]
Wow, debug kernel with gdb, cool!
>
> Thread 1 hit Breakpoint 2, iov_iter_extract_pages
> (i=i at entry=0xffffc900001ebd68,
> pages=pages at entry=0xffffc900001ebb08, maxsize=maxsize at entry=72,
> maxpages=8,
> extraction_flags=extraction_flags at entry=0,
> offset0=offset0 at entry=0xffffc900001ebb10)
> at /home/klara/git/linux/lib/iov_iter.c:1900
> 1900 {
> (gdb) print i->count
> $5 = 72
> (gdb) print i->iov_offset
> $6 = 0
> (gdb) print i->bvec->bv_offset
> $7 = 3952
> (gdb) print i->bvec->bv_len
> $8 = 72
> (gdb) c
> Continuing.
>
> I didn't hit the breakpoint in nvmf_connect_io_queue, but I instead hit it
> if I add it to nvmf_connect_admin_queue. I added this function to the
> bpftrace script but that didn't produce any output either.
Your kernel config shows all BTF related options are enabled, maybe
bpftrace userspace issue?
>
> >
> > And please try the following patch:
> >
> > diff --git a/lib/iov_iter.c b/lib/iov_iter.c
> > index 9fc06f5fb748..c761f6db3cb4 100644
> > --- a/lib/iov_iter.c
> > +++ b/lib/iov_iter.c
> > @@ -1699,6 +1699,7 @@ static ssize_t iov_iter_extract_bvec_pages(struct iov_iter *i,
> > i->bvec++;
> > skip = 0;
> > }
> > + bi.bi_idx = 0;
> > bi.bi_size = maxsize + skip;
> > bi.bi_bvec_done = skip;
> >
> >
>
> Applying this seems to fix the problem.
Thanks for the test, and the patch is sent out.
thanks,
Ming
More information about the Linux-nvme
mailing list