[PATCHv6 11/11] iomap: add support for dma aligned direct-io
Eric Farman
farman at linux.ibm.com
Thu Jun 23 14:34:29 PDT 2022
On Thu, 2022-06-23 at 16:32 -0400, Eric Farman wrote:
> On Thu, 2022-06-23 at 13:11 -0600, Keith Busch wrote:
> > On Thu, Jun 23, 2022 at 12:51:08PM -0600, Keith Busch wrote:
> > > On Thu, Jun 23, 2022 at 02:29:13PM -0400, Eric Farman wrote:
> > > > On Fri, 2022-06-10 at 12:58 -0700, Keith Busch wrote:
> > > > > From: Keith Busch <kbusch at kernel.org>
> > > > >
> > > > > Use the address alignment requirements from the block_device
> > > > > for
> > > > > direct
> > > > > io instead of requiring addresses be aligned to the block
> > > > > size.
> > > >
> > > > Hi Keith,
> > > >
> > > > Our s390 PV guests recently started failing to boot from a
> > > > -next
> > > > host,
> > > > and git blame brought me here.
> > > >
> > > > As near as I have been able to tell, we start tripping up on
> > > > this
> > > > code
> > > > from patch 9 [1] that gets invoked with this patch:
> > > >
> > > > > for (k = 0; k < i->nr_segs; k++, skip = 0) {
> > > > > size_t len = i->iov[k].iov_len - skip;
> > > > >
> > > > > if (len > size)
> > > > > len = size;
> > > > > if (len & len_mask)
> > > > > return false;
> > > >
> > > > The iovec we're failing on has two segments, one with a len of
> > > > x200
> > > > (and base of x...000) and another with a len of xe00 (and a
> > > > base
> > > > of
> > > > x...200), while len_mask is of course xfff.
> > > >
> > > > So before I go any further on what we might have broken, do you
> > > > happen
> > > > to have any suggestions what might be going on here, or
> > > > something
> > > > I
> > > > should try?
> > >
> > > Thanks for the notice, sorry for the trouble. This check wasn't
> > > intended to
> > > have any difference from the previous code with respect to the
> > > vector lengths.
> > >
> > > Could you tell me if you're accessing this through the block
> > > device
> > > direct-io,
> > > or through iomap filesystem?
>
> Reasonably certain the failure's on iomap. I'd reverted the subject
> patch from next-20220622 and got things in working order.
>
> > If using iomap, the previous check was this:
> >
> > unsigned int blkbits =
> > blksize_bits(bdev_logical_block_size(iomap->bdev));
> > unsigned int align = iov_iter_alignment(dio->submit.iter);
> > ...
> > if ((pos | length | align) & ((1 << blkbits) - 1))
> > return -EINVAL;
> >
> >
> ...
> > The result of "iov_iter_alignment()" would include "0xe00 | 0x200"
> > in
> > your
> > example, and checked against 0xfff should have been failing prior
> > to
> > this
> > patch. Unless I'm missing something...
>
> Nope, you're not. I didn't look back at what the old check was doing,
> just saw "0xe00 and 0x200" and thought "oh there's one page" instead
> of
> noting the code was or'ing them. My bad.
>
> That was the last entry in my trace before the guest gave up, as
> everything else through this code up to that point seemed okay. I'll
> pick up the working case and see if I can get a clearer picture
> between
> the two.
Looking over the trace again, I realize I did dump iov_iter_alignment()
as a comparator, and I see one pass through that had a non-zero
response but bdev_iter_is_aligned() returned true...
count = x1000
iov_offset = x0
nr_segs = 1
iov_len = x1000 (len_mask = xfff)
iov_base = x...200 (addr_mask = x1ff)
That particular pass through is in the middle of the stuff it tried to
do, so I don't know if that's the cause or not but it strikes me as
unusual. Will look into that tomorrow and report back.
>
More information about the Linux-nvme
mailing list