[PATCHv6 11/11] iomap: add support for dma aligned direct-io
Keith Busch
kbusch at kernel.org
Thu Jun 23 12:11:57 PDT 2022
On Thu, Jun 23, 2022 at 12:51:08PM -0600, Keith Busch wrote:
> On Thu, Jun 23, 2022 at 02:29:13PM -0400, Eric Farman wrote:
> > On Fri, 2022-06-10 at 12:58 -0700, Keith Busch wrote:
> > > From: Keith Busch <kbusch at kernel.org>
> > >
> > > Use the address alignment requirements from the block_device for
> > > direct
> > > io instead of requiring addresses be aligned to the block size.
> >
> > Hi Keith,
> >
> > Our s390 PV guests recently started failing to boot from a -next host,
> > and git blame brought me here.
> >
> > As near as I have been able to tell, we start tripping up on this code
> > from patch 9 [1] that gets invoked with this patch:
> >
> > > for (k = 0; k < i->nr_segs; k++, skip = 0) {
> > > size_t len = i->iov[k].iov_len - skip;
> > >
> > > if (len > size)
> > > len = size;
> > > if (len & len_mask)
> > > return false;
> >
> > The iovec we're failing on has two segments, one with a len of x200
> > (and base of x...000) and another with a len of xe00 (and a base of
> > x...200), while len_mask is of course xfff.
> >
> > So before I go any further on what we might have broken, do you happen
> > to have any suggestions what might be going on here, or something I
> > should try?
>
> Thanks for the notice, sorry for the trouble. This check wasn't intended to
> have any difference from the previous code with respect to the vector lengths.
>
> Could you tell me if you're accessing this through the block device direct-io,
> or through iomap filesystem?
If using iomap, the previous check was this:
unsigned int blkbits = blksize_bits(bdev_logical_block_size(iomap->bdev));
unsigned int align = iov_iter_alignment(dio->submit.iter);
...
if ((pos | length | align) & ((1 << blkbits) - 1))
return -EINVAL;
And if raw block device, it was this:
if ((pos | iov_iter_alignment(iter)) &
(bdev_logical_block_size(bdev) - 1))
return -EINVAL;
The result of "iov_iter_alignment()" would include "0xe00 | 0x200" in your
example, and checked against 0xfff should have been failing prior to this
patch. Unless I'm missing something...
More information about the Linux-nvme
mailing list