[PATCHv6 11/11] iomap: add support for dma aligned direct-io

Halil Pasic pasic at linux.ibm.com
Tue Jun 28 02:00:24 PDT 2022


On Mon, 27 Jun 2022 09:36:56 -0600
Keith Busch <kbusch at kernel.org> wrote:

> On Mon, Jun 27, 2022 at 11:21:20AM -0400, Eric Farman wrote:
> > 
> > Apologies, it took me an extra day to get back to this, but it is
> > indeed this pass through that's causing our boot failures. I note that
> > the old code (in iomap_dio_bio_iter), did:
> > 
> >         if ((pos | length | align) & ((1 << blkbits) - 1))
> >                 return -EINVAL;
> > 
> > With blkbits equal to 12, the resulting mask was 0x0fff against an
> > align value (from iov_iter_alignment) of x200 kicks us out.
> > 
> > The new code (in iov_iter_aligned_iovec), meanwhile, compares this:
> > 
> >                 if ((unsigned long)(i->iov[k].iov_base + skip) &
> > addr_mask)
> >                         return false;
> > 
> > iov_base (and the output of the old iov_iter_aligned_iovec() routine)
> > is x200, but since addr_mask is x1ff this check provides a different
> > response than it used to.
> > 
> > To check this, I changed the comparator to len_mask (almost certainly
> > not the right answer since addr_mask is then unused, but it was good
> > for a quick test), and our PV guests are able to boot again with -next
> > running in the host.  
> 
> This raises more questions for me. It sounds like your process used to get an
> EINVAL error, and it wants to continue getting an EINVAL error instead of
> letting the direct-io request proceed. Is that correct? 

Is my understanding as well. But I'm not familiar enough with the code to
tell where and how that -EINVAL gets handled.

BTW let me just point out that the bounce buffering via swiotlb needed
for PV is not unlikely to mess up the alignment of things. But I'm not
sure if that is relevant here.

Regards,
Halil

> If so, could you
> provide more details on what issue occurs with dispatching this request?
> 
> If you really need to restrict address' alignment to the storage's logical
> block size, I think your storage driver needs to set the dma_alignment queue
> limit to that value.




More information about the Linux-nvme mailing list