[PATCHv5 00/11] direct-io dma alignment
Keith Busch
kbusch at kernel.org
Wed Jun 1 07:28:31 PDT 2022
On Wed, Jun 01, 2022 at 12:11:40AM -0700, Eric Biggers wrote:
> On Tue, May 31, 2022 at 12:11:26PM -0700, Keith Busch wrote:
> > From: Keith Busch <kbusch at kernel.org>
> >
> > The most significant change from v4 is the alignment is now checked
> > prior to building the bio. This gets the expected EINVAL error for
> > misaligned userspace iovecs in all cases now (Eric Biggers).
> >
> > I've removed the legacy fs change, so only iomap filesystems get to use
> > this alignement capability (Christoph Hellwig).
> >
> > The block fops check for alignment returns a bool now (Damien).
> >
> > Adjusted some comments, docs, and other minor style issues.
> >
> > Reviews added for unchanged or trivially changed patches, removed
> > reviews for ones that changed more significantly.
> >
> > As before, I tested using 'fio' with forced misaligned user buffers on
> > raw block, xfs, and ext4 (example raw block profile below).
> >
>
> I still don't think you've taken care of all the assumptions that bv_len is a
> multiple of logical block size, or at least SECTOR_SIZE. Try this:
>
> git grep -E 'bv_len (>>|/)'
There are only 8 drivers that set the request_queue's dma alignment, which are
the only ones that could be affected from this patch series. The drivers
returned from the above don't set dma alignment, so they're fine to assume
those lengths.
I don't think the above query captures enough since it misses things like
nfhd_submit_bio() that shifts 9 on the following line. Not that that example
matters either for the same reason.
> Also:
>
> git grep '<.*bv_len;'
>
> Also take a look at bio_for_each_segment(), specifically how iter->bi_sector is
> updated.
I'm not finding any driver user of this macro that's set the dma alignment
where this would break. They either never set it, or they're stacking drivers
that always get the safe default.
Outside drivers, blk-integrity doesn't operate on sector lengths, so that's
fine, and blk-crypto would prevent such unalignment much earlier. And btrfs
bounces this type of direct IO, so that's also fine.
Even if we assume all the existing users are safe, I suppose we could say this
type of assumption is potentially fragile. For example, I think drivers like
pmem or null_blk could readily reduce their queue's dma alignment limit, but
that may break their current usage.
More information about the Linux-nvme
mailing list