[PATCH 03/15] iomap: don't try to poll multi-bio I/Os in __iomap_dio_rw

Darrick J. Wong djwong at kernel.org
Wed May 12 14:32:30 PDT 2021


On Wed, May 12, 2021 at 03:15:33PM +0200, Christoph Hellwig wrote:
> If an iocb is split into multiple bios we can't poll for both.  So don't
> bother to even try to poll in that case.
> 
> Signed-off-by: Christoph Hellwig <hch at lst.de>

Ahh the fun of reviewing things like iopoll that I'm not all /that/
familiar with...

> ---
>  fs/iomap/direct-io.c | 17 +++++++++++++++++
>  1 file changed, 17 insertions(+)
> 
> diff --git a/fs/iomap/direct-io.c b/fs/iomap/direct-io.c
> index 9398b8c31323..d5637f467109 100644
> --- a/fs/iomap/direct-io.c
> +++ b/fs/iomap/direct-io.c
> @@ -282,6 +282,13 @@ iomap_dio_bio_actor(struct inode *inode, loff_t pos, loff_t length,
>  	if (!iov_iter_count(dio->submit.iter))
>  		goto out;
>  
> +	/*
> +	 * We can only poll for single bio I/Os.
> +	 */
> +	if (need_zeroout ||
> +	    ((dio->flags & IOMAP_DIO_WRITE) && pos >= i_size_read(inode)))

Hm, is this logic here to catch the second iomap_dio_zero below the
zero_tail: label?  What happens if we have an unaligned direct write
that starts below EOF but (for whatever reason) ends the loop with pos
now above EOF but not on a block boundary?

> +		dio->iocb->ki_flags &= ~IOCB_HIPRI;
> +
>  	if (need_zeroout) {
>  		/* zero out from the start of the block to the write offset */
>  		pad = pos & (fs_block_size - 1);
> @@ -339,6 +346,11 @@ iomap_dio_bio_actor(struct inode *inode, loff_t pos, loff_t length,
>  
>  		nr_pages = bio_iov_vecs_to_alloc(dio->submit.iter,
>  						 BIO_MAX_VECS);
> +		/*
> +		 * We can only poll for single bio I/Os.
> +		 */
> +		if (nr_pages)
> +			dio->iocb->ki_flags &= ~IOCB_HIPRI;
>  		iomap_dio_submit_bio(dio, iomap, bio, pos);
>  		pos += n;
>  	} while (nr_pages);
> @@ -579,6 +591,11 @@ __iomap_dio_rw(struct kiocb *iocb, struct iov_iter *iter,
>  			iov_iter_revert(iter, pos - dio->i_size);
>  			break;
>  		}
> +
> +		/*
> +		 * We can only poll for single bio I/Os.
> +		 */
> +		iocb->ki_flags &= ~IOCB_HIPRI;

Hmm, why is this here?  Won't this clear IOCB_HIPRI even if the first
iomap_apply call successfully creates a polled bio for the entire iovec
such that we exit the loop one line later because count becomes zero?
How does the upper layer (io_uring, I surmise?) know that there's
a bio that it should poll for?

<shrug> Unless the only effect that this has is making it so that the
subsequent calls to iomap_apply don't have the polling mode set?  I see
enough places in io_uring.c that check (iocb->ki_flags & IOCB_HIPRI) to
make me wonder if the lifetime of that flag ends as soon as we get to
->{read,write}_iter?

--D

>  	} while ((count = iov_iter_count(iter)) > 0);
>  	blk_finish_plug(&plug);
>  
> -- 
> 2.30.2
> 



More information about the Linux-nvme mailing list