SGL support of driver

Christoph Hellwig hch at infradead.org
Wed Jul 29 00:55:57 PDT 2015


On Tue, Jul 28, 2015 at 10:35:00AM -0400, Matthew Wilcox wrote:
> On Mon, Jul 27, 2015 at 09:57:56AM -0700, Christoph Hellwig wrote:
> > It also requires pagecache I/O to be split into multiple commands
> > where other block devices can handle it a lot more efficiently.
> 
> Wait, what?  pagecache I/O is page aligned and page sized.  That can
> always be represented by a PRP list.

Pagecache I/O is not nessecarily page aligned.

Think of this case thay I remember clearly because it exposed a bug
a few years ago:

64k page size, 4k file system block size, raid 0 with a stripe size of
8k and two legs.

The typical SGL feds to the hardware driver for streaming I/O will be:

page A, offset 0, len 8k
page A, offset 16k, len 8k
page A, offset 32k, len 8k
page A, offset 48k, len 8k
page B, offset 0, len 8k

This is clearly something PRP list will not handle well.  Note that I'm
not nessecarily saying this is soemthing to optimize for, the more
interesting case for NVMe really is the vectored direct I/O case.



More information about the Linux-nvme mailing list