[PATCHv10 0/9] write hints with nvme fdp, scsi streams

Matthew Wilcox willy at infradead.org
Fri Nov 8 08:54:34 PST 2024


On Fri, Nov 08, 2024 at 08:51:31AM -0700, Keith Busch wrote:
> On Fri, Nov 08, 2024 at 03:18:52PM +0100, Christoph Hellwig wrote:
> > We're not really duplicating much.  Writing sequential is pretty easy,
> > and tracking reclaim units separately means you need another tracking
> > data structure, and either that or the LBA one is always going to be
> > badly fragmented if they aren't the same.
> 
> You're getting fragmentation anyway, which is why you had to implement
> gc. You're just shifting who gets to deal with it from the controller to
> the host. The host is further from the media, so you're starting from a
> disadvantage. The host gc implementation would have to be quite a bit
> better to justify the link and memory usage necessary for the copies
> (...queue a copy-offload discussion? oom?).

But the filesystem knows which blocks are actually in use.  Sending
TRIM/DISCARD information to the drive at block-level granularity hasn't
worked out so well in the past.  So the drive is the one at a disadvantage
because it has to copy blocks which aren't actually in use.

I like the idea of using copy-offload though.



More information about the Linux-nvme mailing list