[PATCHv10 0/9] write hints with nvme fdp, scsi streams
Matthew Wilcox
willy at infradead.org
Fri Nov 8 08:54:34 PST 2024
On Fri, Nov 08, 2024 at 08:51:31AM -0700, Keith Busch wrote:
> On Fri, Nov 08, 2024 at 03:18:52PM +0100, Christoph Hellwig wrote:
> > We're not really duplicating much. Writing sequential is pretty easy,
> > and tracking reclaim units separately means you need another tracking
> > data structure, and either that or the LBA one is always going to be
> > badly fragmented if they aren't the same.
>
> You're getting fragmentation anyway, which is why you had to implement
> gc. You're just shifting who gets to deal with it from the controller to
> the host. The host is further from the media, so you're starting from a
> disadvantage. The host gc implementation would have to be quite a bit
> better to justify the link and memory usage necessary for the copies
> (...queue a copy-offload discussion? oom?).
But the filesystem knows which blocks are actually in use. Sending
TRIM/DISCARD information to the drive at block-level granularity hasn't
worked out so well in the past. So the drive is the one at a disadvantage
because it has to copy blocks which aren't actually in use.
I like the idea of using copy-offload though.
More information about the Linux-nvme
mailing list