[PATCHv10 0/9] write hints with nvme fdp, scsi streams
Christoph Hellwig
hch at lst.de
Sun Nov 10 22:48:41 PST 2024
On Fri, Nov 08, 2024 at 08:51:31AM -0700, Keith Busch wrote:
> You're getting fragmentation anyway, which is why you had to implement
> gc.
A general purpose file system always has fragmentation of some kind,
even it manages to avoid those for certain workloads with cooperative
applications.
If there was magic pixies dust to ensure freespace never fragments file
system development would be solved problem :)
> You're just shifting who gets to deal with it from the controller to
> the host. The host is further from the media, so you're starting from a
> disadvantage.
And the controller is further from the application and misses a lot of
information like say the file structure, so it inherently is at a
disadvantage.
> The host gc implementation would have to be quite a bit
> better to justify the link and memory usage necessary for the copies
That assumes you still have to device GC. If you do align to the
zone/erase (super)block/reclaim unit boundaries you don't.
> This xfs implementation also has logic to recover from a power fail. The
> device already does that if you use the LBA abstraction instead of
> tracking sequential write pointers and free blocks.
Every file system has logic to recover from a power fail. I'm not sure
what kind of discussion you're trying to kick off here.
> I think you are underestimating the duplication of efforts going on
> here.
I'm still not sure what discussion you're trying to to start here.
There is very little work in here, and it is work required to support
SMR drives. It turns out for a fair amount of workloads it actually
works really well on SSDs as well beating everything else we've tried.
More information about the Linux-nvme
mailing list