[PATCHv10 9/9] scsi: set permanent stream count in block limits

Christoph Hellwig hch at lst.de
Wed Oct 30 09:57:08 PDT 2024


On Wed, Oct 30, 2024 at 10:42:59AM -0600, Keith Busch wrote:
> On Wed, Oct 30, 2024 at 04:50:52PM +0100, Christoph Hellwig wrote:
> > On Wed, Oct 30, 2024 at 09:48:39AM -0600, Keith Busch wrote:
> > > What??? You said to map the temperature hints to a write stream. The
> > > driver offers that here. But you specifically don't want that? I'm so
> > > confused.
> > 
> > In bdev/fops.c (or file systems if they want to do that) not down in the
> > driver forced down everyones throat.  Which was the whole point of the
> > discussion that we're running in circles here.
> 
> That makes no sense. A change completely isolated to a driver isn't
> forcing anything on anyone. It's the upper layers that's forcing this
> down, whether the driver uses it or not: the hints are already getting
> to the driver, but the driver currently doesn't use it.

And once it uses by default, taking it away will have someone scream
regresion, because we're not taking it away form that super special
use case.

> Here's something recent from rocksdb developers running ycsb worklada
> benchmark. The filesystem used is XFS.

Thanks for finally putting something up.

> It sets temperature hints for different SST levels, which already
> happens today. The last data point made some minor changes with
> level-to-hint mapping.

Do you have a pointer to the changes?

> Without FDP:
> 
> WAF:        2.72
> IOPS:       1465
> READ LAT:   2681us
> UPDATE LAT: 3115us
> 
> With FDP (rocksdb unmodified):
> 
> WAF:        2.26
> IOPS:       1473
> READ LAT:   2415us
> UPDATE LAT: 2807us
> 
> With FDP (with some minor rocksdb changes):
> 
> WAF:        1.67
> IOPS:       1547
> READ LAT:   1978us
> UPDATE LAT: 2267us

Compared to the Numbers Hans presented at Plumbers for the Zoned XFS code,
which should work just fine with FDP IFF we exposed real write streams,
which roughly double read nad wirte IOPS and reduce the WAF to almost
1 this doesn't look too spectacular to be honest, but it sure it something.

I just wish we could get the real infraѕtructure instead of some band
aid, which makes it really hard to expose the real thing because now
it's been taken up and directly wired to a UAPI.
one



More information about the Linux-nvme mailing list