FYI: path walking optimizations pending for 6.11
Linus Torvalds
torvalds at linux-foundation.org
Wed Jun 19 15:08:47 PDT 2024
On Wed, 19 Jun 2024 at 13:45, Matthew Wilcox <willy at infradead.org> wrote:
>
> Funnily, I'm working on rosebush v2 today. It's in no shape to send out
> (it's failing ~all of its selftests) but *should* greatly improve the
> cache friendliness of the hash table. And it's being written with the
> dcache as its first customer.
I'm interested to see if you can come up with something decent, but
I'm not hugely optimistic.
>From what I saw, you planned on comparing with rhashtable hash chains of 10.
But that's not what the dentry cache uses at all. rhashtable is way
too slow. It's been ages since I ran the numbers, but the dcache array
is just sized to be "large enough".
In fact, my comment about my workload being better if the hash table
was smaller was because we really are pretty aggressive with the
dcache hash table size. I think our scaling factor is 13 - as in "one
entry per 8kB of memory".
Which is almost certainly wasting memory, but name lookup really does
show up as a hot thing on many loads.
Anyway, what it means is that the dcache hash chain is usually *one*.
Not ten. And has none of the rhashtable overheads.
So if your "use linear lookups to make the lookup faster" depends on
comparing with ten entry chains of rhashtable, you might be in for a
very nasty surprise.
In my profiles, the first load of the hash table tends to be the
expensive one. Not the chain following.
Of course, my profiles are not only just one random load, they are
also skewed by the fact that I reboot so much. So maybe my dentry
cache just doesn't grow sufficiently big during my testing, and thus
my numbers are skewed even for just my own loads.
Benchmarking is hard.
Anyway, that was just a warning that if you're comparing against
rhashtable, you have almost certainly already lost before you even got
started.
Linus
More information about the linux-arm-kernel
mailing list