[PATCH 0/3] Revert arm64 cache geometry

Ard Biesheuvel ard.biesheuvel at linaro.org
Thu Oct 29 09:28:19 PDT 2015


> On 29 okt. 2015, at 16:43, Russell King - ARM Linux <linux at arm.linux.org.uk> wrote:
> 
>> On Thu, Oct 29, 2015 at 12:22:51PM +0900, Ard Biesheuvel wrote:
>> Fair enough. It is a bit disappointing that we cannot trust these
>> values, but if the architecture does not mandate their accuracy, we
>> obviously should not be using them in the way that we are.
>> 
>> I think we have similar code in the ARM tree, so we should probably
>> make some changes there as well.
> 
> I've opposed exporting the cache dimensions to userspace for several
> reasons:
> 

I agree with the arguments below. However, what I refer to here is kernel code that infers whether a certain VIPT cache is non-aliasing based on the way size, which is calculated from values that are exposed to software for the sole purpose of enumerating cachelines by set/way.


> * it will become a nightmare with the various different register formats
>  to properly decode these values
> * older CPUs don't have the cache ID registers, so we'd need to augment
>  any export with additional static configuration somehow
> * I don't trust userland with this information to make the right choices,
>  especially when faced with further levels of caches.
> 
> The main reason for people wanting the cache dimensions has been "so we
> can select the optimal code for the CPU".  Given all the combinations
> of caches out there, I've always said selecting code based on one level
> of cache is totally insane, and userspace is better off doing some
> performance measurement of its implementations and selecting the most
> appropriate version.
> 
> There's many things that affect the performance of code paths with CPUs.
> It's not just about cache line size, but instruction latencies, write
> delays, branch prediction and so forth.  You can't _say_ "because this
> CPU has a 32K L1 cache, if I optimise my code as X it'll perform
> better everywhere with a 32K L1 cache than optimised Y."
> 
> Selecting code based on cache parameters is just wrong.
> 
> There may be cases where userspace would like to know the cache line
> size, so it can appropriately align data structures - but that again
> depends on what you're trying to achieve, and what if the L1 cache
> line size is different from the L2 cache line size...
> 
> It's a minefield, one which IMHO userspace shouldn't be trusted with.
> Userspace should assume the worst case cache line size seen in ARM
> CPUs and be done with it.
> 
> -- 
> FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
> according to speedtest.net.



More information about the linux-arm-kernel mailing list