[PATCHv2] ARM64: Add AT_ARM64_MIDR to the aux vector
Pinski, Andrew
Andrew.Pinski at caviumnetworks.com
Wed Sep 2 10:21:24 PDT 2015
> On Sep 3, 2015, at 1:12 AM, Catalin Marinas <catalin.marinas at arm.com> wrote:
>
>> On Wed, Sep 02, 2015 at 10:52:05PM +0800, Andrew Pinski wrote:
>> On Wed, Sep 2, 2015 at 9:57 PM, Siarhei Siamashka <siarhei.siamashka at gmail.com> wrote:
> [...]
>>>>> On Wed, 2 Sep 2015 01:58:56 +0800 pinskia at gmail.com wrote:
>>>>>> Yes but I guess you talk about caching the value in userspace but doing
>>>>>> it via the aux vector is the same as your suggestion. Just one
>>>>>> difference is you don't get the aux vector entry if there is a CPU
>>>>>> that is online which is different. No difference from your suggestion
>>>>>> of caching it. Without considering hot pug for a second (that is a
>>>>>> huge different issue all together), if userland wants to know if all
>>>>>> up CPUs have the same midr, they would either read /sys entries (lots
>>>>>> of syscalls) or bind to each CPU and do the trap. That means at least
>>>>>> three or two syscalls/traps for each CPU. My way is none and gets a
>>>>>> value of midr if they are all the Same for free.
>
> Whether we like it or not, big.LITTLE systems are present in the ARM
> ecosystem. You are looking to add a specific AT_ to solve a particular
> problem on fully symmetric systems, ignoring the rest. I want this fixed
> for all systems rather than trying to invent something else for
> big.LITTLE which won't help user space at all.
>
> If you want to avoid parsing all /sys entries, I would rather have a
> HWCAP_ASYMMETRIC bit for big.LITTLE systems and let user space decide
> whether to read all entries or not.
>
>>> I wonder if we still can try to make "sched_getcpu()" use vDSO
>>> instead of the syscall to make it faster? Now that there exists a
>>> vDSO implementation for gettimeofday(), everything should be easy if
>>> we can find an unused userspace readable coprocessor register.
>>> In the past, Catalin Marinas mentioned that "We have a user read-only
>>> thread register unused on arm64":
>>> http://lists.infradead.org/pipermail/linux-arm-kernel/2013-December/220664.html
>
> We have a patch under development internally, it will appear on the list
> at some point.
>
>>> And if I understand it correctly, also one of the "scratch registers"
>>> should exist in 32-bit ARM, which "isn't present in ARMv8/AArch64":
>>> http://lists.infradead.org/pipermail/linux-arm-kernel/2013-December/221056.html
>>> What kind of registers are these exactly?
>>>
>>> In principle, the aux vector can be extended to also contain a pointer
>>> to an array of MIDR values for all the CPU cores if reducing the setup
>>> overhead is critical.
>>
>> That is not a bad idea. Put this array in the data section of the
>> VDSO too. It should be small enough though on systems with 96 or more
>> cores (dual socket ThunderX has 96 cores total), it is slightly
>> getting big.
>> The struct would be something like:
>> struct
>> {
>> int32 numcores;
>> int32 midr[];
>> };
>
> First of all, I'm against hard-coding (VDSO) data as ABI. So far we used
> VDSO to override some weak glibc functions but the VDSO-specific data is
> parsed by the VDSO function implementation and not directly by glibc (or
> user space). I prefer helper functions that read the VDSO-internal data
> structures.
You don't like the idea of a fixed structure ABI that resides inside vdso data? Having a fixed struct ABI should be ok. The location inside the data part was going to be passed via an aux vector entry. Userland does even need to know it is really located in the vdso at all. It just happens to reside in there. The data structure would be well defined for the aux vector.
>
> Secondly, you seem to be only interested in MIDR_EL1 but we also have
> REVIDR_EL1 and AIDR_EL1 which may be relevant. Once we realise that more
> information is needed, it's not always clear where the boundaries are
> so I would rather have this exposed via /sys and/or MRS emulation (there
> are patches for both).
>
> Anyway, you need to involve the toolchain people in such discussions,
> they may have different needs (like ifunc).
I am a toolchain person first and needed this in the first place for memset and memcpy on thunderx. I had asked our kernel engineers here first before even going forward with my original approach. I touch the kernel only because I need to really. I even had asked on #linaro-kernel and #linaro-toolchain around two-three months ago about this approach and I only had time to post it this week.
Thanks,
Andrew
>
> --
> Catalin
More information about the linux-arm-kernel
mailing list