[PATCH] arm64: Add the arm64.nolse_atomics command line option

Aiqun(Maria) Yu quic_aiquny at quicinc.com
Wed Jul 12 19:24:24 PDT 2023


On 7/12/2023 3:36 PM, Mark Rutland wrote:
> On Wed, Jul 12, 2023 at 11:09:10AM +0800, Aiqun(Maria) Yu wrote:
>> On 7/11/2023 6:25 PM, Will Deacon wrote:
>>> On Tue, Jul 11, 2023 at 06:15:49PM +0800, Aiqun(Maria) Yu wrote:
>>>> On 7/11/2023 4:22 PM, Will Deacon wrote:
>>>>> On Tue, Jul 11, 2023 at 12:02:22PM +0800, Aiqun(Maria) Yu wrote:
>>>>>> On 7/10/2023 5:37 PM, Will Deacon wrote:
>>>>>>> On Mon, Jul 10, 2023 at 01:59:55PM +0800, Maria Yu wrote:
>>>>>>>> In order to be able to disable lse_atomic even if cpu
>>>>>>>> support it, most likely because of memory controller
>>>>>>>> cannot deal with the lse atomic instructions, use a
>>>>>>>> new idreg override to deal with it.
>>>>>>>
>>>>>>> This should not be a problem for cacheable memory though, right?
>>>>>>>
>>>>>>> Given that Linux does not issue atomic operations to non-cacheable mappings,
>>>>>>> I'm struggling to see why there's a problem here.
>>>>>>
>>>>>> The lse atomic operation can be issued on non-cacheable mappings as well.
>>>>>> Even if it is cached data, with different CPUECTLR_EL1 setting, it can also
>>>>>> do far lse atomic operations.
>>>>>
>>>>> Please can you point me to the place in the kernel sources where this
>>>>> happens? The architecture doesn't guarantee that atomics to non-cacheable
>>>>> mappings will work, see "B2.2.6 Possible implementation restrictions on
>>>>> using atomic instructions". Linux, therefore, doesn't issue atomics
>>>>> to non-cacheable memory.
>>>>
>>>> We encounter the issue on third party kernel modules and third party apps
>>>> instead of linux kernel itself.
>>>
>>> Great, so there's nothing to do in the kernel then!
>>>
>>> The third party code needs to be modified not to use atomic instructions
>>> with non-cacheable mappings. No need to involve us with that.
>>
>>>> This is a tradeoff of performance and stability. Per my understanding,
>>>> options can be used to enable the lse_atomic to have the most performance
>>>> cared system, and disable the lse_atomic by stability cared most system.
>>>
>>> Where do livelock and starvation fit in with "stability"? Disabling LSE
>>> atomics for things like qspinlock and the scheduler just because of some
>>> badly written third-party code isn't much of a tradeoff.
> 
>> We also have requirement to have cpus/system fully support lse atomic and
>> cpus/system not fully support lse atomic with a generic kernel image.
> 
> Who *specifically* has this requirement (i.e. what does 'we' mean here)? The

I can use other word to describe the requirement instead of "we".

There is requirements like android google gki. It request different cpu 
arch system to use same generic kernel Image.

> upstream kernel does not require that atomics work on non-cacheable memory, and

The same issue the system can be down of lse atomic not supported for 
cachable memory when there need far atomic.

> saying "The company I work for want this" doesn't change that.
> 
> AFAICT the system here is architecturally compliant, and what you're relying
> upon something that the architecture doesn't guarantee, and Linux doesn't
> guarantee.

It is not also only our company's problem:
To support the atomic instructions added in the Armv8.1 architecture, 
CHI-B provides Atomic Transactions. while Atomic Transactions support is 
also *optional* from CHI-B.

So far atomic cannot fully supported by ARMv8.1 cpu + CHI-B system as well.

from: 
https://developer.arm.com/documentation/102407/0100/Atomic-operations?lang=en 

So only cpu support atomic cannot garantee the system support lse atomic
> 
>> Same kernel module wanted to be used by lse atomic fully support cpu and not
>> fully support cpu/system as well.
> 
> Which kernel modules *specifically* need to do atomics to non-cacheable memory?
The driver want to always do far atomic(no speculatively) and allow a 
read-modify-write non-interruptible sequence in a single instruction.
> 
>> That's why we want to have a runtime option here.
> 
> As per other replies, a runtime option doesn't solve the issue you have
> described, and it will adversely affect the system in other ways (e.g. the
> livelock and starvation issues will mentioned, which we have seen with
> LDXR+STXR atomics).
I myself also have encounter issues from livelock because of LDXR+STXR 
atomics unfairness before. More likely happened when different 
performance cpu. So myself also glad to using atomics instead of 
exclusive access.
So if there is a way to fully utilize the atomic instructions for 
current hardware, and also support the far atomic, that can be much 
better solution than currently disable the feature.
> 
> Thanks,
> Mark.

Pls feel free to comments. It would lead to a reasonable and usable 
solution from our discussions.

-- 
Thx and BRs,
Aiqun(Maria) Yu




More information about the linux-arm-kernel mailing list