[PATCH v4] arm64: Add support for new control bits CTR_EL0.DIC and CTR_EL0.IDC

Robin Murphy robin.murphy at arm.com
Thu Feb 22 08:53:12 PST 2018


On 22/02/18 16:33, Mark Rutland wrote:
> On Thu, Feb 22, 2018 at 04:28:03PM +0000, Robin Murphy wrote:
>> [Apologies to keep elbowing in, and if I'm being thick here...]
>>
>> On 22/02/18 15:22, Mark Rutland wrote:
>>> On Thu, Feb 22, 2018 at 08:51:30AM -0600, Shanker Donthineni wrote:
>>>> +#define CTR_B31_SHIFT		31
>>>
>>> Since this is just a RES1 bit, I think we don't need a mnemonic for it,
>>> but I'll defer to Will and Catalin on that.
>>>
>>>>    ENTRY(invalidate_icache_range)
>>>> +#ifdef CONFIG_ARM64_SKIP_CACHE_POU
>>>> +alternative_if ARM64_HAS_CACHE_DIC
>>>> +	mov	x0, xzr
>>>> +	dsb	ishst
>>>> +	isb
>>>> +	ret
>>>> +alternative_else_nop_endif
>>>> +#endif
>>>
>>> As commented on v3, I don't believe you need the DSB here. If prior
>>> stores haven't been completed at this point, the existing implementation
>>> would not work correctly here.
>>
>> True in terms of ordering between stores prior to entry and the IC IVAU
>> itself, but what about the DSH ISH currently issued *after* the IC IVAU
>> before returning? Is provably impossible that existing callers might be
>> relying on that ordering *anything*, or would we risk losing something
>> subtle by effectively removing it?
> 
> AFAIK, the only caller of this is KVM, before page table updates occur
> to add execute permissions to the page this is applied to.
> 
> At least in that case, I do not beleive there would be breakage.
> 
> If we're worried about subtleties in callers, then we'd need to stick
> with DSB ISH rather than optimising to DSH ISHST.

Hmm, I probably am just squawking needlessly. It is indeed hard to 
imagine how callers could be relying on the invalidating the I-cache for 
ordering unless doing something unreasonably stupid, and if the current 
caller is clearly OK then there should be nothing to worry about.

This *has* helped me realise that I was indeed being somewhat thick 
before, because the existing barrier is of course not about memory 
ordering per se, but about completing the maintenance operation. Hooray 
for overloaded semantics...

On a different track, I'm now wondering whether the extra complexity of 
these alternatives might justify removing some obvious duplication and 
letting __flush_cache_user_range() branch directly into 
invalidate_icache_range(), or might that adversely affect the user fault 
fixup path?

Robin.



More information about the linux-arm-kernel mailing list