[RFC] arm64: Enforce observed order for spinlock and data

bdegraaf at codeaurora.org bdegraaf at codeaurora.org
Wed Oct 5 08:11:43 PDT 2016


On 2016-10-05 10:55, bdegraaf at codeaurora.org wrote:
> On 2016-10-04 15:12, Mark Rutland wrote:
>> Hi Brent,
>> 
>> Could you *please* clarify if you are trying to solve:
>> 
>> (a) a correctness issue (e.g. data corruption) seen in practice.
>> (b) a correctness issue (e.g. data corruption) found by inspection.
>> (c) A performance issue, seen in practice.
>> (d) A performance issue, found by inspection.
>> 
>> Any one of these is fine; we just need to know in order to be able to
>> help effectively, and so far it hasn't been clear.
>> 
>> On Tue, Oct 04, 2016 at 01:53:35PM -0400, bdegraaf at codeaurora.org 
>> wrote:
>>> After looking at this, the problem is not with the lockref code per
>>> se: it is a problem with arch_spin_value_unlocked(). In the
>>> out-of-order case, arch_spin_value_unlocked() can return TRUE for a
>>> spinlock that is in fact locked but the lock is not observable yet 
>>> via
>>> an ordinary load.
>> 
>> Given arch_spin_value_unlocked() doesn't perform any load itself, I
>> assume the ordinary load that you are referring to is the READ_ONCE()
>> early in CMPXCHG_LOOP().
>> 
>> It's worth noting that even if we ignore ordering and assume a
>> sequentially-consistent machine, READ_ONCE() can give us a stale 
>> value.
>> We could perform the read, then another agent can acquire the lock, 
>> then
>> we can move onto the cmpxchg(), i.e.
>> 
>>     CPU0                              CPU1
>>     old = READ_ONCE(x.lock_val)
>>                                       spin_lock(x.lock)
>>     cmpxchg(x.lock_val, old, new)
>>                                       spin_unlock(x.lock)
>> 
>> If the 'old' value is stale, the cmpxchg *must* fail, and the cmpxchg
>> should return an up-to-date value which we will then retry with.
>> 
>>> Other than ensuring order on the locking side (as the prior patch
>>> did), there is a way to make arch_spin_value_unlock's TRUE return
>>> value deterministic,
>> 
>> In general, this cannot be made deterministic. As above, there is a 
>> race
>> that cannot be avoided.
>> 
>>> but it requires that it does a write-back to the lock to ensure we
>>> didn't observe the unlocked value while another agent was in process
>>> of writing back a locked value.
>> 
>> The cmpxchg gives us this guarantee. If it successfully stores, then 
>> the
>> value it observed was the same as READ_ONCE() saw, and the update was
>> atomic.
>> 
>> There *could* have been an intervening sequence between the READ_ONCE
>> and cmpxchg (e.g. put(); get()) but that's not problematic for 
>> lockref.
>> Until you've taken your reference it was possible that things changed
>> underneath you.
>> 
>> Thanks,
>> Mark.
> 
> Mark,
> 
> I found the problem.
> 
> Back in September of 2013, arm64 atomics were broken due to missing 
> barriers
> in certain situations, but the problem at that time was undiscovered.
> 
> Will Deacon's commit d2212b4dce596fee83e5c523400bf084f4cc816c went in 
> at that
> time and changed the correct cmpxchg64 in lockref.c to 
> cmpxchg64_relaxed.
> 
> d2212b4 appeared to be OK at that time because the additional barrier
> requirements of this specific code sequence were not yet discovered, 
> and
> this change was consistent with the arm64 atomic code of that time.
> 
> Around February of 2014, some discovery led Will to correct the problem 
> with
> the atomic code via commit 8e86f0b409a44193f1587e87b69c5dcf8f65be67, 
> which
> has an excellent explanation of potential ordering problems with the 
> same
> code sequence used by lockref.c.
> 
> With this updated understanding, the earlier commit
> (d2212b4dce596fee83e5c523400bf084f4cc816c) should be reverted.
> 
> Because acquire/release semantics are insufficient for the full 
> ordering,
> the single barrier after the store exclusive is the best approach, 
> similar
> to Will's atomic barrier fix.
> 
> Best regards,
> Brent

FYI, this is a "b" type fix (correctness fix based on code inspection).




More information about the linux-arm-kernel mailing list