CAS implementation may be broken
Russell King - ARM Linux
linux at arm.linux.org.uk
Mon Nov 23 15:06:20 EST 2009
On Mon, Nov 23, 2009 at 08:10:51PM +0100, Toby Douglass wrote:
> Russell King - ARM Linux wrote:
>> First time around the loop, lets say %3 = 1 *(u32 *)%2 = 1.
>> ldrex %1, [%2]
>> %1 = *(u32 *)%2 (= 1)
>> mov %0, #0
>> %0 = 0
>> teq %1, %3
>> %3 == %1? (yes)
>> strexeq %0, %4, [%2]
>> executed but because of the other access,
>> exclusivity fails. *(u32 *)%2 not written
>> and %0 = 1
>> So, res = 1, and we go around the loop again. Lets say that *(u32 *)%2 = 2
> No - we're dealing with the ABA problem. We're assuming here that this
> thread gets to retry with the same values.
>> I haven't had time to read all your email properly (due to the need to
>> get on a conference call), but please tell me where the problem is above
>> (using a similar worked example).
> So; we go around again, load %2, do the teq, which succeeds, then the
> strexeq, which now succeeds since no-one else has touched %2.
> This was the thrust of the original post; however, Catalin has raised
> arguments against it which I have not yet digested, so what I'm writing
> here, where it is simply an enlargement on the OP, has the same flaws;
> it's only in response to your specific point. I'm not trying to assert
> this *is* what happens, in spite of what Catalin has written.
Well, I've thought it through quite a bit now, and I have an expansive
reply to your email. In summary, there is nothing wrong with the
existing code; your use of it is the problem.
I can post the expansive reply if you need the details.
In short, consider what happens if you consider a slightly different order
of operations, where you have calculated 'ptr', 'old' and 'new' for cmpxchg
but you haven't executed the first ldrex.
More information about the linux-arm-kernel