[PATCH v2] arm64: kernel: implement fast refcount checking
Ard Biesheuvel
ard.biesheuvel at linaro.org
Wed Jul 26 06:25:39 PDT 2017
> On 26 Jul 2017, at 10:21, Li Kun <hw.likun at huawei.com> wrote:
>
>
>
>> 在 2017/7/26 16:40, Ard Biesheuvel 写道:
>>> On 26 July 2017 at 05:11, Li Kun <hw.likun at huawei.com> wrote:
>>> Hi Ard,
>>>
>>>
>>>> on 2017/7/26 2:15, Ard Biesheuvel wrote:
>>>> +#define REFCOUNT_OP(op, asm_op, cond, l, clobber...) \
>>>> +__LL_SC_INLINE int \
>>>> +__LL_SC_PREFIX(__refcount_##op(int i, atomic_t *r)) \
>>>> +{ \
>>>> + unsigned long tmp; \
>>>> + int result; \
>>>> + \
>>>> + asm volatile("// refcount_" #op "\n" \
>>>> +" prfm pstl1strm, %2\n" \
>>>> +"1: ldxr %w0, %2\n" \
>>>> +" " #asm_op " %w0, %w0, %w[i]\n" \
>>>> +" st" #l "xr %w1, %w0, %2\n" \
>>>> +" cbnz %w1, 1b\n" \
>>>> + REFCOUNT_CHECK(cond) \
>>>> + : "=&r" (result), "=&r" (tmp), "+Q" (r->counter) \
>>>> + : REFCOUNT_INPUTS(r) [i] "Ir" (i) \
>>>> + clobber); \
>>>> + \
>>>> + return result; \
>>>> +} \
>>>> +__LL_SC_EXPORT(__refcount_##op);
>>>> +
>>>> +REFCOUNT_OP(add_lt, adds, mi, , REFCOUNT_CLOBBERS);
>>>> +REFCOUNT_OP(sub_lt_neg, adds, mi, l, REFCOUNT_CLOBBERS);
>>>> +REFCOUNT_OP(sub_le_neg, adds, ls, l, REFCOUNT_CLOBBERS);
>>>> +REFCOUNT_OP(sub_lt, subs, mi, l, REFCOUNT_CLOBBERS);
>>>> +REFCOUNT_OP(sub_le, subs, ls, l, REFCOUNT_CLOBBERS);
>>>> +
>>> I'm not quite sure if we use b.lt to judge whether the result of adds is
>>> less than zero is correct or not.
>>> The b.lt means N!=V, take an extreme example, if we operate like below, the
>>> b.lt will also be true.
>>>
>>> refcount_set(&ref_c,0x80000000);
>>> refcount_dec_and_test(&ref_c);
>>>
>>> maybe we should use PL/NE/MI/EQ to judge the LT_ZERO or LE_ZERO condition ?
>>>
>> The lt/le is confusing here: the actual condition coded used are mi
>> for negative and ls for negative or zero.
>>
>> I started out using lt and le, because it matches the x86 code, but I
>> moved to mi and ls instead. (I don't think it makes sense to deviate
>> from that just because the flags and predicates work a bit
>> differently.)
>>
>> However, I see now that there is one instance of REFCOUNT_CHECK(lt)
>> remaining (in refcount.h). That should mi as well.
> Sorry for having misunderstood your implementation.
> If we want to catch the refcount_sub operate on negetive value(like -1 to -2 ), i think b.ls can't achieve that.
You are right. Decrementing a refcount which is already negative will not be caught. Is that a problem?
> I think (b.eq || b.mi ) may be better than b.ls for this case.
>
At the expense of one more instruction, yes. I tend to agree, though. The atomics are fairly costly to begin with, so one more predicted non-taken branch won't really make a difference.
In fact, I realised that we could easily implement the add-from-zero case as well: both ll/sc and lse versions have the old count in a register already, so all we need is a cbz in the right place.
More information about the linux-arm-kernel
mailing list