[PATCH 0/9] lib/bitmap: optimize bitmap_weight() usage

Michał Mirosław mirq-linux at rere.qmqm.pl
Mon Nov 29 08:34:07 PST 2021


Dnia 29 listopada 2021 06:38:39 UTC, Yury Norov <yury.norov at gmail.com> napisał/a:
>On Sun, Nov 28, 2021 at 07:03:41PM +0100, mirq-test at rere.qmqm.pl wrote:
>> On Sat, Nov 27, 2021 at 07:56:55PM -0800, Yury Norov wrote:
>> > In many cases people use bitmap_weight()-based functions like this:
>> > 
>> > 	if (num_present_cpus() > 1)
>> > 		do_something();
>> > 
>> > This may take considerable amount of time on many-cpus machines because
>> > num_present_cpus() will traverse every word of underlying cpumask
>> > unconditionally.
>> > 
>> > We can significantly improve on it for many real cases if stop traversing
>> > the mask as soon as we count present cpus to any number greater than 1:
>> > 
>> > 	if (num_present_cpus_gt(1))
>> > 		do_something();
>> > 
>> > To implement this idea, the series adds bitmap_weight_{eq,gt,le}
>> > functions together with corresponding wrappers in cpumask and nodemask.
>> 
>> Having slept on it I have more structured thoughts:
>> 
>> First, I like substituting bitmap_empty/full where possible - I think
>> the change stands on its own, so could be split and sent as is.
>
>Ok, I can do it.
>
>> I don't like the proposed API very much. One problem is that it hides
>> the comparison operator and makes call sites less readable:
>> 
>> 	bitmap_weight(...) > N
>> 
>> becomes:
>> 
>> 	bitmap_weight_gt(..., N)
>> 
>> and:
>> 	bitmap_weight(...) <= N
>> 
>> becomes:
>> 
>> 	bitmap_weight_lt(..., N+1)
>> or:
>> 	!bitmap_weight_gt(..., N)
>> 
>> I'd rather see something resembling memcmp() API that's known enough
>> to be easier to grasp. For above examples:
>> 
>> 	bitmap_weight_cmp(..., N) > 0
>> 	bitmap_weight_cmp(..., N) <= 0
>> 	...
>
>bitmap_weight_cmp() cannot be efficient. Consider this example:
>
>bitmap_weight_lt(1000 0000 0000 0000, 1) == false
>                 ^
>                 stop here
>
>bitmap_weight_cmp(1000 0000 0000 0000, 1) == 0
>                                 ^
>                                 stop here
>
>I agree that '_gt' is less verbose than '>', but the advantage of 
>'_gt' over '>' is proportional to length of bitmap, and it means
>that this API should exist.

Thank you for the example. Indeed, for less-than to be efficient here you would need to replace
 bitmap_weight_cmp(..., N) < 0
with
 bitmap_weight_cmp(..., N-1) <= 0

It would still be more readable, I think.

Best Regards
Michał Mirosław



More information about the linux-snps-arc mailing list