On Mon, Nov 29, 2021 at 04:34:07PM +0000, Michał Mirosław wrote: > Dnia 29 listopada 2021 06:38:39 UTC, Yury Norov <yury.norov@xxxxxxxxx> napisał/a: > >On Sun, Nov 28, 2021 at 07:03:41PM +0100, mirq-test@xxxxxxxxxxxx wrote: > >> On Sat, Nov 27, 2021 at 07:56:55PM -0800, Yury Norov wrote: > >> > In many cases people use bitmap_weight()-based functions like this: > >> > > >> > if (num_present_cpus() > 1) > >> > do_something(); > >> > > >> > This may take considerable amount of time on many-cpus machines because > >> > num_present_cpus() will traverse every word of underlying cpumask > >> > unconditionally. > >> > > >> > We can significantly improve on it for many real cases if stop traversing > >> > the mask as soon as we count present cpus to any number greater than 1: > >> > > >> > if (num_present_cpus_gt(1)) > >> > do_something(); > >> > > >> > To implement this idea, the series adds bitmap_weight_{eq,gt,le} > >> > functions together with corresponding wrappers in cpumask and nodemask. > >> > >> Having slept on it I have more structured thoughts: > >> > >> First, I like substituting bitmap_empty/full where possible - I think > >> the change stands on its own, so could be split and sent as is. > > > >Ok, I can do it. > > > >> I don't like the proposed API very much. One problem is that it hides > >> the comparison operator and makes call sites less readable: > >> > >> bitmap_weight(...) > N > >> > >> becomes: > >> > >> bitmap_weight_gt(..., N) > >> > >> and: > >> bitmap_weight(...) <= N > >> > >> becomes: > >> > >> bitmap_weight_lt(..., N+1) > >> or: > >> !bitmap_weight_gt(..., N) > >> > >> I'd rather see something resembling memcmp() API that's known enough > >> to be easier to grasp. For above examples: > >> > >> bitmap_weight_cmp(..., N) > 0 > >> bitmap_weight_cmp(..., N) <= 0 > >> ... > > > >bitmap_weight_cmp() cannot be efficient. Consider this example: > > > >bitmap_weight_lt(1000 0000 0000 0000, 1) == false > > ^ > > stop here > > > >bitmap_weight_cmp(1000 0000 0000 0000, 1) == 0 > > ^ > > stop here > > > >I agree that '_gt' is less verbose than '>', but the advantage of > >'_gt' over '>' is proportional to length of bitmap, and it means > >that this API should exist. > > Thank you for the example. Indeed, for less-than to be efficient here you would need to replace > bitmap_weight_cmp(..., N) < 0 > with > bitmap_weight_cmp(..., N-1) <= 0 Indeed, thanks for pointing to it. > It would still be more readable, I think. To be honest, I'm not sure that bitmap_weight_cmp(..., N-1) <= 0 would be an obvious replacement for the original bitmap_weight(...) < N comparing to bitmap_weight_lt(..., N) I think the best thing I can do is to add bitmap_weight_cmp() as you suggested, and turn lt and others to be wrappers on it. This will let people choose a better function in each case. I also think that for v2 it would be better to drop the conversion for short bitmaps, except for switching to bitmap_empty(), because in that case readability wins over performance; if no objections. Thanks, Yury