On Sat, Dec 5, 2020 at 3:10 AM Rasmus Villemoes <linux@xxxxxxxxxxxxxxxxxx> wrote: > > On 03/12/2020 19.46, Yury Norov wrote: > > > I would prefer to avoid changing the find*bit() semantics. As for now, > > if any of find_*_bit() > > finds nothing, it returns the size of the bitmap it was passed. > > Yeah, we should actually try to fix that, it causes bad code generation. > It's hard, because callers of course do that "if ret == size" check. But > it's really silly that something like find_first_bit needs to do that > "min(i*BPL + __ffs(word), size)" - the caller does a comparison anyway, > that comparison might as well be "ret >= size" rather than "ret == > size", and then we could get rid of that branch (which min() necessarily > becomes) at the end of find_next_bit. We didn't do that 5 years ago because it's too invasive and the improvement is barely measurable, the difference is 2 instructions (on arm64).e. Has something changed since that? 20000000000000000 <find_first_bit_better>: 0: aa0003e3 mov x3, x0 4: aa0103e0 mov x0, x1 8: b4000181 cbz x1, 38 <find_first_bit_better+0x38> c: f9400064 ldr x4, [x3] 10: d2800802 mov x2, #0x40 // #64 14: 91002063 add x3, x3, #0x8 18: b40000c4 cbz x4, 30 <find_first_bit_better+0x30> 1c: 14000008 b 3c <find_first_bit_better+0x3c> 20: f8408464 ldr x4, [x3], #8 24: 91010045 add x5, x2, #0x40 28: b50000c4 cbnz x4, 40 <find_first_bit_better+0x40> 2c: aa0503e2 mov x2, x5 30: eb00005f cmp x2, x0 34: 54ffff63 b.cc 20 <find_first_bit_better+0x20> // b.lo, b.ul, b.last 38: d65f03c0 ret 3c: d2800002 mov x2, #0x0 // #0 40: dac00084 rbit x4, x4 44: dac01084 clz x4, x4 48: 8b020080 add x0, x4, x2 4c: d65f03c0 ret 0000000000000050 <find_first_bit_worse>: 50: aa0003e4 mov x4, x0 54: aa0103e0 mov x0, x1 58: b4000181 cbz x1, 88 <find_first_bit_worse+0x38> 5c: f9400083 ldr x3, [x4] 60: d2800802 mov x2, #0x40 // #64 64: 91002084 add x4, x4, #0x8 68: b40000c3 cbz x3, 80 <find_first_bit_worse+0x30> 6c: 14000008 b 8c <find_first_bit_worse+0x3c> 70: f8408483 ldr x3, [x4], #8 74: 91010045 add x5, x2, #0x40 78: b50000c3 cbnz x3, 90 <find_first_bit_worse+0x40> 7c: aa0503e2 mov x2, x5 80: eb02001f cmp x0, x2 84: 54ffff68 b.hi 70 <find_first_bit_worse+0x20> // b.pmore 88: d65f03c0 ret 8c: d2800002 mov x2, #0x0 // #0 90: dac00063 rbit x3, x3 94: dac01063 clz x3, x3 98: 8b020062 add x2, x3, x2 9c: eb02001f cmp x0, x2 a0: 9a829000 csel x0, x0, x2, ls // ls = plast a4: d65f03c0 ret > I haven't dug very deep into this, but I could also imagine the > arch-specific parts of this might become a little easier to do if the > semantics were just "if no such bit, return an indeterminate value >= > the size". > > > Changing this for > > a single function would break the consistency, and may cause problems > > for those who > > rely on existing behaviour. > > True. But I think it should be possible - I suppose most users are via > the iterator macros, which could all be updated at once. Changing ret == > size to ret >= size will still work even if the implementations have not > been switched over, so it should be doable. Since there's no assembler users for it, we can do just: #define find_first_bit(bitmap, size) min(better_find_first_bit((bitmap), (size)), (size)) ... and deprecate find_first_bit. > > Passing non-positive size to find_*_bit() should produce undefined > > behaviour, because we cannot dereference a pointer to the bitmap in > > this case; this is most probably a sign of a problem on a caller side > > anyways. > > No, the out-of-line bitmap functions should all handle the case of a > zero-size bitmap sensibly. I could be more specific, the behaviour is defined: don't dereference the address and return undefined value (which now is always 0). > Is bitmap full? Yes (all the 0 bits are set). > Is bitmap empty? Yes, (none of the 0 bits are set). > Find the first bit set (returns 0, there's no such bit) I can't answer because this object is not a map of bits - there's no room for bits inside. > Etc. The static inlines for small_const_nbits do assume that the pointer > can be dereferenced, which is why small_const_nbits was updated to mean > 1<=bits<=BITS_PER_LONG rather than just bits<=BITS_PER_LONG. I don't want to do something like if (size == 0) return -1; ... because it legitimizes this kind of usage and hides problems on callers' side. Instead, I'd add WARN_ON(size == 0), but I don't think it's so critical to bother with it. Yury > Rasmus