Re:

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sat, Dec 5, 2020 at 3:10 AM Rasmus Villemoes
<linux@xxxxxxxxxxxxxxxxxx> wrote:
>
> On 03/12/2020 19.46, Yury Norov wrote:
>
> > I would prefer to avoid changing the find*bit() semantics. As for now,
> > if any of find_*_bit()
> > finds nothing, it returns the size of the bitmap it was passed.
>
> Yeah, we should actually try to fix that, it causes bad code generation.
> It's hard, because callers of course do that "if ret == size" check. But
> it's really silly that something like find_first_bit needs to do that
> "min(i*BPL + __ffs(word), size)" - the caller does a comparison anyway,
> that comparison might as well be "ret >= size" rather than "ret ==
> size", and then we could get rid of that branch (which min() necessarily
> becomes) at the end of find_next_bit.

We didn't do that 5 years ago because it's too invasive and the improvement
is barely measurable, the difference is 2 instructions (on arm64).e.
Has something
changed since that?

20000000000000000 <find_first_bit_better>:
   0:   aa0003e3        mov     x3, x0
   4:   aa0103e0        mov     x0, x1
   8:   b4000181        cbz     x1, 38 <find_first_bit_better+0x38>
   c:   f9400064        ldr     x4, [x3]
  10:   d2800802        mov     x2, #0x40                       // #64
  14:   91002063        add     x3, x3, #0x8
  18:   b40000c4        cbz     x4, 30 <find_first_bit_better+0x30>
  1c:   14000008        b       3c <find_first_bit_better+0x3c>
  20:   f8408464        ldr     x4, [x3], #8
  24:   91010045        add     x5, x2, #0x40
  28:   b50000c4        cbnz    x4, 40 <find_first_bit_better+0x40>
  2c:   aa0503e2        mov     x2, x5
  30:   eb00005f        cmp     x2, x0
  34:   54ffff63        b.cc    20 <find_first_bit_better+0x20>  //
b.lo, b.ul, b.last
  38:   d65f03c0        ret
  3c:   d2800002        mov     x2, #0x0                        // #0
  40:   dac00084        rbit    x4, x4
  44:   dac01084        clz     x4, x4
  48:   8b020080        add     x0, x4, x2
  4c:   d65f03c0        ret

0000000000000050 <find_first_bit_worse>:
  50:   aa0003e4        mov     x4, x0
  54:   aa0103e0        mov     x0, x1
  58:   b4000181        cbz     x1, 88 <find_first_bit_worse+0x38>
  5c:   f9400083        ldr     x3, [x4]
  60:   d2800802        mov     x2, #0x40                       // #64
  64:   91002084        add     x4, x4, #0x8
  68:   b40000c3        cbz     x3, 80 <find_first_bit_worse+0x30>
  6c:   14000008        b       8c <find_first_bit_worse+0x3c>
  70:   f8408483        ldr     x3, [x4], #8
  74:   91010045        add     x5, x2, #0x40
  78:   b50000c3        cbnz    x3, 90 <find_first_bit_worse+0x40>
  7c:   aa0503e2        mov     x2, x5
  80:   eb02001f        cmp     x0, x2
  84:   54ffff68        b.hi    70 <find_first_bit_worse+0x20>  // b.pmore
  88:   d65f03c0        ret
  8c:   d2800002        mov     x2, #0x0                        // #0
  90:   dac00063        rbit    x3, x3
  94:   dac01063        clz     x3, x3
  98:   8b020062        add     x2, x3, x2
  9c:   eb02001f        cmp     x0, x2
  a0:   9a829000        csel    x0, x0, x2, ls  // ls = plast
  a4:   d65f03c0        ret

> I haven't dug very deep into this, but I could also imagine the
> arch-specific parts of this might become a little easier to do if the
> semantics were just "if no such bit, return an indeterminate value >=
> the size".
>
> > Changing this for
> > a single function would break the consistency, and may cause problems
> > for those who
> > rely on existing behaviour.
>
> True. But I think it should be possible - I suppose most users are via
> the iterator macros, which could all be updated at once. Changing ret ==
> size to ret >= size will still work even if the implementations have not
> been switched over, so it should be doable.

Since there's no assembler users for it, we can do just:
#define find_first_bit(bitmap, size)
min(better_find_first_bit((bitmap), (size)), (size))

... and deprecate find_first_bit.

> > Passing non-positive size to find_*_bit() should produce undefined
> > behaviour, because we cannot dereference a pointer to the bitmap in
> > this case; this is most probably a sign of a problem on a caller side
> > anyways.
>
> No, the out-of-line bitmap functions should all handle the case of a
> zero-size bitmap sensibly.

I could be more specific, the behaviour is defined: don't dereference
the address and return undefined value (which now is always 0).

> Is bitmap full? Yes (all the 0 bits are set).
> Is bitmap empty? Yes, (none of the 0 bits are set).
> Find the first bit set (returns 0, there's no such bit)

I can't answer because this object is not a map of bits - there's no room for
bits inside.

> Etc. The static inlines for small_const_nbits do assume that the pointer
> can be dereferenced, which is why small_const_nbits was updated to mean
> 1<=bits<=BITS_PER_LONG rather than just bits<=BITS_PER_LONG.

I don't want to do something like

if (size == 0)
        return -1;

... because it legitimizes this kind of usage and hides problems on
callers' side.
Instead, I'd add WARN_ON(size == 0), but I don't think it's so
critical to bother with it.

Yury

> Rasmus



[Index of Archives]     [Linux Kernel]     [Kernel Newbies]     [x86 Platform Driver]     [Netdev]     [Linux Wireless]     [Netfilter]     [Bugtraq]     [Linux Filesystems]     [Yosemite Discussion]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Samba]     [Device Mapper]

  Powered by Linux