On Thu, Sep 24, 2020 at 3:46 PM Kent Gibson <warthog618@xxxxxxxxx> wrote: > On Thu, Sep 24, 2020 at 03:32:48PM +0800, Kent Gibson wrote: > > On Wed, Sep 23, 2020 at 07:18:08PM +0300, Andy Shevchenko wrote: > > > On Tue, Sep 22, 2020 at 5:36 AM Kent Gibson <warthog618@xxxxxxxxx> wrote: > > > > > > > > Add support for the GPIO_V2_LINE_SET_VALUES_IOCTL. > > > > > > > +static long linereq_set_values_unlocked(struct linereq *lr, > > > > + struct gpio_v2_line_values *lv) > > > > +{ > > > > + DECLARE_BITMAP(vals, GPIO_V2_LINES_MAX); > > > > + struct gpio_desc **descs; > > > > + unsigned int i, didx, num_set; > > > > + int ret; > > > > + > > > > + bitmap_zero(vals, GPIO_V2_LINES_MAX); > > > > + for (num_set = 0, i = 0; i < lr->num_lines; i++) { > > > > + if (lv->mask & BIT_ULL(i)) { > > > > > > Similar idea > > > > > > DECLARE_BITMAP(mask, 64) = BITMAP_FROM_U64(lv->mask); > > > > > > num_set = bitmap_weight(); > > > > > > > I had played with this option, but bitmap_weight() counts all > > the bits set in the mask - which considers bits >= lr->num_lines. > > So you would need to mask lv->mask before converting it to a bitmap. > > (I'm ok with ignoring those bits in case userspace wants to be lazy and > > use an all 1s mask.) > > > > But since we're looping over the bitmap anyway we may as well just > > count as we go. > > > > > for_each_set_bit(i, mask, lr->num_lines) > > > > > > > Yeah, that should work. I vaguely recall trying this and finding it > > generated larger object code, but I'll give it another try and if it > > works out then include it in v10. > > > > Tried it again and, while it works, it does increase the size of > gpiolib-cdev.o as follows: > > u64 -> bitmap > x86_64 28360 28616 > i386 22056 22100 > aarch64 37392 37600 > mips32 28008 28016 Yes, that's pity... See below. > So for 64-bit platforms changing to bitmap generates larger code, > probably as we are forcing them to use 32-bit array semantics where > before they could use the native u64. For 32-bit there is a much > smaller difference as they were already using 32-bit array semantics > to realise the u64. > > Those are for some of my test builds, so obviously YMMV. > > It is also only for changing linereq_get_values(), which has three > instances of the loop. linereq_set_values_unlocked() has another two, > so you could expect another increase of ~2/3 of that seen here if we > change that as well. > > The sizeable increase in x86_64 was what made me revert this last time, > and I'm still satisfied with that choice. Are you still eager to switch > to for_each_set_bit()? I already asked once about short cut for for_each_set_bit in case of constant nbits parameter when it's <= BITS_PER_LONG, but here it seems we have variadic amount of lines, dunno if compiler can prove that it's smaller than long. In any case my point is that code readability has a preference vs. memory footprint (except hot paths) and if we are going to fix this it should be done in general. That said, if maintainers are okay with that I would prefer bitmap API over open-coded pieces. Also note, that it will be easier to extend in the future if needed (if we want to have more than BITS_PER_LONG [64] lines to handle). -- With Best Regards, Andy Shevchenko