Re: [RFC 00/10] implement alternative and much simpler id allocator

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sat, Dec 17 2016, Matthew Wilcox <mawilcox@xxxxxxxxxxxxx> wrote:

> From: Matthew Wilcox
>> From: Rasmus Villemoes [mailto:linux@xxxxxxxxxxxxxxxxxx]
>> > This sounds good. I think there may still be a lot of users that never
>> > allocate more than a handful of IDAs, making a 128 byte allocation still
>> > somewhat excessive. One thing I considered was (exactly as it's done for
>> > file descriptor tables) to embed a single word in the struct ida and
>> > use that initially; I haven't looked closely at newIDA, so I don't know
>> > how easy that would be or if its worth the complexity.
>> 
>> Heh, I was thinking about that too.  The radix tree supports "exceptional
>> entries" which have the bottom bit set.  On a 64-bit machine, we could use 62
>> of the bits in the radix tree root to store the ID bitmap.  I'm a little wary of the
>> potential complexity, but we should try it out.
>
> Test patch here: http://git.infradead.org/users/willy/linux-dax.git/shortlog/refs/heads/idr-2016-12-16
> It passes the test suite ... which I actually had to adjust because it
> now succeeds in cases where it hadn't (allocating ID 0 without
> preallocating), and it will now fail in cases where it hadn't
> previously (assuming a single preallocation would be enough).  There
> shouldn't be any examples of that in the kernel proper; it was simply
> me being lazy when I wrote the test suite.

Nice work! A few random comments/questions:

- It does add some complexity, but I think a few comments would make it
  more digestable.

- Hm, maybe I'm confused, and I certainly don't understand how the whole
  radix tree works. But do you use every leaf node as an exceptional
  entry initially, to allocate the first 62 ids from that level? This
  code

	if ((bit + RADIX_TREE_EXCEPTIONAL_SHIFT) <
					BITS_PER_LONG) {
		bit += RADIX_TREE_EXCEPTIONAL_SHIFT;
		radix_tree_iter_replace(root, &iter, slot,
				(void *)((1UL << bit) |
				RADIX_TREE_EXCEPTIONAL_ENTRY));
		*id = new;
		return 0;
	}

   operates on bit, which is the offset from index*IDA_BITMAP_BITS, and
   it seems to create an exceptional entry somewhere down the tree
   (which may of course be the root).

   But we don't seem to allocate another bit from that exceptional entry
   ever unless it happened to sit at index 0; the code

	unsigned long tmp = (unsigned long)bitmap;
	if (start < BITS_PER_LONG) {
		unsigned tbit = find_next_zero_bit(&tmp,
					BITS_PER_LONG, start);
		if (tbit < BITS_PER_LONG) {
			tmp |= 1UL << tbit;
			rcu_assign_pointer(*slot, (void *)tmp);
			*id = new + tbit -
				RADIX_TREE_EXCEPTIONAL_SHIFT;
			return 0;
		}
	}

   is only used for small values of start (though it does seem to
   account for a non-zero value of new == iter.index * IDA_BITMAP_BITS).
   
- In the micro-optimization department, I think we should avoid
  find_next_zero_bit on a single word, and instead use __ffs. Something
  like (assuming we can use bit instead of start)

  if (bit + RADIX_TREE_EXCEPTIONAL_SHIFT < BITS_PER_LONG) {
    tmp = (~(unsigned long)bitmap) >> (bit + RADIX_TREE_EXCEPTIONAL_SHIFT);
    if (tmp) {
      tbit = __ffs(tmp) + bit + RADIX_TREE_EXCEPTIONAL_SHIFT;
      tmp = (unsigned long)bitmap | (1UL << tbit);
      rcu_assign_pointer(*slot, (void *)tmp);
      *id = new + tbit - RADIX_TREE_EXCEPTIONAL_SHIFT;
      return 0;
    }
  }

Rasmus
--
To unsubscribe from this list: send the line "unsubscribe linux-block" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux RAID]     [Linux SCSI]     [Linux ATA RAID]     [IDE]     [Linux Wireless]     [Linux Kernel]     [ATH6KL]     [Linux Bluetooth]     [Linux Netdev]     [Kernel Newbies]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Device Mapper]

  Powered by Linux