On Sat, Dec 17 2016, Matthew Wilcox <mawilcox@xxxxxxxxxxxxx> wrote: > From: Matthew Wilcox >> From: Rasmus Villemoes [mailto:linux@xxxxxxxxxxxxxxxxxx] >> > This sounds good. I think there may still be a lot of users that never >> > allocate more than a handful of IDAs, making a 128 byte allocation still >> > somewhat excessive. One thing I considered was (exactly as it's done for >> > file descriptor tables) to embed a single word in the struct ida and >> > use that initially; I haven't looked closely at newIDA, so I don't know >> > how easy that would be or if its worth the complexity. >> >> Heh, I was thinking about that too. The radix tree supports "exceptional >> entries" which have the bottom bit set. On a 64-bit machine, we could use 62 >> of the bits in the radix tree root to store the ID bitmap. I'm a little wary of the >> potential complexity, but we should try it out. > > Test patch here: http://git.infradead.org/users/willy/linux-dax.git/shortlog/refs/heads/idr-2016-12-16 > It passes the test suite ... which I actually had to adjust because it > now succeeds in cases where it hadn't (allocating ID 0 without > preallocating), and it will now fail in cases where it hadn't > previously (assuming a single preallocation would be enough). There > shouldn't be any examples of that in the kernel proper; it was simply > me being lazy when I wrote the test suite. Nice work! A few random comments/questions: - It does add some complexity, but I think a few comments would make it more digestable. - Hm, maybe I'm confused, and I certainly don't understand how the whole radix tree works. But do you use every leaf node as an exceptional entry initially, to allocate the first 62 ids from that level? This code if ((bit + RADIX_TREE_EXCEPTIONAL_SHIFT) < BITS_PER_LONG) { bit += RADIX_TREE_EXCEPTIONAL_SHIFT; radix_tree_iter_replace(root, &iter, slot, (void *)((1UL << bit) | RADIX_TREE_EXCEPTIONAL_ENTRY)); *id = new; return 0; } operates on bit, which is the offset from index*IDA_BITMAP_BITS, and it seems to create an exceptional entry somewhere down the tree (which may of course be the root). But we don't seem to allocate another bit from that exceptional entry ever unless it happened to sit at index 0; the code unsigned long tmp = (unsigned long)bitmap; if (start < BITS_PER_LONG) { unsigned tbit = find_next_zero_bit(&tmp, BITS_PER_LONG, start); if (tbit < BITS_PER_LONG) { tmp |= 1UL << tbit; rcu_assign_pointer(*slot, (void *)tmp); *id = new + tbit - RADIX_TREE_EXCEPTIONAL_SHIFT; return 0; } } is only used for small values of start (though it does seem to account for a non-zero value of new == iter.index * IDA_BITMAP_BITS). - In the micro-optimization department, I think we should avoid find_next_zero_bit on a single word, and instead use __ffs. Something like (assuming we can use bit instead of start) if (bit + RADIX_TREE_EXCEPTIONAL_SHIFT < BITS_PER_LONG) { tmp = (~(unsigned long)bitmap) >> (bit + RADIX_TREE_EXCEPTIONAL_SHIFT); if (tmp) { tbit = __ffs(tmp) + bit + RADIX_TREE_EXCEPTIONAL_SHIFT; tmp = (unsigned long)bitmap | (1UL << tbit); rcu_assign_pointer(*slot, (void *)tmp); *id = new + tbit - RADIX_TREE_EXCEPTIONAL_SHIFT; return 0; } } Rasmus -- To unsubscribe from this list: send the line "unsubscribe linux-block" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html