On Monday, September 11, 2017 9:27 PM, Matthew Wilcox wrote > On Mon, Aug 28, 2017 at 06:08:30PM +0800, Wei Wang wrote: > > +/** > > + * xb_zero - zero a range of bits in the xbitmap > > + * @xb: the xbitmap that the bits reside in > > + * @start: the start of the range, inclusive > > + * @end: the end of the range, inclusive */ void xb_zero(struct xb > > +*xb, unsigned long start, unsigned long end) { > > + unsigned long i; > > + > > + for (i = start; i <= end; i++) > > + xb_clear_bit(xb, i); > > +} > > +EXPORT_SYMBOL(xb_zero); > > Um. This is not exactly going to be quick if you're clearing a range of bits. > I think it needs to be more along the lines of this: > > void xb_clear(struct xb *xb, unsigned long start, unsigned long end) { > struct radix_tree_root *root = &xb->xbrt; > struct radix_tree_node *node; > void **slot; > struct ida_bitmap *bitmap; > > for (; end < start; start = (start | (IDA_BITMAP_BITS - 1)) + 1) { > unsigned long index = start / IDA_BITMAP_BITS; > unsigned long bit = start % IDA_BITMAP_BITS; > > bitmap = __radix_tree_lookup(root, index, &node, &slot); > if (radix_tree_exception(bitmap)) { > unsigned long ebit = bit + 2; > unsigned long tmp = (unsigned long)bitmap; > if (ebit >= BITS_PER_LONG) > continue; > tmp &= ... something ...; > if (tmp == RADIX_TREE_EXCEPTIONAL_ENTRY) > __radix_tree_delete(root, node, slot); > else > rcu_assign_pointer(*slot, (void *)tmp); > } else if (bitmap) { > unsigned int nbits = end - start + 1; > if (nbits + bit > IDA_BITMAP_BITS) > nbits = IDA_BITMAP_BITS - bit; > bitmap_clear(bitmap->bitmap, bit, nbits); > if (bitmap_empty(bitmap->bitmap, IDA_BITMAP_BITS)) { > kfree(bitmap); > __radix_tree_delete(root, node, slot); > } > } > } > } > > Also note that this should be called xb_clear(), not xb_zero() to fit in with > bitmap_clear(). And this needs a thorough test suite testing all values for 'start' > and 'end' between 0 and at least 1024; probably much higher. And a variable > number of bits need to be set before calling > xb_clear() in the test suite. > > Also, this implementation above is missing a few tricks. For example, if 'bit' is 0 > and 'nbits' == IDA_BITMAP_BITS, we can simply call kfree without first zeroing > out the bits and then checking if the whole thing is zero. Thanks for the optimization suggestions. We've seen significant improvement of the ballooning time. Some other optimizations (stated in the changelog) haven't been included in the new version. If possible, we can leave that to a second step optimization outside this patch series. Best, Wei