On Mon, Aug 05, 2013 at 01:17:39PM +0200, Jan Kara wrote: > On Sun 04-08-13 05:17:03, Kirill A. Shutemov wrote: > > From: "Kirill A. Shutemov" <kirill.shutemov@xxxxxxxxxxxxxxx> > > The radix tree is variable-height, so an insert operation not only has > > to build the branch to its corresponding item, it also has to build the > > branch to existing items if the size has to be increased (by > > radix_tree_extend). > > @@ -82,16 +82,24 @@ static struct kmem_cache *radix_tree_node_cachep; > > * The worst case is a zero height tree with just a single item at index 0, > > * and then inserting an item at index ULONG_MAX. This requires 2 new branches > > * of RADIX_TREE_MAX_PATH size to be created, with only the root node shared. > > + * > > + * Worst case for adding N contiguous items is adding entries at indexes > > + * (ULONG_MAX - N) to ULONG_MAX. It requires nodes to insert single worst-case > > + * item plus extra nodes if you cross the boundary from one node to the next. > > + * > > * Hence: > > */ > > -#define RADIX_TREE_PRELOAD_SIZE (RADIX_TREE_MAX_PATH * 2 - 1) > > +#define RADIX_TREE_PRELOAD_MIN (RADIX_TREE_MAX_PATH * 2 - 1) > > +#define RADIX_TREE_PRELOAD_MAX \ > > + (RADIX_TREE_PRELOAD_MIN + \ > > + DIV_ROUND_UP(RADIX_TREE_PRELOAD_NR - 1, RADIX_TREE_MAP_SIZE)) > > Umm, is this really correct? I see two problems: > 1) You may need internal tree nodes at various levels but you seem to > account only for the level 1. > 2) The rounding doesn't seem right because RADIX_TREE_MAP_SIZE+2 nodes may > require 3 nodes at level 1 if the indexes are like: > i_0 | i_1 .. i_{RADIX_TREE_MAP_SIZE} | i_{RADIX_TREE_MAP_SIZE+1} > ^ ^ > node boundary node boundary > > Otherwise the patch looks good. You are correct that in the fully general case, these things are needed, and the patch undercounts the number of nodes needed. However, in the specific case of THP pagecache, insertions are naturally aligned, and we end up needing very few internal nodes (so few that we've never hit the end of this array in some fairly heavy testing). There are two penalties for getting the general case correct. One is that the calculation becomes harder to understand, and the other is that we consume more per-CPU memory. I think we should document that the current code requires "natural alignment", and include a note about what things will need to change in order to support arbitrary alignment in case anybody needs to do it in the future, but not include support for arbitrary alignment in this patchset. What do you think? -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>