On Thu, May 19, 2022 at 09:50:30AM +0100, David Howells wrote: > From: Stephen Brennan <stephen.s.brennan@xxxxxxxxxx> > > A rare BUG_ON triggered in assoc_array_gc: > > [3430308.818153] kernel BUG at lib/assoc_array.c:1609! > > Which corresponded to the statement currently at line 1593 upstream: > > BUG_ON(assoc_array_ptr_is_meta(p)); > > Using the data from the core dump, I was able to generate a userspace > reproducer[1] and determine the cause of the bug. > > [1]: https://github.com/brenns10/kernel_stuff/tree/master/assoc_array_gc > > After running the iterator on the entire branch, an internal tree node > looked like the following: > > NODE (nr_leaves_on_branch: 3) > SLOT [0] NODE (2 leaves) > SLOT [1] NODE (1 leaf) > SLOT [2..f] NODE (empty) > > In the userspace reproducer, the pr_devel output when compressing this > node was: > > -- compress node 0x5607cc089380 -- > free=0, leaves=0 > [0] retain node 2/1 [nx 0] > [1] fold node 1/1 [nx 0] > [2] fold node 0/1 [nx 2] > [3] fold node 0/2 [nx 2] > [4] fold node 0/3 [nx 2] > [5] fold node 0/4 [nx 2] > [6] fold node 0/5 [nx 2] > [7] fold node 0/6 [nx 2] > [8] fold node 0/7 [nx 2] > [9] fold node 0/8 [nx 2] > [10] fold node 0/9 [nx 2] > [11] fold node 0/10 [nx 2] > [12] fold node 0/11 [nx 2] > [13] fold node 0/12 [nx 2] > [14] fold node 0/13 [nx 2] > [15] fold node 0/14 [nx 2] > after: 3 > > At slot 0, an internal node with 2 leaves could not be folded into the > node, because there was only one available slot (slot 0). Thus, the > internal node was retained. At slot 1, the node had one leaf, and was > able to be folded in successfully. The remaining nodes had no leaves, > and so were removed. By the end of the compression stage, there were 14 > free slots, and only 3 leaf nodes. The tree was ascended and then its > parent node was compressed. When this node was seen, it could not be > folded, due to the internal node it contained. > > The invariant for compression in this function is: whenever > nr_leaves_on_branch < ASSOC_ARRAY_FAN_OUT, the node should contain all > leaf nodes. The compression step currently cannot guarantee this, given > the corner case shown above. > > To fix this issue, retry compression whenever we have retained a node, > and yet nr_leaves_on_branch < ASSOC_ARRAY_FAN_OUT. This second > compression will then allow the node in slot 1 to be folded in, > satisfying the invariant. Below is the output of the reproducer once the > fix is applied: > > -- compress node 0x560e9c562380 -- > free=0, leaves=0 > [0] retain node 2/1 [nx 0] > [1] fold node 1/1 [nx 0] > [2] fold node 0/1 [nx 2] > [3] fold node 0/2 [nx 2] > [4] fold node 0/3 [nx 2] > [5] fold node 0/4 [nx 2] > [6] fold node 0/5 [nx 2] > [7] fold node 0/6 [nx 2] > [8] fold node 0/7 [nx 2] > [9] fold node 0/8 [nx 2] > [10] fold node 0/9 [nx 2] > [11] fold node 0/10 [nx 2] > [12] fold node 0/11 [nx 2] > [13] fold node 0/12 [nx 2] > [14] fold node 0/13 [nx 2] > [15] fold node 0/14 [nx 2] > internal nodes remain despite enough space, retrying > -- compress node 0x560e9c562380 -- > free=14, leaves=1 > [0] fold node 2/15 [nx 0] > after: 3 > > Changes > ======= > DH: > - Use false instead of 0. > - Reorder the inserted lines in a couple of places to put retained before > next_slot. > > ver #2) > - Fix typo in pr_devel, correct comparison to "<=" > > > Fixes: 3cb989501c26 ("Add a generic associative array implementation.") > Cc: <stable@xxxxxxxxxxxxxxx> > Signed-off-by: Stephen Brennan <stephen.s.brennan@xxxxxxxxxx> > Signed-off-by: David Howells <dhowells@xxxxxxxxxx> > cc: Jarkko Sakkinen <jarkko@xxxxxxxxxx> > cc: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> > cc: keyrings@xxxxxxxxxxxxxxx > Link: https://lore.kernel.org/r/20220511225517.407935-1-stephen.s.brennan@xxxxxxxxxx/ # v1 > Link: https://lore.kernel.org/r/20220512215045.489140-1-stephen.s.brennan@xxxxxxxxxx/ # v2 > --- > > lib/assoc_array.c | 8 ++++++++ > 1 file changed, 8 insertions(+) > > diff --git a/lib/assoc_array.c b/lib/assoc_array.c > index 079c72e26493..ca0b4f360c1a 100644 > --- a/lib/assoc_array.c > +++ b/lib/assoc_array.c > @@ -1461,6 +1461,7 @@ int assoc_array_gc(struct assoc_array *array, > struct assoc_array_ptr *cursor, *ptr; > struct assoc_array_ptr *new_root, *new_parent, **new_ptr_pp; > unsigned long nr_leaves_on_tree; > + bool retained; > int keylen, slot, nr_free, next_slot, i; > > pr_devel("-->%s()\n", __func__); > @@ -1536,6 +1537,7 @@ int assoc_array_gc(struct assoc_array *array, > goto descend; > } > > +retry_compress: > pr_devel("-- compress node %p --\n", new_n); > > /* Count up the number of empty slots in this node and work out the > @@ -1553,6 +1555,7 @@ int assoc_array_gc(struct assoc_array *array, > pr_devel("free=%d, leaves=%lu\n", nr_free, new_n->nr_leaves_on_branch); > > /* See what we can fold in */ > + retained = false; > next_slot = 0; > for (slot = 0; slot < ASSOC_ARRAY_FAN_OUT; slot++) { > struct assoc_array_shortcut *s; > @@ -1602,9 +1605,14 @@ int assoc_array_gc(struct assoc_array *array, > pr_devel("[%d] retain node %lu/%d [nx %d]\n", > slot, child->nr_leaves_on_branch, nr_free + 1, > next_slot); > + retained = true; > } > } > > + if (retained && new_n->nr_leaves_on_branch <= ASSOC_ARRAY_FAN_OUT) { > + pr_devel("internal nodes remain despite enough space, retrying\n"); > + goto retry_compress; > + } > pr_devel("after: %lu\n", new_n->nr_leaves_on_branch); > > nr_leaves_on_tree = new_n->nr_leaves_on_branch; > > Reviewed-by: Jarkko Sakkinen <jarkko@xxxxxxxxxx> BR, Jarkko