From: Herbert Xu <herbert@xxxxxxxxxxxxxxxxxxx> Sent: Friday, January 10, 2025 1:28 AM > > On Thu, Jan 09, 2025 at 02:15:17AM -0800, Breno Leitao wrote: > > > > I would suggest we revert this patch until we investigate further. I'll > > prepare and send a revert patch shortly. > > Sorry, I think it was my addition that broke things. The condition > for checking whether an entry is inserted is incorrect, thus resulting > in an underflow of the number of entries after entry removal. > > Please test this patch: > > ---8<--- > The function rhashtable_insert_one only returns NULL iff the > insertion was successful, so that alone should be tested before > increment nelems. Testing the variable data is redundant, and > buggy because we will have overwritten the original value of data > by this point. > > Reported-by: Michael Kelley <mhklinux@xxxxxxxxxxx> > Fixes: e1d3422c95f0 ("rhashtable: Fix potential deadlock by moving schedule_work > outside lock") > Signed-off-by: Herbert Xu <herbert@xxxxxxxxxxxxxxxxxxx> > > diff --git a/lib/rhashtable.c b/lib/rhashtable.c > index bf956b85455a..e196b6f0e35a 100644 > --- a/lib/rhashtable.c > +++ b/lib/rhashtable.c > @@ -621,7 +621,7 @@ static void *rhashtable_try_insert(struct rhashtable *ht, const > void *key, > > rht_unlock(tbl, bkt, flags); > > - if (PTR_ERR(data) == -ENOENT && !new_tbl) { > + if (!new_tbl) { > atomic_inc(&ht->nelems); > if (rht_grow_above_75(ht, tbl)) > schedule_work(&ht->run_work); > -- This patch fixes the problem I saw with VMs in the Azure cloud. Thanks! Michael Kelley