On Sun, Jun 13, 2021 at 12:24:52PM +0000, David Mozes wrote: > Hi *, > Under a very high load of io traffic, we got the below BUG trace. > We can see that: > plist_for_each_entry_safe(this, next, &hb1->chain, list) { > if (match_futex (&this->key, &key1)) > > were called with hb1 = NULL at futex_wake_up function. > And there is no protection on the code regarding such a scenario. > > The NULL can be geting from: > hb1 = hash_futex(&key1); > > How can we protect against such a situation? Can you reproduce it without loading proprietary modules? Your analysis doesn't quite make sense: hb1 = hash_futex(&key1); hb2 = hash_futex(&key2); retry_private: double_lock_hb(hb1, hb2); If hb1 were NULL, then the oops would come earlier, in double_lock_hb(). > RIP: 0010:do_futex+0xdf/0xa90 > > 0xffffffff81138eff is in do_futex (kernel/futex.c:1748). > 1743 put_futex_key(&key1); > 1744 cond_resched(); > 1745 goto retry; > 1746 } > 1747 > 1748 plist_for_each_entry_safe(this, next, &hb1->chain, list) { > 1749 if (match_futex (&this->key, &key1)) { > 1750 if (this->pi_state || this->rt_waiter) { > 1751 ret = -EINVAL; > 1752 goto out_unlock; > (gdb) > > > > plist_for_each_entry_safe(this, next, &hb1->chain, list) { > if (match_futex (&this->key, &key1)) { > > > > > This happened in kernel 4.19.149 running on Azure vm > > > Thx > David > Reply > Forward > MO >