On Mon, 25 Apr 2022 08:13:40 -0700 Eric Dumazet wrote: > dev_hold() has been an increment of a refcount, and dev_put() a decrement. > > Not sure why it is fundamentally broken. Jann described a case where someone does CPU 0 CPU 1 CPU 2 dev_hold() ------ #unregister ------- dev_hold() dev_put() Our check for refcount == 0 goes over the CPUs one by one, so if it sums up CPUs 0 and 1 at the "unregister" point above and CPU2 after the CPU1 hold and CPU2 release it will "miss" one refcount. That's a problem unless doing a dev_hold() on a netdev we only have a reference on is illegal. > There are specific steps at device dismantles making sure no more > users can dev_hold() > > It is a contract. Any buggy layer can overwrite any piece of memory, > including a refcount_t. > > Traditionally we could not add a test in dev_hold() to prevent an > increment if the device is in dismantle phase. > Maybe the situation is better nowadays.