On Mon, Apr 25, 2022 at 8:28 AM Jakub Kicinski <kuba@xxxxxxxxxx> wrote: > > On Mon, 25 Apr 2022 08:13:40 -0700 Eric Dumazet wrote: > > dev_hold() has been an increment of a refcount, and dev_put() a decrement. > > > > Not sure why it is fundamentally broken. > > Jann described a case where someone does > > CPU 0 CPU 1 CPU 2 > > dev_hold() > ------ #unregister ------- > dev_hold() > dev_put() > > Our check for refcount == 0 goes over the CPUs one by one, > so if it sums up CPUs 0 and 1 at the "unregister" point above > and CPU2 after the CPU1 hold and CPU2 release it will "miss" > one refcount. > > That's a problem unless doing a dev_hold() on a netdev we only have > a reference on is illegal. What is 'illegal' is trying to keep using the device after #unregister. We have barriers to prevent that. Somehow a layer does not care about the barriers and pretends the device is still good to use. It is of course perfectly fine to stack multiple dev_hold() from one path (if these do not leak, but this is a different issue) > > > There are specific steps at device dismantles making sure no more > > users can dev_hold() > > > > It is a contract. Any buggy layer can overwrite any piece of memory, > > including a refcount_t. > > > > Traditionally we could not add a test in dev_hold() to prevent an > > increment if the device is in dismantle phase. > > Maybe the situation is better nowadays. >