On Tue, Jun 06, 2023 at 09:17:24PM +0200, Mirsad Goran Todorovac wrote: > On 6/6/23 20:50, Guillaume Nault wrote: > > On Tue, Jun 06, 2023 at 04:28:02PM +0200, Mirsad Todorovac wrote: > > > On 6/6/23 16:11, Guillaume Nault wrote: > > > > On Tue, Jun 06, 2023 at 03:57:35PM +0200, Mirsad Todorovac wrote: > > > > > + if (oif) { > > > > > + rcu_read_lock(); > > > > > + dev = dev_get_by_index_rcu(net, oif); > > > > > + rcu_read_unlock(); > > > > > > > > You can't assume '*dev' is still valid after rcu_read_unlock() unless > > > > you hold a reference on it. > > > > > > > > > + rtnl_lock(); > > > > > + mdev = netdev_master_upper_dev_get(dev); > > > > > + rtnl_unlock(); > > > > > > > > Because of that, 'dev' might have already disappeared at the time > > > > netdev_master_upper_dev_get() is called. So it may dereference an > > > > invalid pointer here. > > > > > > Good point, thanks. I didn't expect those to change. > > > > > > This can be fixed, provided that RCU and RTNL locks can be nested: > > > > Well, yes and no. You can call rcu_read_{lock,unlock}() while under the > > rtnl protection, but not the other way around. > > > > > rcu_read_lock(); > > > if (oif) { > > > dev = dev_get_by_index_rcu(net, oif); > > > rtnl_lock(); > > > mdev = netdev_master_upper_dev_get(dev); > > > rtnl_unlock(); > > > } > > > > This is invalid: rtnl_lock() uses a mutex, so it can sleep and that's > > forbidden inside an RCU critical section. > > Obviously, that's bad. Mea culpa. > > > > if (sk->sk_bound_dev_if) { > > > bdev = dev_get_by_index_rcu(net, sk->sk_bound_dev_if); > > > } > > > > > > addr_type = ipv6_addr_type(daddr); > > > if ((__ipv6_addr_needs_scope_id(addr_type) && !oif) || > > > (addr_type & IPV6_ADDR_MAPPED) || > > > (oif && sk->sk_bound_dev_if && oif != sk->sk_bound_dev_if && > > > !(mdev && sk->sk_bound_dev_if && bdev && mdev == bdev))) { > > > rcu_read_unlock(); > > > return -EINVAL; > > > } > > > rcu_read_unlock(); > > > > > > But again this is still probably not race-free (bdev might also disappear before > > > the mdev == bdev test), even if it passed fcnal-test.sh, there is much duplication > > > of code, so your one-line solution is obviously by far better. :-) > > > > The real problem is choosing the right function for getting the master > > device. In particular netdev_master_upper_dev_get() was a bad choice. > > It forces you to take the rtnl, which is unnatural here and obliges you > > to add extra code, while all this shouldn't be necessary in the first > > place. > > Thank you for the additional insight. I had poor luck with Googling on > these. > > I made a blunder after blunder. But it was insightful and brainstorming. > Good exercise for my little grey cells. > > However, learning without making any errors appears to be simply a lot > of blunt memorising. :-/ > > It's good to be in an environment when one can learn from errors. > > :-) I'm happy you found this useful. > Regards, > Mirsad >