2024-11-14, 11:32:36 +0100, Antonio Quartulli wrote: > On 13/11/2024 12:05, Sabrina Dubroca wrote: > > 2024-11-12, 15:26:59 +0100, Antonio Quartulli wrote: > > > On 11/11/2024 16:41, Sabrina Dubroca wrote: > > > > 2024-10-29, 11:47:31 +0100, Antonio Quartulli wrote: > > > > > +void ovpn_peer_hash_vpn_ip(struct ovpn_peer *peer) > > > > > + __must_hold(&peer->ovpn->peers->lock) > > > > > > > > Changes to peer->vpn_addrs are not protected by peers->lock, so those > > > > could be getting updated while we're rehashing (and taking peer->lock > > > > in ovpn_nl_peer_modify as I'm suggesting above also wouldn't prevent > > > > that). > > > > > > > > > > /me screams :-D > > > > Sorry :) > > > > > Indeed peers->lock is only about protecting the lists, not the content of > > > the listed objects. > > > > > > How about acquiring the peers->lock before calling ovpn_nl_peer_modify()? > > > > It seems like it would work. Maybe a bit weird to have conditional > > locking (MP mode only), but ok. You already have this lock ordering > > (hold peers->lock before taking peer->lock) in > > ovpn_peer_keepalive_work_mp, so there should be no deadlock from doing > > the same thing in the netlink code. > > Yeah. > > > > > Then I would also do that in ovpn_peer_float to protect that rehash. > > I am not extremely comfortable with this, because it means acquiring > peers->lock on every packet (right now we do so only on peer->lock) and it > may defeat the advantage of the RCU locking on the hashtables. > Wouldn't you agree? Hmpf, yeah. Then I think you could keep most of the current code, except doing the rehash under both locks (peers + peer), and get ss+sa_len for the rehash directly from peer->bind (instead of using the ones we just defined locally in ovpn_peer_float, since they may have changed while we released peer->lock to grab peers->lock). We may end up "rehashing" twice into the same bucket if we have 2 concurrent peer_float calls (call 1 sets remote r1, call 2 sets a new one r2, call 1 hashes according to r2, call 2 also rehashes based on r2). That should be ok (it can happen anyway that a "real" rehash lands in the same bucket). peer_float { spin_lock(peer) match/update bind spin_unlock(peer) if (MP) { spin_lock(peers) spin_lock(peer) rehash using peer->bind->remote rather than ss spin_unlock(peer) spin_unlock(peers) } } Does that sound reasonable? -- Sabrina