Re: [PATCH net-next v11 18/23] ovpn: implement peer add/get/dump/delete via netlink

Sabrina Dubroca <sd@xxxxxxxxxxxxxxx> · Fri, 29 Nov 2024 18:00:14 +0100

2024-11-14, 11:32:36 +0100, Antonio Quartulli wrote:
> On 13/11/2024 12:05, Sabrina Dubroca wrote:
> > 2024-11-12, 15:26:59 +0100, Antonio Quartulli wrote:
> > > On 11/11/2024 16:41, Sabrina Dubroca wrote:
> > > > 2024-10-29, 11:47:31 +0100, Antonio Quartulli wrote:
> > > > > +void ovpn_peer_hash_vpn_ip(struct ovpn_peer *peer)
> > > > > +	__must_hold(&peer->ovpn->peers->lock)
> > > > 
> > > > Changes to peer->vpn_addrs are not protected by peers->lock, so those
> > > > could be getting updated while we're rehashing (and taking peer->lock
> > > > in ovpn_nl_peer_modify as I'm suggesting above also wouldn't prevent
> > > > that).
> > > > 
> > > 
> > > /me screams :-D
> > 
> > Sorry :)
> > 
> > > Indeed peers->lock is only about protecting the lists, not the content of
> > > the listed objects.
> > > 
> > > How about acquiring the peers->lock before calling ovpn_nl_peer_modify()?
> > 
> > It seems like it would work. Maybe a bit weird to have conditional
> > locking (MP mode only), but ok. You already have this lock ordering
> > (hold peers->lock before taking peer->lock) in
> > ovpn_peer_keepalive_work_mp, so there should be no deadlock from doing
> > the same thing in the netlink code.
> 
> Yeah.
> 
> > 
> > Then I would also do that in ovpn_peer_float to protect that rehash.
> 
> I am not extremely comfortable with this, because it means acquiring
> peers->lock on every packet (right now we do so only on peer->lock) and it
> may defeat the advantage of the RCU locking on the hashtables.
> Wouldn't you agree?

Hmpf, yeah. Then I think you could keep most of the current code,
except doing the rehash under both locks (peers + peer), and get
ss+sa_len for the rehash directly from peer->bind (instead of using
the ones we just defined locally in ovpn_peer_float, since they may
have changed while we released peer->lock to grab peers->lock). We may
end up "rehashing" twice into the same bucket if we have 2 concurrent
peer_float calls (call 1 sets remote r1, call 2 sets a new one r2,
call 1 hashes according to r2, call 2 also rehashes based on r2). That
should be ok (it can happen anyway that a "real" rehash lands in the
same bucket).

peer_float {
  spin_lock(peer)
  match/update bind
  spin_unlock(peer)

  if (MP) {
    spin_lock(peers)
    spin_lock(peer)
    rehash using peer->bind->remote rather than ss
    spin_unlock(peer)
    spin_unlock(peers)
  }
}

Does that sound reasonable?

-- 
Sabrina