Re: msgr2 and NAT

Jeff Layton <jlayton@xxxxxxxxxx> · Tue, 29 Jan 2019 05:45:39 -0500

On Mon, 2019-01-28 at 08:59 +0000, Sage Weil wrote:
> msgr1 has some super cludgey behavior that's used to detect what IP 
> address the client or daemon should identify as.  If the daemon hasn't 
> explicitly binded to a specific IP (i.e., it bound to 0.0.0.0, [::], or 
> didn't bind at all) then the first time it connects to another peer the 
> peer will send the IP we appear to be connecting from in the initial 
> banner.
> 
> It seems to have worked out mostly okay, but it's definitely a bit weird.  
> The first connection is always to the monitor, so this means that the IP 
> that an OSD or MDS daemon uses is always the one that on the same network 
> as the monitor (or whichever IP the kernel decides to use to route to it).
> 
> A side-effect of this is that, in theory, a client that is behind NAT 
> could connect to a ceph cluster.  It will end up being identified by the 
> NATed IP that the cluster sees and the random 64-bit nonce.
> 
> Note that (to my knowledge) this has never been tested, so it only 
> theoretically works.. 
> 
> The initial msgr2 implementation simplifies this by instead calling 
> getsockname(2) on the first outgoing connection to see what IP we're 
> connecting from.  That removes the weird dependency on the other end tell 
> us who we are, but it means that NAT won't work.
> 
> So... should we try to make the NAT scenario work in msgr2?
> 
> We can do it with a minor-ish change to have the accepting end share our 
> apparent IP sooner in teh exchange (probably after the initial banner).  
> (The current code shares it as part of the server_ident, but that's too 
> late in the exchange to serve the same role it did in msgr1.)
> 
> sage
> 

I wouldn't jump through hoops to enable NAT here. We might be able to
get it right on some simpler setups, but it could turn out to be
headache with more complex setups (maybe where some clients are NAT'ed
and some aren't).

If we're concerned about breaking old setups that might rely on
something like this, we may want to provide a workaround though. If it
doesn't already exist, possibly a per-daemon config option that allows
the admin to explicitly specify the addresses it advertises?
-- 
Jeff Layton <jlayton@xxxxxxxxxx>