Re: Query the verdict for a hypothetical packet

zrm <zrm@xxxxxxxxxxxxxxx> · Wed, 7 Mar 2018 15:26:33 -0500

On 03/07/2018 04:02 AM, Robert White wrote:
On 02/27/2018 10:56 PM, zrm wrote:
It's not that I'm trying to create a high availability gateway, it's
that I'm trying to create a daemon that processes Port Control Protocol
requests. The protocol is specifically designed to do this, I'm just
trying to implement it.

Properly implemented the PEER request used to restore a previous dynamic
mapping (e.g. RFC6887 section 10.4) overrides or circumvents what "Would
have happened" in the dynamic path. Therefore you _never_ want to know
what treatment a "hypothetical packet" "would have had".

So, it's complicated.

You want to override the default address and port translation, but only 
if that would have been possible given the configuration. For example, 
suppose the system administrator has the following rules:

iptables -A FORWARD -m conntrack --ctstate RELATED,ESTABLISHED -j ACCEPT
iptables -A FORWARD -o inet0 -p tcp --dport 25 -j REJECT
iptables -A FORWARD -i lan0 -j ACCEPT
iptables -A FORWARD -j REJECT
iptables -t nat -A POSTROUTING -o inet0 -j SNAT --to-source 1.2.3.4

Now we get a PCP PEER request that asks for tcp src 192.168.1.5:2345 dst 
5.6.7.8:25 to be translated to external src 9.10.11.12:1024.

That request should be refused. The external IP is no good and the 
destination port is 25 which should be rejected by the second rule. If 
we created the conntrack entry the translation would circumvent the SNAT 
rule and wrongly cause the connection to be accepted by the first rule.

But a PCP PEER request that says tcp src 192.168.2.5:2345 dst 5.6.7.8:22 
should be translated to external src 1.2.3.4:1024 is fine, even if a new 
actual outgoing packet would have been translated to src 1.2.3.4:2345.

For example your rules may well use random port mapping, or conflicting
port mapping, so the peer request of this type should _coerce_ the new
mapping to duplicate the old "as early in the NAT flow as possible".

So during the first run the NT mapping happens naturally.
Then the gateway dips out of service.
Then the gateway comes back.
Then the authorized actor (client) coerces the old mapping.

One of the best ways to do this would be to have some sets. One
consulted at the head of the SNAT and DNAT sections of the postrooting
and prerooting chains or hooked nft chains.

In prerouting you DNAT the incoming (public-to-private) packets and in
postrouting you SNAT the outgoing (private-to-public) packets. Whichever
rule gets hit first by "a real packet" will then create the necessary
connection tracking information.

That's pretty close, but then the conntrack entry doesn't exist until 
there is traffic. That risks some subsequent other connection getting 
the same translation, but the bigger issue is that without the conntrack 
entry, incoming packets wouldn't be accepted for the connection.

An ACCEPT rule could be added exactly matching the connection, but then 
we're back to square one, needing to know if the original outgoing 
packet would be accepted before we should add the ACCEPT rule.

In practice, however, there is a non-trivial chance that that one or
both endpoints will already know that the old session is broken. They
will have gotten various error responses while the gateway was down, or
in the interval between the restart and the PEER event.

Having a rational back-off-and-restart built into the protocol is _way_
more useful than implementing PEER service for this purpose. This is
part of why most of PCP is just kind of laying there on the floor
instead of being part of every distro.

That is true for new protocols, but that leaves all the existing 
protocols and existing third party servers.

The place where recreating state with PEER really seems to be useful is 
for long-lived mostly-idle connections like ssh, VPN tunnels, messaging 
protocols, etc. Then the PCP reset announcement comes right away and the 
next packet of the connection is minutes or hours away, and the client 
can recreate the NAT state using PEER without the remote server or 
connection protocol having to do or support anything.

> This is the ugly race that page 38 items i and ii are all on about, and
> so you've got way more kernel hacking to do to really, properly restart
> a link this way.

Do those two really require that much kernel hacking? Why not just a 
couple of DROP rules for packets with ! ctstate RELATED,ESTABLISHED from 
the outside and with ctstate INVALID from the inside?

---

ASIDE: yes I was thinking about writing a PCP deamon myself, mostly to
deal with some of the uPNP nonsense my playstation 4 does.

---

UPnP sure is a terrible protocol. RFC6886 Section 9 is essentially five 
pages of why everybody should stop using it in favor of NAT-PMP/PCP.

On the opposite side, some protocols like SIP really could do with
robust MAP operation, even if they then want to then just reconnect with
new ports/addresses after a short outage.

MAP is _safer_ to support because you can put the ACCEPT rule for the 
MAP just before the default REJECT rule and after everything else, so 
any more specific REJECT rules still apply to the MAP. But you're really 
still in the same boat with needing to know which packets are accepted. 
Suppose you have this:

iptables -A FORWARD -p tcp --dport 22 -j REJECT

Then you still want a MAP request with internal tcp port 22 to give 
NOT_AUTHORIZED instead of SUCCESS, even if creating the mapping doesn't 
actually allow the traffic, because it's wrong to tell the client 
SUCCESS when all traffic for the mapping is being rejected.
--
To unsubscribe from this list: send the line "unsubscribe netfilter" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html