Patrick McHardy wrote:
As soon as we've removed the SAME target, I got some complaints from users that not only need persistent IPs when talking to the same destination, but for all destinations, which NAT currently doesn't provide. I don't want to resurrect the SAME target because of the 32/64bit compat problems it had, it would be better to handle this in the NAT core. The IP is currently determined by hashing the source and destinations IPs and mapping the hash to the NAT range: minip = ntohl(range->min_ip); maxip = ntohl(range->max_ip); j = jhash_2words((__force u32)tuple->src.u3.ip, (__force u32)tuple->dst.u3.ip, 0); j = ((u64)j * (maxip - minip + 1)) >> 32; *var_ipp = htonl(minip + j); We have two options: - add a flag to the NAT range to ignore the destination IP for SNAT - always ignore the destination IP for SNAT I personally prefer the second option since it results in more consistency and avoids adding new a option. I'm can't think of a reason why we would need to include the destination for SNAT, using jhash should result in good distribution anyway, but I might be missing something. Any opinions?
I've queued this patch implementing the second option. I'll push it for 2.6.25 since from a user-perspective this constitutes a regression, even though it was announced for quite some time.
commit c8fe51f524b3098adff20bb79105bb6dfe4db8a4 Author: Patrick McHardy <kaber@xxxxxxxxx> Date: Fri Feb 22 17:16:08 2008 +0100 [NETFILTER]: nf_nat: always select same SNAT source for same host We've removed the SAME target in 2.6.25-rc since it had 32/64 bit compat problems and the NAT core provides the same behaviour regarding IP selection. This turned out to be not entirely correct though, the NAT core only selects the same IP from a range for the same src,dst combination. Some people need the same IP for all destinations however. The easiest way to do this is to ignore the destination IP when doing SNAT. Since we're using jhash, we still get good distribution for multiple source IPs. Signed-off-by: Patrick McHardy <kaber@xxxxxxxxx> diff --git a/net/ipv4/netfilter/nf_nat_core.c b/net/ipv4/netfilter/nf_nat_core.c index 0d5fa3a..8e1cae2 100644 --- a/net/ipv4/netfilter/nf_nat_core.c +++ b/net/ipv4/netfilter/nf_nat_core.c @@ -188,15 +188,19 @@ find_best_ips_proto(struct nf_conntrack_tuple *tuple, __be32 *var_ipp; /* Host order */ u_int32_t minip, maxip, j; + __be32 dst; /* No IP mapping? Do nothing. */ if (!(range->flags & IP_NAT_RANGE_MAP_IPS)) return; - if (maniptype == IP_NAT_MANIP_SRC) + if (maniptype == IP_NAT_MANIP_SRC) { var_ipp = &tuple->src.u3.ip; - else + dst = 0; + } else { var_ipp = &tuple->dst.u3.ip; + dst = tuple->dst.u3.ip; + } /* Fast path: only one choice. */ if (range->min_ip == range->max_ip) { @@ -212,8 +216,7 @@ find_best_ips_proto(struct nf_conntrack_tuple *tuple, * like this), even across reboots. */ minip = ntohl(range->min_ip); maxip = ntohl(range->max_ip); - j = jhash_2words((__force u32)tuple->src.u3.ip, - (__force u32)tuple->dst.u3.ip, 0); + j = jhash_2words((__force u32)tuple->src.u3.ip, (__force u32)dst, 0); j = ((u64)j * (maxip - minip + 1)) >> 32; *var_ipp = htonl(minip + j); }