[PATCH] Allow use of 'socket' match in OUTPUT

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi

A little background: The 'socket' match is used with the tproxy
feature, so a process may bind to and spoof an arbitrary client IP
address. In iptables, the socket match is used in PREROUTING to match
any traffic addressed to such sockets, we then use that to set a mark
on the packet and force it to be routed locally rather than being
passed onto the real holder of that IP address.

If tproxy AND iptables-controlled policy routing (i.e. set mark in
OUTPUT, use that in ip rule) is in use AND the new egress interface
has a lower MTU than the original AND the server sent us a SYN-ACK
packet with an MSS larger than the new egress interface can transmit,
Linux will generate an ICMP fragmentation needed message, but we don't
get to process that since socket cannot be used in OUTPUT.

The attached patch only makes the changes for IPv4, it looks like IPv6
wants similar changes, but I don't have anything available to easily
test that.

I would greatly appreciate any feedback on this patch, pointers if
anything it does is wrong, or even alternative ways to solve this
problem.

Thanks

-- 

Daniel Collins
Software Developer

smoothwall
daniel.collins@xxxxxxxxxxxxxx
www.smoothwall.com

Head Office : 1 John Charles Way, Leeds, LS12 6QA, United Kingdom
Tech Office : Eagle Point, Little Park Farm Road, Fareham, PO15 5TD,
United Kingdom
US Office : 8008 Corporate Center Dr #410, Charlotte, NC 28226, United States

Telephone: UK: +44 870-199-9500 US: +1 800-959-3760

Smoothwall Limited is registered in England, Company Number: 4298247
and whose registered address is 1 John Charles Way, Leeds, LS12 6QA
United Kingdom.
Author: Harry Mason <harry.mason@xxxxxxxxxxxxxx>
Author: Daniel Collins <daniel.collins@xxxxxxxxxxxxxx>
Description: Allow use of the 'socket' iptables match in OUTPUT, for the purpose
 of capturing and rerouting fragmentation-needed messages generated in response
 to a tproxy socket trying to send messages larger than permitted by the MTU of
 the egress interface chosen by rerouting.
--- a/net/netfilter/xt_socket.c
+++ b/net/netfilter/xt_socket.c
@@ -115,16 +115,16 @@
 xt_socket_get_sock_v4(struct net *net, const u8 protocol,
 		      const __be32 saddr, const __be32 daddr,
 		      const __be16 sport, const __be16 dport,
-		      const struct net_device *in)
+		      const int ifindex)
 {
 	switch (protocol) {
 	case IPPROTO_TCP:
 		return __inet_lookup(net, &tcp_hashinfo,
 				     saddr, sport, daddr, dport,
-				     in->ifindex);
+				     ifindex);
 	case IPPROTO_UDP:
 		return udp4_lib_lookup(net, saddr, sport, daddr, dport,
-				       in->ifindex);
+				       ifindex);
 	}
 	return NULL;
 }
@@ -183,10 +183,35 @@
 	}
 #endif
 
-	if (!sk)
-		sk = xt_socket_get_sock_v4(dev_net(skb->dev), protocol,
+	/* For input packets, sk is the destination socket, so if it is already
+	 * defined there is no need to search again.
+	 *
+	 * For output packets, sk will be the source socket, but we are
+	 * interested in the destination socket, so force a lookup. This
+	 * supports locally generated ICMP errors for sockets with non-local
+	 * addresses.
+	*/
+	if (!par->in || !sk) {
+		/* Check for sockets in the network namespace associated with
+		 * the packets device, if it has one (i.e. is an incomming packet),
+		 * else use the outgoing device made by the routing decision.
+		 *
+		 * Stolen from net/ipv4/icmp.c
+		*/
+		struct net *net = dev_net(skb->dev ?: skb_dst(skb)->dev);
+
+		/* ifindex is used when looking up socket if any sockets are
+		 * bound to a specific interface, we know the device on the
+		 * input side, but resort to ignoring any such sockets on the
+		 * output side.
+		*/
+		int ifindex = par->in ? par->in->ifindex : 0;
+
+		sk = xt_socket_get_sock_v4(net, protocol,
 					   saddr, daddr, sport, dport,
-					   par->in);
+					   ifindex);
+	}
+
 	if (sk) {
 		bool wildcard;
 		bool transparent = true;
@@ -417,7 +442,8 @@
 		.family		= NFPROTO_IPV4,
 		.match		= socket_mt4_v0,
 		.hooks		= (1 << NF_INET_PRE_ROUTING) |
-				  (1 << NF_INET_LOCAL_IN),
+				  (1 << NF_INET_LOCAL_IN) |
+				  (1 << NF_INET_LOCAL_OUT),
 		.me		= THIS_MODULE,
 	},
 	{
@@ -428,7 +454,8 @@
 		.checkentry	= socket_mt_v1_check,
 		.matchsize	= sizeof(struct xt_socket_mtinfo1),
 		.hooks		= (1 << NF_INET_PRE_ROUTING) |
-				  (1 << NF_INET_LOCAL_IN),
+				  (1 << NF_INET_LOCAL_IN) |
+				  (1 << NF_INET_LOCAL_OUT),
 		.me		= THIS_MODULE,
 	},
 #ifdef XT_SOCKET_HAVE_IPV6
@@ -452,7 +479,8 @@
 		.checkentry	= socket_mt_v2_check,
 		.matchsize	= sizeof(struct xt_socket_mtinfo1),
 		.hooks		= (1 << NF_INET_PRE_ROUTING) |
-				  (1 << NF_INET_LOCAL_IN),
+				  (1 << NF_INET_LOCAL_IN) |
+				  (1 << NF_INET_LOCAL_OUT),
 		.me		= THIS_MODULE,
 	},
 #ifdef XT_SOCKET_HAVE_IPV6

[Index of Archives]     [Netfitler Users]     [LARTC]     [Bugtraq]     [Yosemite Forum]

  Powered by Linux