Hi Matthew, : I need a virtual firewall/router solution. I'm thinking of a : netscreen 1000 but I want to know if it can be done in Linux. : Here is my idea: : 1 Linux box : 2 GigE interfaces What's linux? : 1 interface setup with a public IP address ($PUBIP) : 1 interface setup with 802.1q VLAN trunking with 100 vlans assigned : ($VLAN1-$VLAN100) : : a /25 subnet routed to $PUBIP from my core routers : : All $VLAN interfaces setup with IP 192.168.1.1/24 : IPs : I'm sure the kernel will bitch about assigning 192.168.1.1 on a : bunch on Interfaces. That will not be a problem at all. You can assign the same IP to all the interfaces--it's confusing for the humans, but not the kernel. The routes are the trick.... : Inbound traffic on $VLAN gets marked with a fwmark ($VLAN1 = fw1, : $VLAN2 = fw2) : Outbound traffic gets NAT'ed based on the fwmark to an IP in the subnet : Returning traffic gets marked based on the dest IP (one of the subnets) : with the same fwmark for the appropriate VLAN Clever use of fwmark. : returning packets are 'unNAT'ed' and then routed down the correct VLAN : based on the fwmark on the packet. Something like, this then, right? for ID in $( seq 1 100 ) ; do iptables -t mangle -A PREROUTING -i $PUBIF -d AAA.BBB.CCC.$ID \ -j MARK --set-mark $ID done The problem I'd be concerned about (refer to KPTD [1]) would be the possible interaction/interference with connection tracking. Perhaps someone more familiar with the workings of iptables can address this concern. Would the connection tracking mechanism circumvent the return packet traversing the PREROUTING mangle chain? (Connection tracking happens first, according to the KPTD....) : Questions: : How will Linux react if I put 192.168.1.1 on >1 interfaces? No problem at all! You'll just have to be smart about choosing output interface (dev) with your routes. : Does the unNAT'ing of the packets destroy the fwmark? No, but see above concern/question about connection tracking. : Is there a way of handling kernel based packets (ICMP, ARP responses) : so they go out the correct interface? Yikes! Good question on ICMP. I have no idea about the interaction between an inbound (already fwmark'd) packet and the generation of ICMP! : Example: an ARP (who has 192.168.1.1) from in on VLAN5, How can I get : the kernel to send its response on VLAN5? The ARP replies will go out the interface on which the query arrived. You aren't doing anything "funky" with ARP are you? Just straight up ARP? No proxy ARPing or anything like that? : I see the packet flow as something like. : : Client (192.168.1.100) sends SYN to www.redhat.com:80 : Client has default gw of 192.168.1.1 : Client is on 802.1q VLAN10 : Client puts packet on Ethernet VLAN10 with MAC address of Linux box : Packet enters Linux box on VLAN10 Source:ClientIP Dest:www.redhat.com:80 : Packet gets marked by iptables rule. FWMARK = 10 : Packet gets routed out to upstream gateway : Packet gets NAT'ed to SUBNETIP10 based on FWMARK 10 : Packet now looks like src: SUBNETIP10:NATPORT dst:REDHAT:80 Warning! The fwmark does not survive the local box. The fwmark feature is an attribute of the in-memory representation of the packet as it's handled by the linux router. As soon as the packet has left the box, the fwmark datum is lost. Also, I was under the impression from above that the NAT would happen on the 2 GigE linux box, not on an upstream router. Which way would it be? If two routers, you could use some sort of mangling scheme where you take advantage of the ToS field to carry this information [2], but you'd then need to strip it out at the SNATting box. Public routers might not be prepared to handle nonstandard data in the ToS field and might consequently harm your data. Another approach, assuming a separate upstream router. I speculate wildly..... - upstream router does all of the connection tracking - this router does packet rewriting with iproute2 and uses the mangle table only outbound (request): - this router NATs each vlan$ID-192.168.1.$host to 172.16.$ID.$host - transmitted across ethernet - upstream router SNATs 172.16.$ID.$host to AAA.BBB.CCC.$ID inbound (return): - upstream router unSNATs AAA.BBB.CCC.$ID to 172.16.$ID.$host per connection tracking mechanism - transmitted across ethernet - this router MARKs inbound packet with $ID - this router NATs 172.16.$ID.$host to 192.168.1.$host - RPDB lookup keyed to fwmark only to select routing table - routing table specifies output interface (vlan$ID) But this would be ugly, and probably difficult to debug. Not to mention that I've never done it, so it's only a paper solution. : Response packet from redhat flows : Packet enters Linux box src REDHAT:80 dst SUBNETIP10:NATPORT : Packet gets tagged with fwmark based on SUBNETIP to FWMARK 10 : Packet gets unNAT'ed by kernel NAT table : Packet looks like src REDHAT:80 dst CLIENTIP:CLIENTPORT fwmark:10 : iproute2 setup routes CLIENTIP to the correct client on the correct : VLAN (vlan10) : arp lookup assigned correct MAC address and sends the packet to the : switch on VLAN10 Your description of the outbound packet path leads me to believe that you have an upstream router. The description of the inbound packet flow omits any mention of an upstream router. : Problems I can see biting me: : ARP tables. Can the kernel maintain seperate ARP tables for each VLAN? : Each VLAN can have a machine with IP 192.168.1.100 The multiple ARP table question is also one I can't answer. Maybe Julian.... Certainly, the neighbor table itself supports entries for IP addresses on multiple interfaces, so the same IP could be in the neighbor table with different associations on each interface. An example: Imagine a host has two connections to same media segment. After causing an ARP lookup on each interface, there are per-device entries in the neighbor table: # ping -c 1 -I eth0 10.10.20.33 > /dev/null 2>&1 # ping -c 1 -I eth1 10.10.20.33 > /dev/null 2>&1 # ip neigh show 10.10.20.33 dev eth1 lladdr 00:80:c8:f8:4a:51 nud reachable 10.10.20.33 dev eth0 lladdr 00:80:c8:f8:4a:51 nud reachable I don't think you'd have any trouble with setting up 100 routing tables for each 192.168.1.0/24 via its own interface. I would add the RPDB rules at a relatively low priority so that other rules could be inserted above. for ID in $( seq 1 100 ) ; do ip rule add fwmark $ID table $ID prio $( expr 5000 + $ID ) ip route add 192.168.1.0/24 dev vlan$ID table $ID done ip route flush cache : ICMPs: What happens when a client tries to ping the linux box : (192.168.1.1). If I fwmark all incoming packets on a VLAN will the : kernel respond with a packet using the same fwmark? I don't know. Maybe somebody else on the list can answer this one.... : ARP requests: Same as the ICMPs. Will the kernel be able to answer an : ARP request to 192.168.1.1 This shouldn't be a problem unless you are doing something very funky with ARP. So, in summary - I don't think ARP will be a problem for you. Julian and/or the VLAN list might be able to confirm this. - You will have to use some sort of NAT on the linux box in order to have a way to differentiate the return packets for each VLAN, but this you know already. - In a one router solution, the trick will probably be the interaction between the connection tracking mechanisms and the fwmarking mechanism. - In a two router solution, the trick will probably be the "second" route lookup after the packet has been NATted to 192.168.1.$host. - Unanswered question: ICMP generated by the linux router itself. Matthew...this is a very interesting question, and I'm quite intrigued by your approach. Please let us (the LARTC list) know if you do prove that this can or cannot be done using the current tools available under linux. Sadly, the Netscreen may be able to fulfill your need with less effort. -Martin [1] http://www.docum.org/stef.coene/qos/kptd/ [2] http://iptables-tutorial.frozentux.net/chunkyhtml/targets.html#TOSTARGET -- Martin A. Brown --- SecurePipe, Inc. --- mabrown@xxxxxxxxxxxxxx