Signed-off-by: Peter Krempa <pkrempa@xxxxxxxxxx> --- docs/firewall.html.in | 523 ------------------------------------------ docs/firewall.rst | 506 ++++++++++++++++++++++++++++++++++++++++ docs/meson.build | 2 +- 3 files changed, 507 insertions(+), 524 deletions(-) delete mode 100644 docs/firewall.html.in create mode 100644 docs/firewall.rst diff --git a/docs/firewall.html.in b/docs/firewall.html.in deleted file mode 100644 index 15b4f397be..0000000000 --- a/docs/firewall.html.in +++ /dev/null @@ -1,523 +0,0 @@ -<?xml version="1.0" encoding="UTF-8"?> -<!DOCTYPE html> -<html xmlns="http://www.w3.org/1999/xhtml"> - <body> - <h1 >Firewall and network filtering in libvirt</h1> - <p>There are three pieces of libvirt functionality which do network - filtering of some type. - <br /><br /> - At a high level they are: - </p> - <ul> - <li>The virtual network driver - <br /><br /> - This provides an isolated bridge device (ie no physical NICs - attached). Guest TAP devices are attached to this bridge. - Guests can talk to each other and the host, and optionally the - wider world. - <br /><br /> - </li> - <li>The QEMU driver MAC filtering - <br /><br /> - This provides a generic filtering of MAC addresses to prevent - the guest spoofing its MAC address. This is mostly obsoleted by - the next item, so won't be discussed further. - <br /><br /> - </li> - <li>The network filter driver - <br /><br /> - This provides fully configurable, arbitrary network filtering - of traffic on guest NICs. Generic rulesets are defined at the - host level to control traffic in some manner. Rules sets are - then associated with individual NICs of a guest. While not as - expressive as directly using iptables/ebtables, this can still - do nearly everything you would want to on a guest NIC filter. - </li> - </ul> - - <h3><a id="fw-virtual-network-driver">The virtual network driver</a> - </h3> - <p>The typical configuration for guests is to use bridging of the - physical NIC on the host to connect the guest directly to the LAN. - In RHEL6 there is also the possibility of using macvtap/sr-iov - and VEPA connectivity. None of this stuff plays nicely with wireless - NICs, since they will typically silently drop any traffic with a - MAC address that doesn't match that of the physical NIC. - </p> - <p>Thus the virtual network driver in libvirt was invented. This takes - the form of an isolated bridge device (ie one with no physical NICs - attached). The TAP devices associated with the guest NICs are attached - to the bridge device. This immediately allows guests on a single host - to talk to each other and to the host OS (modulo host IPtables rules). - </p> - <p>libvirt then uses iptables to control what further connectivity is - available. There are three configurations possible for a virtual - network at time of writing: - </p> - <ul> - <li>isolated: all off-node traffic is completely blocked</li> - <li>nat: outbound traffic to the LAN is allowed, but MASQUERADED</li> - <li>forward: outbound traffic to the LAN is allowed</li> - </ul> - <p>The latter 'forward' case requires the virtual network be on a - separate sub-net from the main LAN, and that the LAN admin has - configured routing for this subnet. In the future we intend to - add support for IP subnetting and/or proxy-arp. This allows for - the virtual network to use the same subnet as the main LAN and - should avoid need for the LAN admin to configure special routing. - </p> - <p>Libvirt will optionally also provide DHCP services to the virtual - network using DNSMASQ. In all cases, we need to allow DNS/DHCP - queries to the host OS. Since we can't predict whether the host - firewall setup is already allowing this, we insert 4 rules into - the head of the INPUT chain - </p> - <pre> -target prot opt in out source destination -ACCEPT udp -- virbr0 * 0.0.0.0/0 0.0.0.0/0 udp dpt:53 -ACCEPT tcp -- virbr0 * 0.0.0.0/0 0.0.0.0/0 tcp dpt:53 -ACCEPT udp -- virbr0 * 0.0.0.0/0 0.0.0.0/0 udp dpt:67 -ACCEPT tcp -- virbr0 * 0.0.0.0/0 0.0.0.0/0 tcp dpt:67</pre> - <p>Note we have restricted our rules to just the bridge associated - with the virtual network, to avoid opening undesirable holes in - the host firewall wrt the LAN/WAN. - </p> - <p>The next rules depend on the type of connectivity allowed, and go - in the main FORWARD chain: - </p> - <ul> - <li>type=isolated - <br /><br /> -Allow traffic between guests. Deny inbound. Deny outbound. - <pre> -target prot opt in out source destination -ACCEPT all -- virbr1 virbr1 0.0.0.0/0 0.0.0.0/0 -REJECT all -- * virbr1 0.0.0.0/0 0.0.0.0/0 reject-with icmp-port-unreachable -REJECT all -- virbr1 * 0.0.0.0/0 0.0.0.0/0 reject-with icmp-port-unreachable</pre> - </li> - <li>type=nat - <br /><br /> -Allow inbound related to an established connection. Allow -outbound, but only from our expected subnet. Allow traffic -between guests. Deny all other inbound. Deny all other outbound. - <pre> -target prot opt in out source destination -ACCEPT all -- * virbr0 0.0.0.0/0 192.168.122.0/24 state RELATED,ESTABLISHED -ACCEPT all -- virbr0 * 192.168.122.0/24 0.0.0.0/0 -ACCEPT all -- virbr0 virbr0 0.0.0.0/0 0.0.0.0/0 -REJECT all -- * virbr0 0.0.0.0/0 0.0.0.0/0 reject-with icmp-port-unreachable -REJECT all -- virbr0 * 0.0.0.0/0 0.0.0.0/0 reject-with icmp-port-unreachable</pre> - </li> - <li>type=routed - <br /><br /> -Allow inbound, but only to our expected subnet. Allow -outbound, but only from our expected subnet. Allow traffic -between guests. Deny all other inbound. Deny all other outbound. - <pre> -target prot opt in out source destination -ACCEPT all -- * virbr2 0.0.0.0/0 192.168.124.0/24 -ACCEPT all -- virbr2 * 192.168.124.0/24 0.0.0.0/0 -ACCEPT all -- virbr2 virbr2 0.0.0.0/0 0.0.0.0/0 -REJECT all -- * virbr2 0.0.0.0/0 0.0.0.0/0 reject-with icmp-port-unreachable -REJECT all -- virbr2 * 0.0.0.0/0 0.0.0.0/0 reject-with icmp-port-unreachable</pre> - </li> - <li>Finally, with type=nat, there is also an entry in the POSTROUTING -chain to apply masquerading: - <pre> -target prot opt in out source destination -MASQUERADE all -- * * 192.168.122.0/24 !192.168.122.0/24</pre> - </li> - </ul> - - <h3><a id="fw-firewalld-and-virtual-network-driver">firewalld and the virtual network driver</a> - </h3> - <p> - If <a href="https://firewalld.org">firewalld</a> is active on - the host, libvirt will attempt to place the bridge interface of - a libvirt virtual network into the firewalld zone named - "libvirt" (thus making all guest->host traffic on that network - subject to the rules of the "libvirt" zone). This is done - because, if firewalld is using its nftables backend (available - since firewalld 0.6.0) the default firewalld zone (which would - be used if libvirt didn't explicitly set the zone) prevents - forwarding traffic from guests through the bridge, as well as - preventing DHCP, DNS, and most other traffic from guests to - host. The zone named "libvirt" is installed into the firewalld - configuration by libvirt (not by firewalld), and allows - forwarded traffic through the bridge as well as DHCP, DNS, TFTP, - and SSH traffic to the host - depending on firewalld's backend - this will be implemented via either iptables or nftables - rules. libvirt's own rules outlined above will *always* be - iptables rules regardless of which backend is in use by - firewalld. - </p> - <p> - NB: It is possible to manually set the firewalld zone for a - network's interface with the "zone" attribute of the network's - "bridge" element. - </p> - <p> - NB: Prior to libvirt 5.1.0, the firewalld "libvirt" zone did not - exist, and prior to firewalld 0.7.0 a feature crucial to making - the "libvirt" zone operate properly (rich rule priority - settings) was not implemented in firewalld. In cases where one - or the other of the two packages is missing the necessary - functionality, it's still possible to have functional guest - networking by setting the firewalld backend to "iptables" (in - firewalld prior to 0.6.0, this was the only backend available). - </p> - - <h3><a id="fw-network-filter-driver">The network filter driver</a> - </h3> - <p>This driver provides a fully configurable network filtering capability - that leverages ebtables, iptables and ip6tables. This was written by - the libvirt guys at IBM and although its XML schema is defined by libvirt, - the conceptual model is closely aligned with the DMTF CIM schema for - network filtering: - </p> - <p><a href="https://www.dmtf.org/standards/cim/cim_schema_v2230/CIM_Network.pdf">https://www.dmtf.org/standards/cim/cim_schema_v2230/CIM_Network.pdf</a></p> - <p>The filters are managed in libvirt as a top level, standalone object. - This allows the filters to then be referenced by any libvirt object - that requires their functionality, instead tying them only to use - by guest NICs. In the current implementation, filters can be associated - with individual guest NICs via the libvirt domain XML format. In the - future we might allow filters to be associated with the virtual network - objects. Further we're expecting to define a new 'virtual switch' object - to remove the complexity of configuring bridge/sriov/vepa networking - modes. This make also end up making use of network filters. - </p> - <p>There are a new set of virsh commands for managing network filters:</p> - <ul> - <li>virsh nwfilter-define - <br /><br /> - define or update a network filter from an XML file - <br /><br /> - </li> - <li>virsh nwfilter-undefine - <br /><br /> - undefine a network filter - <br /><br /> - </li> - <li>virsh nwfilter-dumpxml - <br /><br /> - network filter information in XML - <br /><br /> - </li> - <li>virsh nwfilter-list - <br /><br /> - list network filters - <br /><br /> - </li> - <li>virsh nwfilter-edit - <br /><br /> - edit XML configuration for a network filter - </li> - </ul> - <p>There are equivalently named C APIs for each of these commands.</p> - <p>As with all objects libvirt manages, network filters are configured -using an XML format. At a high level the format looks like this: - </p> -<pre> -<filter name='no-spamming' chain='XXXX'> - <uuid>d217f2d7-5a04-0e01-8b98-ec2743436b74</uuid> - - <rule ...> - .... - </rule> - - <filterref filter='XXXX'/> -</filter></pre> - <p>Every filter has a name and UUID which serve as unique identifiers. - A filter can have zero-or-more <code><rule></code> elements which - are used to actually define network controls. Filters can be arranged - into a DAG, so zero-or-more <code><filterref/></code> elements are - also allowed. Cycles in the graph are not allowed. - </p> - <p>The <code><rule></code> element is where all the interesting stuff - happens. It has three attributes, an action, a traffic direction and an - optional priority. E.g.: - </p> - <pre><rule action='drop' direction='out' priority='500'></pre> - <p>Within the rule there are a wide variety of elements allowed, which - do protocol specific matching. Supported protocols currently include - <code>mac</code>, <code>arp</code>, <code>rarp</code>, <code>ip</code>, - <code>ipv6</code>, <code>tcp/ip</code>, <code>icmp/ip</code>, - <code>igmp/ip</code>, <code>udp/ip</code>, <code>udplite/ip</code>, - <code>esp/ip</code>, <code>ah/ip</code>, <code>sctp/ip</code>, - <code>tcp/ipv6</code>, <code>icmp/ipv6</code>, <code>igmp/ipv6</code>, - <code>udp/ipv6</code>, <code>udplite/ipv6</code>, <code>esp/ipv6</code>, - <code>ah/ipv6</code>, <code>sctp/ipv6</code>. Each protocol defines what - is valid inside the <rule> element. The general pattern though is: - </p> - <pre> -<protocol match='yes|no' attribute1='value1' attribute2='value2'/></pre> - <p>So, eg a TCP protocol, matching ports 0-1023 would be expressed as:</p> - <pre><tcp match='yes' srcportstart='0' srcportend='1023'/></pre> - <p>Attributes can included references to variables defined by the - object using the rule. So the guest XML format allows each NIC - to have a MAC address and IP address defined. These are made - available to filters via the variables <code><b>$IP</b></code> and - <code><b>$MAC</b></code>. - </p> - <p>So to define a filter that prevents IP address spoofing we can - simply match on source IP address <code>!= $IP</code> like this: - </p> - <pre> -<filter name='no-ip-spoofing' chain='ipv4'> - <rule action='drop' direction='out'> - <ip match='no' srcipaddr='<b>$IP</b>' /> - </rule> -</filter></pre> - <p>I'm not going to go into details on all the other protocol - matches you can do, because it'll take far too much space. - You can read about the options - <a href="formatnwfilter.html#nwfelemsRulesProto">here</a>. - </p> - <p>Out of the box in RHEL6/Fedora rawhide, libvirt ships with a - set of default useful rules: - </p> - <pre> -# virsh nwfilter-list -UUID Name ----------------------------------------------------------------- -15b1ab2b-b1ac-1be2-ed49-2042caba4abb allow-arp -6c51a466-8d14-6d11-46b0-68b1a883d00f allow-dhcp -7517ad6c-bd90-37c8-26c9-4eabcb69848d allow-dhcp-server -7680776c-77aa-496f-90d6-13097664b925 allow-dhcpv6 -9cdaad60-7631-4172-8ccb-ef774be7485b allow-dhcpv6-server -3d38b406-7cf0-8335-f5ff-4b9add35f288 allow-incoming-ipv4 -908543c1-902e-45f6-a6ca-1a0ad35e7599 allow-incoming-ipv6 -5ff06320-9228-2899-3db0-e32554933415 allow-ipv4 -ce8904cc-ad3a-4454-896c-53452882f817 allow-ipv6 -db0b1767-d62b-269b-ea96-0cc8b451144e clean-traffic -6d6ddcc8-1242-4c43-ac63-63af80493132 clean-traffic-gateway -4cf38077-c7d5-4e25-99bb-6c4c9efad294 no-arp-ip-spoofing -0b11a636-ce58-497f-be90-17f63c92487a no-arp-mac-spoofing -f88f1932-debf-4aa1-9fbe-f10d3aa4bc95 no-arp-spoofing -772f112d-52e4-700c-0250-e178a3d91a7a no-ip-multicast -7ee20370-8106-765d-f7ff-8a60d5aaf30b no-ip-spoofing -f8a51c43-a08f-49b3-b9e2-393d54522dc0 no-ipv6-multicast -a7f0afe9-a428-44b8-8566-c8ee2a669271 no-ipv6-spoofing -d5d3c490-c2eb-68b1-24fc-3ee362fc8af3 no-mac-broadcast -fb57c546-76dc-a372-513f-e8179011b48a no-mac-spoofing -dba10ea7-446d-76de-346f-335bd99c1d05 no-other-l2-traffic -f5c78134-9da4-0c60-a9f0-fb37bc21ac1f no-other-rarp-traffic -7637e405-4ccf-42ac-5b41-14f8d03d8cf3 qemu-announce-self -9aed52e7-f0f3-343e-fe5c-7dcb27b594e5 qemu-announce-self-rarp</pre> - <p>Most of these are just building blocks. The interesting one here - is 'clean-traffic'. This pulls together all the building blocks - into one filter that you can then associate with a guest NIC. - This stops the most common bad things a guest might try, IP - spoofing, arp spoofing and MAC spoofing. To look at the rules for - any of these just do: - </p> - <pre>virsh nwfilter-dumpxml FILTERNAME|UUID</pre> - <p>They are all stored in <code>/etc/libvirt/nwfilter</code>, but don't - edit the files there directly. Use <code>virsh nwfilter-define</code> - to update them. This ensures the guests have their iptables/ebtables - rules recreated. - </p> - <p>To associate the clean-traffic filter with a guest, edit the - guest XML config and change the <code><interface></code> element - to include a <code><filterref></code> and also specify the - <code><ip address/></code> that the guest is allowed to - use: - </p> - <pre> -<interface type='bridge'> - <mac address='52:54:00:56:44:32'/> - <source bridge='br1'/> - <ip address='10.33.8.131'/> - <target dev='vnet0'/> - <model type='virtio'/> - <filterref filter='clean-traffic'/> -</interface></pre> - <p>If no <code><ip address></code> is included, the network filter - driver will activate its 'learning mode'. This uses libpcap to snoop on - network traffic the guest sends and attempts to identify the - first IP address it uses. It then locks traffic to this address. - Obviously this isn't entirely secure, but it does offer some - protection against the guest being trojaned once up and running. - In the future we intend to enhance the learning mode so that it - looks for DHCPOFFERS from a trusted DHCP server and only allows - the offered IP address to be used. - </p> - <p>Now, how is all this implemented...?</p> - <p>The network filter driver uses a combination of ebtables, iptables and - ip6tables, depending on which protocols are referenced in a filter. The - out of the box 'clean-traffic' filter rules only require use of - ebtables. If you want to do matching at tcp/udp/etc protocols (eg to add - a new filter 'no-email-spamming' to block port 25), then iptables will - also be used. - </p> - <p>The driver attempts to keep its rules separate from those that - the host admin might already have configured. So the first thing - it does with ebtables, is to add two hooks in POSTROUTING and - PREROUTING chains, to redirect traffic to custom chains. These - hooks match on the TAP device name of the guest NIC, so they - should not interact badly with any administrator defined rules: - </p> - <pre> -Bridge chain: PREROUTING, entries: 1, policy: ACCEPT --i vnet0 -j libvirt-I-vnet0 - -Bridge chain: POSTROUTING, entries: 1, policy: ACCEPT --o vnet0 -j libvirt-O-vnet0</pre> - <p>To keep things manageable and easy to follow, the driver will then - create further sub-chains for each protocol then it needs to match - against: - </p> - <pre> -Bridge chain: libvirt-I-vnet0, entries: 5, policy: ACCEPT --p IPv4 -j I-vnet0-ipv4 --p ARP -j I-vnet0-arp --p 0x8035 -j I-vnet0-rarp --p 0x835 -j ACCEPT --j DROP - -Bridge chain: libvirt-O-vnet0, entries: 4, policy: ACCEPT --p IPv4 -j O-vnet0-ipv4 --p ARP -j O-vnet0-arp --p 0x8035 -j O-vnet0-rarp --j DROP</pre> - <p>Finally, here comes the actual implementation of the filters. This - example shows the 'clean-traffic' filter implementation. - I'm not going to explain what this is doing now. :-) - </p> - <pre> -Bridge chain: I-vnet0-ipv4, entries: 2, policy: ACCEPT --s ! 52:54:0:56:44:32 -j DROP --p IPv4 --ip-src ! 10.33.8.131 -j DROP - -Bridge chain: O-vnet0-ipv4, entries: 1, policy: ACCEPT --j ACCEPT - -Bridge chain: I-vnet0-arp, entries: 6, policy: ACCEPT --s ! 52:54:0:56:44:32 -j DROP --p ARP --arp-mac-src ! 52:54:0:56:44:32 -j DROP --p ARP --arp-ip-src ! 10.33.8.131 -j DROP --p ARP --arp-op Request -j ACCEPT --p ARP --arp-op Reply -j ACCEPT --j DROP - -Bridge chain: O-vnet0-arp, entries: 5, policy: ACCEPT --p ARP --arp-op Reply --arp-mac-dst ! 52:54:0:56:44:32 -j DROP --p ARP --arp-ip-dst ! 10.33.8.131 -j DROP --p ARP --arp-op Request -j ACCEPT --p ARP --arp-op Reply -j ACCEPT --j DROP - -Bridge chain: I-vnet0-rarp, entries: 2, policy: ACCEPT --p 0x8035 -s 52:54:0:56:44:32 -d Broadcast --arp-op Request_Reverse --arp-ip-src 0.0.0.0 --arp-ip-dst 0.0.0.0 --arp-mac-src 52:54:0:56:44:32 --arp-mac-dst 52:54:0:56:44:32 -j ACCEPT --j DROP - -Bridge chain: O-vnet0-rarp, entries: 2, policy: ACCEPT --p 0x8035 -d Broadcast --arp-op Request_Reverse --arp-ip-src 0.0.0.0 --arp-ip-dst 0.0.0.0 --arp-mac-src 52:54:0:56:44:32 --arp-mac-dst 52:54:0:56:44:32 -j ACCEPT --j DROP</pre> - <p>NB, we would have liked to include the prefix 'libvirt-' in all - of our chain names, but unfortunately the kernel limits names - to a very short maximum length. So only the first two custom - chains can include that prefix. The others just include the - TAP device name + protocol name. - </p> - <p>If I define a new filter 'no-spamming' and then add this to the - 'clean-traffic' filter, I can illustrate how iptables usage works: - </p> - <pre> -# cat > /root/spamming.xml <<EOF -<filter name='no-spamming' chain='root'> - <uuid>d217f2d7-5a04-0e01-8b98-ec2743436b74</uuid> - <rule action='drop' direction='out' priority='500'> - <tcp dstportstart='25' dstportend='25'/> - </rule> -</filter> -EOF -# virsh nwfilter-define /root/spamming.xml -# virsh nwfilter-edit clean-traffic</pre> - - <p>...add <code><filterref filter='no-spamming'/></code></p> - <p>All active guests immediately have their iptables/ebtables rules - rebuilt. - </p> - <p>The network filter driver deals with iptables in a very similar - way. First it separates out its rules from those the admin may - have defined, by adding a couple of hooks into the INPUT/FORWARD - chains: - </p> - <pre> -Chain INPUT (policy ACCEPT 13M packets, 21G bytes) -target prot opt in out source destination -libvirt-host-in all -- * * 0.0.0.0/0 0.0.0.0/0 - -Chain FORWARD (policy ACCEPT 5532K packets, 3010M bytes) -target prot opt in out source destination -libvirt-in all -- * * 0.0.0.0/0 0.0.0.0/0 -libvirt-out all -- * * 0.0.0.0/0 0.0.0.0/0 -libvirt-in-post all -- * * 0.0.0.0/0 0.0.0.0/0</pre> - <p>These custom chains then do matching based on the TAP device - name, so they won't open holes in the admin defined matches for - the LAN/WAN (if any). - </p> - <pre> -Chain libvirt-host-in (1 references) - target prot opt in out source destination - HI-vnet0 all -- * * 0.0.0.0/0 0.0.0.0/0 [goto] PHYSDEV match --physdev-in vnet0 - -Chain libvirt-in (1 references) - target prot opt in out source destination - FI-vnet0 all -- * * 0.0.0.0/0 0.0.0.0/0 [goto] PHYSDEV match --physdev-in vnet0 - -Chain libvirt-in-post (1 references) - target prot opt in out source destination - ACCEPT all -- * * 0.0.0.0/0 0.0.0.0/0 PHYSDEV match --physdev-in vnet0 - -Chain libvirt-out (1 references) - target prot opt in out source destination - FO-vnet0 all -- * * 0.0.0.0/0 0.0.0.0/0 [goto] PHYSDEV match --physdev-out vnet0</pre> - <p>Finally, we can see the interesting bit which is the actual - implementation of my filter to block port 25 access: - </p> - <pre> -Chain FI-vnet0 (1 references) - target prot opt in out source destination - DROP tcp -- * * 0.0.0.0/0 0.0.0.0/0 tcp dpt:25 - -Chain FO-vnet0 (1 references) - target prot opt in out source destination - DROP tcp -- * * 0.0.0.0/0 0.0.0.0/0 tcp spt:25 - -Chain HI-vnet0 (1 references) - target prot opt in out source destination - DROP tcp -- * * 0.0.0.0/0 0.0.0.0/0 tcp dpt:25</pre> - <p>One thing in looking at this you may notice is that if there - are many guests all using the same filters, we will be duplicating - the iptables rules over and over for each guest. This is merely a - limitation of the current rules engine implementation. At the libvirt - object modelling level you can clearly see we've designed the model - so filter rules are defined in one place, and indirectly referenced - by guests. Thus it should be possible to change the implementation in - the future so we can share the actual iptables/ebtables rules for - each guest to create a more scalable system. The stuff in current libvirt - is more or less the very first working implementation we've had of this, - so there's not been much optimization work done yet. - </p> - <p>Also notice that at the XML level we don't expose the fact we - are using iptables or ebtables at all. The rule definition is done in - terms of network protocols. Thus if we ever find a need, we could - plug in an alternative implementation that calls out to a different - firewall implementation instead of ebtables/iptables (providing that - implementation was suitably expressive of course) - </p> - <p>Finally, in terms of problems we have in deployment. The biggest - problem is that if the admin does <code>service iptables restart</code> - all our work gets blown away. We've experimented with using lokkit - to record our custom rules in a persistent config file, but that - caused different problem. Admins who were not using lokkit for - their config found that all their own rules got blown away. So - we threw away our lokkit code. Instead we document that if you - run <code>service iptables restart</code>, you need to send SIGHUP to - libvirt to make it recreate its rules. - </p> - <p>More in depth documentation on this is <a href="formatnwfilter.html">here</a>.</p> - </body> -</html> diff --git a/docs/firewall.rst b/docs/firewall.rst new file mode 100644 index 0000000000..adda0ef1f4 --- /dev/null +++ b/docs/firewall.rst @@ -0,0 +1,506 @@ +========================================= +Firewall and network filtering in libvirt +========================================= + +.. contents:: + +There are three pieces of libvirt functionality which do network filtering of +some type. At a high level they are: + +- The virtual network driver + + This provides an isolated bridge device (ie no physical NICs attached). + Guest TAP devices are attached to this bridge. Guests can talk to each + other and the host, and optionally the wider world. + +- The QEMU driver MAC filtering + + This provides a generic filtering of MAC addresses to prevent the guest + spoofing its MAC address. This is mostly obsoleted by the next item, so + won't be discussed further. + +- The network filter driver + + This provides fully configurable, arbitrary network filtering of traffic on + guest NICs. Generic rulesets are defined at the host level to control + traffic in some manner. Rules sets are then associated with individual NICs + of a guest. While not as expressive as directly using iptables/ebtables, + this can still do nearly everything you would want to on a guest NIC + filter. + +The virtual network driver +-------------------------- + +The typical configuration for guests is to use bridging of the physical NIC on +the host to connect the guest directly to the LAN. In RHEL6 there is also the +possibility of using macvtap/sr-iov and VEPA connectivity. None of this stuff +plays nicely with wireless NICs, since they will typically silently drop any +traffic with a MAC address that doesn't match that of the physical NIC. + +Thus the virtual network driver in libvirt was invented. This takes the form of +an isolated bridge device (ie one with no physical NICs attached). The TAP +devices associated with the guest NICs are attached to the bridge device. This +immediately allows guests on a single host to talk to each other and to the host +OS (modulo host IPtables rules). + +libvirt then uses iptables to control what further connectivity is available. +There are three configurations possible for a virtual network at time of +writing: + +- isolated: all off-node traffic is completely blocked +- nat: outbound traffic to the LAN is allowed, but MASQUERADED +- forward: outbound traffic to the LAN is allowed + +The latter 'forward' case requires the virtual network be on a separate sub-net +from the main LAN, and that the LAN admin has configured routing for this +subnet. In the future we intend to add support for IP subnetting and/or +proxy-arp. This allows for the virtual network to use the same subnet as the +main LAN and should avoid need for the LAN admin to configure special routing. + +Libvirt will optionally also provide DHCP services to the virtual network using +DNSMASQ. In all cases, we need to allow DNS/DHCP queries to the host OS. Since +we can't predict whether the host firewall setup is already allowing this, we +insert 4 rules into the head of the INPUT chain + +:: + + target prot opt in out source destination + ACCEPT udp -- virbr0 * 0.0.0.0/0 0.0.0.0/0 udp dpt:53 + ACCEPT tcp -- virbr0 * 0.0.0.0/0 0.0.0.0/0 tcp dpt:53 + ACCEPT udp -- virbr0 * 0.0.0.0/0 0.0.0.0/0 udp dpt:67 + ACCEPT tcp -- virbr0 * 0.0.0.0/0 0.0.0.0/0 tcp dpt:67 + +Note we have restricted our rules to just the bridge associated with the virtual +network, to avoid opening undesirable holes in the host firewall wrt the +LAN/WAN. + +The next rules depend on the type of connectivity allowed, and go in the main +FORWARD chain: + +- | type=isolated + | Allow traffic between guests. Deny inbound. Deny outbound. + + :: + + target prot opt in out source destination + ACCEPT all -- virbr1 virbr1 0.0.0.0/0 0.0.0.0/0 + REJECT all -- * virbr1 0.0.0.0/0 0.0.0.0/0 reject-with icmp-port-unreachable + REJECT all -- virbr1 * 0.0.0.0/0 0.0.0.0/0 reject-with icmp-port-unreachable + +- | type=nat + | Allow inbound related to an established connection. Allow outbound, but + only from our expected subnet. Allow traffic between guests. Deny all other + inbound. Deny all other outbound. + + :: + + target prot opt in out source destination + ACCEPT all -- * virbr0 0.0.0.0/0 192.168.122.0/24 state RELATED,ESTABLISHED + ACCEPT all -- virbr0 * 192.168.122.0/24 0.0.0.0/0 + ACCEPT all -- virbr0 virbr0 0.0.0.0/0 0.0.0.0/0 + REJECT all -- * virbr0 0.0.0.0/0 0.0.0.0/0 reject-with icmp-port-unreachable + REJECT all -- virbr0 * 0.0.0.0/0 0.0.0.0/0 reject-with icmp-port-unreachable + +- | type=routed + | Allow inbound, but only to our expected subnet. Allow outbound, but only + from our expected subnet. Allow traffic between guests. Deny all other + inbound. Deny all other outbound. + + :: + + target prot opt in out source destination + ACCEPT all -- * virbr2 0.0.0.0/0 192.168.124.0/24 + ACCEPT all -- virbr2 * 192.168.124.0/24 0.0.0.0/0 + ACCEPT all -- virbr2 virbr2 0.0.0.0/0 0.0.0.0/0 + REJECT all -- * virbr2 0.0.0.0/0 0.0.0.0/0 reject-with icmp-port-unreachable + REJECT all -- virbr2 * 0.0.0.0/0 0.0.0.0/0 reject-with icmp-port-unreachable + +- Finally, with type=nat, there is also an entry in the POSTROUTING chain to + apply masquerading: + + :: + + target prot opt in out source destination + MASQUERADE all -- * * 192.168.122.0/24 !192.168.122.0/24 + +firewalld and the virtual network driver +---------------------------------------- + +If `firewalld <https://firewalld.org>`__ is active on the host, libvirt will +attempt to place the bridge interface of a libvirt virtual network into the +firewalld zone named "libvirt" (thus making all guest->host traffic on that +network subject to the rules of the "libvirt" zone). This is done because, if +firewalld is using its nftables backend (available since firewalld 0.6.0) the +default firewalld zone (which would be used if libvirt didn't explicitly set the +zone) prevents forwarding traffic from guests through the bridge, as well as +preventing DHCP, DNS, and most other traffic from guests to host. The zone named +"libvirt" is installed into the firewalld configuration by libvirt (not by +firewalld), and allows forwarded traffic through the bridge as well as DHCP, +DNS, TFTP, and SSH traffic to the host - depending on firewalld's backend this +will be implemented via either iptables or nftables rules. libvirt's own rules +outlined above will \*always\* be iptables rules regardless of which backend is +in use by firewalld. + +NB: It is possible to manually set the firewalld zone for a network's interface +with the "zone" attribute of the network's "bridge" element. + +NB: Prior to libvirt 5.1.0, the firewalld "libvirt" zone did not exist, and +prior to firewalld 0.7.0 a feature crucial to making the "libvirt" zone operate +properly (rich rule priority settings) was not implemented in firewalld. In +cases where one or the other of the two packages is missing the necessary +functionality, it's still possible to have functional guest networking by +setting the firewalld backend to "iptables" (in firewalld prior to 0.6.0, this +was the only backend available). + +The network filter driver +------------------------- + +This driver provides a fully configurable network filtering capability that +leverages ebtables, iptables and ip6tables. This was written by the libvirt guys +at IBM and although its XML schema is defined by libvirt, the conceptual model +is closely aligned with the DMTF CIM schema for network filtering: + +https://www.dmtf.org/standards/cim/cim_schema_v2230/CIM_Network.pdf + +The filters are managed in libvirt as a top level, standalone object. This +allows the filters to then be referenced by any libvirt object that requires +their functionality, instead tying them only to use by guest NICs. In the +current implementation, filters can be associated with individual guest NICs via +the libvirt domain XML format. In the future we might allow filters to be +associated with the virtual network objects. Further we're expecting to define a +new 'virtual switch' object to remove the complexity of configuring +bridge/sriov/vepa networking modes. This make also end up making use of network +filters. + +There are a new set of virsh commands for managing network filters: + +- ``virsh nwfilter-define`` + define or update a network filter from an XML file +- ``virsh nwfilter-undefine`` + undefine a network filter +- ``virsh nwfilter-dumpxml`` + network filter information in XML +- ``virsh nwfilter-list`` + list network filters +- ``virsh nwfilter-edit`` + edit XML configuration for a network filter + +There are equivalently named C APIs for each of these commands. + +As with all objects libvirt manages, network filters are configured using an XML +format. At a high level the format looks like this: + +:: + + <filter name='no-spamming' chain='XXXX'> + <uuid>d217f2d7-5a04-0e01-8b98-ec2743436b74</uuid> + + <rule ...> + .... + </rule> + + <filterref filter='XXXX'/> + </filter> + +Every filter has a name and UUID which serve as unique identifiers. A filter can +have zero-or-more ``<rule>`` elements which are used to actually define network +controls. Filters can be arranged into a DAG, so zero-or-more ``<filterref/>`` +elements are also allowed. Cycles in the graph are not allowed. + +The ``<rule>`` element is where all the interesting stuff happens. It has three +attributes, an action, a traffic direction and an optional priority. E.g.: + +:: + + <rule action='drop' direction='out' priority='500'> + +Within the rule there are a wide variety of elements allowed, which do protocol +specific matching. Supported protocols currently include ``mac``, ``arp``, +``rarp``, ``ip``, ``ipv6``, ``tcp/ip``, ``icmp/ip``, ``igmp/ip``, ``udp/ip``, +``udplite/ip``, ``esp/ip``, ``ah/ip``, ``sctp/ip``, ``tcp/ipv6``, ``icmp/ipv6``, +``igmp/ipv6``, ``udp/ipv6``, ``udplite/ipv6``, ``esp/ipv6``, ``ah/ipv6``, +``sctp/ipv6``. Each protocol defines what is valid inside the <rule> element. +The general pattern though is: + +:: + + <protocol match='yes|no' attribute1='value1' attribute2='value2'/> + +So, eg a TCP protocol, matching ports 0-1023 would be expressed as: + +:: + + <tcp match='yes' srcportstart='0' srcportend='1023'/> + +Attributes can included references to variables defined by the object using the +rule. So the guest XML format allows each NIC to have a MAC address and IP +address defined. These are made available to filters via the variables ``$IP`` +and ``$MAC``. + +So to define a filter that prevents IP address spoofing we can simply match on +source IP address ``!= $IP`` like this: + +:: + + <filter name='no-ip-spoofing' chain='ipv4'> + <rule action='drop' direction='out'> + <ip match='no' srcipaddr='$IP' /> + </rule> + </filter> + +I'm not going to go into details on all the other protocol matches you can do, +because it'll take far too much space. You can read about the options +`here <formatnwfilter.html#nwfelemsRulesProto>`__. + +Out of the box in RHEL6/Fedora rawhide, libvirt ships with a set of default +useful rules: + +:: + + # virsh nwfilter-list + UUID Name + ---------------------------------------------------------------- + 15b1ab2b-b1ac-1be2-ed49-2042caba4abb allow-arp + 6c51a466-8d14-6d11-46b0-68b1a883d00f allow-dhcp + 7517ad6c-bd90-37c8-26c9-4eabcb69848d allow-dhcp-server + 7680776c-77aa-496f-90d6-13097664b925 allow-dhcpv6 + 9cdaad60-7631-4172-8ccb-ef774be7485b allow-dhcpv6-server + 3d38b406-7cf0-8335-f5ff-4b9add35f288 allow-incoming-ipv4 + 908543c1-902e-45f6-a6ca-1a0ad35e7599 allow-incoming-ipv6 + 5ff06320-9228-2899-3db0-e32554933415 allow-ipv4 + ce8904cc-ad3a-4454-896c-53452882f817 allow-ipv6 + db0b1767-d62b-269b-ea96-0cc8b451144e clean-traffic + 6d6ddcc8-1242-4c43-ac63-63af80493132 clean-traffic-gateway + 4cf38077-c7d5-4e25-99bb-6c4c9efad294 no-arp-ip-spoofing + 0b11a636-ce58-497f-be90-17f63c92487a no-arp-mac-spoofing + f88f1932-debf-4aa1-9fbe-f10d3aa4bc95 no-arp-spoofing + 772f112d-52e4-700c-0250-e178a3d91a7a no-ip-multicast + 7ee20370-8106-765d-f7ff-8a60d5aaf30b no-ip-spoofing + f8a51c43-a08f-49b3-b9e2-393d54522dc0 no-ipv6-multicast + a7f0afe9-a428-44b8-8566-c8ee2a669271 no-ipv6-spoofing + d5d3c490-c2eb-68b1-24fc-3ee362fc8af3 no-mac-broadcast + fb57c546-76dc-a372-513f-e8179011b48a no-mac-spoofing + dba10ea7-446d-76de-346f-335bd99c1d05 no-other-l2-traffic + f5c78134-9da4-0c60-a9f0-fb37bc21ac1f no-other-rarp-traffic + 7637e405-4ccf-42ac-5b41-14f8d03d8cf3 qemu-announce-self + 9aed52e7-f0f3-343e-fe5c-7dcb27b594e5 qemu-announce-self-rarp + +Most of these are just building blocks. The interesting one here is +'clean-traffic'. This pulls together all the building blocks into one filter +that you can then associate with a guest NIC. This stops the most common bad +things a guest might try, IP spoofing, arp spoofing and MAC spoofing. To look at +the rules for any of these just do: + +:: + + virsh nwfilter-dumpxml FILTERNAME|UUID + +They are all stored in ``/etc/libvirt/nwfilter``, but don't edit the files there +directly. Use ``virsh nwfilter-define`` to update them. This ensures the guests +have their iptables/ebtables rules recreated. + +To associate the clean-traffic filter with a guest, edit the guest XML config +and change the ``<interface>`` element to include a ``<filterref>`` and also +specify the ``<ip address/>`` that the guest is allowed to use: + +:: + + <interface type='bridge'> + <mac address='52:54:00:56:44:32'/> + <source bridge='br1'/> + <ip address='10.33.8.131'/> + <target dev='vnet0'/> + <model type='virtio'/> + <filterref filter='clean-traffic'/> + </interface> + +If no ``<ip address>`` is included, the network filter driver will activate its +'learning mode'. This uses libpcap to snoop on network traffic the guest sends +and attempts to identify the first IP address it uses. It then locks traffic to +this address. Obviously this isn't entirely secure, but it does offer some +protection against the guest being trojaned once up and running. In the future +we intend to enhance the learning mode so that it looks for DHCPOFFERS from a +trusted DHCP server and only allows the offered IP address to be used. + +Now, how is all this implemented...? + +The network filter driver uses a combination of ebtables, iptables and +ip6tables, depending on which protocols are referenced in a filter. The out of +the box 'clean-traffic' filter rules only require use of ebtables. If you want +to do matching at tcp/udp/etc protocols (eg to add a new filter +'no-email-spamming' to block port 25), then iptables will also be used. + +The driver attempts to keep its rules separate from those that the host admin +might already have configured. So the first thing it does with ebtables, is to +add two hooks in POSTROUTING and PREROUTING chains, to redirect traffic to +custom chains. These hooks match on the TAP device name of the guest NIC, so +they should not interact badly with any administrator defined rules: + +:: + + Bridge chain: PREROUTING, entries: 1, policy: ACCEPT + -i vnet0 -j libvirt-I-vnet0 + + Bridge chain: POSTROUTING, entries: 1, policy: ACCEPT + -o vnet0 -j libvirt-O-vnet0 + +To keep things manageable and easy to follow, the driver will then create +further sub-chains for each protocol then it needs to match against: + +:: + + Bridge chain: libvirt-I-vnet0, entries: 5, policy: ACCEPT + -p IPv4 -j I-vnet0-ipv4 + -p ARP -j I-vnet0-arp + -p 0x8035 -j I-vnet0-rarp + -p 0x835 -j ACCEPT + -j DROP + + Bridge chain: libvirt-O-vnet0, entries: 4, policy: ACCEPT + -p IPv4 -j O-vnet0-ipv4 + -p ARP -j O-vnet0-arp + -p 0x8035 -j O-vnet0-rarp + -j DROP + +Finally, here comes the actual implementation of the filters. This example shows +the 'clean-traffic' filter implementation. I'm not going to explain what this is +doing now. :-) + +:: + + Bridge chain: I-vnet0-ipv4, entries: 2, policy: ACCEPT + -s ! 52:54:0:56:44:32 -j DROP + -p IPv4 --ip-src ! 10.33.8.131 -j DROP + + Bridge chain: O-vnet0-ipv4, entries: 1, policy: ACCEPT + -j ACCEPT + + Bridge chain: I-vnet0-arp, entries: 6, policy: ACCEPT + -s ! 52:54:0:56:44:32 -j DROP + -p ARP --arp-mac-src ! 52:54:0:56:44:32 -j DROP + -p ARP --arp-ip-src ! 10.33.8.131 -j DROP + -p ARP --arp-op Request -j ACCEPT + -p ARP --arp-op Reply -j ACCEPT + -j DROP + + Bridge chain: O-vnet0-arp, entries: 5, policy: ACCEPT + -p ARP --arp-op Reply --arp-mac-dst ! 52:54:0:56:44:32 -j DROP + -p ARP --arp-ip-dst ! 10.33.8.131 -j DROP + -p ARP --arp-op Request -j ACCEPT + -p ARP --arp-op Reply -j ACCEPT + -j DROP + + Bridge chain: I-vnet0-rarp, entries: 2, policy: ACCEPT + -p 0x8035 -s 52:54:0:56:44:32 -d Broadcast --arp-op Request_Reverse --arp-ip-src 0.0.0.0 --arp-ip-dst 0.0.0.0 --arp-mac-src 52:54:0:56:44:32 --arp-mac-dst 52:54:0:56:44:32 -j ACCEPT + -j DROP + + Bridge chain: O-vnet0-rarp, entries: 2, policy: ACCEPT + -p 0x8035 -d Broadcast --arp-op Request_Reverse --arp-ip-src 0.0.0.0 --arp-ip-dst 0.0.0.0 --arp-mac-src 52:54:0:56:44:32 --arp-mac-dst 52:54:0:56:44:32 -j ACCEPT + -j DROP + +NB, we would have liked to include the prefix 'libvirt-' in all of our chain +names, but unfortunately the kernel limits names to a very short maximum length. +So only the first two custom chains can include that prefix. The others just +include the TAP device name + protocol name. + +If I define a new filter 'no-spamming' and then add this to the 'clean-traffic' +filter, I can illustrate how iptables usage works: + +:: + + # cat > /root/spamming.xml <<EOF + <filter name='no-spamming' chain='root'> + <uuid>d217f2d7-5a04-0e01-8b98-ec2743436b74</uuid> + <rule action='drop' direction='out' priority='500'> + <tcp dstportstart='25' dstportend='25'/> + </rule> + </filter> + EOF + # virsh nwfilter-define /root/spamming.xml + # virsh nwfilter-edit clean-traffic + +...add ``<filterref filter='no-spamming'/>`` + +All active guests immediately have their iptables/ebtables rules rebuilt. + +The network filter driver deals with iptables in a very similar way. First it +separates out its rules from those the admin may have defined, by adding a +couple of hooks into the INPUT/FORWARD chains: + +:: + + Chain INPUT (policy ACCEPT 13M packets, 21G bytes) + target prot opt in out source destination + libvirt-host-in all -- * * 0.0.0.0/0 0.0.0.0/0 + + Chain FORWARD (policy ACCEPT 5532K packets, 3010M bytes) + target prot opt in out source destination + libvirt-in all -- * * 0.0.0.0/0 0.0.0.0/0 + libvirt-out all -- * * 0.0.0.0/0 0.0.0.0/0 + libvirt-in-post all -- * * 0.0.0.0/0 0.0.0.0/0 + +These custom chains then do matching based on the TAP device name, so they won't +open holes in the admin defined matches for the LAN/WAN (if any). + +:: + + Chain libvirt-host-in (1 references) + target prot opt in out source destination + HI-vnet0 all -- * * 0.0.0.0/0 0.0.0.0/0 [goto] PHYSDEV match --physdev-in vnet0 + + Chain libvirt-in (1 references) + target prot opt in out source destination + FI-vnet0 all -- * * 0.0.0.0/0 0.0.0.0/0 [goto] PHYSDEV match --physdev-in vnet0 + + Chain libvirt-in-post (1 references) + target prot opt in out source destination + ACCEPT all -- * * 0.0.0.0/0 0.0.0.0/0 PHYSDEV match --physdev-in vnet0 + + Chain libvirt-out (1 references) + target prot opt in out source destination + FO-vnet0 all -- * * 0.0.0.0/0 0.0.0.0/0 [goto] PHYSDEV match --physdev-out vnet0 + +Finally, we can see the interesting bit which is the actual implementation of my +filter to block port 25 access: + +:: + + Chain FI-vnet0 (1 references) + target prot opt in out source destination + DROP tcp -- * * 0.0.0.0/0 0.0.0.0/0 tcp dpt:25 + + Chain FO-vnet0 (1 references) + target prot opt in out source destination + DROP tcp -- * * 0.0.0.0/0 0.0.0.0/0 tcp spt:25 + + Chain HI-vnet0 (1 references) + target prot opt in out source destination + DROP tcp -- * * 0.0.0.0/0 0.0.0.0/0 tcp dpt:25 + +One thing in looking at this you may notice is that if there are many guests all +using the same filters, we will be duplicating the iptables rules over and over +for each guest. This is merely a limitation of the current rules engine +implementation. At the libvirt object modelling level you can clearly see we've +designed the model so filter rules are defined in one place, and indirectly +referenced by guests. Thus it should be possible to change the implementation in +the future so we can share the actual iptables/ebtables rules for each guest to +create a more scalable system. The stuff in current libvirt is more or less the +very first working implementation we've had of this, so there's not been much +optimization work done yet. + +Also notice that at the XML level we don't expose the fact we are using iptables +or ebtables at all. The rule definition is done in terms of network protocols. +Thus if we ever find a need, we could plug in an alternative implementation that +calls out to a different firewall implementation instead of ebtables/iptables +(providing that implementation was suitably expressive of course) + +Finally, in terms of problems we have in deployment. The biggest problem is that +if the admin does ``service iptables restart`` all our work gets blown away. +We've experimented with using lokkit to record our custom rules in a persistent +config file, but that caused different problem. Admins who were not using lokkit +for their config found that all their own rules got blown away. So we threw away +our lokkit code. Instead we document that if you run +``service iptables restart``, you need to send SIGHUP to libvirt to make it +recreate its rules. + +More in depth documentation on this is `here <formatnwfilter.html>`__. diff --git a/docs/meson.build b/docs/meson.build index 6147f85d16..aa8bad89f0 100644 --- a/docs/meson.build +++ b/docs/meson.build @@ -22,7 +22,6 @@ docs_html_in_files = [ 'csharp', 'dbus', 'docs', - 'firewall', 'format', 'formatcaps', 'formatdomaincaps', @@ -82,6 +81,7 @@ docs_rst_files = [ 'drvvmware', 'drvxen', 'errors', + 'firewall', 'formatbackup', 'formatcheckpoint', 'formatdomain', -- 2.35.1