IP MTU and L2 MTU are different animals. IMHO IP MTU is for fragmentation at sender of a link. There is no need dropping IP packets at receiver with size > configured IP MTU. IP packets with size > receiver L2 MTU will be dropped at sub-IP layer. For this patch: if veth has some notion on L2 MTU (e.g. buffer size limits), there has to be checks for it. I don't know why configuring MRU helps, more config, more mistakes. If there is no need for dropping the packet: don't. Teco > Op 11 mei 2017, om 21:10 heeft Fredrik Markström <fredrik.markstrom@xxxxxxxxx> het volgende geschreven: > > On Thu, May 11, 2017 at 6:01 PM, Stephen Hemminger > <stephen@xxxxxxxxxxxxxxxxxx> wrote: >> On Thu, 11 May 2017 15:46:27 +0200 >> Fredrik Markstrom <fredrik.markstrom@xxxxxxxxx> wrote: >> >>> From: Fredrik Markström <fredrik.markstrom@xxxxxxxxx> >>> >>> is_skb_forwardable() currently checks if the packet size is <= mtu of >>> the receiving interface. This is not consistent with most of the hardware >>> ethernet drivers that happily receives packets larger then MTU. >> >> Wrong. > > What is "Wrong" ? I was initially skeptical to implement this patch, > since it feels odd to have different MTU:s set on the two sides of a > link. After consulting some IP people and the RFC:s I kind of changed > my mind and thought I'd give it a shot. In the RFCs I couldn't find > anything that defined when and when not a received packet should be > dropped. > >> >> Hardware interfaces are free to drop any packet greater than MTU (actually MTU + VLAN). >> The actual limit is a function of the hardware. Some hardware can only limit by >> power of 2; some can only limit frames larger than 1500; some have no limiting at all. > > Agreed. The purpose of these patches is to be able to configure an > veth interface to mimic these different behaviors. Non of the Ethernet > interfaces I have access to drops packets due to them being larger > then the configured MTU like veth does. > > Being able to mimic real Ethernet hardware is useful when > consolidating hardware using containers/namespaces. > > In a reply to a comment from David Miller in my previous version of > the patch I attached the example below to demonstrate the case in > detail. > > This works with all ethernet hardware setups I have access to: > > ---- 8< ------ > # Host A eth2 and Host B eth0 is on the same network. > > # On HOST A > % ip address add 1.2.3.4/24 dev eth2 > % ip link set eth2 mtu 300 up > > % # HOST B > % ip address add 1.2.3.5/24 dev eth0 > % ip link set eth0 mtu 1000 up > % ping -c 1 -W 1 -s 400 1.2.3.4 > PING 1.2.3.4 (1.2.3.4) 400(428) bytes of data. > 408 bytes from 1.2.3.4: icmp_seq=1 ttl=64 time=1.57 ms > > --- 1.2.3.4 ping statistics --- > 1 packets transmitted, 1 received, 0% packet loss, time 0ms > rtt min/avg/max/mdev = 1.573/1.573/1.573/0.000 ms > ---- 8< ------ > > > But it doesn't work with veth: > > ---- 8< ------ > # veth0 and veth1 is a veth pair and veth1 has ben moved to a separate > network namespace. > % # NS A > % ip address add 1.2.3.4/24 dev veth0 > % ip link set veth0 mtu 300 up > > % # NS B > % ip address add 1.2.3.5/24 dev veth1 > % ip link set veth1 mtu 1000 up > % ping -c 1 -W 1 -s 400 1.2.3.4 > PING 1.2.3.4 (1.2.3.4) 400(428) bytes of data. > > --- 1.2.3.4 ping statistics --- > 1 packets transmitted, 0 received, 100% packet loss, time 0ms > ---- 8< ------ > > -- > /Fredrik