On Wed, Apr 20, 2022 at 10:32 PM Hangbin Liu <liuhangbin@xxxxxxxxx> wrote: > > On Wed, Apr 20, 2022 at 09:45:15AM -0400, Willem de Bruijn wrote: > > On Wed, Apr 20, 2022 at 4:28 AM Hangbin Liu <liuhangbin@xxxxxxxxx> wrote: > > > > > > Currently, the kernel drops GSO VLAN tagged packet if it's created with > > > socket(AF_PACKET, SOCK_RAW, 0) plus virtio_net_hdr. > > > > > > The reason is AF_PACKET doesn't adjust the skb network header if there is > > > a VLAN tag. Then after virtio_net_hdr_set_proto() called, the skb->protocol > > > will be set to ETH_P_IP/IPv6. And in later inet/ipv6_gso_segment() the skb > > > is dropped as network header position is invalid. > > > > > > Let's handle VLAN packets by adjusting network header position in > > > packet_parse_headers(), and move the function in packet_snd() before > > > calling virtio_net_hdr_set_proto(). > > > > The network header is set in > > > > skb_reset_network_header(skb); > > > > err = -EINVAL; > > if (sock->type == SOCK_DGRAM) { > > offset = dev_hard_header(skb, dev, ntohs(proto), addr, > > NULL, len); > > if (unlikely(offset < 0)) > > goto out_free; > > } else if (reserve) { > > skb_reserve(skb, -reserve); > > if (len < reserve + sizeof(struct ipv6hdr) && > > dev->min_header_len != dev->hard_header_len) > > skb_reset_network_header(skb); > > } > > > > If all that is needed is to move the network header beyond an optional > > VLAN tag in the case of SOCK_RAW, then this can be done in the else > > for Ethernet packets. > > Before we set network position, we need to check the skb->protocol to make > sure it's a VLAN packet. > > If we set skb->protocol and adjust network header here, like > packet_parse_headers() does. How should we do with later > > skb->protocol = proto; > skb->dev = dev; > > settings? > > If you are afraid that skb_probe_transport_header(skb) would affect the > virtio_net_hdr operation. How about split the skb_probe_transport_header() > from packet_parse_headers()? Something like > > --- a/net/packet/af_packet.c > +++ b/net/packet/af_packet.c > @@ -1924,13 +1924,19 @@ static int packet_rcv_spkt(struct sk_buff *skb, struct net_device *dev, > > static void packet_parse_headers(struct sk_buff *skb, struct socket *sock) > { > + int depth; > + > if ((!skb->protocol || skb->protocol == htons(ETH_P_ALL)) && > sock->type == SOCK_RAW) { > skb_reset_mac_header(skb); > skb->protocol = dev_parse_header_protocol(skb); > } > > - skb_probe_transport_header(skb); > + /* Move network header to the right position for VLAN tagged packets */ > + if (likely(skb->dev->type == ARPHRD_ETHER) && > + eth_type_vlan(skb->protocol) && > + __vlan_get_protocol(skb, skb->protocol, &depth) != 0) > + skb_set_network_header(skb, depth); > } > > /* > @@ -3047,6 +3055,8 @@ static int packet_snd(struct socket *sock, struct msghdr *msg, size_t len) > skb->mark = sockc.mark; > skb->tstamp = sockc.transmit_time; > > + packet_parse_headers(skb, sock); > + > if (has_vnet_hdr) { > err = virtio_net_hdr_to_skb(skb, &vnet_hdr, vio_le()); > if (err) > @@ -3055,7 +3065,7 @@ static int packet_snd(struct socket *sock, struct msghdr *msg, size_t len) > virtio_net_hdr_set_proto(skb, &vnet_hdr); > } > > - packet_parse_headers(skb, sock); > + skb_probe_transport_header(skb); > > if (unlikely(extra_len == 4)) > skb->no_fcs = 1; > > > > > > It is not safe to increase reserve, as that would eat into the > > reserved hlen LL_RESERVED_SPACE(dev), which does not account for > > optional VLAN headers. > > > > Instead of setting here first, then patching up again later in > > packet_parse_headers. > > > > This change affects all packets with VLAN headers, not just those with > > GSO. I imagine that moving the network header is safe for all, but > > don't know that code well enough to verify that it does not have > > unintended side effects. Does dev_queue_xmit expect the network header > > to point to the start of the VLAN headers or after, for instance? > > I think adjust the network position should be safe, as tap device also did that. Oh, it's great to be able to point to such prior art. Can you mention that in the commit message? Your approach does sound simpler than the above. Thanks for looking into that alternative, though. > > Thanks > Hangbin _______________________________________________ Virtualization mailing list Virtualization@xxxxxxxxxxxxxxxxxxxxxxxxxx https://lists.linuxfoundation.org/mailman/listinfo/virtualization