Include target: net vs net-next [PATCH net v3] Chengen Du wrote: > The issue initially stems from libpcap [1]. In the outbound packet path, > if hardware VLAN offloading is unavailable, the VLAN tag is inserted into > the payload but then cleared from the sk_buff struct. Consequently, this > can lead to a false negative when checking for the presence of a VLAN tag, > causing the packet sniffing outcome to lack VLAN tag information (i.e., > TCI-TPID). As a result, the packet capturing tool may be unable to parse > packets as expected. > > The TCI-TPID is missing because the prb_fill_vlan_info() function does not > modify the tp_vlan_tci/tp_vlan_tpid values, as the information is in the > payload and not in the sk_buff struct. The skb_vlan_tag_present() function > only checks vlan_all in the sk_buff struct. In cooked mode, the L2 header > is stripped, preventing the packet capturing tool from determining the > correct TCI-TPID value. Additionally, the protocol in SLL is incorrect, > which means the packet capturing tool cannot parse the L3 header correctly. > This does not add much context over v1 of the patch. But at least a pointer to context. > [1] https://github.com/the-tcpdump-group/libpcap/issues/1105 Prefer Link: $URL Please also add a Link to the conversation on patch 1: Link: https://lore.kernel.org/netdev/20240520070348.26725-1-chengen.du@xxxxxxxxxxxxx/T/#u > Fixes: f6fb8f100b80 ("af-packet: TPACKET_V3 flexible buffer implementation.") The referenced commit only introduces v3. The code changes to tpacket_rcv and packet_recvmsg indicate that this goes back further. Let's say to the introduction of explicitly passing VLAN information: Fixes: 393e52e33c6c ("packet: deliver VLAN TCI to userspace") > Cc: stable@xxxxxxxxxxxxxxx > Signed-off-by: Chengen Du <chengen.du@xxxxxxxxxxxxx> > --- > net/packet/af_packet.c | 18 ++++++++++++++++-- > 1 file changed, 16 insertions(+), 2 deletions(-) > > diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c > index ea3ebc160e25..82b36e90d73b 100644 > --- a/net/packet/af_packet.c > +++ b/net/packet/af_packet.c > @@ -1011,6 +1011,10 @@ static void prb_fill_vlan_info(struct tpacket_kbdq_core *pkc, > ppd->hv1.tp_vlan_tci = skb_vlan_tag_get(pkc->skb); > ppd->hv1.tp_vlan_tpid = ntohs(pkc->skb->vlan_proto); > ppd->tp_status = TP_STATUS_VLAN_VALID | TP_STATUS_VLAN_TPID_VALID; > + } else if (eth_type_vlan(pkc->skb->protocol)) { > + ppd->hv1.tp_vlan_tci = ntohs(vlan_eth_hdr(pkc->skb)->h_vlan_TCI); Careful about packet length. A malicious packet can be inserted that is an Ethernet header with zero payload, but ETH_P_8021Q as h_proto. See how __vlan_get_protocol carefully reads the headers. > + ppd->hv1.tp_vlan_tpid = ntohs(pkc->skb->protocol); > + ppd->tp_status = TP_STATUS_VLAN_VALID | TP_STATUS_VLAN_TPID_VALID; > } else { > ppd->hv1.tp_vlan_tci = 0; > ppd->hv1.tp_vlan_tpid = 0; > @@ -2428,6 +2432,10 @@ static int tpacket_rcv(struct sk_buff *skb, struct net_device *dev, > h.h2->tp_vlan_tci = skb_vlan_tag_get(skb); > h.h2->tp_vlan_tpid = ntohs(skb->vlan_proto); > status |= TP_STATUS_VLAN_VALID | TP_STATUS_VLAN_TPID_VALID; > + } else if (eth_type_vlan(skb->protocol)) { > + h.h2->tp_vlan_tci = ntohs(vlan_eth_hdr(skb)->h_vlan_TCI); > + h.h2->tp_vlan_tpid = ntohs(skb->protocol); > + status |= TP_STATUS_VLAN_VALID | TP_STATUS_VLAN_TPID_VALID; > } else { > h.h2->tp_vlan_tci = 0; > h.h2->tp_vlan_tpid = 0; > @@ -2457,7 +2465,8 @@ static int tpacket_rcv(struct sk_buff *skb, struct net_device *dev, > sll->sll_halen = dev_parse_header(skb, sll->sll_addr); > sll->sll_family = AF_PACKET; > sll->sll_hatype = dev->type; > - sll->sll_protocol = skb->protocol; > + sll->sll_protocol = (skb->protocol == htons(ETH_P_8021Q)) ? > + vlan_eth_hdr(skb)->h_vlan_encapsulated_proto : skb->protocol; In SOCK_RAW mode, the VLAN tag will be present, so should be returned. I'm concerned about returning a different value between SOCK_RAW and SOCK_DGRAM. But don't immediately see a better option. And for SOCK_DGRAM this approach is indistinguishable from the result on a device with hardware offload, so is acceptable. This test for ETH_P_8021Q ignores the QinQ stacked VLAN case. When fixing VLAN encap, both variants should be addressed at the same time. Note that ETH_P_8021AD is included in the eth_type_vlan test you call above. All these extra branches also makes the common case slower. Let's try to mitigate that as much as possible. > sll->sll_pkttype = skb->pkt_type; > if (unlikely(packet_sock_flag(po, PACKET_SOCK_ORIGDEV))) > sll->sll_ifindex = orig_dev->ifindex; > @@ -3482,7 +3491,8 @@ static int packet_recvmsg(struct socket *sock, struct msghdr *msg, size_t len, > /* Original length was stored in sockaddr_ll fields */ > origlen = PACKET_SKB_CB(skb)->sa.origlen; > sll->sll_family = AF_PACKET; > - sll->sll_protocol = skb->protocol; > + sll->sll_protocol = (skb->protocol == htons(ETH_P_8021Q)) ? > + vlan_eth_hdr(skb)->h_vlan_encapsulated_proto : skb->protocol; > } > > sock_recv_cmsgs(msg, sk, skb); > @@ -3539,6 +3549,10 @@ static int packet_recvmsg(struct socket *sock, struct msghdr *msg, size_t len, > aux.tp_vlan_tci = skb_vlan_tag_get(skb); > aux.tp_vlan_tpid = ntohs(skb->vlan_proto); > aux.tp_status |= TP_STATUS_VLAN_VALID | TP_STATUS_VLAN_TPID_VALID; > + } else if (eth_type_vlan(skb->protocol)) { > + aux.tp_vlan_tci = ntohs(vlan_eth_hdr(skb)->h_vlan_TCI); > + aux.tp_vlan_tpid = ntohs(skb->protocol); > + aux.tp_status |= TP_STATUS_VLAN_VALID | TP_STATUS_VLAN_TPID_VALID; > } else { > aux.tp_vlan_tci = 0; > aux.tp_vlan_tpid = 0; > -- > 2.40.1 >