On Thu, Jun 4, 2020 at 5:47 AM Victor Julien <victor@xxxxxxxxxxxx> wrote: > > On 02-06-2020 22:18, Willem de Bruijn wrote: > > On Tue, Jun 2, 2020 at 4:05 PM Victor Julien <victor@xxxxxxxxxxxx> wrote: > >> > >> On 02-06-2020 21:38, Willem de Bruijn wrote: > >>> On Tue, Jun 2, 2020 at 3:22 PM Victor Julien <victor@xxxxxxxxxxxx> wrote: > >>>> > >>>> On 02-06-2020 21:03, Willem de Bruijn wrote: > >>>>> On Tue, Jun 2, 2020 at 2:31 PM Victor Julien <victor@xxxxxxxxxxxx> wrote: > >>>>>> On 02-06-2020 19:37, Willem de Bruijn wrote: > >>>>>>> On Tue, Jun 2, 2020 at 1:03 PM Victor Julien <victor@xxxxxxxxxxxx> wrote: > >>>>>>>> > >>>>>>>> On 02-06-2020 16:29, Willem de Bruijn wrote: > >>>>>>>>> On Tue, Jun 2, 2020 at 4:05 AM Victor Julien <victor@xxxxxxxxxxxx> wrote: > >>>>>>>>>> > >>>>>>>>>> Introduce a new flag (TP_STATUS_CSUM_UNNECESSARY) to indicate > >>>>>>>>>> that the driver has completely validated the checksums in the packet. > >>>>>>>>>> > >>>>>>>>>> The TP_STATUS_CSUM_UNNECESSARY flag differs from TP_STATUS_CSUM_VALID > >>>>>>>>>> in that the new flag will only be set if all the layers are valid, > >>>>>>>>>> while TP_STATUS_CSUM_VALID is set as well if only the IP layer is valid. > >>>>>>>>> > >>>>>>>>> transport, not ip checksum. > >>>>>>>> > >>>>>>>> Allow me a n00b question: what does transport refer to here? Things like > >>>>>>>> ethernet? It isn't clear to me from the doc. > >>>>>>> > >>>>>>> The TCP/UDP/.. transport protocol checksum. > >>>>>> > >>>>>> Hmm that is what I thought originally, but then it didn't seem to work. > >>>>>> Hence my patch. > >>>>>> > >>>>>> However I just redid my testing. I took the example tpacketv3 program > >>>>>> and added the status flag checks to the 'display()' func: > >>>>>> > >>>>>> if (ppd->tp_status & TP_STATUS_CSUM_VALID) { > >>>>>> printf("TP_STATUS_CSUM_VALID, "); > >>>>>> } > >>>>>> if (ppd->tp_status & (1<<8)) { > >>>>>> printf("TP_STATUS_CSUM_UNNECESSARY, "); > >>>>>> > >>>>>> } > >>>>>> > >>>>>> Then using scapy sent some packets in 2 variants: > >>>>>> - default (good csums) > >>>>>> - deliberately bad csums > >>>>>> (then also added a few things like ip6 over ip) > >>>>>> > >>>>>> > >>>>>> srp1(Ether()/IP(src="1.2.3.4", dst="5.6.7.8")/IPv6()/TCP(), > >>>>>> iface="enp1s0") // good csums > >>>>>> > >>>>>> srp1(Ether()/IP(src="1.2.3.4", dst="5.6.7.8")/IPv6()/TCP(chksum=1), > >>>>>> iface="enp1s0") //bad tcp > >>>>> > >>>>> Is this a test between two machines? What is the device driver of the > >>>>> machine receiving and printing the packet? It would be helpful to know > >>>>> whether this uses CHECKSUM_COMPLETE or CHECKSUM_UNNECESSARY. > >>>> > >>>> Yes 2 machines, or actually 2 machines and a VM. The receiving Linux > >>>> sits in a kvm vm with network pass through and uses the virtio driver > >>>> (host uses e1000e). Based on a quick 'git grep CHECKSUM_UNNECESSARY' > >>>> virtio seems to support that. > >>>> > >>>> I've done some more tests. In a pcap replay that I know contains packet > >>>> with bad TCP csums (but good IP csums for those pkts), to a physical > >>>> host running Ubuntu Linux kernel 5.3: > >>>> > >>>> - receiver uses nfp (netronome) driver: TP_STATUS_CSUM_VALID set for > >>>> every packet, including the bad TCP ones > >>>> - receiver uses ixgbe driver: TP_STATUS_CSUM_VALID not set for the bad > >>>> packets. > >>> > >>> Great. Thanks a lot for running all these experiments. > >>> > >>> We might have to drop the TP_STATUS_CSUM_VALID with CHECKSUM_COMPLETE > >>> unless skb->csum_valid. > >>> > >>> For packets with multiple transport layer checksums, > >>> CHECKSUM_UNNECESSARY should mean that all have been verified. > >>> > >>> I believe that in the case of multiple transport headers, csum_valid > >>> similarly ensures all checksums up to csum_start are valid. Will need > >>> to double check. > >>> > >>> If so, there probably is no need for a separate new TP_STATUS. > >>> TP_STATUS_CSUM_VALID is reported only when all checksums are valid. > >> > >> So if I understand you correctly the key may be in the call to > >> `skb_csum_unnecessary`: > >> > >> That reads: > >> > >> static inline int skb_csum_unnecessary(const struct sk_buff *skb) > >> { > >> return ((skb->ip_summed == CHECKSUM_UNNECESSARY) || > >> skb->csum_valid || > >> (skb->ip_summed == CHECKSUM_PARTIAL && > >> skb_checksum_start_offset(skb) >= 0)); > >> } > >> > >> But really only the first 2 conditions are reachable > > > > .. from this codepath. That function is called in other codepaths as well. > > > >> , as we already know > >> skb->ip_summed is not CHECKSUM_PARTIAL when we call it. > >> > >> So our unmodified check is: > >> > >> else if (skb->pkt_type != PACKET_OUTGOING && > >> (skb->ip_summed == CHECKSUM_COMPLETE || > >> skb->ip_summed == CHECKSUM_UNNECESSARY || > >> skb->csum_valid)) > >> > >> Should this become something like: > >> > >> else if (skb->pkt_type != PACKET_OUTGOING && > >> (skb->ip_summed == CHECKSUM_COMPLETE && > >> skb->csum_valid) || > >> skb->ip_summed == CHECKSUM_UNNECESSARY) > >> > >> Is this what you had in mind? > > > > I don't suggest modifying skb_csum_unnecessary probably. Certainly not > > until I've looked at all other callers of it. > > > > But in case of packet sockets, yes, adding that csum_valid check is my > > first rough approximation. > > > > That said, first let's give others more familiar with > > TP_STATUS_CSUM_VALID some time to comment. > > > > I did some more experiments, on real hw this time. I made the following > change to 5.7.0 (wasn't brave enough to remote upgrade a box to netnext): > > diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c > index 29bd405adbbd..3afb1913837a 100644 > --- a/net/packet/af_packet.c > +++ b/net/packet/af_packet.c > @@ -2216,8 +2216,8 @@ static int tpacket_rcv(struct sk_buff *skb, struct > net_device *dev, > if (skb->ip_summed == CHECKSUM_PARTIAL) > status |= TP_STATUS_CSUMNOTREADY; > else if (skb->pkt_type != PACKET_OUTGOING && > - (skb->ip_summed == CHECKSUM_COMPLETE || > - skb_csum_unnecessary(skb))) > + ((skb->ip_summed == CHECKSUM_COMPLETE && > skb->csum_valid) || > + skb->ip_summed == CHECKSUM_UNNECESSARY)) > status |= TP_STATUS_CSUM_VALID; > > if (snaplen > res) > > With this change it seems the TP_STATUS_CSUM_VALID flag is *never* set > for the nfp driver. I was mistaken. skb->csum_valid only signals whether the skb->csum field is initialized. As of commit 573e8fca255a ("net: skb_gro_checksum_* functions") skb->csum_valid it is always set if CHECKSUM_COMPLETE. This does not imply that the checksum field in the header is correct. The checksum field may get checked against the known checksum of the payload in skb->csum before __netif_receive_skb_core and thus before packet sockets during GRO when that is enabled. But not always. Not if the packet gets flushed, for instance, see tcp4_gro_receive. Commit 662880f44203 ("net: Allow GRO to use and set levels of checksum unnecessary") indicates that the original assumption in this patch that CHECKSUM_UNNECESSARY implies all checksums being valid does not necessarily hold. Drivers are expected to set up skb->csum_level when they have verified more than just the inner transport header.