пн, 20 июл. 2020 г. в 15:17, Edward Cree <ecree@xxxxxxxxxxxxxx>: > > On 20/07/2020 09:15, Alexander Petrovsky wrote: > > But, the main problem for us it's fragmented IP packets. Some times > > ago I tried to use for such packets AF_XDP, fast pass them into the > > user space, accumulate and after that pass back to the network, it was > > a PoC. > Not 100% sure this works because I haven't tried it, but as long as > packets aren't being re-ordered, you can do it without needing to > save the payload in a map. > All the map needs to store is (for each IPID being tracked) what host > this connection goes to. > If you receive a First Fragment (frag_off=0, MF=1), you look up the > tuple through the regular LB to pick a server, and record that host > in the map entry for the IPID. > For any other fragment, you look up the IPID in the map to get the > destination host, and if MF=0 you delete the map entry. > (If the IPID wasn't found, either drop or punt to userspace.) > Then TX/REDIRECT the packet to the appropriate host. > You might want to add some kind of simple ageing to this so that map > entries from interrupted/spurious fragment chains don't stick around > and build up over time. > > The problem comes when 'middle' fragments can either come after the > last (MF=0) fragment (technically this can be handled by tracking > the byte range seen for the IPID, and not deleting from the map > until all bytes up to the frag_off+total_len of the last-frag have > been seen), or worse, before the first fragment. If the frag_off=0 > fragment isn't the first one received, then this doesn't work > because you don't know at the time of receiving fragments what L4 > ports they belong to. But I don't know how common that situation is > and whether having it take the slow-path is acceptable. > > HTH, > -ed Unfortunately, for UDP I can't pick some _random_ host in case the first _seen_ fragment it's not a First Fragment (frag_off=0, MF=1). In this case, I have to accumulate ALL fragments in map. And on each received fragment check, is all fragments are collected. I did it in my PoC with AF_XDP, but in PoC all seems unreliable. -- Alexander Petrovsky