Cong Wang wrote: > On Mon, Feb 15, 2021 at 4:54 PM John Fastabend <john.fastabend@xxxxxxxxx> wrote: > > > > Cong Wang wrote: > > > On Mon, Feb 15, 2021 at 3:57 PM John Fastabend <john.fastabend@xxxxxxxxx> wrote: > > > > > > > > For TCP case we can continue to use CB and not pay the price. For UDP > > > > and AF_UNIX we can do the extra alloc. > > > > > > I see your point, but specializing TCP case does not give much benefit > > > here, the skmsg code would have to check skb->protocol etc. to decide > > > whether to use TCP_SKB_CB() or skb_ext: > > > > > > if (skb->protocol == ...) > > > TCP_SKB_CB(skb) = ...; > > > else > > > ext = skb_ext_find(skb); > > > > > > which looks ugly to me. And I doubt skb->protocol alone is sufficient to > > > distinguish TCP, so we may end up having more checks above. > > > > > > So do you really want to trade code readability with an extra alloc? > > > > Above is ugly. So I look at where the patch replaces things, > > > > sk_psock_tls_strp_read(), this is TLS specific read hook so can't really > > work in generic case anyways. > > > > sk_psock_strp_read(), will you have UDP, AF_UNIX stream parsers? Do these > > even work outside TCP cases. > > > > For these ones: sk_psock_verdict_apply(), sk_psock_verdict_recv(), > > sk_psock_backlog(), can't we just do some refactoring around their > > hook points so we know the context. For example sk_psock_tls_verdict_apply > > is calling sk_psock_skb_redirect(). Why not have a sk_psock_unix_redirect() > > and a sk_psock_udp_redirect(). There are likely some optimizations we can > > deploy this way. We've already don this for tls and sk_msg types for example. > > > > Then the helpers will know their types by program type, just use the right > > variants. > > > > So not suggestiong if/else the checks so much as having per type hooks. > > > > Hmm, but sk_psock_backlog() is still the only one that handles all three > above cases, right? It uses TCP_SKB_CB() too and more importantly it > is also why we can't use a per-cpu struct here (see bpf_redirect_info). Right, but the workqueue is created at init time where we will know the socket type based on the program/map types so can build the redirect backlog queue there based on the type needed. I also have a patch in mind that would do more specific TCP things in that code anyways. I can flush it out this week if anyone cares. The idea is we are wasting lots of cycles using skb_send_sock_locked when we can just inject the packet directlyy into the tcp stack. Also on the original patch here, we can't just kfree_skb() on alloc errors because that will look like a data drop. Errors need to be handled gracefully without dropping data. At least in the TCP case, but probably also in UDP and AF_UNIX cases as well. Original code was pretty loose in this regard, but it caused users to write bug reports and then I've been fixing most of them. If you see more cases let me know. > > Thanks.