On Sat, 14 Mar 2020 at 02:58, Alexei Starovoitov <alexei.starovoitov@xxxxxxxxx> wrote: > > I'm not following. There is skb->sk. Why do you need to lookup sk ? Because > your hook is before demux and skb->sk is not set? Then move your hook to after? > > I think we're arguing in circles because in this thread I haven't seen the > explanation of the problem you're trying to solve. We argued about your > proposed solution and got stuck. Can we restart from the beginning with all > details? Yes, that's a good idea. I mentioned this in passing in my cover letter, but should have provided more context. Jakub is working on a patch series to add a BPF hook to socket dispatch [1] aka the inet_lookup function. The core idea is to control skb->sk via a BPF program. Hence, we can't use skb->sk. Introducing this hook poses another problem: we need to get the struct sk from somewhere. The canonical way in BPF is to use the lookup_sk helpers. Of course that doesn't work, since our hook would invoke itself. So we need a data structure that can hold sockets, to be used by programs attached on the new hook. Jakub's RFC patch set used REUSEPORT_SOCKARRAY for this. During LPC '19 we got feedback that sockmap is probably the better choice. As a result, Jakub started working on extending sockmap TCP support and after a while I joined to add UDP. Now, we are looking at what our control plane could look like. Based on the inet-tool work that Marek Majkowski has done [2], we currently have the following set up: * An LPM map that goes from IP prefix and port to an index in a sockmap * A sockmap that holds sockets * A BPF program that performs the business logic inet-tool is used to update the two maps to add and remove mappings on the fly. Essentially, services donate their sockets either via fork+exec or SCM_RIGHTS on a Unix socket. Once we have inserted a socket in the sockmap, it's not possible to retrieve it again. This makes it impossible to change the position of a socket in the map, to resize the map, etc. with our current design. One way to work around this is to add a persistent component to our control plane: a process can hold on to the sockets and re-build the map when necessary. The downsides are that upgrading the service is non-trivial (since we need to pass the socket fds) and that a failure of this service is catastrophic. Once it happens, we probably have to reboot the machine to get it into a workable state again. We'd like to avoid a persistent service if we can. By allowing to look up fds from the sockmap, we could make this part of our control plane more robust. 1: https://www.youtube.com/watch?v=qRDoUpqvYjY 2: https://github.com/majek/inet-tool I hope this explanation helps, sorry for not being more thorough in the original cover letter! Lorenz -- Lorenz Bauer | Systems Engineer 6th Floor, County Hall/The Riverside Building, SE1 7PB, UK www.cloudflare.com