Jakub Sitnicki wrote: > On Mon, Oct 11, 2021 at 09:16 PM CEST, John Fastabend wrote: > > We do not need to handle unhash from BPF side we can simply wait for the > > close to happen. The original concern was a socket could transition from > > ESTABLISHED state to a new state while the BPF hook was still attached. > > But, we convinced ourself this is no longer possible and we also > > improved BPF sockmap to handle listen sockets so this is no longer a > > problem. > > > > More importantly though there are cases where unhash is called when data is > > in the receive queue. The BPF unhash logic will flush this data which is > > wrong. To be correct it should keep the data in the receive queue and allow > > a receiving application to continue reading the data. This may happen when > > tcp_abort is received for example. Instead of complicating the logic in > > unhash simply moving all this to tcp_close hook solves this. > > > > Fixes: 51199405f9672 ("bpf: skb_verdict, support SK_PASS on RX BPF path") > > Signed-off-by: John Fastabend <john.fastabend@xxxxxxxxx> > > --- > > Doesn't this open the possibility of having a TCP_CLOSE socket in > sockmap if I disconnect it, that is call connect(AF_UNSPEC), instead of > close it? Correct it means we may have TCP_CLOSE socket in the map. I'm not seeing any problem with this though. A send on the socket would fail the sk_state checks in the send hooks. (tcp.c:1245). Receiving from the TCP stack would fail with normal TCP stack checks. Maybe we want a check on redirect into ingress if the sock is in ESTABLISHED state as well? I might push that in its own patch though it seems related, but I think we should have that there regardless of this patch. Did you happen to see any issues on the sock_map side for close case? It looks good to me. .John