Jakub Kicinski wrote: > On Tue, 14 May 2019 15:34:55 -0700, John Fastabend wrote: > > John Fastabend wrote: > > > Jakub Kicinski wrote: > > > > On Thu, 09 May 2019 21:57:49 -0700, John Fastabend wrote: > > > > > @@ -2042,12 +2060,14 @@ void tls_sw_free_resources_tx(struct sock *sk) > > > > > if (atomic_read(&ctx->encrypt_pending)) > > > > > crypto_wait_req(-EINPROGRESS, &ctx->async_wait); > > > > > > > > > > - release_sock(sk); > > > > > + if (locked) > > > > > + release_sock(sk); > > > > > cancel_delayed_work_sync(&ctx->tx_work.work); > > > > > > > > So in the splat I got (on a slightly hacked up kernel) it seemed like > > > > unhash may be called in atomic context: > > > > > > > > [ 783.232150] tls_sk_proto_unhash+0x72/0x110 [tls] > > > > [ 783.237497] tcp_set_state+0x484/0x640 > > > > [ 783.241776] ? __sk_mem_reduce_allocated+0x72/0x4a0 > > > > [ 783.247317] ? tcp_recv_timestamp+0x5c0/0x5c0 > > > > [ 783.252265] ? tcp_write_queue_purge+0xa6a/0x1180 > > > > [ 783.257614] tcp_done+0xac/0x260 > > > > [ 783.261309] tcp_reset+0xbe/0x350 > > > > [ 783.265101] tcp_validate_incoming+0xd9d/0x1530 > > > > > > > > I may have been unclear off-list, I only tested the patch no longer > > > > crashes the offload :( > > > > > > > > > > Yep, I misread and thought it was resolved here as well. OK I'll dig into > > > it. I'm not seeing it from selftests but I guess that means we are missing > > > a testcase. :( yet another version I guess. > > > > > > > Seems we need to call release_sock in the unhash case as well. Will > > send a new patch shortly. > > My reading of the stack trace was that unhash gets called from > tcp_reset(), IOW from soft IRQ, so we can't cancel_delayed_work_sync() > in tls_sw_free_resources_tx(), no? Well the tcp_close() path has the lock held and can also call unhash(). Anyways this dropping the sock lock in the middle of the block seems a bit suspect to me anyways. I think we can defer the free until after sock is released this is how it was solved on sockmap side.