-----"Stefan Metzmacher" <metze@xxxxxxxxx> wrote: ----- >To: "Bernard Metzler" <bmt@xxxxxxxxxxxxxx> >From: "Stefan Metzmacher" <metze@xxxxxxxxx> >Date: 05/07/2021 01:37AM >Cc: linux-rdma@xxxxxxxxxxxxxxx, "Stefan Metzmacher" <metze@xxxxxxxxx> >Subject: [EXTERNAL] [PATCH 00/31] rdma/siw: fix a lot of deadlocks >and use after free bugs > >Hi Bernard, > >while testing with my smbdirect driver I hit a lot of >bugs in the siw.ko driver. They all cause problems where >the siw driver was not able to unload anymore and I had to >reboot the machine. > Hi Stefan, Much appreciated! These are quite some patches, and I will need some time to go through. Would bee nice if those would be broken down into smaller bundles (introduce non-blocking connect, _siw_cep_close() subroutine, fixing cep reference counting, smp_mb() after STag invalidation, ..). Anyway, many thanks for the effort, it will improve the driver! First comments: A non blocking connect does really makes sense as you are pointing out. I hope it doesn't complicate the CM code even further. I think we agreed upon not using BUG() and BUG_ON(), so please don't introduce it. 'I hit a lot of bugs' is not very helpful, but just a statement. Thanks very much! Bernard. >I implemented: >- a non blocking connect >- fixed a lot of bugs where siw_cep_put() was called too often. >- fixed bugs where siw_cm_upcall() confused the core IWCM logic > >I have some more changes to follow, but I wanted to send them >finally out after having them one and a half year sitting in some >private branch... > >Stefan Metzmacher (31): > rdma/siw: fix warning in siw_proc_send() > rdma/siw: call smp_mb() after mem->stag_valid = 0 in > siw_invalidate_stag() too > rdma/siw: remove superfluous siw_cep_put() from siw_connect() error > path > rdma/siw: let siw_accept() deferr RDMA_MODE until EVENT_ESTABLISHED > rdma/siw: make use of kernel_{bind,connect,listen}() > rdma/siw: make siw_cm_upcall() a noop without valid 'id' > rdma/siw: split out a __siw_cep_terminate_upcall() function > rdma/siw: use __siw_cep_terminate_upcall() for indirect > SIW_CM_WORK_CLOSE_LLP > rdma/siw: use __siw_cep_terminate_upcall() for >SIW_CM_WORK_PEER_CLOSE > rdma/siw: use __siw_cep_terminate_upcall() for >SIW_CM_WORK_MPATIMEOUT > rdma/siw: introduce SIW_EPSTATE_ACCEPTING/REJECTING for > rdma_accept/rdma_reject > rdma/siw: add some debugging of state and sk_state to the teardown > process > rdma/siw: handle SIW_EPSTATE_CONNECTING in > __siw_cep_terminate_upcall() > rdma/siw: let siw_connect() set AWAIT_MPAREP before > siw_send_mpareqrep() > rdma/siw: create a temporary copy of private data > rdma/siw: use error and out logic at the end of siw_connect() > rdma/siw: start mpa timer before calling siw_send_mpareqrep() > rdma/siw: call the blocking kernel_bindconnect() just before > siw_send_mpareqrep() > rdma/siw: split out a __siw_cep_close() function > rdma/siw: implement non-blocking connect. > rdma/siw: let siw_listen_address() call siw_cep_alloc() first > rdma/siw: let siw_listen_address() call siw_cep_set_inuse() early > rdma/siw: make use of __siw_cep_close() in siw_accept() > rdma/siw: do the full disassociation of cep and qp in > siw_qp_llp_close() > rdma/siw: fix double siw_cep_put() in siw_cm_work_handler() > rdma/siw: make use of __siw_cep_close() in siw_cm_work_handler() > rdma/siw: fix the "close" logic in siw_qp_cm_drop() > rdma/siw: make use of __siw_cep_close() in siw_qp_cm_drop() > rdma/siw: make use of __siw_cep_close() in siw_reject() > rdma/siw: make use of __siw_cep_close() in siw_listen_address() > rdma/siw: make use of __siw_cep_close() in siw_drop_listeners() > > drivers/infiniband/sw/siw/siw_cm.c | 537 >+++++++++++++++----------- > drivers/infiniband/sw/siw/siw_cm.h | 3 + > drivers/infiniband/sw/siw/siw_mem.c | 2 + > drivers/infiniband/sw/siw/siw_qp.c | 3 + > drivers/infiniband/sw/siw/siw_qp_rx.c | 2 +- > 5 files changed, 316 insertions(+), 231 deletions(-) > >-- >2.25.1 > >