On 12/10/22 12:48 PM, Mike Christie wrote: > > When we do iscsit iscsit_release_commands_from_conn we are: > > 1. Waiting on commands in the backend and LIO core. > 2. Doing the last put on commands that have had queue_status called but > we haven't freed the cmd because they haven't been ackd. > > Are we hitting an issue with #2? We need a proper bug and analysis or we are > just guessing and am going to mess up. > > For example, for isert is the bug you are worried about that we have a missing > isert_send_done/isert_completion_put call because we disconnected before the > send callbacks could be done or because the ib layer won't call isert_send_done > when it detects a failure? I tested this and it's actually opposite and broken for a different reason :) It looks like we will still call isert_send_done for the cases above so we are ok there. The target_wait_for_cmds call will also sync us up those calls as well. So if we move isert's target_wait_for_cmds we have to flush those calls as well or add some more checks/refcounts or something. It turns out instead of a hang there is use after free. We can race where isert_put_unsol_pending_cmds does a isert_put_cmd but then isert_send_done can be running and also does isert_completion_put -> isert_put_cmd, so we hit a use after free due to the isert_put_unsol_pending_cmds calls freeing the se_cmd.