On Fri, 2015-05-29 at 09:03 +0200, Bart Van Assche wrote: > On 05/28/15 20:29, Vasu Dev wrote: > > We would have needed schedule() in a HBA design if the current LUN RESET > > call flow had some thing to finish during schedule() preemption time but > > then that would have required wait on any such thing with a condition > > after schedule() is over but currently schedule is just a pause as I > > described below since all it does is releasing the CPU and then just > > resumes back without waiting for anything to finish after schedule(). > > The above is not correct. The schedule() call in fc_seq_set_resp() waits > until fc_invoke_resp() invokes wake_up(). > Does it do for all in-flight pending frames of fsp/exch and not yet handled by fc_invoke_resp() after LUN RESET ? It won't since wake_up() will be issued on first frame after LUN RESET and subsequent frames yet to hit fc_invoke_resp() will be curbed by NULL ep->resp handler as I explained in my last response. So I'd rather have all frames dropped by ep->resp == NULL after LUN RESET as it was before scheduled() & ep->resp_wq/ep->resp_active new logic for this purpose, since new logic around schedule() in itself not serving this purpose completely beside causing scheduling while atomic bug. > Regarding LUN RESET behavior, I think the following quotes from SAM 5 > are interesting. From paragraph "5.6 Aborting commands": "After a > command is aborted and TASK ABORTED status, if any, is returned, the > SCSI target device shall send no further requests or responses for that > command." From the paragraph "6.3.3 Logical unit reset": "When > responding to a logical unit reset condition, the logical unit shall > abort all commands as described in 5.6." > > The only way to guarantee that no further responses are sent after a LUN > RESET has completed is to wait until all ongoing ep->resp() calls have > finished. Correct and this could be either by at least these two options discussed so far:- 1. Using schedule() along with waiting for *all* in flight pending frames instead getting out of wait on first pending frame after LUN RESET. 2. Just use ep->resp == NULL which is set to NULL with LUN RESET to discard all frames. I'd go with #2 w/o having risk of atomic sleep bug like this patch is fixing with #1 due to use fo schedule() beside currently #1 is not really implemented to curb all frames as I explained above. Looking at ep->resp_active++ may give impression as if #1 is fully implemented but fact is that LUN RESET's pending fsp wait will be over on very first frame. > I think that is what patch 4/4 posted on May 26 realizes. Or > did I perhaps overlook something ? > We are discussion these above two options to fix scheduling while atomic bug here, your original 4/4 fixes the issue by dropping fsp->scsi_pkt_lock around schedule(), which is ok for now so may be just documents notes on need to drop here and possibly document #2 fix w/o schedule() as TODO. It would be helpful to me if you could point to any other existing HBA user of schedule() for such case due to another reason you mentioned before as:- "The layers above libfc need a way to wait until ongoing invocations of their own response handler have finished. This is why the schedule() call has been added in fc_seq_set_resp(). ", whether that is for SCSI/TCM/SCST. Thanks Bart for quite useful discussion and this patch series. This discussion should be helpful to fix any more issues due to unlocking/locking considerations around added schedule() for ep->resp_wq/ep->resp_active. Thanks Vasu -- To unsubscribe from this list: send the line "unsubscribe target-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html