Hi Alex, Apologies for the slightly delayed response. Comments are below. On Fri, 2014-03-21 at 22:20 -0700, Alex Leung wrote: > On 3/19/14, 9:25 PM, Alex Leung wrote: > > Hi Nicholas, > > > > On 3/18/14, 10:04 PM, Nicholas A. Bellinger wrote: > >> On Mon, 2014-03-17 at 19:57 -0700, Alex Leung wrote: > >>> On 3/17/14, 11:56 AM, Nicholas A. Bellinger wrote: > >>> > >>> <SNIP> > >>> > >>>> Here's an updated patch to drop the transport_send_task_abort() in > >>>> core_tmr_abort_task(), and add the transport_cmd_finish_abort() to make > >>>> the final target_put_sess_cmd(), invoking TFO->release_cmd() to release > >>>> the associated descriptor. > >>>> > >>>> Care to test..? ;) > >>> > >>> I'd be happy to -- just give me a day or two. I assume it's okay that > >>> I'll be testing with a non-upstream fabric driver (Emulex OCS SDK)? > >>> > >> > >> For this particular patch that's fine, and I'll also confirm the change > >> with a couple of other fabric drivers before pushing. > > > > I did some testing with your most recent patch applied to the "for-next" > > branch and the following now results in the tfo->release_cmd() path > > instead of the tfo->queue_status() with TASK_ABORTED status path. > > > > 1. Abort Task > > 2. LUN Reset with TAS=0 > > 3. LUN Reset with TAS=1 but IT Nexus(cmd) matches IT Nexus(tmr) > > > > To test TAS=1 where the IT Nexuses do not match, I did not have another > > initiator handy, so I took a shortcut and changed the > > core_tmr_handle_tas_abort() to just be "if (tas)" (removing the IT Nexus > > not matching qualifier) and made sure that tfo->queue_status > > (TASK_ABORTED) was called -- it was. > > > > So far, so good. > > > > However, it looks transport_check_aborted_status() is also not quite > > right. Currently it calls tfo->queue_status() with the TASK_ABORTED > > status regardless of TAS and Abort Task v. LUN Reset. I can fabricate > > a race condition where, just before tfo->release_cmd() is called > > for the abort, the write data phase completes and calls > > target_execute_cmd() (which calls transport_check_aborted_status(), > > which calls tfo->queue_status()). > > > > I believe transport_check_aborted_status() is for the case where > > transport_send_task_abort() is called but the fabric driver is not > > ready to send the response, so it returns non-zero for > > tfo->write_pending_status(). When the dataphase is complete, > > target_execute_cmd() --> transport_check_aborted_status() is called. > > If my understanding correct, it seems like CMD_T_ABORTED has > > two meanings: 1. the command has been aborted and 2. we wish to send > > a task aborted status at a later time (when the data phase has > > completed). If we use an se_cmd_flag to distinguish the aborted w/ > > TASK_ABORTED case from the aborted w/o TASK_ABORTED case, we would not > > send an unexpected tfo->queue_status (TASK_ABORTED) in the above case. > > The below patch (that includes your changes) works in this case (and > > still sends the delayed TASK_ABORTED when appropriate). Thoughts? > > > > I noticed something else during my testing. For Abort Task, there's a > extra target_put_sess_cmd() that I believe now needs to be removed as a > result of adding the call to transport_cmd_finish_abort() in > core_tmr_abort_task(). I'm also not sure the kref_get, > target_put_sess_cmd pair that used to surround the call to > the now removed transport_send_task_abort() is necessary either, but > I left that. See updated patch below. > > The extra kref_put results tfo->release_cmd() being called during > "check_stop_to_fabric" in transport_cmd_finish_abort() for Abort Task > (unexpected), but during "transport_put_cmd()" (expected) for LUN Reset: > > void transport_cmd_finish_abort(struct se_cmd *cmd, int remove) > { > if (cmd->se_cmd_flags & SCF_SE_LUN_CMD) > transport_lun_remove_cmd(cmd); > > // tfo->release_cmd() called here for Task Abort > if (transport_cmd_check_stop_to_fabric(cmd)) > return; > if (remove) > // tfo->release_cmd() called here for LUN Reset > transport_put_cmd(cmd); > } > This is absolutely correct. Note there are a few fabric driver cases (loopback + tcm_fc) where the TFO->check_stop_free() calls transport_generic_free_cmd(), but for this case transport_cmd_check_stop_to_fabric() will return a non-zero value. > If we're in agreement so far, I have a question about the > tfo->release_cmd() path in general. By the time the fabric driver gets > the release_cmd(), the SGL (se_cmd->t_data_sg) has been freed and it is > too late for the fabric driver to unmap it (if it were previously > mapped by the fabric). check_stop_free() seems to be my last chance > to do the unmap while the SGL still exists. Is this the right place? > Mmmm, I think the right thing to do here is add a TFO->aborted_task() caller that allows the fabric driver to perform any SGL unmapping, in place of the original TFO->queue_status() call. I'll be posting for review soon that includes TFO->aborted_task(), and the associated fabric driver changes. Thanks! --nab -- To unsubscribe from this list: send the line "unsubscribe target-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html