On 01/04/2017 02:53 PM, Andy Grover wrote: > On 12/30/2016 12:18 AM, lixiubo@xxxxxxxxxxxxxxxxxxxx wrote: >> From: Xiubo Li <lixiubo@xxxxxxxxxxxxxxxxxxxx> >> >> This patch series add TCMU_OP_EXP opcode support for skipping the >> expired cmd in kernel space and user space. >> >> * The 1/2 patch is for target_core_user in kernel space. >> * The 2/2 patch is for tcmu-runner in user space. >> >> Here add one 'void *data' parameter just piont to the entry hdr in >> ring buffer: >> struct tcmu_cmd { >> ... >> + void *data; ===> pionting to entry hdr >> ... >> } >> >> When some cmds were expired, target_core_user will set the opcode in >> the entry hdr to TCMU_OP_EXP(or reuse TCMU_OP_PAD here?), and then >> the tcmu-runner will just skip all the TCMU_OP_EXP cmds, so is the >> target_core_user. > > I think this may cause race conditions. Once we "give" part of the cmd > ring to userspace, it's racy to change it because we don't know if > userspace has already accessed it, or not. > > We could put the new OP_EXP opcode on the cmd ring as a new entry. This > would ensure the old cmd's processing was terminated and not finish > out-of-order with a new cmd the initiator sent to retry the aborted one, > avoiding corruption. Implementing this could get a little complicated When tcmu-runner sees the new OP_EXP opcode, what does it do with it? I was not sure how it prevents retries and new commands from executing before the expired command. Will tcmu-runner not start new commands before the OP_EXP is handled? If you do a HA target, then the initiator could retry IO on other paths, so I think we would want to abort the command in tcmu-runner and the handler, before having LIO return status for the command or a TMF for it. So if OP_EXP means for tcmu-runner and the handlers to kill commands then it would work for me. What is the point of the target_core_user timer though? The initiators already do the same thing, so it seems like one more thing to configure. For example, for tape commands we will want to increase it, so why configure it twice. Is it just a hack around target_core_tmr.c not being able to call into the backend modules to perform TMFs? If there is a good use for the timer, should it be in userspace? It seems like it would be easier for tcmu-runner to drive it. We could have a tcmu-runner timer in userspace, call into the handlers and when they kill a command we can fail it with a task aborted status or something like that. Is it just supposed to detect when tcmu-runner has died? > because we'd need to handle if the cmd ring was full, and prioritize > queued OP_EXP opcodes over OP_CMD when ring space became available. > > Mike made me aware of this as an issue recently -- do you have any > further thoughts for how best to solve this? > > Regards -- Andy > > -- > To unsubscribe from this list: send the line "unsubscribe target-devel" in > the body of a message to majordomo@xxxxxxxxxxxxxxx > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe target-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html