From: Roland Dreier <roland@xxxxxxxxxxxxxxx> > Oops, looks like it actually gets used. So my vote for adding the > splice fix ASAP. > > Looking at the surrounding code I'd rather fix the reenetrance problem > by using workqueues properly, that is: > > - remove dev->qf_cmd_list and dev->dev_qf_count entirely > - instead add a work_struct to struct se_cmd > - use that to queue the cmd directly onto a workqueue Fair enough, here's the splice-only fix. As I said, I think this is definitely for 3.1 since this fixes crashes in practice for me. I agree that the queue full stuff needs further fixing, eg I noticed the code: if (atomic_read(&cmd->se_dev->dev_qf_count) != 0) schedule_work(&cmd->se_dev->qf_work_queue); which even with cargo-cult sprinkling of smp_mb__after_atomic_inc() obviously can't avoid the race where dev_qf_count gets incremented from 0 to 1 just after the if statement tests it. So I'll work on further cleanup for 3.2. - R. ---- 8< ---- Subject: target: Fix race between multiple invocations of target_qf_do_work() When work is scheduled with schedule_work(), the work can end up running on multiple CPUs at the same time -- this happens if the work is already running on one CPU and schedule_work() is called on another CPU. This leads to list corruption with target_qf_do_work(), which is roughly doing: spin_lock(...); list_for_each_entry_safe(...) { list_del(...); spin_unlock(...); // do stuff spin_lock(...); } With multiple CPUs running this code, one CPU can end up deleting the list entry that the other CPU is about to work on. Fix this by splicing the list entries onto a local list and then operating on that in the work function. This way, each invocation of target_qf_do_work() operates on its own local list and so multiple invocations don't corrupt each other's list. This also avoids dropping and reacquiring the lock for each list entry. Signed-off-by: Roland Dreier <roland@xxxxxxxxxxxxxxx> --- drivers/target/target_core_transport.c | 11 +++++------ 1 files changed, 5 insertions(+), 6 deletions(-) diff --git a/drivers/target/target_core_transport.c b/drivers/target/target_core_transport.c index a2f4713..2cd415c 100644 --- a/drivers/target/target_core_transport.c +++ b/drivers/target/target_core_transport.c @@ -978,15 +978,17 @@ static void target_qf_do_work(struct work_struct *work) { struct se_device *dev = container_of(work, struct se_device, qf_work_queue); + LIST_HEAD(qf_cmd_list); struct se_cmd *cmd, *cmd_tmp; spin_lock_irq(&dev->qf_cmd_lock); - list_for_each_entry_safe(cmd, cmd_tmp, &dev->qf_cmd_list, se_qf_node) { - + list_splice_init(&dev->qf_cmd_list, &qf_cmd_list); + spin_unlock_irq(&dev->qf_cmd_lock); + + list_for_each_entry_safe(cmd, cmd_tmp, &qf_cmd_list, se_qf_node) { list_del(&cmd->se_qf_node); atomic_dec(&dev->dev_qf_count); smp_mb__after_atomic_dec(); - spin_unlock_irq(&dev->qf_cmd_lock); pr_debug("Processing %s cmd: %p QUEUE_FULL in work queue" " context: %s\n", cmd->se_tfo->get_fabric_name(), cmd, @@ -998,10 +1000,7 @@ static void target_qf_do_work(struct work_struct *work) * has been added to head of queue */ transport_add_cmd_to_queue(cmd, cmd->t_state); - - spin_lock_irq(&dev->qf_cmd_lock); } - spin_unlock_irq(&dev->qf_cmd_lock); } unsigned char *transport_dump_cmd_direction(struct se_cmd *cmd) -- 1.7.5.4 -- To unsubscribe from this list: send the line "unsubscribe target-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html