On Thu, 22 Sep 2011 16:33:50 -0700 Andy Grover <agrover@xxxxxxxxxx> wrote: > On 09/21/2011 11:53 PM, FUJITA Tomonori wrote: > > On Wed, 21 Sep 2011 15:38:45 -0700 > > Andy Grover <agrover@xxxxxxxxxx> wrote: > > > >> Both the receiving thread and the bs worker threads access the hash of > >> lists at it_nexus->cmd_hash_list. It may be the case that this is causing > > > > Really? Can you explain how worker threads touch > > it_nexus->cmd_hash_list? > > Sorry, I was wrong, they don't. > > But, the backtraces point strongly at list corruption when traversing a > list off of the cmd_hash_list in abort_task_set(). Kiefer also mentioned > reducing the number of worker threads seemed to help. But maybe that > just lowered performance so the issue was less likely? I thought it > might be interesting to see if a possible race was fixed by locking > around that structure. Now that I have a better understanding of the tgt > event loop mechanism, there don't appear to be any accesses to > cmd_hash_list except by the main thread. > > How else could this be happening? Use after free? We might handle the same command twice. I don't have a clear idea yet. -- To unsubscribe from this list: send the line "unsubscribe stgt" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html