> On Sep 18, 2015, at 6:59 AM, Tomas Henzl <thenzl@xxxxxxxxxx> wrote: > On 17.9.2015 19:16, Matthew R. Ochs wrote: >>> On Sep 17, 2015, at 7:38 AM, Tomas Henzl <thenzl@xxxxxxxxxx> wrote: >>> >>> On 16.9.2015 18:53, Matthew R. Ochs wrote: >>>> Interrupt processing can run in parallel to a remove operation. This >>>> can lead to a condition where the interrupt handler is processing with >>>> memory that has been freed. >>>> >>>> To avoid processing an interrupt while memory may be yanked, check for >>>> removal while in the interrupt handler. Bail when removal is imminent. >>>> >>>> Signed-off-by: Matthew R. Ochs <mrochs@xxxxxxxxxxxxxxxxxx> >>>> Signed-off-by: Manoj N. Kumar <manoj@xxxxxxxxxxxxxxxxxx> >>>> --- >>>> drivers/scsi/cxlflash/common.h | 2 ++ >>>> drivers/scsi/cxlflash/main.c | 21 +++++++++++++++------ >>>> 2 files changed, 17 insertions(+), 6 deletions(-) >>>> >>>> diff --git a/drivers/scsi/cxlflash/common.h b/drivers/scsi/cxlflash/common.h >>>> index 1abe4e0..03d2cc6 100644 >>>> --- a/drivers/scsi/cxlflash/common.h >>>> +++ b/drivers/scsi/cxlflash/common.h >>>> @@ -103,6 +103,8 @@ struct cxlflash_cfg { >>>> enum cxlflash_lr_state lr_state; >>>> int lr_port; >>>> >>>> + atomic_t remove_active; >>>> + >>>> struct cxl_afu *cxl_afu; >>>> >>>> struct pci_pool *cxlflash_cmd_pool; >>>> diff --git a/drivers/scsi/cxlflash/main.c b/drivers/scsi/cxlflash/main.c >>>> index 6e85c77..89ee648 100644 >>>> --- a/drivers/scsi/cxlflash/main.c >>>> +++ b/drivers/scsi/cxlflash/main.c >>>> @@ -892,6 +892,7 @@ static void cxlflash_remove(struct pci_dev *pdev) >>>> spin_unlock_irqrestore(&cfg->tmf_waitq.lock, lock_flags); >>>> >>>> cfg->state = STATE_FAILTERM; >>>> + atomic_inc(&cfg->remove_active); >>> Hi Matthew, >>> you could just call term_afu at this point, this way you don't >>> need an additional check in all irq functions. >>> Cheers, >>> Tomas >> Hi Tomas, >> >> We actually do call term_afu() a few lines down from here. I don't follow >> how moving it here would help things. > > When you disable ints sooner (that is what term_afu does ?) you'll get no > more ints later isn't this what you want? Correct, that's what we want. >> The reason for the atomic was to provide something lightweight that we >> could check _inside_ the processing loop for the read-response queue >> handler. A check outside that loop doesn't really provide much in terms >> of closing or narrowing down the window of when freed memory can be >> accessed. >> >> As David Laight correctly pointed out, this approach does not completely >> close the window. We'd need something heavier to fully protect (e.g. a lock >> to wrap around the entire loop). I will look into adding this in a future cycle >> when I can adequately test. > > term_afu calls free_irq and this function > does not return until any executing interrupts for have completed. > This is the sync mechanism you need, it's lightweight > (does not add an additional check to your irq functions) > and closes the race window completely. Thanks for clarifying! I looked at this closer and you are correct, free_irq() guarantees not to return until the interrupt handler has completed. The current location of term_afu() is appropriate as the memory that the handler touches is not freed until the very end [by free_mem() and scsi_host_put()] of the remove. Thus we can simply ignore this patch (I'll remove it in a v3). -matt -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html