On Thursday 19 February 2009, Stanislaw Gruszka wrote: > Wednesday 18 February 2009 22:25:19 Bartlomiej Zolnierkiewicz napisał(a): > > > I entered a problem with double decreasing module reference counter > > > where it become "negative", here is the usage scenario: > > > > > > # modprobe at91_ide > > > # modprobe ide_gd_mod > > > # lsmod > > > Module Size Used by Not tainted > > > ide_gd_mod 22948 0 > > > at91_ide 4672 0 > > > ide_core 77020 2 ide_gd_mod,at91_ide > > > # rmmod ide_gd_mod > > > # lsmod > > > Module Size Used by Not tainted > > > at91_ide 4672 4294967295 > > > ide_core 77020 1 at91_ide > > > > > > Note when I first remove at91_ide module and then ide_gd_mod > > > everyting is ok. > > > > > > I tired to debug issue and I did not found any suspicious in at91_ide. > > > I think probable reason is double free in ide-gd.c . Here is patch with > > > workaround (or maybe it is a real fix, but I'm not sure): > > > > > > diff --git a/drivers/ide/ide-gd.c b/drivers/ide/ide-gd.c > > > index 7857b20..31ae04e 100644 > > > --- a/drivers/ide/ide-gd.c > > > +++ b/drivers/ide/ide-gd.c > > > @@ -70,8 +70,6 @@ static void ide_gd_remove(ide_drive_t *drive) > > > del_gendisk(g); > > > > > > drive->disk_ops->flush(drive); > > > - > > > - ide_disk_put(idkp); > > > } > > > > > > static void ide_disk_release(struct kref *kref) > > > > > > If this patch is ok, maybe similar things need to be done also in ide-cd and > > > perhaps other device type modules. > > > > Seems like ide_device_put() needs the same module_refcount() check that > > is present in scsi_device_put() so removal of device driver won't trigger > > a spurious module_put() on a host driver? > > I little surprise about scsi code (linux-scsi ML CC). Is comment inside > scsi_device_put() function correct? Why scsi_device_get() not check > try_module_get() return value? And most importand: there is reference > counter check before put, so it can be 0, but data does it protect is in > use ? > > Adding module_refcount() != 0 to ide_device_put() helps only partially, below > commands sequence give oops [1]. > > # modprobe at91_ide > # modprobe ide_gd_mod > # rmmod ide_gd_mod > # modprobe ide_gd_mod > # rmmod at91_ide > > Oops happens because previous "rmmod ide_gd_mod" decrease some reference > counter in ide_device_put() and in "rmmod at91_ide" function del_gendisk() > cause call to drive_release_dev(), which free drive->id before ide_disk_flush() . > This function oops with NULL driver->id. Uh... we will need some more intrusive changes to the reference counting to fix it -- like to replace idkp->kref by idkp->dev and make drive->gendev a parent of it (so only after the final put on ->dev ->gendev can go away). [ IOW we need to have some changes similar to those done in sd.c by: commit 6bdaa1f17dd32ec62345c7b57842f53e6278a2fa and later by: commit ee959b00c335d7780136c5abda37809191fe52c3 ] > There is no oops with my workaround, when I just remove ide_disk_put() from I suppose that after ide_disk_put() removal ide_disk_release() is simply never called... ;) > ide_gd_remove(). It's strange why there is lack of symmetrical _put/_get calls, > ide_gd_probe() has no call to ide_disk_get(). We have kref_init() in ide_disk_probe(), so there is no need for it and we also don't want to hold an extra reference on host driver... Thanks, Bart -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html