On Fri, Dec 22, 2023 at 04:27:16PM +0800, Yu Kuai wrote: > Hi, > > 在 2023/12/22 14:49, Luis Chamberlain 写道: > > On Fri, Dec 08, 2023 at 04:23:35PM +0800, linan666@xxxxxxxxxxxxxxx wrote: > > > From: Li Nan <linan122@xxxxxxxxxx> > > > > > > "if device_add() succeeds, you should call device_del() when you want to > > > get rid of it." > > > > > > In sd_probe(), device_add_disk() fails when device_add() has already > > > succeeded, so change put_device() to device_unregister() to ensure device > > > resources are released. > > > > > > Fixes: 2a7a891f4c40 ("scsi: sd: Add error handling support for add_disk()") > > > Signed-off-by: Li Nan <linan122@xxxxxxxxxx> > > > > Nacked-by: Luis Chamberlain <mcgrof@xxxxxxxxxx> > > > > > --- > > > drivers/scsi/sd.c | 2 +- > > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > > > diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c > > > index 542a4bbb21bc..d81cbeee06eb 100644 > > > --- a/drivers/scsi/sd.c > > > +++ b/drivers/scsi/sd.c > > > @@ -3736,7 +3736,7 @@ static int sd_probe(struct device *dev) > > > error = device_add_disk(dev, gd, NULL); > > > if (error) { > > > - put_device(&sdkp->disk_dev); > > > + device_unregister(&sdkp->disk_dev); > > > put_disk(gd); > > > goto out; > > > } > > > > This is incorrect, device_unregister() calls: > > > > void device_unregister(struct device *dev) > > { > > pr_debug("device: '%s': %s\n", dev_name(dev), __func__); > > device_del(dev); > > put_device(dev); > > } > > > > So you're adding what you believe to be a correct missing device_del(). > > But what you missed is that if device_add_disk() fails then device_add() > > did not succeed because the new code we have in the kernel *today* unwinds > > this for us now. > > I'm confused here, there are two device here, one is 'sdkp->disk_dev', > one is gendisk->part0->bd_device, and the order in which they > initialize: > > sd_probe > device_add(&sdkp->disk_dev) -> succeed > device_add_disk -> failed, and device_add(bd_device) did not succeed > put_device(&sdkp->disk_dev) -> device_del is missed > > I don't see that if device_add_disk() fail, device_del() for > 'sdkp->disk_dev'is called from anywhere. Do I missing anything? Ah then the fix is still incorrect and the commit log should describe that this is for another device. How about this instead? >From c3f6e03f4a82aa253b6c487a293dcd576393b606 Mon Sep 17 00:00:00 2001 From: Luis Chamberlain <mcgrof@xxxxxxxxxx> Date: Mon, 29 Jan 2024 09:25:18 -0800 Subject: [PATCH] sd: remove extra put_device() for extra scsi device The sd driver first device_add() its own device, and later use device_add_disk() with another device. When we added error handling for device_add_disk() we now call put_disk() and that will trigger disk_release() when the refcount is 0. That will end up calling the block driver's disk->fops->free_disk() if one is defined. The sd driver has scsi_disk_free_disk() as its free_disk() and that does the proper put_device(&sdkp->disk_dev) for us so we should not need to call it, however we are left still missing the device_del() for it. While at it, unwind with scsi_autopm_put_device(sdp) *prior* to putting to device as we do in sd_remove(). Reported-by: Li Nan <linan122@xxxxxxxxxx> Reported-by: Yu Kuai <yukuai1@xxxxxxxxxxxxxxx> Fixes: 2a7a891f4c40 ("scsi: sd: Add error handling support for add_disk()") Signed-off-by: Luis Chamberlain <mcgrof@xxxxxxxxxx> --- drivers/scsi/sd.c | 8 +++++--- 1 file changed, 5 insertions(+), 3 deletions(-) diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c index 7f949adbadfd..6475a3c947f8 100644 --- a/drivers/scsi/sd.c +++ b/drivers/scsi/sd.c @@ -3693,8 +3693,9 @@ static int sd_probe(struct device *dev) error = device_add(&sdkp->disk_dev); if (error) { + scsi_autopm_put_device(sdp); put_device(&sdkp->disk_dev); - goto out; + return error; } dev_set_drvdata(dev, sdkp); @@ -3734,9 +3735,10 @@ static int sd_probe(struct device *dev) error = device_add_disk(dev, gd, NULL); if (error) { - put_device(&sdkp->disk_dev); + scsi_autopm_put_device(sdp); + device_del(&sdkp->disk_dev); put_disk(gd); - goto out; + return error; } if (sdkp->security) { -- 2.42.0