On 2022/12/1 7:06, Alex Williamson wrote: > [Cc +vfio-ap, vfio-ccw] > > On Fri, 18 Nov 2022 11:28:27 +0800 > ruanjinjie <ruanjinjie@xxxxxxxxxx> wrote: > >> Inject fault while probing module, if device_register() fails, >> but the refcount of kobject is not decreased to 0, the name >> allocated in dev_set_name() is leaked. Fix this by calling >> put_device(), so that name can be freed in callback function >> kobject_cleanup(). >> >> unreferenced object 0xffff88807d687008 (size 8): >> comm "modprobe", pid 8280, jiffies 4294807686 (age 12.378s) >> hex dump (first 8 bytes): >> 6d 64 70 79 00 6b 6b a5 mdpy.kk. >> backtrace: >> [<ffffffff8174f19e>] __kmalloc_node_track_caller+0x4e/0x150 >> [<ffffffff81731d53>] kstrdup+0x33/0x60 >> [<ffffffff83aa1421>] kobject_set_name_vargs+0x41/0x110 >> [<ffffffff82d91abb>] dev_set_name+0xab/0xe0 >> [<ffffffffa0260105>] 0xffffffffa0260105 >> [<ffffffff81001c27>] do_one_initcall+0x87/0x2e0 >> [<ffffffff813739cb>] do_init_module+0x1ab/0x640 >> [<ffffffff81379d20>] load_module+0x5d00/0x77f0 >> [<ffffffff8137bc40>] __do_sys_finit_module+0x110/0x1b0 >> [<ffffffff83c944a5>] do_syscall_64+0x35/0x80 >> [<ffffffff83e0006a>] entry_SYSCALL_64_after_hwframe+0x46/0xb0 >> >> unreferenced object 0xffff888101ccbcf8 (size 8): >> comm "modprobe", pid 15662, jiffies 4295164481 (age 13.282s) >> hex dump (first 8 bytes): >> 6d 74 74 79 00 6b 6b a5 mtty.kk. >> backtrace: >> [<ffffffff8174f19e>] __kmalloc_node_track_caller+0x4e/0x150 >> [<ffffffff81731d53>] kstrdup+0x33/0x60 >> [<ffffffff83aa1421>] kobject_set_name_vargs+0x41/0x110 >> [<ffffffff82d91abb>] dev_set_name+0xab/0xe0 >> [<ffffffffa0248134>] 0xffffffffa0248134 >> [<ffffffff81001c27>] do_one_initcall+0x87/0x2e0 >> [<ffffffff813739cb>] do_init_module+0x1ab/0x640 >> [<ffffffff81379d20>] load_module+0x5d00/0x77f0 >> [<ffffffff8137bc40>] __do_sys_finit_module+0x110/0x1b0 >> [<ffffffff83c944a5>] do_syscall_64+0x35/0x80 >> [<ffffffff83e0006a>] entry_SYSCALL_64_after_hwframe+0x46/0xb0 >> >> unreferenced object 0xffff88810177c6c8 (size 8): >> comm "modprobe", pid 23657, jiffies 4295314656 (age 13.227s) >> hex dump (first 8 bytes): >> 6d 62 6f 63 68 73 00 a5 mbochs.. >> backtrace: >> [<ffffffff8174f19e>] __kmalloc_node_track_caller+0x4e/0x150 >> [<ffffffff81731d53>] kstrdup+0x33/0x60 >> [<ffffffff83aa1421>] kobject_set_name_vargs+0x41/0x110 >> [<ffffffff82d91abb>] dev_set_name+0xab/0xe0 >> [<ffffffffa0248124>] 0xffffffffa0248124 >> [<ffffffff81001c27>] do_one_initcall+0x87/0x2e0 >> [<ffffffff813739cb>] do_init_module+0x1ab/0x640 >> [<ffffffff81379d20>] load_module+0x5d00/0x77f0 >> [<ffffffff8137bc40>] __do_sys_finit_module+0x110/0x1b0 >> [<ffffffff83c944a5>] do_syscall_64+0x35/0x80 >> [<ffffffff83e0006a>] entry_SYSCALL_64_after_hwframe+0x46/0xb0 >> >> Fixes: d61fc96f47fd ("sample: vfio mdev display - host device") >> Fixes: 9d1a546c53b4 ("docs: Sample driver to demonstrate how to use Mediated device framework.") >> Fixes: a5e6e6505f38 ("sample: vfio bochs vbe display (host device for bochs-drm)") >> Signed-off-by: ruanjinjie <ruanjinjie@xxxxxxxxxx> >> --- >> samples/vfio-mdev/mbochs.c | 4 +++- >> samples/vfio-mdev/mdpy.c | 4 +++- >> samples/vfio-mdev/mtty.c | 4 +++- >> 3 files changed, 9 insertions(+), 3 deletions(-) >> >> diff --git a/samples/vfio-mdev/mbochs.c b/samples/vfio-mdev/mbochs.c >> index 117a8d799f71..1c47672be815 100644 >> --- a/samples/vfio-mdev/mbochs.c >> +++ b/samples/vfio-mdev/mbochs.c >> @@ -1430,8 +1430,10 @@ static int __init mbochs_dev_init(void) >> dev_set_name(&mbochs_dev, "%s", MBOCHS_NAME); >> >> ret = device_register(&mbochs_dev); >> - if (ret) >> + if (ret) { >> + put_device(&mbochs_dev); >> goto err_class; >> + } >> >> ret = mdev_register_parent(&mbochs_parent, &mbochs_dev, &mbochs_driver, >> mbochs_mdev_types, > > > vfio-ap has a similar unwind as the sample drivers, but actually makes > an attempt to catch this ex: I think the reason is vfio-ap driver error path has common unwind to do before device_register, otherwise it can return just after put_device or device_unregister. > > ... > ret = device_register(&matrix_dev->device); > if (ret) > goto matrix_reg_err; > > ret = driver_register(&matrix_driver); > if (ret) > goto matrix_drv_err; > > return 0; > > matrix_drv_err: > device_unregister(&matrix_dev->device); > matrix_reg_err: > put_device(&matrix_dev->device); > ... > > So of the vfio drivers calling device_register(), vfio-ap is the only > one that does a put_device() if device_register() fails, but it also > seems sketchy to call both device_unregister() and put_device() in the > case that we exit via matrix_drv_err. The patch do not change the original error path, just add missing put_device out of the normal error path if device_register fails , so there is no risk of calling both device_unregister() and put_device(). > > I wonder if all of these shouldn't adopt a flow like: > > ret = device_register(&dev); > if (ret) > goto err1; > > .... > > return 0; > > err2: > device_del(&dev); > err1: > put_device(&dev); > > Thanks, > > Alex > >> diff --git a/samples/vfio-mdev/mdpy.c b/samples/vfio-mdev/mdpy.c >> index 946e8cfde6fd..bfb93eaf535b 100644 >> --- a/samples/vfio-mdev/mdpy.c >> +++ b/samples/vfio-mdev/mdpy.c >> @@ -717,8 +717,10 @@ static int __init mdpy_dev_init(void) >> dev_set_name(&mdpy_dev, "%s", MDPY_NAME); >> >> ret = device_register(&mdpy_dev); >> - if (ret) >> + if (ret) { >> + put_device(&mdpy_dev); >> goto err_class; >> + } >> >> ret = mdev_register_parent(&mdpy_parent, &mdpy_dev, &mdpy_driver, >> mdpy_mdev_types, >> diff --git a/samples/vfio-mdev/mtty.c b/samples/vfio-mdev/mtty.c >> index e72085fc1376..dddb0619846c 100644 >> --- a/samples/vfio-mdev/mtty.c >> +++ b/samples/vfio-mdev/mtty.c >> @@ -1330,8 +1330,10 @@ static int __init mtty_dev_init(void) >> dev_set_name(&mtty_dev.dev, "%s", MTTY_NAME); >> >> ret = device_register(&mtty_dev.dev); >> - if (ret) >> + if (ret) { >> + put_device(&mtty_dev.dev); >> goto err_class; >> + } >> >> ret = mdev_register_parent(&mtty_dev.parent, &mtty_dev.dev, >> &mtty_driver, mtty_mdev_types, >