On Tue, Jun 02, 2020 at 08:25:14AM -0700, James Bottomley wrote: > On Tue, 2020-06-02 at 05:10 -0700, Matthew Wilcox wrote: > > On Tue, Jun 02, 2020 at 07:50:33PM +0800, Wang Hai wrote: > > > syzkaller reports for memory leak when kobject_init_and_add() > > > returns an error in the function sysfs_slab_add() [1] > > > > > > When this happened, the function kobject_put() is not called for > > > the > > > corresponding kobject, which potentially leads to memory leak. > > > > > > This patch fixes the issue by calling kobject_put() even if > > > kobject_init_and_add() fails. > > > > I think this speaks to a deeper problem with kobject_init_and_add() > > -- the need to call kobject_put() if it fails is not readily apparent > > to most users. This same bug appears in the first three users of > > kobject_init_and_add() that I checked -- > > arch/ia64/kernel/topology.c > > drivers/firmware/dmi-sysfs.c > > drivers/firmware/efi/esrt.c > > drivers/scsi/iscsi_boot_sysfs.c > > > > Some do get it right -- > > arch/powerpc/kernel/cacheinfo.c > > drivers/gpu/drm/ttm/ttm_bo.c > > drivers/gpu/drm/ttm/ttm_memory.c > > drivers/infiniband/hw/mlx4/sysfs.c > > > > I'd argue that the current behaviour is wrong, > > Absolutely agree with this. We have a big meta pattern here where we > introduce functions with tortuous semantics then someone creates a > checker for the semantics and misuses come crawling out of the woodwork > leading to floods of patches, usually for little or never used error > paths, which really don't buy anything apart from theoretical > correctness. Just insisting on simple semantics would have avoided > this. I "introduced" this way back at the end of 2007. It's not exactly a new function. > > that kobject_init_and_add() should call kobject_put() if the add > > fails. This would need a tree-wide audit. But somebody needs to do > > that anyway because based on my random sampling, half of the users > > currently get it wrong. > > Well, the semantics of kobject_init() are free on fail, so these are > the ones everyone seems to be using. The semantics of kobject_add are > put on fail. The problem is that put on fail isn't necessarily correct > in the kobject_init() case: the release function may make assumptions > about the object hierarchy which aren't satisfied in the kobject_init() > failure case. This argues that kobject_init_and_add() can't ever have > correct semantics and we should eliminate it. At the time, it did reduce common functionality and error handling all into a simpler function. And, given it's history, it must have somehow worked for the past 12 years or so :) Odds are, lots of the callers shouldn't be messing around with kobjects in the first place. Originally it was only assumed that there would be very few users. But it has spread to filesystems and firmware subsystems. Drivers should never use it though, so it's a good hint something is wrong there... Anyway, patches to fix this up to make a "sane" api for kobjects is always appreciated. Personally I don't have the time at the moment. thanks, greg k-h