On Tue, Jul 27, 2021 at 08:38:50PM +0200, gregkh@xxxxxxxxxxxxxxxxxxx wrote: > On Tue, Jul 27, 2021 at 11:18:03AM -0700, Luis Chamberlain wrote: > > On Tue, Jul 27, 2021 at 07:46:34PM +0200, gregkh@xxxxxxxxxxxxxxxxxxx wrote: > > > On Tue, Jul 27, 2021 at 10:30:36AM -0700, Luis Chamberlain wrote: > > > > On Sat, Jul 24, 2021 at 12:15:10PM +0000, David Laight wrote: > > > > > From: Luis Chamberlain > > > > > > Sent: 22 July 2021 23:19 > > > > The sysfs store / read file operations are gauranteed to exist using > > > > kernfs's active reference (see kernfs_active()). > > > > > > But that has nothing to do with module reference counts. kernfs knows > > > nothing about modules. > > > > Yes but we are talking about sysfs files which the module creates. So > > but inference again, an active reference protects a module. > > What active reference? kernfs_active() > > > > In fact, this documentation patch was motivated by my own solution to a > > > > possible deadlock when sysfs is used. Using the same example above, if > > > > the same sysfs file uses *any* lock, which is *also* used on the exit > > > > routine, you can easily trigger a deadlock. This can happen for example > > > > by the lock being obtained by the removal routine, then the sysfs file > > > > gets called, waits for the lock to complete, then the module's exit > > > > routine starts cleaning up and removing sysfs files, but we won't be > > > > able to remove the sysfs file (due to kernefs active reference) until > > > > the sysfs file complets, but it cannot complete because the lock is > > > > already held. > > > > > > > > Yes, this is a generic problem. Yes I have proof [0]. Yes, a generic > > > > solution has been proposed [1], and because Greg is not convinced and I > > > > need to move on with life, I am suggesting a temporary driver specific > > > > solution (to which Greg is still NACK'ing, without even proposing any > > > > alternatives) [2]. > > > > > > > > [0] https://lkml.kernel.org/r/20210703004632.621662-5-mcgrof@xxxxxxxxxx > > > > [1] https://lkml.kernel.org/r/20210401235925.GR4332@xxxxxxxxxxxxxxxxxxx > > > > [2] https://lkml.kernel.org/r/20210723174919.ka3tzyre432uilf7@garbanzo > > > > > > My problem with your proposed solution is that it is still racy, you can > > > not increment your own module reference count from 0 -> 1 and expect it > > > to work properly. You need external code to do that somewhere. > > > > You are not providing *any* proof for this. > > I did provide proof of that. Here it is again. <irrelevant example> sysfs files are safe to use try_module_get() because once they are active a removal of the file cannot happen, and so removal will wait. > > And even so, I believe I have clarified as best as possible how a > > kernfs active reference implicitly protects the module when we are > > talking about sysfs files. > > I do not see any link anywhere between kernfs and modules, what am I > missing? Pointers to lines of code would be appreciated. I provided a selftests with error injections inserted all over kernfs_fop_write_iter(). Please study that and my error injection code. > > > Now trying to tie sysfs files to the modules that own them would be > > > nice, but as we have seen, that way lies way too many kernel changes, > > > right? > > > > It's not a one-liner fix. Yes. > > > > > Hm, maybe. Did we think about this from the kobj_attribute level? If > > > we use the "wrapper" logic there and the use of the macros we already > > > have for attributes, we might be able to get the module pointer directly > > > "for free". > > > > > > Did we try that? > > > > That was my hope. I tried that first. Last year in November I determined > > kernfs is kobject stupid. But more importantly *neither* are struct device > > specific, so neither of them have semantics for modules or even devices. > > But what about at the kobject level? kernfs is kobject stupid. > I will try to look at that this week, can't promise anything... Luis