Re: [PATCH v6 2/3] zram: fix deadlock with sysfs attribute usage and module removal

Luis Chamberlain <mcgrof@xxxxxxxxxx> · Fri, 23 Jul 2021 10:49:19 -0700

On Fri, Jul 23, 2021 at 01:15:49PM +0200, Greg KH wrote:
> On Thu, Jul 22, 2021 at 03:17:05PM -0700, Luis Chamberlain wrote:
> > On Wed, Jul 21, 2021 at 01:29:29PM +0200, Greg KH wrote:
> > > On Fri, Jul 02, 2021 at 05:19:57PM -0700, Luis Chamberlain wrote:
> > > > +#define MODULE_DEVICE_ATTR_FUNC_STORE(_name) \
> > > > +static ssize_t module_ ## _name ## _store(struct device *dev, \
> > > > +				   struct device_attribute *attr, \
> > > > +				   const char *buf, size_t len) \
> > > > +{ \
> > > > +	ssize_t __ret; \
> > > > +	if (!try_module_get(THIS_MODULE)) \
> > > > +		return -ENODEV; \
> > > 
> > > I feel like this needs to be written down somewhere as I see it come up
> > > all the time.
> > 
> > I'll go ahead and cook up a patch to do just this after I send this
> > email out.
> > 
> > > Again, this is racy and broken code.  You can NEVER try to increment
> > > your own module reference count unless it has already been incremented
> > > by someone external first.
> > 
> > In the zram driver's case the sysfs files are still pegged on, because
> > as we noted before the kernfs active reference will ensure the store
> > operation still exists.
> 
> How does that happen without a module lock?

If a read / write operations is happening on a sysfs file created by a
module, the module cannot be removed because it is the module's own
responsibility to remove the sysfs file on module exit. There is no
module lock. It is inferred.

> > If the driver removes the operation prior to
> > getting the active reference, the write will just fail. kernfs ensures
> > once a file is opened the op is not removed until the operation completes.
> 
> How does it do that?

Using an active reference.

> > If a file is opened then, the module cannot possibly be removed. The
> > piece of information we realy care about is the use of module_is_live()
> > inside try_module_get() which does:
> > 
> > static inline bool module_is_live(struct module *mod)
> > {                                                                               
> > 	return mod->state != MODULE_STATE_GOING;
> > }
> > 
> > The try allows module removal to trump use of the sysfs file. If
> > userspace wants the module removed, it gives up in favor for that
> > operation.
> 
> I do not see the tie in kernfs to module reference counts, what am I
> missing?

Let me try to describe this again. Let's take it step by step, premise
by premise on the inference assumption. Let me know at which point you
disagree.

We are talking about sysfs files and you're argument is that
try_module_get() should lock the module, and so cannot be used
in sysfs files. My point is that such module lock is inferred:

1) Sysfs files are created by a module, that same module is responsible
   for removing the same sysfs files.
2) The module can only be removed and gone, once *all* sysfs files are
   removed first.
3) If any of the module's sysfs files are present the module must
   still be present
4) kernfs ensures that if a file is opened the file will not be
   removed until any pending operation completes
5) If a sysfs file is used to write something, that means the
   sysfs file has not yet been removed, and we know it will
   remain in existance throughout its entire operation
6) When a sysfs file operation is being run, the module must
   always exist

  Luis