Re: [PATCH v6 2/3] zram: fix deadlock with sysfs attribute usage and module removal

Luis Chamberlain <mcgrof@xxxxxxxxxx> · Thu, 22 Jul 2021 15:17:05 -0700

On Wed, Jul 21, 2021 at 01:29:29PM +0200, Greg KH wrote:
> On Fri, Jul 02, 2021 at 05:19:57PM -0700, Luis Chamberlain wrote:
> > +#define MODULE_DEVICE_ATTR_FUNC_STORE(_name) \
> > +static ssize_t module_ ## _name ## _store(struct device *dev, \
> > +				   struct device_attribute *attr, \
> > +				   const char *buf, size_t len) \
> > +{ \
> > +	ssize_t __ret; \
> > +	if (!try_module_get(THIS_MODULE)) \
> > +		return -ENODEV; \
> 
> I feel like this needs to be written down somewhere as I see it come up
> all the time.

I'll go ahead and cook up a patch to do just this after I send this
email out.

> Again, this is racy and broken code.  You can NEVER try to increment
> your own module reference count unless it has already been incremented
> by someone external first.

In the zram driver's case the sysfs files are still pegged on, because
as we noted before the kernfs active reference will ensure the store
operation still exists. If the driver removes the operation prior to
getting the active reference, the write will just fail. kernfs ensures
once a file is opened the op is not removed until the operation completes.

If a file is opened then, the module cannot possibly be removed. The
piece of information we realy care about is the use of module_is_live()
inside try_module_get() which does:

static inline bool module_is_live(struct module *mod)
{                                                                               
	return mod->state != MODULE_STATE_GOING;
}

The try allows module removal to trump use of the sysfs file. If
userspace wants the module removed, it gives up in favor for that
operation.

> As "proof", what happens if this module is unloaded right _before_ this
> call happens?  The module will be unloaded, memory zeroed out (or
> overridden), and then the processor will resume here and try to call (or
> return into) this code path.

The use of try_module_get() is protected to be correct by the kernfs active
reference, which in turn ensures the module is not gone. That is, when
a sysfs file read / write op is issued, if the file was opened we *know*
the module is not gone yet. It cannot possibly be removed. But once
inside the operation, try_module_get() can check to see if userspace did
want to remove the module, and if so it would immediately bail out and
yield to that operation.

Userspace cannot open a sysfs file with the module being gone.
The kernfs active prevents that.

> Boom.

I think it would be good we add a self test for this particular case.
I'll go ahead and extend my sysfs tests with one test for this case.

I could do this by adding a new sysfs file to the test driver and where
all it does is this try_module_get() thing. This can then be raced with
module removals attempts.

But it does not mean all you say is wrong.

I think the value of what you are saying requires documenting as it
was not clear to me either. I'll send a patch now.

> Just say no to "try_module_get(THIS_MODULE)" as it is totally wrong.

Context is required and documented. I'll end a patch.

  Luis