Re: [PATCH v4.4.y] drivers: core: Remove glue dirs from sysfs earlier

Guenter Roeck <groeck@xxxxxxxxxx> · Tue, 29 Jan 2019 11:22:16 -0800

On Tue, Jan 29, 2019 at 7:19 AM Greg KH <gregkh@xxxxxxxxxxxxxxxxxxx> wrote:
>
> On Tue, Jan 29, 2019 at 06:35:48AM -0800, Guenter Roeck wrote:
> > On Mon, Jan 28, 2019 at 10:58 PM Greg KH <gregkh@xxxxxxxxxxxxxxxxxxx> wrote:
> > >
> > > On Mon, Jan 28, 2019 at 09:31:30AM -0800, Zubin Mithra wrote:
> > > > From: Benjamin Herrenschmidt <benh@xxxxxxxxxxxxxxxxxxx>
> > > >
> > > > commit 726e41097920a73e4c7c33385dcc0debb1281e18 upstream
> > > >
> > > > For devices with a class, we create a "glue" directory between
> > > > the parent device and the new device with the class name.
> > > >
> > > > This directory is never "explicitely" removed when empty however,
> > > > this is left to the implicit sysfs removal done by kobject_release()
> > > > when the object loses its last reference via kobject_put().
> > > >
> > > > This is problematic because as long as it's not been removed from
> > > > sysfs, it is still present in the class kset and in sysfs directory
> > > > structure.
> > > >
> > > > The presence in the class kset exposes a use after free bug fixed
> > > > by the previous patch, but the presence in sysfs means that until
> > > > the kobject is released, which can take a while (especially with
> > > > kobject debugging), any attempt at re-creating such as binding a
> > > > new device for that class/parent pair, will result in a sysfs
> > > > duplicate file name error.
> > > >
> > > > This fixes it by instead doing an explicit kobject_del() when
> > > > the glue dir is empty, by keeping track of the number of
> > > > child devices of the gluedir.
> > > >
> > > > This is made easy by the fact that all glue dir operations are
> > > > done with a global mutex, and there's already a function
> > > > (cleanup_glue_dir) called in all the right places taking that
> > > > mutex that can be enhanced for this. It appears that this was
> > > > in fact the intent of the function, but the implementation was
> > > > wrong.
> > > >
> > > > Backport Note: kref_read() is not present in 4.4. Hence,
> > > > use atomic_read(&kref.refcount) instead of kref_read(&kref).
> > > >
> > > > Signed-off-by: Benjamin Herrenschmidt <benh@xxxxxxxxxxxxxxxxxxx>
> > > > Acked-by: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx>
> > > > Signed-off-by: Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx>
> > > > Signed-off-by: Zubin Mithra <zsm@xxxxxxxxxxxx>
> > > > ---
> > > >  drivers/base/core.c     |  2 ++
> > > >  include/linux/kobject.h | 17 +++++++++++++++++
> > > >  2 files changed, 19 insertions(+)
> > >
> > > Wait, why is this needed?
> > >
> >
> > We have syzcaller/syzbot hits, and there is a reproducer. Is this sufficient ?
>
> Nice, where?
>
See below.

> > > And why only for 4.4?  What about 4.9 and 4.14?  Do you want to upgrade
> > > and suddenly hit the same "bug" that you fixed before?
> > >
> >
> > Good point. Sorry, that got lost. We are, after all, oly human. I'd
> > ask Zubin to provide backports, or do it myself, but we'll have to
> > resolve the issue you bring up below first.
> >
> > > There was a reason that I did not backport this to the stable tree when
> > > it was submitted, and that was because this was an odd race to ever hit.
> > > Are you hitting this in the real world without kobject deferred
> > > release enabled?  And if so, are you hitting the WARN_ON that is added
> > > here?
> > >
> > I think we may need updated rules for stable. Many bug fixes are
> > backported to stable releases without having been seen in the "real
> > world" (whatever that means). At the same time we do see many races in
> > the real world, many of them not fully understood. Our policy so far
> > is to fix as many problems as possible if they are understood, in the
> > hope that it fixes at least some of those problems seen in the field.
>
> I don't have a problem with backporting this, but it went in as a "fixes
> a theoritical issue, and let's WARN_ON if it really ever is hit".
>
> So I would like to see where we are hitting that WARN_ON, as it sounds
> like you found an easy-to-reproduce way to do it.
>
> > If there is a new rule that problems have to have been observed in the
> > real world (ie without debugging options enabled, and without specific
> > reproducer) before a patch is applied to stable, can we have that as
> > generic rule that is not only selectively applied ?
>
> It's not a specific rule, it's just that this specific patch went
> through a 30+ email chain of us arguing about it, so I really want to
> see why this is needed, and why Ben was right and I was wrong :)
>
> I'm not trying to be extra hard here, it's just that I have a lot of
> history with this patch...
>
> So if you all have a reproducer, I would love to see it.
>
Turns out you were already on Cc: for one of the reports:

https://b.corp.google.com/issues/112630408

I added you to the other:

https://b.corp.google.com/issues/112630534

The syzbot dashboard links in those bugs point to the reproducers.

> Oh, and the backports to the other kernel versions as well, you don't
> want to have to fix this again in a year when you all upgrade to a new
> release.
>
For v4.14: The upstream patch (726e41097920) applies directly.
For v4.9.y: The patch provided for v4.4.y also applies to v4.9.y.

Thanks,
Guenter