Re: [PATCH v4.4.y] drivers: core: Remove glue dirs from sysfs earlier

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Jan 29, 2019 at 11:22 AM Guenter Roeck <groeck@xxxxxxxxxx> wrote:
>
> On Tue, Jan 29, 2019 at 7:19 AM Greg KH <gregkh@xxxxxxxxxxxxxxxxxxx> wrote:
> >
> > On Tue, Jan 29, 2019 at 06:35:48AM -0800, Guenter Roeck wrote:
> > > On Mon, Jan 28, 2019 at 10:58 PM Greg KH <gregkh@xxxxxxxxxxxxxxxxxxx> wrote:
> > > >
> > > > On Mon, Jan 28, 2019 at 09:31:30AM -0800, Zubin Mithra wrote:
> > > > > From: Benjamin Herrenschmidt <benh@xxxxxxxxxxxxxxxxxxx>
> > > > >
> > > > > commit 726e41097920a73e4c7c33385dcc0debb1281e18 upstream
> > > > >
> > > > > For devices with a class, we create a "glue" directory between
> > > > > the parent device and the new device with the class name.
> > > > >
> > > > > This directory is never "explicitely" removed when empty however,
> > > > > this is left to the implicit sysfs removal done by kobject_release()
> > > > > when the object loses its last reference via kobject_put().
> > > > >
> > > > > This is problematic because as long as it's not been removed from
> > > > > sysfs, it is still present in the class kset and in sysfs directory
> > > > > structure.
> > > > >
> > > > > The presence in the class kset exposes a use after free bug fixed
> > > > > by the previous patch, but the presence in sysfs means that until
> > > > > the kobject is released, which can take a while (especially with
> > > > > kobject debugging), any attempt at re-creating such as binding a
> > > > > new device for that class/parent pair, will result in a sysfs
> > > > > duplicate file name error.
> > > > >
> > > > > This fixes it by instead doing an explicit kobject_del() when
> > > > > the glue dir is empty, by keeping track of the number of
> > > > > child devices of the gluedir.
> > > > >
> > > > > This is made easy by the fact that all glue dir operations are
> > > > > done with a global mutex, and there's already a function
> > > > > (cleanup_glue_dir) called in all the right places taking that
> > > > > mutex that can be enhanced for this. It appears that this was
> > > > > in fact the intent of the function, but the implementation was
> > > > > wrong.
> > > > >
> > > > > Backport Note: kref_read() is not present in 4.4. Hence,
> > > > > use atomic_read(&kref.refcount) instead of kref_read(&kref).
> > > > >
> > > > > Signed-off-by: Benjamin Herrenschmidt <benh@xxxxxxxxxxxxxxxxxxx>
> > > > > Acked-by: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx>
> > > > > Signed-off-by: Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx>
> > > > > Signed-off-by: Zubin Mithra <zsm@xxxxxxxxxxxx>
> > > > > ---
> > > > >  drivers/base/core.c     |  2 ++
> > > > >  include/linux/kobject.h | 17 +++++++++++++++++
> > > > >  2 files changed, 19 insertions(+)
> > > >
> > > > Wait, why is this needed?
> > > >
> > >
> > > We have syzcaller/syzbot hits, and there is a reproducer. Is this sufficient ?
> >
> > Nice, where?
> >
> See below.
>
> > > > And why only for 4.4?  What about 4.9 and 4.14?  Do you want to upgrade
> > > > and suddenly hit the same "bug" that you fixed before?
> > > >
> > >
> > > Good point. Sorry, that got lost. We are, after all, oly human. I'd
> > > ask Zubin to provide backports, or do it myself, but we'll have to
> > > resolve the issue you bring up below first.
> > >
> > > > There was a reason that I did not backport this to the stable tree when
> > > > it was submitted, and that was because this was an odd race to ever hit.
> > > > Are you hitting this in the real world without kobject deferred
> > > > release enabled?  And if so, are you hitting the WARN_ON that is added
> > > > here?
> > > >
> > > I think we may need updated rules for stable. Many bug fixes are
> > > backported to stable releases without having been seen in the "real
> > > world" (whatever that means). At the same time we do see many races in
> > > the real world, many of them not fully understood. Our policy so far
> > > is to fix as many problems as possible if they are understood, in the
> > > hope that it fixes at least some of those problems seen in the field.
> >
> > I don't have a problem with backporting this, but it went in as a "fixes
> > a theoritical issue, and let's WARN_ON if it really ever is hit".
> >
> > So I would like to see where we are hitting that WARN_ON, as it sounds
> > like you found an easy-to-reproduce way to do it.
> >
> > > If there is a new rule that problems have to have been observed in the
> > > real world (ie without debugging options enabled, and without specific
> > > reproducer) before a patch is applied to stable, can we have that as
> > > generic rule that is not only selectively applied ?
> >
> > It's not a specific rule, it's just that this specific patch went
> > through a 30+ email chain of us arguing about it, so I really want to
> > see why this is needed, and why Ben was right and I was wrong :)
> >
> > I'm not trying to be extra hard here, it's just that I have a lot of
> > history with this patch...
> >
> > So if you all have a reproducer, I would love to see it.
> >
> Turns out you were already on Cc: for one of the reports:
>
> https://b.corp.google.com/issues/112630408
>
> I added you to the other:
>
> https://b.corp.google.com/issues/112630534
>

Public link:

https://groups.google.com/forum/#!msg/syzkaller-bugs/LruHGgq_a-k/bd0RED9nAAAJ

Unfortunately, that is the only one I found. There may be others - the
internal links above suggest that the problem was reported several
times - but I wasn't able to find it.

Guenter

> The syzbot dashboard links in those bugs point to the reproducers.
>
> > Oh, and the backports to the other kernel versions as well, you don't
> > want to have to fix this again in a year when you all upgrade to a new
> > release.
> >
> For v4.14: The upstream patch (726e41097920) applies directly.
> For v4.9.y: The patch provided for v4.4.y also applies to v4.9.y.
>
> Thanks,
> Guenter



[Index of Archives]     [Linux Kernel]     [Kernel Development Newbies]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite Hiking]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux