On Fri, 3 Jun 2022 at 18:12, Greg KH <gregkh@xxxxxxxxxxxxxxxxxxx> wrote: > > > > > > syzbot has bisected this issue to: > > > > > > > > > > > > commit a9c4cf299f5f79d5016c8a9646fa1fc49381a8c1 > > > > > > Author: Andy Shevchenko <andriy.shevchenko@xxxxxxxxxxxxxxx> > > > > > > Date: Fri Jun 18 13:41:27 2021 +0000 > > > > > > > > > > > > ACPI: sysfs: Use __ATTR_RO() and __ATTR_RW() macros > > > > > > > > > > Hmm... It's not obvious at all how this change can alter the behaviour so > > > > > drastically. device_add() is called from USB core with intf->dev.name == NULL > > > > > by some reason. A-ha, seems like fault injector, which looks like > > > > > > > > > > dev_set_name(&intf->dev, "%d-%s:%d.%d", dev->bus->busnum, > > > > > dev->devpath, configuration, ifnum); > > > > > > > > > > missed the return code check. > > > > > > > > > > But I'm not familiar with that code at all, adding Linux USB ML and Alan. > > > > > > > > I can't see any connection between this bug and acpi/sysfs.c. Is it a > > > > bad bisection? > > > > > > > > It looks like you're right about dev_set_name() failing. In fact, the > > > > kernel appears to be littered with calls to that routine which do not > > > > check the return code (the entire subtree below drivers/usb/ contains > > > > only _one_ call that does check the return code!). The function doesn't > > > > have any __must_check annotation, and its kerneldoc doesn't mention the > > > > return code or the possibility of a failure. > > > > > > > > Apparently the assumption is that if dev_set_name() fails then > > > > device_add() later on will also fail, and the problem will be detected > > > > then. > > > > > > > > So now what should happen when device_add() for an interface fails in > > > > usb_set_configuration()? > > > > > > But how can that really fail on a real system? > > > > > > Is this just due to error-injection stuff? If so, I'm really loath to > > > rework the world for something that can never happen in real life. > > > > > > Or is this a real syzbot-found-with-reproducer issue? > > > > Aren't there quite a few reasons why device_add() might fail? (Although > > most of them probably are memory allocation errors...) > > I was thinking of the dev_set_name() issue further back in the call > chain. > > > Basically, you have to make up your mind. If a function can fail, you > > should be prepared to handle the failure. If it can't fail, there's no > > point in even checking the return code. > > True, ok, we should unwind the mess. I'll try to look at it after the > merge window... > > But again, is this a "real and able to be triggered from userspace" > problem, or just fault-injection-induced? Then this is something to fix in the fault injection subsystem. Testing systems shouldn't be reporting false positives. What allocations cannot fail in real life? Is it <=page_size?