On Fri, Jun 8, 2018 at 5:16 PM, Dmitry Vyukov <dvyukov@xxxxxxxxxx> wrote: >> On Fri, Jun 8, 2018 at 4:31 AM, Tetsuo Handa >> <penguin-kernel@xxxxxxxxxxxxxxxxxxx> wrote: >>> Dmitry Vyukov wrote: >>>> On Tue, Jun 5, 2018 at 3:45 PM, Tetsuo Handa >>>> <penguin-kernel@xxxxxxxxxxxxxxxxxxx> wrote: >>>> > Dmitry, can you assign VM resources for a git tree for this bug? This bug wants to fight >>>> > against https://github.com/google/syzkaller/blob/master/docs/syzbot.md#no-custom-patches ... >>>> >>>> Hi Tetsuo, >>>> >>>> Most of the reasons for not doing it still stand. A syzkaller instance >>>> will produce not just this bug, it will produce hundreds of different >>>> bugs. Then the question is: what to do with these bugs? Report all to >>>> mailing lists? >>> >>> Is it possible to add linux-next.git tree as a target for fuzzing? If yes, >>> we can try debug patches easily, in addition to find bugs earlier than now. >> >> syzbot tested linux-next and mmotm initially, but they were removed at >> the request of kernel developers. See: >> https://groups.google.com/d/msg/syzkaller/0H0LHW_ayR8/dsK5qGB_AQAJ >> and: >> https://groups.google.com/d/msg/syzkaller-bugs/FeAgni6Atlk/U0JGoR0AAwAJ >> Indeed, linux-next produces around 50 assorted one-off unexplainable >> bug reports. >> >> >>>> I think the solution here is just to run syzkaller instance locally. >>>> It's just a program anybody can run it on any kernel with any custom >>>> patches. Moreover for local instance it's also possible to limit set >>>> of tested syscalls to increase probability of hitting this bug and at >>>> the same time filter out most of other bugs. >>> >>> If this bug is reproducible with VM resources individual developer can afford... >>> >>> Since my Linux development environment is VMware guests on a Windows PC, I can't >>> run VM instance which needs KVM acceleration. Also, due to security policy, I can't >>> utilize external VM resources available on the Internet, as well as I can't use ssh >>> and git protocols. Speak of this bug, even with a lot of VM instances, syzbot can >>> reproduce this bug only once or twice per a day. Thus, the question for me boils >>> down to, whether I can reproduce this bug using one VMware guest instance with 4GB >>> of memory. Effectively, I don't have access to environments for running syzkaller >>> instance... >> >> Well, I don't know what to say, it does require some resources. >> >>>> Do we have any idea about the guilty subsystem? You mentioned >>>> bdi_unregister, why? What would be the set of syscalls to concentrate >>>> on? >>>> I will do a custom run when I get around to it, if nobody else beats me to it. >>> >>> Because bdi_unregister() does "bdi->dev = NULL;" which wb_workfn() is hitting >>> NULL pointer dereference. >> >> Right, wb_workfn is not a generic function, it's fs-specific function. >> >> Trying to reproduce this locally now. > > > No luck so far. > > Trying to look from a different angle: is it possible that bdi->dev is > not set yet, rather then already reset? I was able to reproduce this once locally running syz-crush utility replaying one of the crash logs. Now running with Tetsuo's patch. I can say we hunting a very subtle race condition with short inconsistency window, perhaps few instructions.