Bjorn Helgaas <helgaas@xxxxxxxxxx> writes: > On Thu, Jul 19, 2018 at 02:18:09PM +1000, Benjamin Herrenschmidt wrote: >> On Wed, 2018-07-18 at 18:29 -0500, Bjorn Helgaas wrote: >> > [+cc Paul, Michael, linuxppc-dev] >> > >> >> ..../... >> >> > > Debugging revealed a race condition between pcie core driver >> > > enabling is_added bit(pci_bus_add_device()) and nvme driver >> > > reset work-queue enabling is_busmaster bit (by pci_set_master()). >> > > As both fields are not handled in atomic manner and that clears >> > > is_added bit. >> > > >> > > Fix moves device addition is_added bit to separate private flag >> > > variable and use different atomic functions to set and retrieve >> > > device addition state. As is_added shares different memory >> > > location so race condition is avoided. >> > >> > Really nice bit of debugging! >> >> Indeed. However I'm not fan of the solution. Shouldn't we instead have >> some locking for the content of pci_dev ? I've always been wary of us >> having other similar races in there. >> >> As for the powerpc bits, I'm probably the one who wrote them, however, >> I'm on vacation this week and right now, no bandwidth to context switch >> all that back in :-) So give me a few days and/or ping me next week. > > OK, here's a ping :) > > Some powerpc cleanup would be ideal, but I'd like to fix the race for > v4.19, so I'm fine with this patch as-is. But I'd definitely want > your ack before inserting the ugly #include path in the powerpc code. Sorry, the patch didn't hit linuxppc so I forgot about it. I'm OK with the patch, the include is a bit gross, but I guess it's fine. I have a change to pseries/setup.c queued that might collide, though it's just an addition of another include so it's a trivial fixup. Acked-by: Michael Ellerman <mpe@xxxxxxxxxxxxxx> In terms of longer term clean up, do you have a sketch of what you'd like to see? cheers