On 09/15/2017 01:36 PM, Luca Coelho wrote: > On Fri, 2017-09-15 at 13:32 -0600, Jens Axboe wrote: >> On 09/14/2017 02:36 PM, Jens Axboe wrote: >>> On 09/14/2017 02:04 PM, Srinath Mannam wrote: >>>> Hi Jens Axboe, >>>> >>>> >>>> On Thu, Sep 14, 2017 at 11:14 PM, Jens Axboe <axboe@xxxxxxxxx> wrote: >>>>> On 09/14/2017 11:35 AM, Jens Axboe wrote: >>>>>> On 09/14/2017 11:28 AM, Srinath Mannam wrote: >>>>>>> Hi Bjorn, >>>>>>> >>>>>>> On Thu, Sep 14, 2017 at 10:52 PM, Jens Axboe <axboe@xxxxxxxxx> wrote: >>>>>>>> >>>>>>>> On 09/14/2017 11:11 AM, Bjorn Helgaas wrote: >>>>>>>>> [+cc linux-pci] >>>>>>>>> >>>>>>>>> On Thu, Sep 14, 2017 at 12:00 PM, Jens Axboe <axboe@xxxxxxxxx> wrote: >>>>>>>>>> On 09/12/2017 02:04 PM, Johannes Berg wrote: >>>>>>>>>>> On Tue, 2017-09-12 at 13:43 -0600, Jens Axboe wrote: >>>>>>>>>>> >>>>>>>>>>>> CC'ing the guilty part and Bjorn. I'm assuming it's the >>>>>>>>>>>> pci_is_enabled() check, since the rest of the patch shouldn't have >>>>>>>>>>>> functional changes. >>>>>>>>>>> >>>>>>>>>>> and pci_enable_bridge() already checks if it's already enabled, but >>>>>>>>>>> still enables mastering in that case if it isn't: >>>>>>>>>>> >>>>>>>>>>> static void pci_enable_bridge(struct pci_dev *dev) >>>>>>>>>>> { >>>>>>>>>>> [...] >>>>>>>>>>> if (pci_is_enabled(dev)) { >>>>>>>>>>> if (!dev->is_busmaster) >>>>>>>>>>> pci_set_master(dev); >>>>>>>>>>> return; >>>>>>>>>>> } >>>>>>>>>>> >>>>>>>>>>> so I guess due to the new check we end up with mastering disabled, and >>>>>>>>>>> thus the firmware can't load since that's a DMA thing? >>>>>>>>>> >>>>>>>>>> Bjorn/Srinath, any input here? This is a regression that prevents wifi >>>>>>>>>> from working on a pretty standard laptop. It'd suck to have this be in >>>>>>>>>> -rc1. Seems like the trivial fix would be: >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c >>>>>>>>>> index b0002daa50f3..ffbe11dbdd61 100644 >>>>>>>>>> --- a/drivers/pci/pci.c >>>>>>>>>> +++ b/drivers/pci/pci.c >>>>>>>>>> @@ -1394,7 +1394,7 @@ static int pci_enable_device_flags(struct pci_dev *dev, unsigned long flags) >>>>>>>>>> return 0; /* already enabled */ >>>>>>>>>> >>>>>>>>>> bridge = pci_upstream_bridge(dev); >>>>>>>>>> - if (bridge && !pci_is_enabled(bridge)) >>>>>>>>>> + if (bridge) >>>>>>> >>>>>>> With this change and keeping "mutex_lock(&pci_bridge_mutex);" in >>>>>>> pci_enable_bridge functoin will causes a nexted lock. >>>>>> >>>>>> Took a look, and looks like you are right. That code looks like a mess, >>>>>> fwiw. >>>>>> >>>>>> I'd strongly suggest that the bad commit is reverted until a proper >>>>>> solution is found, since the simple one-liner could potentially >>>>>> introduce a deadlock with your patch applied. >>>>> >>>>> BTW, your patch looks pretty bad too, introducing a random mutex >>>>> deep on code that can be recursive. Why isn't this check in >>>>> pci_enable_device_flags() enough to prevent double-enable of >>>>> an upstream bridge? >>>>> >>>>> if (atomic_inc_return(&dev->enable_cnt) > 1) >>>>> return 0; /* already enabled */ >>>>> >>>> >>>> This check only to verify device enable not for the bus master check. >>>> But device enable function calls the bridge enable if it has the bridge. >>>> Bridge enable function enables both device and bus master. >>>> >>>> Here the issue might be because, bridge of endpoint has already set >>>> device enable without set bus master in some other context. which is >>>> wrong. >>>> because all bridges should enable with bridge_enable function only. >>>> So we see the problem In this context, because "if (bridge && >>>> !pci_is_enabled(bridge))" check stops bridge enable which intern stops >>>> bus master. >>>> pci_enable_bridge function always makes sure that both device and bus >>>> master are enabled in any case. If pci_enable_bridge function is not >>>> called means, that bridge is already has device enable flag set. which >>>> is not from pci_enable_bridge function. >>> >>> In any case, your patch introduces a regression on systems. Please get >>> it reverted now, and then you can come up with a new approach to fix the >>> double enable of the upstream bridge. >> >> Who's sending in the revert? I can certainly do it if no one else does, >> but it needs to be done. >> >> I'm not seeing any patches coming out of Srinath to fix up the >> situation, so we should revert the broken patch until a better solution >> exists. > > Bjorn already sent it: > > https://lkml.org/lkml/2017/9/15/46 Huh ok, I would have thought the various folks in this discussion would have been CC'ed on that. But good to know, that takes care of my concern. -- Jens Axboe