Re: A question about the patch: [PATCH] PCI/PM: Keep runtime PM enabled for unbound PCI devices

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wednesday, November 27, 2013 02:31:32 PM Huang Ying wrote:
> On Wed, 2013-11-27 at 13:32 +0800, Mike Qiu wrote:
> > On 11/27/2013 04:32 AM, Rafael J. Wysocki wrote:
> > > On Tuesday, November 26, 2013 01:41:13 PM Mike Qiu wrote:
> > >> On 11/14/2013 04:54 PM, Huang Ying wrote:
> > >>> On Thu, 2013-11-14 at 16:37 +0800, mike wrote:
> > >>>> On 11/14/2013 04:25 PM, Huang Ying wrote:
> > >>>>> On Thu, 2013-11-14 at 16:12 +0800, mike wrote:
> > >>>>>> On 11/14/2013 03:53 PM, Huang Ying wrote:
> > >>>>>>> On Thu, 2013-11-14 at 15:19 +0800, mike wrote:
> > >>>>>>>> On 11/14/2013 01:59 PM, Huang Ying wrote:
> > >>>>>>>>> On Thu, 2013-11-14 at 11:23 +0800, mike wrote:
> > >>>>>>>>>> On 11/14/2013 03:20 AM, Alan Stern wrote:
> > >>>>>>>>>>> On Wed, 13 Nov 2013, Bjorn Helgaas wrote:
> > >>>>>>>>>>>
> > >>>>>>>>>>>> [+cc Rafael, linux-pm]
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> On Wed, Nov 13, 2013 at 6:09 AM, mike<qiudayu@xxxxxxxxxxxxxxxxxx>  wrote:
> > >>>>>>>>>>>>> Hi Huang Ying,
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>> I see you are the author of this patch, commit id is:
> > >>>>>>>>>>>>> 967577b062417b4e4b8e27b711220f4124f5153a
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>> I have a question while I try to understand this patch,
> > >>>>>>>>>>>>> So I would very grateful if you or others can give me some reply.....
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>> ............
> > >>>>>>>>>>>>> -       rc = ddi->drv->probe(ddi->dev, ddi->id);
> > >>>>>>>>>>>>> +       pm_runtime_get_sync(dev);
> > >>>>>>>>>>>>> +       pci_dev->driver = pci_drv;
> > >>>>>>>>>>>>> ^^^^^^^^^^^^^^^^^^^^^^^^^^
> > >>>>>>>>>>>>> I see here you make the driver to initialize before probe,
> > >>>>>>>>>>>>> But I have no idea of why you do this change.....
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>> and I look inside the code, it may be pm_runtime relate??
> > >>>>>>>>>>> Yes, it is related to runtime PM.  In the PCI subsystem, runtime PM
> > >>>>>>>>>>> doesn't do anything unless pci_dev->driver is set.  You can see this at
> > >>>>>>>>>>> the start of pci_pm_runtime_suspend().
> > >>>>>>>>>>>
> > >>>>>>>>>>> Since we want the driver's probe routine to be able to carry out
> > >>>>>>>>>>> runtime PM operations, we have to set pci_dev->driver before the probe
> > >>>>>>>>>>> routine runs.
> > >>>>>>>>>> Is there any situations , like in  probe state,  pci_dev->driver
> > >>>>>>>>>> has been set. the  pci_pm_runtime_xxx() has passed
> > >>>>>>>>>> pci_dev->driver NULL check, but at this point, probe fail
> > >>>>>>>>>> occurs, and  pci_dev->driver to be set to NULL.
> > >>>>>>>>>>
> > >>>>>>>>>> What will happen ? Or this situation will never happen?
> > >>>>>>>>>> I'm confuse about this.
> > >>>>>>>>> I think that will never happen.  Before ->probe(), pm_runtime_get_sync()
> > >>>>>>>>> is called, so pci_pm_runtime_xxx() will not be called until
> > >>>>>>>>> pm_runtime_put_noidle() is called in ->probe().  And
> > >>>>>>>>>       should be done as one of the latest actions in
> > >>>>>>>>> ->probe(), after the normal probe actions succeeded.
> > >>>>>>>> OK, just as your description, it seems OK.
> > >>>>>>>> But this is really a issue as I explained in last email.
> > >>>>>>>>
> > >>>>>>>> So I want to know if there are any side-effect of changing the code
> > >>>>>>>> in pci_pm_runtime_xxx()
> > >>>>>>>>
> > >>>>>>>>       if (!pci_dev->driver)
> > >>>>>>>>              return 0;
> > >>>>>>>>       to
> > >>>>>>>>
> > >>>>>>>>       if (!dev->driver)
> > >>>>>>>>              return 0;
> > >>>>>>>>
> > >>>>>>> If you make this change, we can not put devices into low power state
> > >>>>>>> (runtime suspend the device) in ->probe().  That is expected in some
> > >>>>>> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> > >>>>>> This means dev->driver is NULL ?? but pci_dev->driver is set???
> > >>>>>>
> > >>>>>> Because if use pci_dev->driver can put into low power state, means
> > >>>>>>
> > >>>>>> pci_dev->driver is set, but in the situation, use dev->driver will can't,
> > >>>>>>
> > >>>>>> means dev->driver = null, but I have not find any case that
> > >>>>>>
> > >>>>>> dev->driver = null, but pci_dev->driver != null;
> > >>>>> Sorry I make a mistake here.  The dev->driver != null in
> > >>>>> local_pci_probe().  We use pci_dev->driver instead of dev->driver in
> > >>>>> pci_pm_runtime_xxx() because we want device to be kept in normal power
> > >>>>> state (D0) and SUSPENDED state when unbound.The
> > >>>>> pm_runtime_put/get_sync in pci_device_remove/local_pci_probe will not
> > >>>>> change the power state of the device because of the check in
> > >>>>> pci_pm_runtime_xxx().
> > >>>> Yes, you are right, but what am I confuse is that, why check dev->driver
> > >>>> in pci_pm_runtime_xxx() can't keep the device in normal power
> > >>>> state (D0) and SUSPENDED state when unbound.
> > >>>>
> > >>>> May be logic issue ?
> > >>> Because dev->driver is set before local_pci_probe() and cleared after
> > >>> pci_device_remove().  But we need a flag to be changed in
> > >>> local_pci_probe() and pci_device_remove().
> > >> Hi Ying,
> > >>
> > >> I'm now face one bug, and the root cause is this logic has some problem.
> > >>
> > >> The other component calls the ops in driver during probe state, which a
> > >> lot of critical data struct haven't been setup yet.
> > >>
> > >> This never happen in old logic, because dev->driver is unset in probe
> > >> state, it can check dev->driver to see if the device diver can work. But
> > >> for new logic it is really a big issue.
> > > What is the other component and why is it doing that?
> > 
> > Some component like EEH in Power arch, it need to check whether the 
> > driver is work or not.
> > 
> > In old logic, if probed then dev->driver set, otherwise it will be NULL, 
> > it is safe to do so.
> > 
> > But in new, it has problem, it can call the driver API, which is very 
> > dangerous in probe state, maybe a lot key data structure haven't been 
> > setup yet, this lead to the kernel down and machine reboot. Also this 
> > can be fixed in driver, like check the driver data it self, this 
> > solution needs all the driver fix this issue, It may be a huge program.
> > 
> > So we need a new flag I think, or which old flag can we use to solve 
> > this issue ?
> 
> I think a flag is not safe for you.  Driver may be removed when you
> operate on it.

Precisely.  The old code is still unsafe although it happens to work in the
given test conditions.

> Better to use device_lock() if possible, which will be
> held during device probe and driver remove.

Or generally synchronize it properly.

Thanks,
Rafael

--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [DMA Engine]     [Linux Coverity]     [Linux USB]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Greybus]

  Powered by Linux