Re: [PATCH v5] PCI: dwc: Wait for link up only if link is started

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Feb 06, 2024 at 10:40:43PM +0530, Manivannan Sadhasivam wrote:
> + Joao
> 
> On Mon, Feb 05, 2024 at 04:30:20PM +0530, Ajay Agarwal wrote:
> > On Wed, Jan 31, 2024 at 12:06:26AM +0530, Manivannan Sadhasivam wrote:
> > > On Tue, Jan 30, 2024 at 10:48:59PM +0530, Ajay Agarwal wrote:
> > > > On Tue, Jan 30, 2024 at 05:59:06PM +0530, Manivannan Sadhasivam wrote:
> > > > > On Tue, Jan 30, 2024 at 02:30:27PM +0530, Ajay Agarwal wrote:
> > > > > 
> > > > > [...]
> > > > > 
> > > > > > > > > > > If that's the case with your driver, when are you starting the link training?
> > > > > > > > > > > 
> > > > > > > > > > The link training starts later based on a userspace/debugfs trigger.
> > > > > > > > > > 
> > > > > > > > > 
> > > > > > > > > Why does it happen as such? What's the problem in starting the link during
> > > > > > > > > probe? Keep it in mind that if you rely on the userspace for starting the link
> > > > > > > > > based on a platform (like Android), then if the same SoC or peripheral instance
> > > > > > > > > get reused in other platform (non-android), the it won't be a seamless user
> > > > > > > > > experience.
> > > > > > > > > 
> > > > > > > > > If there are any other usecases, please state them.
> > > > > > > > > 
> > > > > > > > > - Mani
> > > > > > > > >
> > > > > > > > This SoC is targeted for an android phone usecase and the endpoints
> > > > > > > > being enumerated need to go through an appropriate and device specific
> > > > > > > > power sequence which gets triggered only when the userspace is up. The
> > > > > > > > PCIe probe cannot assume that the EPs have been powered up already and
> > > > > > > > hence the link-up is not attempted.
> > > > > > > 
> > > > > > > Still, I do not see the necessity to not call start_link() during probe. If you
> > > > > > I am not adding any logic/condition around calling the start_link()
> > > > > > itself. I am only avoiding the wait for the link to be up if the
> > > > > > controller driver has not defined start_link().
> > > > > > 
> > > > > 
> > > > > I'm saying that not defining the start_link() callback itself is wrong.
> > > > > 
> > > > Whether the start_link() should be defined or not, is a different
> > > > design discussion. We currently have 2 drivers in upstream (intel-gw and
> > > > dw-plat) which do not have start_link() defined. Waiting for the link to
> > > > come up for the platforms using those drivers is not a good idea. And
> > > > that is what we are trying to avoid.
> > > > 
> > > 
> > > NO. The sole intention of this patch is to fix the delay observed with _your_
> > > out-of-tree controller driver as you explicitly said before. Impact for the
> > > existing 2 drivers are just a side effect.
> > >
> > Hi Mani,
> > What is the expectation from the pcie-designware-host driver? If the
> > .start_link() has to be defined by the vendor driver, then shouldn't the
> > probe be failed if the vendor has not defined it? Thereby failing the
> > probe for intel-gw and pcie-dw-plat drivers?
> > 
> 
> intel-gw maintainer agreed to fix the driver [1], but I cannot really comment on
> the other one. It is not starting the link at all, so don't know how it works.
> I've CCed the driver author to get some idea.
> 
> [1] https://lore.kernel.org/linux-pci/BY3PR19MB50764E90F107B3256189804CBD432@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx/
> 
> > Additionally, if the link fails to come up even after 1 sec of wait
> > time, shouldn't the probe be failed in that case too?
> > 
> 
> Why? The device can be attached at any point of time. What I'm stressing is, the
> driver should check for the link to be up during probe and if there is no
> device, then it should just continue and hope for the device to show up later.
My change is still checking whether the link is up during probe.
If yes, print the link status (negotiated speed and width).
If no, and the .start_link() exists, then call the same and wait for 1
second for the link to be up.

> This way, the driver can detect the powered up devices during boot and also
> detect the hotplug devices.
>
If the .start_link() is not defined, how will the link come up? Are you
assuming that the bootloader might have enabled link training?

> > My understanding of these drivers is that the .start_link() is an
> > OPTIONAL callback and that the dw_pcie_host_init should help setup the
> > SW structures regardless of whether the .start_link() has been defined
> > or not, and whether the link is up or not. The vendor should be allowed
> > to train the link at a later point of time as well.
> > 
> 
> What do you mean by later point of time? Bringing the link through debugfs? NO.
> We cannot let each driver behave differently, unless there is a really valid
> reason.
> 
pci_rescan_bus() is an exported API. I am assuming that this means that
it is supposed to be used by modules when they know that the link is up.
If a module wishes to bring the link up using debugfs or some other HW
trigger, why is that a bad design? In my opinion, this is providing more
options to the HW/product designer and the module author, in addition to
the already existing .start_link() callback.

> > Please let me know your thoughts.
> > > > > > > add PROBE_PREFER_ASYNCHRONOUS to your controller driver, this delay would become
> > > > > > > negligible. The reason why I'm against not calling start_link() is due to below
> > > > > > > reasons:
> > > > > > > 
> > > > > > > 1. If the same SoC gets reused for other platforms, even other android phones
> > > > > > > that powers up the endpoints during boot, then it creates a dependency with
> > > > > > > userspace to always start the link even though the devices were available.
> > > > > > > That's why we should never fix the behavior of the controller drivers based on a
> > > > > > > single platform.
> > > > > > I wonder how the behavior is changing with this patch. Do you have an
> > > > > > example of a platform which does not have start_link() defined but would
> > > > > > like to still wait for a second for the link to come up?
> > > > > > 
> > > > > 
> > > > > Did you went through my reply completely? I mentioned that the 1s delay would be
> > > > > gone if you add the async flag to your driver and you are ignoring that.
> > > > > 
> > The async probe might not help in all the cases. Consider a situation
> > where the PCIe is probed after the boot is already completed. The user
> > will face the delay then. Do you agree?
> > 
> 
> You mean loading the driver module post boot? If the link is still not up, yes
> the users will see the 1sec delay.
>
> But again, the point I'm trying to make here is, all the drivers have to follow
> one flow. We cannot let each driver do its way of starting the link. There could
> be exceptions if we get a valid reason for a driver to not do so, but so far I
> haven't come across such reason. (the existing drivers, intel-gw and
> designware-plat are not exceptions, but they need fixing).
> 
> For your driver, you said that the userspace brings up the link, post boot when
> the devices are powered on. So starting the link during probe incurs 1s delay,
> as there would be no devices. But I suggested you to enable async probe to
> nullify the 1s delay during probe and you confirmed that it fixes the issue for
> you.
> 
There are emulation/simulation platforms in which 1 second of runtime
can correspond to ~1 hour of real-world time. Even if PCIe is probed
asyncronously, it will still block the next set of processes.

> Then you are again debating about not defining the start_link() callback :(
> 
I am not sure why you think I am debating against defining the
.start_link() callback. It is clearly an optional pointer and I am
choosing to not use it because of the usecase. And if it is optional and
I am not using it, why waste 1 sec of runtime waiting for the link? Do
we have an example in upstream of platforms where this 1 sec could prove
useful where link training is not being started by the platform driver
but EPs have to be detected because they are present and powered-up?

> I've made by point clear, please do not take inspiration from those drivers,
> they really need fixing. And for your usecase, allowing the controller driver
> to start the link post boot just because a device on your Pixel phone comes up
> later is not a good argument. You _should_not_ define the behavior of a
> controller driver based on one platform, it is really a bad design.
> 
> - Mani
> 
> > > > Yes, I did go through your suggestion of async probe and that might
> > > > solve my problem of the 1 sec delay. But I would like to fix the problem
> > > > at the core.
> > > > 
> > > 
> > > There is no problem at the core. The problem is with some controller drivers.
> > > Please do not try to fix a problem which is not there. There are no _special_
> > > reasons for those 2 drivers to not define start_link() callback. I'm trying to
> > > point you in the right path, but you are always chosing the other one.
> > > 
> > > > > But again, I'm saying that not defining start_link() itself is wrong and I've
> > > > > already mentioned the reasons.
> > > > > 
> > > > > > For example, consider the intel-gw driver. The 1 sec wait time in its
> > > > > > probe path is also a waste because it explicitly starts link training
> > > > > > later in time.
> > > > > > 
> > > > > 
> > > > > I previously mentioned that the intel-gw needs fixing since there is no point in
> > > > > starting the link and waiting for it to come up in its probe() if the DWC core
> > > > > is already doing that.
> > > > > 
> > > > > - Mani
> > > > > 
> > > > > -- 
> > > > > மணிவண்ணன் சதாசிவம்
> > > > I think we are at a dead-end in terms of agreeing to a policy. I would
> > > > like the maintainers to pitch in here with their views.
> > > 
> > > I'm the maintainer of the DWC drivers that you are proposing the patch for. If
> > > you happen to spin future revision of this series, please carry:
> > > 
> > > Nacked-by: Manivannan Sadhasivam <manivannan.sadhasivam@xxxxxxxxxx>
> > > 
> > > - Mani
> > > 
> > > -- 
> > > மணிவண்ணன் சதாசிவம்
> 
> -- 
> மணிவண்ணன் சதாசிவம்




[Index of Archives]     [DMA Engine]     [Linux Coverity]     [Linux USB]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Greybus]

  Powered by Linux