[+cc Mika, Maciej since they've worked on similar delays recently] On Tue, Jul 25, 2023 at 03:46:35PM -0500, Bjorn Helgaas wrote: > On Mon, Jul 24, 2023 at 06:48:47PM +0800, Kevin Xie wrote: > > On 2023/7/21 0:15, Bjorn Helgaas wrote: > > > On Thu, Jul 20, 2023 at 06:11:59PM +0800, Kevin Xie wrote: > > >> On 2023/7/20 0:48, Bjorn Helgaas wrote: > > >> > On Wed, Jul 19, 2023 at 06:20:56PM +0800, Minda Chen wrote: > > >> >> Add StarFive JH7110 SoC PCIe controller platform > > >> >> driver codes. > > > >> However, in the compatibility testing with several NVMe SSD, we > > >> found that Lenovo Thinklife ST8000 NVMe can not get ready in 100ms, > > >> and it actually needs almost 200ms. Thus, we increased the T_PVPERL > > >> value to 300ms for the better device compatibility. > > > ... > > > > > > Thanks for this valuable information! This NVMe issue potentially > > > affects many similar drivers, and we may need a more generic fix so > > > this device works well with all of them. > > > > > > T_PVPERL is defined to start when power is stable. Do you have a way > > > to accurately determine that point? I'm guessing this: > > > > > > gpiod_set_value_cansleep(pcie->power_gpio, 1) > > > > > > turns the power on? But of course that doesn't mean it is instantly > > > stable. Maybe your testing is telling you that your driver should > > > have a hardware-specific 200ms delay to wait for power to become > > > stable, followed by the standard 100ms for T_PVPERL? > > > > You are right, we did not take the power stable cost into account. > > T_PVPERL is enough for Lenovo Thinklife ST8000 NVMe SSD to get ready, > > and the extra cost is from the power circuit of a PCIe to M.2 connector, > > which is used to verify M.2 SSD with our EVB at early stage. > > Hmm. That sounds potentially interesting. I assume you're talking > about something like this: https://www.amazon.com/dp/B07JKH5VTL > > I'm not familiar with the timing requirements for something like this. > There is a PCIe M.2 spec with some timing requirements, but I don't > know whether or how software is supposed to manage this. There is a > T_PVPGL (power valid to PERST# inactive) parameter, but it's > implementation specific, so I don't know what the point of that is. > And I don't see a way for software to even detect the presence of such > an adapter. I intended to ask about this on the PCI-SIG forum, but after reading this thread [1], I don't think we would learn anything. The question was: The M.2 device has 5 voltage rails generated from the 3.3V input supply voltage ------------------------------------------- This is re. Table 17 in PCI Express M.2 Specification Revision 1.1 Power Valid* to PERST# input inactive : Implementation specific; recommended 50 ms What exactly does this mean ? The Note says *Power Valid when all the voltage supply rails have reached their respective Vmin. Does this mean that the 50ms to PERSTn is counted from the instant when all *5 voltage rails* on the M.2 device have become "good" ? and the answer was: You wrote; Does this mean that the 50ms to PERSTn is counted from the instant when all 5 voltage rails on the M.2 device have become "good" ? Reply: This means that counting the recommended 50 ms begins from the time when the power rails coming to the device/module, from the host, are stable *at the device connector*. As for the time it takes voltages derived inside the device from any of the host power rails (e.g., 3.3V rail) to become stable, that is part of the 50ms the host should wait before de-asserting PERST#, in order ensure that most devices will be ready by then. Strictly speaking, nothing disastrous happens if a host violates the 50ms. If it de-asserts too soon, the device may not be ready, but most hosts will try again. If the host de-asserts too late, the device has even more time to stabilize. This is why the WG felt that an exact minimum number for >>Tpvpgl, was not valid in practice, and we made it a recommendation. Since T_PVPGL is implementation-specific, we can't really base anything in software on the 50ms recommendation. It sounds to me like they are counting on software to retry config reads when enumerating. I guess the delays we *can* observe are: 100ms T_PVPERL "Power stable to PERST# inactive" (CEM 2.9.2) 100ms software delay between reset and config request (Base 6.6.1) The PCI core doesn't know how to assert PERST#, so the T_PVPERL delay definitely has to be in the host controller driver. The PCI core observes the second 100ms delay after a reset in pci_bridge_wait_for_secondary_bus(). But this 100ms delay does not happen during initial enumeration. I think the assumption of the PCI core is that when the host controller driver calls pci_host_probe(), we can issue config requests immediately. So I think that to be safe, we probably need to do both of those 100ms delays in the host controller driver. Maybe there's some hope of supporting the latter one in the PCI core someday, but that's not today. Bjorn [1] https://forum.pcisig.com/viewtopic.php?f=74&t=1037