On Mon, Jun 12, 2023 at 01:19:02PM +0000, Yoshihiro Shimoda wrote: > Hello Serge, > > > From: Serge Semin, Sent: Friday, June 9, 2023 7:54 PM > <snip> > > > > > static int rcar_gen4_pcie_start_link(struct dw_pcie *dw) > > > > > { > > > > > struct rcar_gen4_pcie *rcar = to_rcar_gen4_pcie(dw); > > > > > int i; > > > > > > > > > > rcar_gen4_pcie_ltssm_enable(rcar, true); > > > > > > > > > > /* > > > > > * Require direct speed change here. Otherwise RDLH_LINK_UP of > > > > > * PCIEINTSTS0 which is this controller specific register may not > > > > > * be set. > > > > > */ > > > > > > > > > if (rcar->needs_speed_change) { > > > > > > > > Seeing this is specified for the root port only what about > > > > replacing the statement with just test whether the rcar_gen4_pcie.mode == > > > > DW_PCIE_RC_TYPE? Thus you'll be ablt to drop the needs_speed_change field. > > > > > > Thank you for the comment. I'll fix it. > > > > > > > BTW Just curious. Why is the loop below enabled for the Root Port > > > > only? What about the end-point controller? It's the same hardware > > > > after all.. > > > > > > This is reused from v16 and then it used "link retraining" which is only for > > > the Root Port. As you mentioned, it seems endpoint controller is also needed > > > if we use direct speed change. > > > > > > > > for (i = 0; i < SPEED_CHANGE_MAX_RETRIES; i++) { > > > > > rcar_gen4_pcie_speed_change(dw); > > > > > msleep(100); > > > > > if (rcar_gen4_pcie_check_current_link(dw)) > > > > > return 0; > > > > > } > > > > > > > > Did you trace how many iterations this loop normally takes? > > > > > > i = 0 or 1 (if the max-link-speed is suitable for a connected device.) > > > > > > > Is it > > > > constant or varies for the same platform setup and a connected link > > > > partner? Does the number of iterations depend on the target link speed > > > > specified via the "max-link-speed" property? > > > > > > > > This is not related to the "max-link-speed". It seems to related to > > > a link partner. > > > LinkCap max-link-speed loop > > > Device A 4 4 1 > > > Device A 4 3 1 > > > Device B 3 3 0 > > > > Great! If so I would have just left a single unconditional > > rcar_gen4_pcie_speed_change() call placed right after the > > rcar_gen4_pcie_ltssm_enable() method with no delays afterwards. These > > methods would have been invoked in the framework of > > dw_pcie_start_link() after which the dw_pcie_wait_for_link() method is > > called with several checks parted with the ~100ms delay. It will make > > sure that at least some link is up with the link state printed to the > > system log. If for some reason the performance degradation happens > > then it will be up to the system administrator to investigate what was > > wrong. Your driver did as much is it could to reach the best link gen. > > IIUC, is your suggestion like the following code? > --- > rcar_gen4_pcie_ltssm_enable(rcar, true); > if (!dw_pcie_wait_for_link(dw)) { > rcar_gen4_pcie_speed_change(dw); > return 0; > } > --- > > Unfortunately, it doesn't work correctly... > The following code can work correctly. The value of i is still 1 on the device A. > What do you think that the following code is acceptable? > --- > rcar_gen4_pcie_ltssm_enable(rcar, true); > for (i = 0; i < SPEED_CHANGE_MAX_RETRIES; i++) { > msleep(100); > rcar_gen4_pcie_speed_change(dw); > if (dw_pcie_link_up(dw)) { > printk("%s:%d\n", __func__, i); > return 0; > } > } > --- My idea was to implement something like this: +static int rcar_gen4_pcie_start_link(struct dw_pcie *dw) +{ + struct rcar_gen4_pcie *rcar = to_rcar_gen4_pcie(dw); + + rcar_gen4_pcie_ltssm_enable(rcar, true); + + rcar_gen4_pcie_speed_change(dw); + + return 0; +} and retain the rcar_gen4_pcie_link_up() method as is. * Note: originally your loop used to have the msleep() call performed after the first rcar_gen4_pcie_speed_change() invocation. Thus the delay can be dropped if there is only one iteration implemented (see further to understand why). You don't need to wait for the link to actually get up in the start_link() callback because there is the link_up() callback, which is called from the dw_pcie_wait_for_link() method during the generic DWC PCIe setup procedure. See: dw_pcie_host_init(): +-> ops->host_init() +-> ... +-> dw_pcie_setup_rc() | +-> ... | +-> dw_pcie_setup() | +-> ... +-> if !dw_pcie_link_up() | | +-> ops->link_up() | +-> dw_pcie_start_link() | +-> ops->start_link() +-> dw_pcie_wait_for_link(); // See, wait-procedure is already performed | +-> loop 10 times // for you in the core driver together | +-> dw_pcie_link_up() // with the delays between the checks | +-> ops->link_up() | +-> msleep(~100) +-> ... -Serge(y) > > Best regards, > Yoshihiro Shimoda > > > -Serge(y) >