On Wed, Dec 21, 2022 at 02:59:42PM +0000, Matthias Kaehlcke wrote: > On Wed, Dec 21, 2022 at 11:19:53AM +0530, Manivannan Sadhasivam wrote: > > On Mon, Dec 05, 2022 at 06:18:36PM +0000, Matthias Kaehlcke wrote: > > > On Mon, Dec 05, 2022 at 04:55:00PM +0530, Manivannan Sadhasivam wrote: > > > > On Fri, Sep 16, 2022 at 01:38:37PM +0530, Krishna chaitanya chundru wrote: > > > > > In ASPM driver, LTR threshold scale and value are updated based on > > > > > tcommon_mode and t_poweron values. In Kioxia NVMe L1.2 is failing due to > > > > > LTR threshold scale and value are greater values than max snoop/non-snoop > > > > > value. > > > > > > > > > > Based on PCIe r4.1, sec 5.5.1, L1.2 substate must be entered when > > > > > reported snoop/no-snoop values is greater than or equal to > > > > > LTR_L1.2_THRESHOLD value. > > > > > > > > > > Signed-off-by: Prasad Malisetty <quic_pmaliset@xxxxxxxxxxx> > > > > > Signed-off-by: Krishna chaitanya chundru <quic_krichai@xxxxxxxxxxx> > > > > > Acked-by: Manivannan Sadhasivam <manivannan.sadhasivam@xxxxxxxxxx> > > > > > > > > I take my Ack back... Sorry that I did not look into this patch closer. > > > > > > > > > --- > > > > > > > > > > I am taking this patch forward as prasad is no more working with our org. > > > > > changes since v6: > > > > > - Rebasing with pci/next. > > > > > changes since v5: > > > > > - no changes, just reposting as standalone patch instead of reply to > > > > > previous patch. > > > > > Changes since v4: > > > > > - Replaced conditional statements with min and max. > > > > > changes since v3: > > > > > - Changed the logic to include this condition "snoop/nosnoop > > > > > latencies are not equal to zero and lower than LTR_L1.2_THRESHOLD" > > > > > Changes since v2: > > > > > - Replaced LTRME logic with max snoop/no-snoop latencies check. > > > > > Changes since v1: > > > > > - Added missing variable declaration in v1 patch > > > > > --- > > > > > drivers/pci/pcie/aspm.c | 30 ++++++++++++++++++++++++++++++ > > > > > 1 file changed, 30 insertions(+) > > > > > > > > > > diff --git a/drivers/pci/pcie/aspm.c b/drivers/pci/pcie/aspm.c > > > > > index 928bf64..2bb8470 100644 > > > > > --- a/drivers/pci/pcie/aspm.c > > > > > +++ b/drivers/pci/pcie/aspm.c > > > > > @@ -486,13 +486,35 @@ static void aspm_calc_l1ss_info(struct pcie_link_state *link, > > > > > { > > > > > struct pci_dev *child = link->downstream, *parent = link->pdev; > > > > > u32 val1, val2, scale1, scale2; > > > > > + u32 max_val, max_scale, max_snp_scale, max_snp_val, max_nsnp_scale, max_nsnp_val; > > > > > u32 t_common_mode, t_power_on, l1_2_threshold, scale, value; > > > > > u32 ctl1 = 0, ctl2 = 0; > > > > > u32 pctl1, pctl2, cctl1, cctl2; > > > > > + u16 ltr; > > > > > + u16 max_snoop_lat, max_nosnoop_lat; > > > > > > > > > > if (!(link->aspm_support & ASPM_STATE_L1_2_MASK)) > > > > > return; > > > > > > > > > > + ltr = pci_find_ext_capability(child, PCI_EXT_CAP_ID_LTR); > > > > > + if (!ltr) > > > > > + return; > > > > > + > > > > > + pci_read_config_word(child, ltr + PCI_LTR_MAX_SNOOP_LAT, &max_snoop_lat); > > > > > + pci_read_config_word(child, ltr + PCI_LTR_MAX_NOSNOOP_LAT, &max_nosnoop_lat); > > > > > + > > > > > + max_snp_scale = (max_snoop_lat & PCI_LTR_SCALE_MASK) >> PCI_LTR_SCALE_SHIFT; > > > > > + max_snp_val = max_snoop_lat & PCI_LTR_VALUE_MASK; > > > > > + > > > > > + max_nsnp_scale = (max_nosnoop_lat & PCI_LTR_SCALE_MASK) >> PCI_LTR_SCALE_SHIFT; > > > > > + max_nsnp_val = max_nosnoop_lat & PCI_LTR_VALUE_MASK; > > > > > + > > > > > + /* choose the greater max scale value between snoop and no snoop value*/ > > > > > + max_scale = max(max_snp_scale, max_nsnp_scale); > > > > > + > > > > > + /* choose the greater max value between snoop and no snoop scales */ > > > > > + max_val = max(max_snp_val, max_nsnp_val); > > > > > + > > > > > /* Choose the greater of the two Port Common_Mode_Restore_Times */ > > > > > val1 = (parent_l1ss_cap & PCI_L1SS_CAP_CM_RESTORE_TIME) >> 8; > > > > > val2 = (child_l1ss_cap & PCI_L1SS_CAP_CM_RESTORE_TIME) >> 8; > > > > > @@ -525,6 +547,14 @@ static void aspm_calc_l1ss_info(struct pcie_link_state *link, > > > > > */ > > > > > l1_2_threshold = 2 + 4 + t_common_mode + t_power_on; > > > > > encode_l12_threshold(l1_2_threshold, &scale, &value); > > > > > + > > > > > + /* > > > > > + * Based on PCIe r4.1, sec 5.5.1, L1.2 substate must be entered when reported > > > > > + * snoop/no-snoop values are greater than or equal to LTR_L1.2_THRESHOLD value. > > > > > > > > Apart from the bug in calculating the LTR_Threshold as reported by Matthias > > > > and Bjorn, I'm wondering if we are covering up for the device firmware issue. > > > > > > Yes, I think the patch is doing exactly that. > > > > > > > As per section 6.18, if the device reports snoop/no-snoop scale/value as 0, then > > > > it implies that the device won't tolerate any additional delays from the host. > > > > > > > > In that case, how can we allow the link to go into L1.2 since that would incur > > > > high delay compared to L1.1? > > > > > > I had the same doubt, a value of 0 doesn't make sense, if it literally means > > > 'max delay of 0ns'. I did some debugging around this issue. One thing I found > > > is that there are NVMe models that don't have issues with entering L1.2 with > > > max (no-)snoop latencies of 0. From that I infer that a value of 0 does not > > > literally mean a max delay of 0ns. > > > > > > > This is interesting. > > > > > The PCIe spec doesn't say specifically what a value of 0 in those registers > > > means, but chapter "6.18 Latency Tolerance Reporting (LTR) Mechanism" of the > > > PCIe 4.0 base spec says something about the latency requirements in LTR > > > messages: > > > > > > Setting the value and scale fields to all 0’s indicates that the device will > > > be impacted by any delay and that the best possible service is requested. > > > > > > With that and the fact that several NVMe's don't have issues with all 0 values > > > I deduce that all 0's means 'best possible service' and not 'max latency of > > > 0ns'. It seems the Kioxia firmware has a bug which interprets all 0 values as > > > a max latency of 0ns. > > > > > > Another finding is that the Kioxia NVMe can enter L1.2 if the max latencies > > > are set to values >= the LTR threshold. Unfortunately that isn't a viable > > > fix for existing devices in the field, devices under development could possibly > > > adjust the latencies in the BIOS (coreboot code [1] suggests that this is done > > > at least in some cases). > > > > > > > I fully agree that it is a firmware issue. And yes, we should refrain to fixes > > in the bootloader if possible. > > > > Another option would be to add a quirk for specific devices in the ASPM code. > > But in that case, I'm not sure what would be the optimal snoop/no-snoop value > > that could be used. > > I had/have the same doubt. > > > There is another issue where if we have some other device on the same bus > > that explicitly requires 0ns latency. > > Would that be reasonable requirement, i.e. can 0ns latency ever be achieved? Well, not in an ideal world. But the device vendors do not fail to surprise us ;) Thanks, Mani -- மணிவண்ணன் சதாசிவம்