On Thu, Mar 24, 2022 at 04:25:16AM +0300, Serge Semin wrote: > It was wrong to use the region size parameter in order to determine > whether the INCREASE_REGION_SIZE flag needs to be set for the outbound > iATU entry because in general there are cases when combining a region base > address and size together produces the out of bounds upper range limit > while upper_32_bits(size) still returns zero. So having a region size > within the permitted values doesn't mean the region limit address will fit > to the corresponding CSR. Here is the way iATU calculates the in- and > outbound untranslated regions if the INCREASE_REGION_SIZE flag is cleared > [1]: > > Start address: End address: > 63 31 0 63 31 0 > +---------------+---------------+ +---------------+---------------+ > | | | 0s | | | | Fs | > +---------------+---------------+ +---------------+---------------+ > upper base | lower base !upper! base | limit address > address address address > > So the region start address is determined by the iATU lower and upper base > address registers, while the region upper boundary is calculated based on > the 32-bits limit address register and the upper part of the base address. > In accordance with that logic for instance the range > 0xf0000000 @ 0x20000000 does have the size smaller than 4GB, but the > actual limit address turns to be invalid forming the untranslated address > map as [0xf0000000; 0x0000FFFF], which isn't what the original range was. I find the example confusing: If the start address is 0x0-0xf0000000 and size is 0x20000000. Then the end address without the INCREASE_REGION_SIZE is going to be: 0x0-0x1000FFFF and this is wrong. If the INCREASE_REGION_SIZE is set, then the end address will be: 0x1-0x1000FFFF and this is correct. > In order to fix that we need to check whether the size being added to the > lower part of the base address causes the 4GB range overflow. If it does > then we need to set the INCREASE_REGION_SIZE flag thus activating the > extended limit address by means of an additional iATU CSR (upper limit > address register) [2]: > > Start address: End address: > 63 31 0 63 x 31 0 > +---------------+---------------+ +---------------+---------------+ > | | | 0s | | | | | Fs | > +---------------+---------------+ +---------------+---------------+ > upper base | lower base upper | upper | limit address > address address base | limit | > address|address| > > Otherwise there is enough room in the 32-bits wide limit address register, > and the flag can be left unset. > > Note the case when the size-based flag setting approach is correct implies > requiring to have the size-aligned base addresses only. But that > restriction isn't relevant to the PCIe ranges accepted by the kernel. > There is also no point in implementing it either seeing the problem can be > easily fixed by checking the whole limit address instead of the region > size. > > [1] DesignWare Cores PCI Express Controller Databook - DWC PCIe Root Port, > v5.40a, March 2019, fig.3-36, p.175 > [2] DesignWare Cores PCI Express Controller Databook - DWC PCIe Root Port, > v5.40a, March 2019, fig.3-37, p.176 > > Fixes: 5b4cf0f65324 ("PCI: dwc: Add upper limit address for outbound iATU") > Signed-off-by: Serge Semin <Sergey.Semin@xxxxxxxxxxxxxxxxxxxx> With the example fixed, Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@xxxxxxxxxx> Thanks, Mani > --- > drivers/pci/controller/dwc/pcie-designware.c | 16 ++++++++++------ > 1 file changed, 10 insertions(+), 6 deletions(-) > > diff --git a/drivers/pci/controller/dwc/pcie-designware.c b/drivers/pci/controller/dwc/pcie-designware.c > index 7dc8c360a0d4..d737af058903 100644 > --- a/drivers/pci/controller/dwc/pcie-designware.c > +++ b/drivers/pci/controller/dwc/pcie-designware.c > @@ -287,8 +287,8 @@ static void dw_pcie_prog_outbound_atu_unroll(struct dw_pcie *pci, u8 func_no, > dw_pcie_writel_ob_unroll(pci, index, PCIE_ATU_UNR_UPPER_TARGET, > upper_32_bits(pci_addr)); > val = type | PCIE_ATU_FUNC_NUM(func_no); > - val = upper_32_bits(size - 1) ? > - val | PCIE_ATU_INCREASE_REGION_SIZE : val; > + if (upper_32_bits(limit_addr) > upper_32_bits(cpu_addr)) > + val |= PCIE_ATU_INCREASE_REGION_SIZE; > if (pci->version == 0x490A) > val = dw_pcie_enable_ecrc(val); > dw_pcie_writel_ob_unroll(pci, index, PCIE_ATU_UNR_REGION_CTRL1, val); > @@ -315,6 +315,7 @@ static void __dw_pcie_prog_outbound_atu(struct dw_pcie *pci, u8 func_no, > u64 pci_addr, u64 size) > { > u32 retries, val; > + u64 limit_addr; > > if (pci->ops && pci->ops->cpu_addr_fixup) > cpu_addr = pci->ops->cpu_addr_fixup(pci, cpu_addr); > @@ -325,6 +326,8 @@ static void __dw_pcie_prog_outbound_atu(struct dw_pcie *pci, u8 func_no, > return; > } > > + limit_addr = cpu_addr + size - 1; > + > dw_pcie_writel_dbi(pci, PCIE_ATU_VIEWPORT, > PCIE_ATU_REGION_OUTBOUND | index); > dw_pcie_writel_dbi(pci, PCIE_ATU_LOWER_BASE, > @@ -332,17 +335,18 @@ static void __dw_pcie_prog_outbound_atu(struct dw_pcie *pci, u8 func_no, > dw_pcie_writel_dbi(pci, PCIE_ATU_UPPER_BASE, > upper_32_bits(cpu_addr)); > dw_pcie_writel_dbi(pci, PCIE_ATU_LIMIT, > - lower_32_bits(cpu_addr + size - 1)); > + lower_32_bits(limit_addr)); > if (pci->version >= 0x460A) > dw_pcie_writel_dbi(pci, PCIE_ATU_UPPER_LIMIT, > - upper_32_bits(cpu_addr + size - 1)); > + upper_32_bits(limit_addr)); > dw_pcie_writel_dbi(pci, PCIE_ATU_LOWER_TARGET, > lower_32_bits(pci_addr)); > dw_pcie_writel_dbi(pci, PCIE_ATU_UPPER_TARGET, > upper_32_bits(pci_addr)); > val = type | PCIE_ATU_FUNC_NUM(func_no); > - val = ((upper_32_bits(size - 1)) && (pci->version >= 0x460A)) ? > - val | PCIE_ATU_INCREASE_REGION_SIZE : val; > + if (upper_32_bits(limit_addr) > upper_32_bits(cpu_addr) && > + pci->version >= 0x460A) > + val |= PCIE_ATU_INCREASE_REGION_SIZE; > if (pci->version == 0x490A) > val = dw_pcie_enable_ecrc(val); > dw_pcie_writel_dbi(pci, PCIE_ATU_CR1, val); > -- > 2.35.1 >