On Fri, Aug 09, 2019 at 07:57:41PM +0200, marek.vasut@xxxxxxxxx wrote: > From: Marek Vasut <marek.vasut+renesas@xxxxxxxxx> > > Due to hardware constraints, the size of each inbound range entry > populated into the controller cannot be larger than the alignment > of the entry's start address. Currently, the alignment for each > "dma-ranges" inbound range is calculated only once for each range > and the increment for programming the controller is also derived > from it only once. Thus, a "dma-ranges" entry describing a memory > at 0x48000000 and size 0x38000000 would lead to multiple controller > entries, each 0x08000000 long. > > This is inefficient, especially considering that by adding the size > to the start address, the alignment increases. This patch moves the > alignment calculation into the loop populating the controller entries, > thus updating the alignment for each controller entry. > > Signed-off-by: Marek Vasut <marek.vasut+renesas@xxxxxxxxx> > Cc: Geert Uytterhoeven <geert+renesas@xxxxxxxxx> > Cc: Lorenzo Pieralisi <lorenzo.pieralisi@xxxxxxx> > Cc: Wolfram Sang <wsa@xxxxxxxxxxxxx> > Cc: linux-renesas-soc@xxxxxxxxxxxxxxx > To: linux-pci@xxxxxxxxxxxxxxx > Reviewed-by: Simon Horman <horms+renesas@xxxxxxxxxxxx> > --- > V2: Update on top of 1/3 > V3: No change > --- > drivers/pci/controller/pcie-rcar.c | 37 +++++++++++++++--------------- > 1 file changed, 19 insertions(+), 18 deletions(-) > > diff --git a/drivers/pci/controller/pcie-rcar.c b/drivers/pci/controller/pcie-rcar.c > index e2735005ffd3..d820aa64d0b7 100644 > --- a/drivers/pci/controller/pcie-rcar.c > +++ b/drivers/pci/controller/pcie-rcar.c > @@ -1029,30 +1029,31 @@ static int rcar_pcie_inbound_ranges(struct rcar_pcie *pcie, > if (restype & IORESOURCE_PREFETCH) > flags |= LAM_PREFETCH; > > - /* > - * If the size of the range is larger than the alignment of the start > - * address, we have to use multiple entries to perform the mapping. > - */ > - if (cpu_addr > 0) { > - unsigned long nr_zeros = __ffs64(cpu_addr); > - u64 alignment = 1ULL << nr_zeros; > - > - size = min(range->size, alignment); > - } else { > - size = range->size; > - } > - /* Hardware supports max 4GiB inbound region */ > - size = min(size, 1ULL << 32); > - > - mask = roundup_pow_of_two(size) - 1; > - mask &= ~0xf; > - > while (cpu_addr < cpu_end) { > if (idx >= MAX_NR_INBOUND_MAPS - 1) { > dev_warn(pcie->dev, > "Too many inbound regions, not all are mapped.\n"); > break; > } > + /* > + * If the size of the range is larger than the alignment of > + * the start address, we have to use multiple entries to > + * perform the mapping. > + */ > + if (cpu_addr > 0) { > + unsigned long nr_zeros = __ffs64(cpu_addr); > + u64 alignment = 1ULL << nr_zeros; > + > + size = min(range->size, alignment); > + } else { > + size = range->size; > + } > + /* Hardware supports max 4GiB inbound region */ > + size = min(size, 1ULL << 32); > + > + mask = roundup_pow_of_two(size) - 1; > + mask &= ~0xf; > + Indeed, as the cpu address increases so does the size you can fit in each window. Reviewed-by: Andrew Murray <andrew.murray@xxxxxxx> (Though you'll have to rebase this without patch 2). > /* > * Set up 64-bit inbound regions as the range parser doesn't > * distinguish between 32 and 64-bit types. > -- > 2.20.1 >