On Tue, Jul 09, 2019 at 03:15:59AM +0200, marek.vasut@xxxxxxxxx wrote: > From: Marek Vasut <marek.vasut+renesas@xxxxxxxxx> > > Due to hardware constraints, the size of each inbound range entry > populated into the controller cannot be larger than the alignment > of the entry's start address. Currently, the alignment for each > "dma-ranges" inbound range is calculated only once for each range > and the increment for programming the controller is also derived > from it only once. Thus, a "dma-ranges" entry describing a memory > at 0x48000000 and size 0x38000000 would lead to multiple controller > entries, each 0x08000000 long. > > This is inefficient, especially considering that by adding the size > to the start address, the alignment increases. This patch moves the > alignment calculation into the loop populating the controller entries, > thus updating the alignment for each controller entry. > > Signed-off-by: Marek Vasut <marek.vasut+renesas@xxxxxxxxx> > Cc: Geert Uytterhoeven <geert+renesas@xxxxxxxxx> > Cc: Lorenzo Pieralisi <lorenzo.pieralisi@xxxxxxx> > Cc: Wolfram Sang <wsa@xxxxxxxxxxxxx> > Cc: linux-renesas-soc@xxxxxxxxxxxxxxx > To: linux-pci@xxxxxxxxxxxxxxx Reviewed-by: Simon Horman <horms+renesas@xxxxxxxxxxxx> > --- > drivers/pci/controller/pcie-rcar.c | 33 +++++++++++++++--------------- > 1 file changed, 17 insertions(+), 16 deletions(-) > > diff --git a/drivers/pci/controller/pcie-rcar.c b/drivers/pci/controller/pcie-rcar.c > index 938adff4148f..48f361b5d690 100644 > --- a/drivers/pci/controller/pcie-rcar.c > +++ b/drivers/pci/controller/pcie-rcar.c > @@ -1029,25 +1029,26 @@ static int rcar_pcie_inbound_ranges(struct rcar_pcie *pcie, > if (restype & IORESOURCE_PREFETCH) > flags |= LAM_PREFETCH; > > - /* > - * If the size of the range is larger than the alignment of the start > - * address, we have to use multiple entries to perform the mapping. > - */ > - if (cpu_addr > 0) { > - unsigned long nr_zeros = __ffs64(cpu_addr); > - u64 alignment = 1ULL << nr_zeros; > + while (cpu_addr < cpu_end) { > + /* > + * If the size of the range is larger than the alignment of > + * the start address, we have to use multiple entries to > + * perform the mapping. > + */ > + if (cpu_addr > 0) { > + unsigned long nr_zeros = __ffs64(cpu_addr); > + u64 alignment = 1ULL << nr_zeros; > > - size = min(range->size, alignment); > - } else { > - size = range->size; > - } > - /* Hardware supports max 4GiB inbound region */ > - size = min(size, 1ULL << 32); > + size = min(range->size, alignment); > + } else { > + size = range->size; > + } > + /* Hardware supports max 4GiB inbound region */ > + size = min(size, 1ULL << 32); > > - mask = roundup_pow_of_two(size) - 1; > - mask &= ~0xf; > + mask = roundup_pow_of_two(size) - 1; > + mask &= ~0xf; > > - while (cpu_addr < cpu_end) { > /* > * Set up 64-bit inbound regions as the range parser doesn't > * distinguish between 32 and 64-bit types. > -- > 2.20.1 >