On 19 March 2018 at 15:27, Ard Biesheuvel <ard.biesheuvel@xxxxxxxxxx> wrote: > Commit 4f2c7583e33e upstream. > > When struct its_device instances are created, the nr_ites member > will be set to a power of 2 that equals or exceeds the requested > number of MSIs passed to the msi_prepare() callback. At the same > time, the LPI map is allocated to be some multiple of 32 in size, > where the allocated size may be less than the requested size > depending on whether a contiguous range of sufficient size is > available in the global LPI bitmap. > > This may result in the situation where the nr_ites < nr_lpis, and > since nr_ites is what we program into the hardware when we map the > device, the additional LPIs will be non-functional. > > For bog standard hardware, this does not really matter. However, > in cases where ITS device IDs are shared between different PCIe > devices, we may end up allocating these additional LPIs without > taking into account that they don't actually work. > > So let's make nr_ites at least 32. This ensures that all allocated > LPIs are 'live', and that its_alloc_device_irq() will fail when > attempts are made to allocate MSIs beyond what was allocated in > the first place. > > Signed-off-by: Ard Biesheuvel <ard.biesheuvel@xxxxxxxxxx> > [maz: updated comment] > Signed-off-by: Marc Zyngier <marc.zyngier@xxxxxxx> > [ardb: trivial tweak of unrelated context] > Signed-off-by: Ard Biesheuvel <ard.biesheuvel@xxxxxxxxxx> > --- Please apply to v4.9 > drivers/irqchip/irq-gic-v3-its.c | 9 ++++----- > 1 file changed, 4 insertions(+), 5 deletions(-) > > diff --git a/drivers/irqchip/irq-gic-v3-its.c b/drivers/irqchip/irq-gic-v3-its.c > index acb9d250a905..ac15e5d5d9b2 100644 > --- a/drivers/irqchip/irq-gic-v3-its.c > +++ b/drivers/irqchip/irq-gic-v3-its.c > @@ -684,7 +684,7 @@ static struct irq_chip its_irq_chip = { > * This gives us (((1UL << id_bits) - 8192) >> 5) possible allocations. > */ > #define IRQS_PER_CHUNK_SHIFT 5 > -#define IRQS_PER_CHUNK (1 << IRQS_PER_CHUNK_SHIFT) > +#define IRQS_PER_CHUNK (1UL << IRQS_PER_CHUNK_SHIFT) > > static unsigned long *lpi_bitmap; > static u32 lpi_chunks; > @@ -1320,11 +1320,10 @@ static struct its_device *its_create_device(struct its_node *its, u32 dev_id, > > dev = kzalloc(sizeof(*dev), GFP_KERNEL); > /* > - * At least one bit of EventID is being used, hence a minimum > - * of two entries. No, the architecture doesn't let you > - * express an ITT with a single entry. > + * We allocate at least one chunk worth of LPIs bet device, > + * and thus that many ITEs. The device may require less though. > */ > - nr_ites = max(2UL, roundup_pow_of_two(nvecs)); > + nr_ites = max(IRQS_PER_CHUNK, roundup_pow_of_two(nvecs)); > sz = nr_ites * its->ite_size; > sz = max(sz, ITS_ITT_ALIGN) + ITS_ITT_ALIGN - 1; > itt = kzalloc(sz, GFP_KERNEL); > -- > 2.11.0 >