Hi Marc, Thanks for your review. Please see my comments inline. Thanks, Minghuan > -----Original Message----- > From: Marc Zyngier [mailto:marc.zyngier@xxxxxxx] > Sent: Thursday, January 05, 2017 11:33 PM > To: M.H. Lian <minghuan.lian@xxxxxxx>; linux-arm- > kernel@xxxxxxxxxxxxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx; > devicetree@xxxxxxxxxxxxxxx > Cc: Rob Herring <robh@xxxxxxxxxx>; Jason Cooper > <jason@xxxxxxxxxxxxxx>; Roy Zang <roy.zang@xxxxxxx>; Mingkai Hu > <mingkai.hu@xxxxxxx>; Stuart Yoder <stuart.yoder@xxxxxxx>; Leo Li > <leoyang.li@xxxxxxx>; Scott Wood <scott.wood@xxxxxxx> > Subject: Re: [PATCH v2,9/9] irqchip/ls-scfg-msi: add MSI affinity support > > On 05/01/17 08:10, Minghuan Lian wrote: > > For LS1046a and LS1043a v1.1, the MSI controller has 4 MSIRs and 4 > > CPUs. A GIC SPI interrupt of MSIR can be associated with a CPU. > > When changing MSI interrupt affinity, this MSI will be moved to the > > corresponding MSIR and MSI message data will be changed according to > > MSIR. when requesting a MSI, the bits of all 4 MSIR will be reserved. > > The parameter 'msi_affinity_flag' is provide to change this mode. > > "lsmsi=no-affinity" will disable affinity, all MSI can only be > > associated with CPU 0. > > > > Signed-off-by: Minghuan Lian <Minghuan.Lian@xxxxxxx> > > --- > > v2-v1: > > - None > > > > drivers/irqchip/irq-ls-scfg-msi.c | 75 > > ++++++++++++++++++++++++++++++++++++--- > > 1 file changed, 70 insertions(+), 5 deletions(-) > > > > diff --git a/drivers/irqchip/irq-ls-scfg-msi.c > > b/drivers/irqchip/irq-ls-scfg-msi.c > > index dc19569..753fe39 100644 > > --- a/drivers/irqchip/irq-ls-scfg-msi.c > > +++ b/drivers/irqchip/irq-ls-scfg-msi.c > > @@ -40,6 +40,7 @@ struct ls_scfg_msir { > > unsigned int gic_irq; > > unsigned int bit_start; > > unsigned int bit_end; > > + unsigned int srs; /* Shared interrupt register select */ > > void __iomem *reg; > > }; > > > > @@ -70,6 +71,19 @@ struct ls_scfg_msi { > > .chip = &ls_scfg_msi_irq_chip, > > }; > > > > +static int msi_affinity_flag = 1; > > + > > +static int __init early_parse_ls_scfg_msi(char *p) { > > + if (p && strncmp(p, "no-affinity", 11) == 0) > > + msi_affinity_flag = 0; > > + else > > + msi_affinity_flag = 1; > > + > > + return 0; > > +} > > +early_param("lsmsi", early_parse_ls_scfg_msi); > > What is the point of this option? If feels like an unnecessary complexity. [Minghuan Lian] For LS1043a rev1.1, there are only 8 MSI interrupts for each MSI controller when enable affinity. (32 MSI interrupts are evenly distributed to 4 cores). 3 MSI controllers can only provide 32 MSI interrupts. When disable affinity, the MSI interrupts number of 3 controllers can up to 32*3= 96. 32 MSI interrupts may be not enough. For example: an Ethernet card with 4 ports, each port needs 4 RX, 4TX and 1 error interrupts, totally need 4*(4+4+1) 36 MSI interrupts. With the parameter "lsmsi=no-affinity" this Ethernet card can work well, although the performance is poor. > > > + > > static void ls_scfg_msi_compose_msg(struct irq_data *data, struct > > msi_msg *msg) { > > struct ls_scfg_msi *msi_data = irq_data_get_irq_chip_data(data); > @@ > > -77,12 +91,43 @@ static void ls_scfg_msi_compose_msg(struct irq_data > *data, struct msi_msg *msg) > > msg->address_hi = upper_32_bits(msi_data->msiir_addr); > > msg->address_lo = lower_32_bits(msi_data->msiir_addr); > > msg->data = data->hwirq; > > + > > + if (msi_affinity_flag) { > > + u32 msir_index; > > + > > + msir_index = cpumask_first(data->common->affinity); > > + if (msir_index >= msi_data->msir_num) > > + msir_index = 0; > > How can this happen? [Minghuan Lian] This will never happen. I will remove this "if" sentence. > > > + > > + msg->data |= msir_index; > > How do you guarantee that the low bits were clear? Please document the > way you encode your MSI payload. [Minghuan Lian] the available message data is a bytes. 7 6 5 4 3 2 1 0 | - | ibs | srs | SRS bit 0-1 is used to select MSIR register/CPU. A MSIR is associated with a CPU. IBS bit2-6 of ls1046, bit2-4 for ls1043 is used to select bit of the MSIR. When enable affinity, only bits of MSIR0(srs = 0 cpu0) is be freed for requesting MSI. all bits of the MSIR1-3(cpu1-3) are reserved to be sure this MSI can be successfully associated with cpu 1-3. So, When requesting a MSI interrupt, srs always is 0. The hwirq always equals bits number of the MSIR0(SRS = 0). First, MSI message data payload equals hwirq, and then plus the real SRS according to smp_affinity. This MSI interrupt will be routed to the corresponding the MSIR/CPU. > > > + } > > } > > > > static int ls_scfg_msi_set_affinity(struct irq_data *irq_data, > > const struct cpumask *mask, bool force) { > > - return -EINVAL; > > + struct ls_scfg_msi *msi_data = irq_data_get_irq_chip_data(irq_data); > > + u32 cpu; > > + > > + if (!msi_affinity_flag) > > + return -EINVAL; > > + > > + if (!force) > > + cpu = cpumask_any_and(mask, cpu_online_mask); > > + else > > + cpu = cpumask_first(mask); > > + > > + if (cpu >= msi_data->msir_num) > > + return -EINVAL; > > + > > + if (msi_data->msir[cpu].gic_irq <= 0) { > > + pr_warn("cannot bind the irq to cpu%d\n", cpu); > > Please don't. Returning an error is enough. If you really want to have > something, turn it into a proper debug message. [Minghuan Lian] ok, I will change it. > > > + return -EINVAL; > > + } > > + > > + cpumask_copy(irq_data->common->affinity, mask); > > + > > + return IRQ_SET_MASK_OK; > > } > > > > static struct irq_chip ls_scfg_msi_parent_chip = { @@ -158,7 +203,7 > > @@ static void ls_scfg_msi_irq_handler(struct irq_desc *desc) > > > > for_each_set_bit_from(pos, &val, size) { > > hwirq = ((msir->bit_end - pos) << msi_data->cfg->ibs_shift) | > > - msir->index; > > + msir->srs; > > virq = irq_find_mapping(msi_data->parent, hwirq); > > if (virq) > > generic_handle_irq(virq); > > @@ -221,10 +266,19 @@ static int ls_scfg_msi_setup_hwirq(struct > ls_scfg_msi *msi_data, int index) > > ls_scfg_msi_irq_handler, > > msir); > > > > + if (msi_affinity_flag) { > > + /* Associate MSIR interrupt to the cpu */ > > + irq_set_affinity(msir->gic_irq, get_cpu_mask(index)); > > + msir->srs = 0; /* This value is determined by the CPU */ > > + } else > > + msir->srs = index; > > + > > /* Release the hwirqs corresponding to this MSIR */ > > - for (i = 0; i < msi_data->cfg->msir_irqs; i++) { > > - hwirq = i << msi_data->cfg->ibs_shift | msir->index; > > - bitmap_clear(msi_data->used, hwirq, 1); > > + if (!msi_affinity_flag || msir->index == 0) { > > + for (i = 0; i < msi_data->cfg->msir_irqs; i++) { > > + hwirq = i << msi_data->cfg->ibs_shift | msir->index; > > + bitmap_clear(msi_data->used, hwirq, 1); > > + } > > } > > > > return 0; > > @@ -316,6 +370,17 @@ static int ls_scfg_msi_probe(struct > platform_device *pdev) > > bitmap_set(msi_data->used, 0, msi_data->irqs_num); > > > > msi_data->msir_num = of_irq_count(pdev->dev.of_node); > > + > > + if (msi_affinity_flag) { > > + u32 cpu_num; > > + > > + cpu_num = num_possible_cpus(); > > + if (msi_data->msir_num >= cpu_num) > > + msi_data->msir_num = cpu_num; > > + else > > + msi_affinity_flag = 0; > > + } > > + > > msi_data->msir = devm_kcalloc(&pdev->dev, msi_data->msir_num, > > sizeof(*msi_data->msir), > > GFP_KERNEL); > > > > This is a very confusing patch. Please get rid of this useless option and > document how you encode the routing in the hwirq. [Minghuan Lian] Both LS1021a and LS1043av1.0 have only a MSIR, a gic interrupt. MSI controllers cannot support affinity. Then LS1046a/LS1043av1.1 extends MSIR number to 4 same to cpu number. So each MSIR with a GIC interrupt can be associated with a cpu. To keep simple, the first patch for ls1046 and ls1043av1.1 keep the same way with ls1021 and ls1043av1.0 that does not support affinity and all interrupts of MSIR0-3 are different and can be used for requesting MSI interrupts. This patch is to enable affinity, that means, for ls1046a and ls1043av1.1, the same bit of MSIR0-3 will be looked as one interrupt using the same hwirq. And MSIRN only is used when the MSI interrupt is associated with the corresponding cpu. > > Thanks, > > M. > -- > Jazz is not dead. It just smells funny... -- To unsubscribe from this list: send the line "unsubscribe devicetree" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html