On Fri, Nov 20, 2015 at 01:45:37PM -0500, Mohsin Zaidi wrote: > Some more observations. > > When I said yesterday that changing the unbanned CPUs to 19/55 or > 18/54 worked correctly for all IRQs, I failed to notice that of the > 256 IRQs for the interfaces, 3 would never have their affinities get > updated correctly. > > For example, with the banning mask set to "ff,ff7fffff,fff7ffff", the > smp_affinity_list values for the last 10 IRQs are as follows: > > 19 > 55 > 26 > 55 > 24 > 55 > 19 > 19 > 19 > 22 > > 3 of these are set to whatever was set for them last (my last test was > to unban all CPUs). I see this pattern repeated every time. > > I changed the test to unban 18-19,54-55 at the same time, and this > problem went away. When I unbanned just 19/55 and reduced the number > of queues per interface by one, the problem also went away. > > It's as if 2 CPUs can't be successfully assigned 256 IRQs. This also > holds true if the CPUs are not siblings (e.g. 19/54). > I wonder if this is a hardware limitation (i.e. if you're hitting the upper limit of the elligible cpu set in an MSI write or some such). If you manually set all irqs to a single cpu, what happens? Neil > So there are two dimensions to the problem. One is choosing CPUs just > on NUMA node 0 doesn't work, and the other is that assigning 256 IRQs > to 2 CPUs on NUMA node 1 doesn't work. > Regards, > Mohsin > > > On Fri, Nov 20, 2015 at 9:45 AM, Neil Horman <nhorman at tuxdriver.com> wrote: > > On Thu, Nov 19, 2015 at 01:32:58PM -0500, Mohsin Zaidi wrote: > >> Thanks, Neil. I'll have the results for you shortly. > >> > >> I wanted to point out that each of the 4 interfaces on the server have > >> 64 queues, so there are a total of 256 queues. Also, the banning is > >> attempting to direct interrupts to just two processors (#1 and #37) on > >> the same NUMA node, which is also not the same as the NUMA node that > >> "owns" the interface I am looking at (eth03). > >> > >> Does any of this matter? > > It really shouldn't, but given that I'm at a loss to explain the behavior yet, > > anything is on the table. > > Neil > > > >> Regards, > >> Mohsin > >> > >> > >> On Thu, Nov 19, 2015 at 9:58 AM, Neil Horman <nhorman at tuxdriver.com> wrote: > >> > On Wed, Nov 18, 2015 at 10:42:41AM -0500, Mohsin Zaidi wrote: > >> >> I'm using the irqbalance daemon with the following config file. The > >> >> only thing I've changed is the banned CPUs list, and I've banned all > >> >> but CPUs #1 and #37. Interrupts *never* go to #1, and go to #18 and > >> >> #37, even though #18 has also been banned. > >> >> > >> >> # irqbalance is a daemon process that distributes interrupts across > >> >> # CPUS on SMP systems. The default is to rebalance once every 10 > >> >> # seconds. This is the environment file that is specified to systemd via the > >> >> # EnvironmentFile key in the service unit file (or via whatever method the init > >> >> # system you're using has. > >> >> # > >> >> # ONESHOT=yes > >> >> # after starting, wait for a minute, then look at the interrupt > >> >> # load and balance it once; after balancing exit and do not change > >> >> # it again. > >> >> #IRQBALANCE_ONESHOT= > >> >> > >> >> # > >> >> # IRQBALANCE_BANNED_CPUS > >> >> # 64 bit bitmask which allows you to indicate which cpu's should > >> >> # be skipped when reblancing irqs. Cpu numbers which have their > >> >> # corresponding bits set to one in this mask will not have any > >> >> # irq's assigned to them on rebalance > >> >> # > >> >> #IRQBALANCE_BANNED_CPUS= > >> >> IRQBALANCE_BANNED_CPUS=000000ff,ffffffdf,fffffffd > >> >> > >> >> # > >> >> # IRQBALANCE_ARGS > >> >> # append any args here to the irqbalance daemon as documented in the man page > >> >> # > >> >> #IRQBALANCE_ARGS= > >> >> Regards, > >> >> Mohsin > >> >> > >> >> > >> >> On Wed, Nov 18, 2015 at 10:28 AM, Neil Horman <nhorman at tuxdriver.com> wrote: > >> >> > On Wed, Nov 18, 2015 at 10:04:56AM -0500, Mohsin Zaidi wrote: > >> >> >> Sorry about that, Neil. > >> >> >> > >> >> >> I haven't specified any hint policy in IRQBALANCE_ARGS (for the daemon). > >> >> >> Regards, > >> >> >> Mohsin > >> >> >> > >> >> > Ok, well, I'm at a bit of a loss. irqbalance, based on your output from the > >> >> > debug log, is working properly, presuming you actually listed cpus 18 and 37 as > >> >> > your only unbanned one, which you indicate is the opposite of what you've > >> >> > configured. > >> >> > > >> >> > Can you please send me the command line you use to start irqbalance? > >> >> > > >> >> > Neil > >> >> > > >> >> >> > >> >> >> On Wed, Nov 18, 2015 at 6:36 AM, Neil Horman <nhorman at tuxdriver.com> wrote: > >> >> >> > On Fri, Nov 13, 2015 at 04:39:08PM -0500, Neil Horman wrote: > >> >> >> >> On Fri, Nov 13, 2015 at 01:39:20PM -0500, Mohsin Zaidi wrote: > >> >> >> >> > Thanks for your reply, Neil. > >> >> >> >> > > >> >> >> >> > Yes, when I manually set the irq affinity to avoid #18, it works. > >> >> >> >> > > >> >> >> >> > I just downloaded and applied the latest irqbalance code, but it's > >> >> >> >> > showing the same behavior. > >> >> >> >> > > >> >> >> >> What hint policy are you using? > >> >> >> >> > >> >> >> >> Neil > >> >> >> >> > >> >> >> > Ping, any response regarding hint policy? > >> >> >> > > >> >> >> > Neil > >> >> >> > > >> >> >> > >> >> > >> > > >> > I'm at something of a loss here. I can see no reason why this would fail on > >> > only one system. In an effort to get additional data, please apply this patch, > >> > run irqbalance in debug mode and post the output please. > >> > > >> > Thanks! > >> > Neil > >> > > >> > > >> > diff --git a/activate.c b/activate.c > >> > index c8453d5..d92e770 100644 > >> > --- a/activate.c > >> > +++ b/activate.c > >> > @@ -113,6 +113,7 @@ static void activate_mapping(struct irq_info *info, void *data __attribute__((un > >> > return; > >> > > >> > cpumask_scnprintf(buf, PATH_MAX, applied_mask); > >> > + printf("Applying mask for irq %d: 5s\n", info->irq, buf); > >> > fprintf(file, "%s", buf); > >> > fclose(file); > >> > info->moved = 0; /*migration is done*/ > >> >