Thanks, Neil. I'll have the results for you shortly. I wanted to point out that each of the 4 interfaces on the server have 64 queues, so there are a total of 256 queues. Also, the banning is attempting to direct interrupts to just two processors (#1 and #37) on the same NUMA node, which is also not the same as the NUMA node that "owns" the interface I am looking at (eth03). Does any of this matter? Regards, Mohsin On Thu, Nov 19, 2015 at 9:58 AM, Neil Horman <nhorman at tuxdriver.com> wrote: > On Wed, Nov 18, 2015 at 10:42:41AM -0500, Mohsin Zaidi wrote: >> I'm using the irqbalance daemon with the following config file. The >> only thing I've changed is the banned CPUs list, and I've banned all >> but CPUs #1 and #37. Interrupts *never* go to #1, and go to #18 and >> #37, even though #18 has also been banned. >> >> # irqbalance is a daemon process that distributes interrupts across >> # CPUS on SMP systems. The default is to rebalance once every 10 >> # seconds. This is the environment file that is specified to systemd via the >> # EnvironmentFile key in the service unit file (or via whatever method the init >> # system you're using has. >> # >> # ONESHOT=yes >> # after starting, wait for a minute, then look at the interrupt >> # load and balance it once; after balancing exit and do not change >> # it again. >> #IRQBALANCE_ONESHOT= >> >> # >> # IRQBALANCE_BANNED_CPUS >> # 64 bit bitmask which allows you to indicate which cpu's should >> # be skipped when reblancing irqs. Cpu numbers which have their >> # corresponding bits set to one in this mask will not have any >> # irq's assigned to them on rebalance >> # >> #IRQBALANCE_BANNED_CPUS= >> IRQBALANCE_BANNED_CPUS=000000ff,ffffffdf,fffffffd >> >> # >> # IRQBALANCE_ARGS >> # append any args here to the irqbalance daemon as documented in the man page >> # >> #IRQBALANCE_ARGS= >> Regards, >> Mohsin >> >> >> On Wed, Nov 18, 2015 at 10:28 AM, Neil Horman <nhorman at tuxdriver.com> wrote: >> > On Wed, Nov 18, 2015 at 10:04:56AM -0500, Mohsin Zaidi wrote: >> >> Sorry about that, Neil. >> >> >> >> I haven't specified any hint policy in IRQBALANCE_ARGS (for the daemon). >> >> Regards, >> >> Mohsin >> >> >> > Ok, well, I'm at a bit of a loss. irqbalance, based on your output from the >> > debug log, is working properly, presuming you actually listed cpus 18 and 37 as >> > your only unbanned one, which you indicate is the opposite of what you've >> > configured. >> > >> > Can you please send me the command line you use to start irqbalance? >> > >> > Neil >> > >> >> >> >> On Wed, Nov 18, 2015 at 6:36 AM, Neil Horman <nhorman at tuxdriver.com> wrote: >> >> > On Fri, Nov 13, 2015 at 04:39:08PM -0500, Neil Horman wrote: >> >> >> On Fri, Nov 13, 2015 at 01:39:20PM -0500, Mohsin Zaidi wrote: >> >> >> > Thanks for your reply, Neil. >> >> >> > >> >> >> > Yes, when I manually set the irq affinity to avoid #18, it works. >> >> >> > >> >> >> > I just downloaded and applied the latest irqbalance code, but it's >> >> >> > showing the same behavior. >> >> >> > >> >> >> What hint policy are you using? >> >> >> >> >> >> Neil >> >> >> >> >> > Ping, any response regarding hint policy? >> >> > >> >> > Neil >> >> > >> >> >> > > I'm at something of a loss here. I can see no reason why this would fail on > only one system. In an effort to get additional data, please apply this patch, > run irqbalance in debug mode and post the output please. > > Thanks! > Neil > > > diff --git a/activate.c b/activate.c > index c8453d5..d92e770 100644 > --- a/activate.c > +++ b/activate.c > @@ -113,6 +113,7 @@ static void activate_mapping(struct irq_info *info, void *data __attribute__((un > return; > > cpumask_scnprintf(buf, PATH_MAX, applied_mask); > + printf("Applying mask for irq %d: 5s\n", info->irq, buf); > fprintf(file, "%s", buf); > fclose(file); > info->moved = 0; /*migration is done*/