On Mon, Oct 12, 2015 at 11:52:30PM +0530, Kashyap Desai wrote: > > > What should be the solution if we really want to slow down IO > > > submission to avoid CPU lockup. We don't want only one CPU to keep > > > busy for completion. > > > > > > Any suggestion ? > > > > > Yup, file a bug with Oracle :) > > Neil - > > Thanks for info. I understood to use latest <irqbalance>...that was > already attempted. I tried with latest irqbalance and I see expected > behavior as long as I provide <exact> or <subset> + <--poliicyscript>. > We are planning for the same, but wanted to understand what is latest > <irqbalancer> default settings. Is there any reason we are seeing default > settings changed from subset to ignore ? > Latest defaults are that hinting is ignored by default, but hinting can also be set via a policyscript on an irq by irq basis. The reasons for changing the default behavior are documented in commit d9138c78c3e8cb286864509fc444ebb4484c3d70. Irq affinity hinting is effectively a holdover from back in the days when irqbalance couldn't understand a devices locality and irq count easily. Now that it can, there is really no need for an irq affinity hint, unless your driver doesn't properly participate in sysfs device ennumeration. > > > > What you're seeing looks like at least in part a bug with your (very > old) > > version of irqbalance. I seem to recall fixing more than a few bugs > dealing > > with affinity masks from the hint files and banned_cpu options. I > strongly > > suggest that you test with an upstream version of irqbalance and contact > > oracle to update their version to something more recent. > > I see CPU lock up issue does not go if <rq_affinity> is set to 1 in > storage stack and if <irqbalance> policy set to <ignore>. With <ignore> > policy, I see only limited logic cpu of local NUMA node is busy doing > completion. We are still seeing may IO pumping from remote NUMA node. This > will cause CPU lockup as <rq_affinity> does not migrate softirq to _exact_ > submitter. Not sure what majority of h/w require from <irqbalanace> ? Is > it <ignore> kind of policy good choice or <subset> ? > I'm sorry, you'll have to try that again, I'm afraid I can't really parse what you just wrote there. I _think_ what you're saying is that you're observing irqbalance allowing cpu0 (or a small subset of cpus) handling interrupts from your storage devices. As I said in my last note, I recal there being a bug about that that was fixed in a later version. I also note however, that you mention above that you are using a policy script, which Im guessing may have some culpability in terms of you having irqs with multi-bit affinity masks, which as I mentioned will not give you expected behaivor. If you post your policy script, I may be able to point out where you are going wrong. Neil > ` Kashyap > > > > > Regards > > Neil > > > > > ` Kashyap > > > > > > _______________________________________________ > > > irqbalance mailing list > > > irqbalance at lists.infradead.org > > > http://lists.infradead.org/mailman/listinfo/irqbalance > > >