On 2016/8/1 21:11, Neil Horman wrote: > On Thu, Jul 28, 2016 at 02:15:48PM +0800, Yun Wu (Abel) wrote: >> Hi Neil et al, >> >> The question comes from commit 93ed801, in which a condition was added to >> judge whether irqbalance needs to rescan. >> >> /* IRQ removed and reinserted, need restart or this will >> * cause an overflow and IRQ won't be rebalanced again >> */ >> if (count < info->irq_count) { >> need_rescan = 1; >> break; >> } >> >> This works well for most situations, but not all. During one SLEEP_INTERVAL, >> when an IRQ is removed and reinserted like the above comment said, AND the >> times of the IRQ being serviced after reinserted do become a larger number >> than when unremoved, the IRQ can hardly be rebalanced again. Actually this >> problem shows up very occasionally in my recent hotplug tests, but once >> happened on performance-critical IRQs, it is undoubtedly a disaster. >> >> This problem can even be worse when the two IRQs, removed one and reinserted >> one, belongs to different kind of devices, in which case wrong balance policies >> might be used. >> >> To solve this problem, I think we can make efforts in two aspects: >> (given the removed IRQ is A and the reinserted one is B) >> a) If A != B, set need_rescan to 1. This can be achieved by comparing the >> two IRQs' name string. >> b) If A == B, we simply treat this as an modification on its affinity. An >> unexpected modification on affinity can cause inconsistency between the >> IRQ's real affinity and the affinity recorded inside irqbalance's data >> structure, leading to inappropriate load calculation. >> >> I haven't yet figured out a proper way to solve the inconsistency, or is there >> already a solution that I missed? >> >> Any comments are appreciated. >> >> Thanks, >> Abel > Yeah, you look to be right. My first thought is to be heavy handed and use the > listening interface on libudev to detect hotplug events, and just set > need_rescan, anytime we get one. > > Thoughts? > Neil Hi Neil, I think this will work on detecting hotplug events, and what also makes me concerned is changing affinity manually without setting that IRQ banned, which can cause the inconsistency I mentioned last email. A way to (possibly) work it out is to monitor all the statistics on each core for all interrupts parsing from /proc/interrupts. When comparing the latest and the recorded statistics,a rise in the count number of a specific core means the IRQ having been serviced on this core recently, otherwise a fall reveals hotplug. By doing this, if there is no explicit sign of hotplug happened, we can finally get a cpu mask for each IRQ showing its recent real affinity which should be a subset of the mask of the object the IRQ assigned to. I will get patches ready for review soon. And still, any comments are appreciated. :) Thanks, Abel