On 2019/7/12 下午3:53, Peter Zijlstra wrote: [snip] >>>> return target; >>>> } >>> >>> Select idle sibling should never cross node boundaries and is thus the >>> entirely wrong place to fix anything. >> >> Hmm.. in our early testing the printk show both select_task_rq_fair() and >> task_numa_find_cpu() will call select_idle_sibling with prev and target on >> different node, thus we pick this point to save few lines. > > But it will never return @prev if it is not in the same cache domain as > @target. See how everything is gated by: > > && cpus_share_cache(x, target) Yeah, that's right. > >> But if the semantics of select_idle_sibling() is to return cpu on the same >> node of target, what about move the logical after select_idle_sibling() for >> the two callers? > > No, that's insane. You don't do select_idle_sibling() to then ignore the > result. You have to change @target before calling select_idle_sibling(). > I see, we should not override the decision of select_idle_sibling(). Actually the original design we try to achieve is: let wake affine select the target try find idle sibling of target if got one pick it else if task cling to prev pick prev That is to consider wake affine superior to numa cling. But after rethinking maybe this is not necessary, since numa cling is also some kind of strong wake affine hint, actually maybe even a better one to filter out the bad cases. I'll try change @target instead and give a retest then. Regards, Michael Wang