On 06/29/2012 04:01 PM, Nai Xia wrote:
Hey guys, Can I say NAK to these patches ?
Not necessarily the patches, but thinking about your points some more, I thought of a much more serious potential problem with Andrea's code.
Now I aware that this sampling algorithm is completely broken, if we take a few seconds to see what it is trying to solve:
Andrea's patch can only approximate the pages_accessed number in a time unit(scan interval), I don't think it can catch even 1% of average_page_access_frequence on a busy workload.
It is much more "interesting" than that. Once the first thread gets a NUMA pagefault on a particular page, the page is made present in the page tables and NO OTHER THREAD will get NUMA page faults. That means when trying to compare the weighting of NUMA accesses between different threads in a 10 second interval, we only know THE FIRST FAULT. We have no information on whether any other threads tried to access the same page, because we do not get faults more frequently. Not only do we not get use frequency information, we may not get the information on which threads use which memory, at all. Somehow Andrea's code still seems to work. It would be very interesting to know why. How much sense does the following code still make, considering we may never get all the info on which threads use which memory? + /* + * Generate the w_nid/w_cpu_nid from the + * pre-computed mm/task_numa_weight[] and + * compute w_other using the w_m/w_t info + * collected from the other process. + */ + if (mm == p->mm) { + if (w_t > w_t_t) + w_t_t = w_t; + w_other = w_t*AUTONUMA_BALANCE_SCALE/w_t_t; + w_nid = task_numa_weight[nid]; + w_cpu_nid = task_numa_weight[cpu_nid]; + w_type = W_TYPE_THREAD; Andrea, what is the real reason your code works? -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>