Bharata B Rao <bharata@xxxxxxx> writes: > On 14-Feb-23 10:25 AM, Bharata B Rao wrote: >> On 13-Feb-23 12:00 PM, Huang, Ying wrote: >>>> I have a microbenchmark where two sets of threads bound to two >>>> NUMA nodes access the two different halves of memory which is >>>> initially allocated on the 1st node. >>>> >>>> On a two node Zen4 system, with 64 threads in each set accessing >>>> 8G of memory each from the initial allocation of 16G, I see that >>>> IBS driven NUMA balancing (i,e., this patchset) takes 50% less time >>>> to complete a fixed number of memory accesses. This could well >>>> be the best case and real workloads/benchmarks may not get this much >>>> uplift, but it does show the potential gain to be had. >>> >>> Can you find a way to show the overhead of the original implementation >>> and your method? Then we can compare between them? Because you think >>> the improvement comes from the reduced overhead. >> >> Sure, will measure the overhead. > > I used ftrace function_graph tracer to measure the amount of time (in us) > spent in fault handling and task_work handling in both the methods when > the above mentioned benchmark was running. > > Default IBS > Fault handling 29879668.71 1226770.84 > Task work handling 24878.894 10635593.82 > Sched switch handling 78159.846 > > Total 29904547.6 11940524.51 Thanks! You have shown the large overhead difference between the original method and your method. Can you show the number of the pages migrated too? I think the overhead / page can be a good overhead indicator too. Can it be translated to the performance improvement? Per my understanding, the total overhead is small compared with total run time. Best Regards, Huang, Ying > In the default case, the fault handling duration is measured > by tracing do_numa_page() and the task_work duration is tracked > by task_numa_work(). > > In the IBS case, the fault handling is tracked by the NMI handler > ibs_overflow_handler(), the task_work is tracked by task_ibs_access_work() > and sched switch time overhead is tracked by hw_access_sched_in(). Note > that in IBS case, not much is done in NMI handler but bulk of the work > (page migration etc) happens in task_work context unlike the default case. > > The breakup in numbers is given below: > > Default > ======= > Duration Min Max Avg > do_numa_page 29879668.71 0.08 317.166 17.16 > task_numa_work 24878.894 0.2 3424.19 388.73 > Total 29904547.6 > > IBS > === > Duration Min Max Avg > ibs_overflow_handler 1226770.84 0.15 104.918 1.26 > task_ibs_access_work 10635593.82 0.21 398.428 29.81 > hw_access_sched_in 78159.846 0.15 247.922 1.29 > Total 11940524.51 > > Regards, > Bharata.