Hi, I am working on x8_64 arch. Profiled (oprofile) Linux kernel module and notice that whole lot of cycles are spent in copy_from_user call. I compared same flow from kernel proper and noticed that for more data through put cycles spent in copy_from_user are much less. Kernel proper has 1/8 cycles compared to module. (There is a user process which keeps sending data, like iperf) Used perf tool to gather some statistics and found that call from kernel proper 185,719,857,837 cpu-cycles # 3.318 GHz [90.01%] 99,886,030,243 instructions # 0.54 insns per cycle [95.00%] 1,696,072,702 cache-references # 30.297 M/sec [94.99%] 786,929,244 cache-misses # 46.397 % of all cache refs [95.00%] 16,867,747,688 branch-instructions # 301.307 M/sec [95.03%] 86,752,646 branch-misses # 0.51% of all branches [95.00%] 5,482,768,332 bus-cycles # 97.938 M/sec [20.08%] 55967.269801 cpu-clock 55981.842225 task-clock # 0.933 CPUs utilized and call from kernel module 9,388,787,678 cpu-cycles # 1.527 GHz [89.77%] 1,706,203,221 instructions # 0.18 insns per cycle [94.59%] 551,010,961 cache-references # 89.588 M/sec [94.73%] 369,632,492 cache-misses # 67.083 % of all cache refs [95.18%] 291,358,658 branch-instructions # 47.372 M/sec [94.68%] 10,291,678 branch-misses # 3.53% of all branches [95.01%] 582,651,999 bus-cycles # 94.733 M/sec [20.55%] 6112.471585 cpu-clock 6150.490210 task-clock # 0.102 CPUs utilized 367 page-faults # 0.000 M/sec 367 minor-faults # 0.000 M/sec 0 major-faults # 0.000 M/sec 25,770 context-switches # 0.004 M/sec 23 cpu-migrations # 0.000 M/sec So obviously, CPU is stalling when it is copying data and there are more cache misses. My question is, is there a difference calling copy_from_user from kernel proper compared to calling from LKM ? _______________________________________________ Kernelnewbies mailing list Kernelnewbies@xxxxxxxxxxxxxxxxx http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies