On Wed, Dec 4, 2013 at 3:19 AM, Abu Rasheda <rcpilot2010@xxxxxxxxx> wrote: > On Mon, Dec 2, 2013 at 11:31 PM, Mulyadi Santosa > <mulyadi.santosa@xxxxxxxxx> wrote: >> On Tue, Dec 3, 2013 at 8:33 AM, Abu Rasheda <rcpilot2010@xxxxxxxxx> wrote: >>> I have my implementation of socket APIs, >>> >>> I sock_unregister(AF_INET); & sock_register(&inet_family_ops), this replaces >>> kernel resident socket related calls with my socket related calls. My code >>> is loaded as kernel module. >>> >>> My question, is Linux kernel able to call its own socket call more >>> efficiently (less overhead, fewer CPU cycles) than mine ? code is running on >>> Intel x86_64 arch. >>> >>> Any pointer is appreciated. >> >> >> IMHO, strictly from module vs core kernel code perspective, there is >> no speed difference. >> >> The reason is, once the kernel is loaded, it is the kernel itself. So >> it is not separated into isolated segment or something like that. Yes >> module is loaded into vmalloc-ed memory area, but that's it. > > Is it possible that since my code of socket API is located in > different part of memory, so that it needs to make a long jump > (results in d-cache miss) ? > > here is statistics: from per stats tool: > > Linux code: > 112,880,163,053 instructions:HG # 0.72 insns per cycle > > My code: > 32,074,097,170 instructions:HG # 0.34 insns per cycle Hi... please Cc: to kernelnewbies as well. as for d-cache miss, I think that's not due to long jump, but accessing something not cache aligned. Or maybe you can think something that can be prefetched. That way, you utilize the processor pipeline better, but use it wisely -- regards, Mulyadi Santosa Freelance Linux trainer and consultant blog: the-hydra.blogspot.com training: mulyaditraining.blogspot.com _______________________________________________ Kernelnewbies mailing list Kernelnewbies@xxxxxxxxxxxxxxxxx http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies