* Waiman Long <waiman.long@xxxxxx> wrote: > > Mind posting the microbenchmark? > > I have attached the tool that I used for testing. Thanks, that's interesting! Btw., we could also do something like this in user-space, in tools/perf/bench/, we have no 'perf bench locking' subcommand yet. We already build and measure simple x86 kernel methods there such as memset() and memcpy(): triton:~/tip> perf bench mem memcpy -r all # Running 'mem/memcpy' benchmark: Routine default (Default memcpy() provided by glibc) # Copying 1MB Bytes ... 1.385195 GB/Sec 4.982462 GB/Sec (with prefault) Routine x86-64-unrolled (unrolled memcpy() in arch/x86/lib/memcpy_64.S) # Copying 1MB Bytes ... 1.627604 GB/Sec 5.336407 GB/Sec (with prefault) Routine x86-64-movsq (movsq-based memcpy() in arch/x86/lib/memcpy_64.S) # Copying 1MB Bytes ... 2.132233 GB/Sec 4.264465 GB/Sec (with prefault) Routine x86-64-movsb (movsb-based memcpy() in arch/x86/lib/memcpy_64.S) # Copying 1MB Bytes ... 1.490935 GB/Sec 7.128193 GB/Sec (with prefault) Locking primitives would certainly be more complex build in user-space - but we could shuffle things around in kernel headers as well to make it easier to test in user-space. That's how we can build lockdep in user-space for example, see tools/lib/lockdep. Just a thought. Thanks, Ingo -- To unsubscribe from this list: send the line "unsubscribe linux-arch" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html