On Sat, 2015-10-31 at 14:23 -0700, Linus Torvalds wrote: > Mind testing something really stupid, and making the __clear_bit() in > __clear_close_on_exec() conditiona, something like this: > > static inline void __clear_close_on_exec(int fd, struct fdtable *fdt) > { > - __clear_bit(fd, fdt->close_on_exec); > + if (test_bit(fd, fdt->close_on_exec) > + __clear_bit(fd, fdt->close_on_exec); > } > > and see if it makes a difference. It does ;) About 4 % qps increase 3 runs : lpaa24:~# taskset ff0ff ./opensock -t 16 -n 10000000 -l 10 total = 4176651 total = 4178012 total = 4105226 instead of : total = 3910620 total = 3874567 total = 3971028 Perf profile : 69.12% opensock opensock [.] memset | --- memset 12.37% opensock [kernel.kallsyms] [k] queued_spin_lock_slowpath | --- queued_spin_lock_slowpath | |--99.99%-- _raw_spin_lock | | | |--51.99%-- __close_fd | | sys_close | | entry_SYSCALL_64_fastpath | | __libc_close | | | | | --100.00%-- 0x0 | | | |--47.79%-- __alloc_fd | | get_unused_fd_flags | | sock_map_fd | | sys_socket | | entry_SYSCALL_64_fastpath | | __socket | | | | | --100.00%-- 0x0 | --0.21%-- [...] --0.01%-- [...] 1.92% opensock [kernel.kallsyms] [k] _find_next_bit.part.0 | --- _find_next_bit.part.0 | |--66.93%-- find_next_zero_bit | __alloc_fd | get_unused_fd_flags | sock_map_fd | sys_socket | entry_SYSCALL_64_fastpath | __socket | --33.07%-- __alloc_fd get_unused_fd_flags sock_map_fd sys_socket entry_SYSCALL_64_fastpath __socket | --100.00%-- 0x0 1.63% opensock [kernel.kallsyms] [k] _raw_spin_lock | --- _raw_spin_lock | |--28.66%-- get_unused_fd_flags | sock_map_fd -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html