On Fri, 4 Jun 2021 15:43:48 +0800 Guoqing Jiang <jgq516@xxxxxxxxx> wrote: > Just curious '8' is chose for group_thread_cnt. IIUC, group means one > numa node, and better to > set it to match the number of cores in one numa node in case better > performance is expected. Since the worker threads has really low impact on 100%-read workloads, group_thread_cnt has no effect here. It performs just as good with group_thread_cnt set to 0, 8 or 96. Thinking about it, I should probably omit that comment from the patch. btw, in write workloads, the worker threads are actually contended by device_lock, so choosing any number >8 often results in degraded performance in my tests. I have a p.o.c. branch that reduces the contention by replacing device_lock with hash locks to remove that contention, but it's a big-risky patch so I'd need to break it down first. Thanks, Gal