Any performance gains from using per thread(thread local) urings?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hello,

I'am writing a small web + embedded database application taking
advantage of the multicore performance of the latest AMD Epyc (up to
128 threads/CPU).

Is there any performance advantage of using per thread uring setups?
Such as every thread will own its unique sq+cq.

My feeling is there are no gains since internally, in Linux kernel,
the uring system is represented as a single queue pickup thread
anyway(?) and sharing a one pair of sq+cq (through exclusive locks)
via all threads would be enough to achieve maximum throughput.

I want to squeeze the max performance out of uring in multi threading
clients <-> server environment, where the max number of threads is
always bounded by the max number of CPUs cores.

Regards, Dmitry



[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux