Re: Re: [PATCH v8] io_uring: releasing CPU resources when polling

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 9/25/2024 12:12, Pavel Begunkov wrote:
>I don't have a strong opinion on the feature, but the open question
>we should get some decision on is whether it's really well applicable to
>a good enough set of apps / workloads, if it'll even be useful in the
>future and/or for other vendors, and if the merit outweighs extra
>8 bytes + 1 flag per io_kiocb and the overhead of 1-2 static key'able
>checks in hot paths.

IMHO, releasing some of the CPU resources during the polling
process may be appropriate for some performance bottlenecks
due to CPU resource constraints, such as some database
applications, in addition to completing IO operations, CPU
also needs to peocess data, like compression and decompression.
In a high-concurrency state, not only polling takes up a lot of
CPU time, but also operations like calculation and processing
also need to compete for CPU time. In this case, the performance
of the application may be difficult to improve.

The MultiRead interface of Rocksdb has been adapted to io_uring,
I used db_bench to construct a situation with high CPU pressure
and compared the performance. The test configuration is as follows,

-------------------------------------------------------------------
CPU Model 	Intel(R) Xeon(R) Platinum 8380 CPU @ 2.30GHz
CPU Cores	8
Memory		16G
SSD			Samsung PM9A3
-------------------------------------------------------------------

Test case:
./db_bench --benchmarks=multireadrandom,stats
--duration=60
--threads=4/8/16
--use_direct_reads=true
--db=/mnt/rocks/test_db
--wal_dir=/mnt/rocks/test_db
--key_size=4
--value_size=4096
-cache_size=0
-use_existing_db=1
-batch_size=256
-multiread_batched=true
-multiread_stride=0
------------------------------------------------------
Test result:
			National	Optimization
threads		ops/sec		ops/sec		CPU Utilization
16			139300		189075		100%*8
8			138639		133191		90%*8
4			71475		68361		90%*8
------------------------------------------------------

When the number of threads exceeds the number of CPU cores,the
database throughput does not increase significantly. However,
hybrid polling can releasing some CPU resources during the polling
process, so that part of the CPU time can be used for frequent
data processing and other operations, which speeds up the reading
process, thereby improving throughput and optimizaing database
performance.I tried different compression strategies and got
results similar to the above table.(~30% throughput improvement)

As more database applications adapt to the io_uring engine, I think
the application of hybrid poll may have potential in some scenarios.
--
Xue




[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux