Re: why performance difference between 'rados bench seq' and 'rados bench rand' quite significant

Louisa <lushasha08@xxxxxxx> · Wed, 30 Oct 2024 12:30:53 +0800

rep3datapool pg num is 512, Average number of PG replicas per OSD is 139
scrubs, balancer and pg autoscaler was disabled
RAM is 128G, swap is 0
From: Anthony D'Atri
Date: 2024-10-30 12:03
To: Louisa
CC: ceph-users
Subject: Re:  why performance difference between 'rados bench seq' and 'rados bench rand' quite significant
The good Mr. Nelson and others may have more to contribute, but a few thoughts:

* Running for 60 or 120 seconds isn’t quantitative:  rados bench typically exhibits a clear ramp-up; watch the per-second stats.
* Suggest running for 10 minutes, three times in a row and averaging the results
* How many PGs in rep3datapool?  Average number of PG replicas per OSD shown by `ceph osd df` ?  I would shoot for 150 - 200 in your case.
* Disable scrubs, the balancer, and the pg autoscaler during benching.
* If you have OS swap configured, disable it and reboot.  How much RAM?

> On Oct 29, 2024, at 11:43 PM, Louisa <lushasha08@xxxxxxx> wrote:
> 
> Hi all,
> We used 'rados bench' to test 4k object read and write operations.  
> Our cluster is pacific, one node, 11 bluestore osd ,db and wal share the block device.  Block device is HDD.
> 
> 1. testing 4k write with command 'rados bench 120 write -t 16 -b 4K -p rep3datapool --run-name 4kreadwrite --no-cleanup'
> 
> 2. Before tesing 4k reads, we restarted all OSD daemons.  The perfomance of 'rados bench 120 seq -t 16 -p rep3datapool --run-name 4kreadwrite' was very good, which Average IOPS: 17735; 
> using 'ceph daemon osd.1 perf dump rocksdb' , we found the rocksdb:get_latency avgcount: 15189, avgtime: 0.000012947 (12.9us)
> 
> 3. Before tesing 4k rand reads, we restarted all OSD daemons.  'rados bench 60 rand -t 16 -p rep3datapool --run-name 4kreadwrite' average IOPS: 2071
> rocksdb:get_latency avgcount: 8756, avgtime: 0.001761293 (1.7ms)
> 
> Q1: Why performance difference between 'rados bench seq' and 'rados bench rand' quite significant? How to explain the rocksdb get_latency perfomance between this two scenario?
> 
> 4. We write 40w 4k object to the pool, restarted all OSD daemons. running 'rados bench 120 seq -t 16 -p rep3datapool --run-name 4kreadwrite' again. Average IOPS~= 2000. 
> rocsdb:get_latency avgtime  also reached milliseconds level
> Q2: Why 'rados bench seq' performance decresing extremly after writing some more 4k object to the pool?
> 
> Q3: Is there any methods and suggestions to optimized the read performance of this scenario under this hardware configuration.
> 
> 
> _______________________________________________
> ceph-users mailing list -- ceph-users@xxxxxxx
> To unsubscribe send an email to ceph-users-leave@xxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx