Hi, don’t expect solution on group, just direction. Here is link to the blog post https://ceph.io/en/news/blog/2024/ceph-a-journey-to-1tibps/ on youtube is presentation from nyc ceph days View performance from the client's perspective, run the measurement tools from inside the virtual machine. This approach will provide you the performance as experienced by the client. The most commonly used tool for performance measurement is fio. I strongly recommend using fio for your evaluation. also use ioping to measure latency. While fio will provide IOPS/ and latency metrics during load, ioping offers view of latency behavior when the machine is not under heavy load. Based on my previous experiences (not only mine, but also my team), many performance issues were related to network configurations or problems around the network infrastructure. As example we encountered a situation where a change made by the network team to the spine switches caused disk latency to increase from 3ms to 80-120ms. Other example which almost burn me was issue with one spine cards, which was not fully broken, monitoring not discovered it, tests shows everything is ok but on ceph we had many, many issues like flapping osd’s, like half of osds form 500 goes down, latency spikes time to time. Card had problems time to time but not during tests :) and of course AMD nodes before I discovered iommu=pt for kernel params. Belive me this c-states and power management on nodes are important. You already received very good advices from others, not much to add, look on your network drivers, rx queue, tx queue. for your information this cluster was not fine tuned, also e2e enc. is enabled 6 node cluster, all nvme 8x nvme per node, 512gb ram, 4x25GB lacp for public and another 4x25GB for cluster net. (malleanox cards) # rados bench -p test 10 write -t 8 -b 16K Rados bench results: Total time run: 10.0003 Total writes made: 113195 Write size: 16384 Object size: 16384 Bandwidth (MB/sec): 176.862 Stddev Bandwidth: 27.047 Max bandwidth (MB/sec): 195.828 Min bandwidth (MB/sec): 107.906 Average IOPS: 11319 Stddev IOPS: 1731.01 Max IOPS: 12533 Min IOPS: 6906 Average Latency(s): 0.000705734 Stddev Latency(s): 0.00224331 Max latency(s): 0.325178 Min latency(s): 0.000413413 This is test from fio with librbd it shows more or less vm performance. [test] ioengine=rbd clientname=admin pool=test rbdname=bench rw=randwrite bs=4k iodepth=256 direct=1 numjobs=1 fsync=0 size=10G runtime=300 time_based invalidate=0 test: (groupid=0, jobs=1): err= 0: pid=3495143: Tue Jun 11 11:56:04 2024 write: IOPS=83.6k, BW=326MiB/s (342MB/s)(95.6GiB/300002msec); 0 zone resets slat (nsec): min=975, max=2665.0k, avg=3943.68, stdev=2820.21 clat (usec): min=399, max=225434, avg=3058.67, stdev=1801.25 and for iodepth=1 test: (groupid=0, jobs=1): err= 0: pid=3503647: Tue Jun 11 11:57:48 2024 write: IOPS=1845, BW=7382KiB/s (7559kB/s)(159MiB/22033msec); 0 zone resets slat (nsec): min=2966, max=41133, avg=4381.81, stdev=1062.40 clat (usec): min=367, max=202364, avg=537.05, stdev=1009.49 and iodepth=256 and bs=16k test: (groupid=0, jobs=1): err= 0: pid=3505339: Tue Jun 11 12:03:27 2024 write: IOPS=79.6k, BW=1244MiB/s (1305MB/s)(365GiB/300002msec); 0 zone resets slat (nsec): min=1815, max=4497.4k, avg=5671.20, stdev=3540.33 clat (usec): min=446, max=267567, avg=3208.34, stdev=2038.58 lat (usec): min=451, max=267571, avg=3214.01, stdev=2038.60 BR, Sebastian > On 11 Jun 2024, at 02:23, Mark Lehrer <lehrer@xxxxxxxxx> wrote: > > If they can do 1 TB/s with a single 16K write thread, that will be > quite impressive :D Otherwise not really applicable. Ceph scaling > has always been good. > > More seriously, would you mind sending a link to this? > > > Thanks! > > Mark > > On Mon, Jun 10, 2024 at 12:01 PM Anthony D'Atri <anthony.datri@xxxxxxxxx> wrote: >> >> Eh? cf. Mark and Dan's 1TB/s presentation. >> >> On Jun 10, 2024, at 13:58, Mark Lehrer <lehrer@xxxxxxxxx> wrote: >> >> It >> seems like Ceph still hasn't adjusted to SSD performance. >> >> > _______________________________________________ > ceph-users mailing list -- ceph-users@xxxxxxx > To unsubscribe send an email to ceph-users-leave@xxxxxxx _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx