Re: 3-node cluster with 3 x Intel Optane 900P - very low benchmarked performance (200 IOPS)?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

I have retested with 4K blocks - results are below.

I am currently using 4 OSDs per Optane 900P drive. This was based on some posts I found on Proxmox Forums, and what seems to be "tribal knowledge" there.

I also saw this presentation, which mentions on page 14:

2-4 OSDs/NVMe SSD and 4-6 NVMe SSDs per node are sweet spots

Has anybody done much testing with pure Optane drives for Ceph? (Paper above seems to use them mixed with traditional SSDs).

Would increasing the number of OSDs help in this scenario? I am happy to try that - I assume I will need to blow away all the existing OSDs/Ceph setup and start again, of course.

Here are the rados bench results with 4K - the write IOPS are still a tad short of 15,000 - is that what I should be aiming for?

Write result:
# rados bench -p proxmox_vms 60 write -b 4K -t 16 --no-cleanup
Total time run:         60.001016
Total writes made:      726749
Write size:             4096
Object size:            4096
Bandwidth (MB/sec):     47.3136
Stddev Bandwidth:       2.16408
Max bandwidth (MB/sec): 48.7344
Min bandwidth (MB/sec): 38.5078
Average IOPS:           12112
Stddev IOPS:            554
Max IOPS:               12476
Min IOPS:               9858
Average Latency(s):     0.00132019
Stddev Latency(s):      0.000670617
Max latency(s):         0.065541
Min latency(s):         0.000689406

Sequential read result:

# rados bench -p proxmox_vms  60 seq -t 16
Total time run:       17.098593
Total reads made:     726749
Read size:            4096
Object size:          4096
Bandwidth (MB/sec):   166.029
Average IOPS:         42503
Stddev IOPS:          218
Max IOPS:             42978
Min IOPS:             42192
Average Latency(s):   0.000369021
Max latency(s):       0.00543175
Min latency(s):       0.000170024

Random read result:

# rados bench -p proxmox_vms 60 rand -t 16
Total time run:       60.000282
Total reads made:     2708799
Read size:            4096
Object size:          4096
Bandwidth (MB/sec):   176.353
Average IOPS:         45146
Stddev IOPS:          310
Max IOPS:             45754
Min IOPS:             44506
Average Latency(s):   0.000347637
Max latency(s):       0.00457886
Min latency(s):       0.000138381

I am happy to try with fio -ioengine =rbd (the reason I use rados bench is because that is what was used in the Proxmox Ceph benchmark paper) however, is there a common community-suggested starting command line that makes it easy to compare results? (fio seems quite complex in terms of options).

Thanks,
Victor

On Sun, Mar 10, 2019 at 6:15 AM Vitaliy Filippov <vitalif@xxxxxxxxxx> wrote:
Welcome to our "slow ceph" party :)))

However I have to note that:

1) 500000 iops is for 4 KB blocks. You're testing it with 4 MB ones. 
That's kind of unfair comparison.

2) fio -ioengine=rbd is better than rados bench for testing.

3) You can't "compensate" for Ceph's overhead even by having infinitely 
fast disks.

At its simplest, imagine that disk I/O takes X microseconds and Ceph's 
overhead is Y for a single operation.

Suppose there is no parallelism. Then raw disk IOPS = 1000000/X and Ceph 
IOPS = 1000000/(X+Y). Y is currently quite long, something around 400-800 
microseconds or so. So the best IOPS number you can squeeze out of a 
single client thread (a DBMS, for example) is 1000000/400 = only ~2500 
iops.

Parallel iops are of course better, but still you won't get anything close 
to 500000 iops from a single OSD. The expected number is around 15000. 
Create multiple OSDs on a single NVMe and sacrifice your CPU usage if you 
want better results.

--
With best regards,
   Vitaliy Filippov
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux