[SSD NVM FOR JOURNAL] Performance issues

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hello!

I recently installed INTEL SSD 400GB 750 SERIES PCIE 3.0 X4 in 3 of my OSD nodes.

First of all, here's is an schema describing how my cluster is:

Imagem inline 1

Imagem inline 2

I primarily use my ceph as a beckend for OpenStack nova, glance, swift and cinder. My crushmap is configured to have rulesets for SAS disks, SATA disks and another ruleset that resides in HPE nodes using SATA disks too.

Before installing the new journal in HPE nodes, i was using one of the disks that today are OSDs (osd.35, osd.34 and osd.33). After upgrading the journal, i noticed that a dd command writing 1gb blocks in openstack nova instances doubled the throughput but the value expected was actually 400% or 500% since in the Dell nodes that we have another nova pool the throughput is around this value.

Here is a demonstration of the scenario and the difference in performance between Dell nodes and HPE nodes:



Scenario: 
    
  •    Using pools to store instance disks for OpenStack 
  •     Pool nova in "ruleset SAS" placed on c4-osd201, c4-osd202 and c4-osd203 with 5 osds per hosts
  •     Pool nova_hpedl180 in "ruleset NOVA_HPEDL180" placed on c4-osd204, c4-osd205, c4-osd206 with 3 osds per hosts
  •     Every OSD has one partition of 35GB in a INTEL SSD 400GB 750 SERIES PCIE 3.0 X4
  •     Internal link for cluster and public network of 10Gbps
  •     Deployment via ceph-ansible. Same configuration define in ansible for every host on cluster


Instance on pool nova in ruleset SAS:
    
    
   # dd if=/dev/zero of=/mnt/bench bs=1G count=1 oflag=direct
       1+0 records in
       1+0 records out
       1073741824 bytes (1.1 GB, 1.0 GiB) copied, 2.56255 s, 419 MB/s


Instance on pool nova in ruleset NOVA_HPEDL180:

     #  dd if=/dev/zero of=/mnt/bench bs=1G count=1 oflag=direct
     1+0 records in
     1+0 records out
     1073741824 bytes (1.1 GB, 1.0 GiB) copied, 11.8243 s, 90.8 MB/s
    

I made some FIO benchmarks as suggested by Sebastien ( https://www.sebastien-han.fr/blog/2014/10/10/ceph-how-to-test-if-your-ssd-is-suitable-as-a-journal-device/ ) and the command with 1 job returned me about 180MB/s of throughput in recently installed nodes (HPE nodes). I made some hdparm benchmark in all SSDs and everything seems normal.


I can't see what is causing this difference of throughput since the network is not a problem and i think that cpu and memory are not crucial since i was monitoring the cluster with atop command and i didn't notice saturation of resources. My only though is that I have less workload in nova_hpedl180 pool in HPE nodes and less disks per node and this ca influence in the throughput of the journal.


Any clue about what is missing or what is happening?

Thanks in advance.
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux