Re: bluestore write iops calculation

Maged Mokhtar <mmokhtar@xxxxxxxxxxx> · Sat, 3 Aug 2019 00:27:29 +0200



    On 02/08/2019 08:54, nokia ceph wrote:

    
      Hi Team,

        
        Could you please help us in understanding the write iops
          inside ceph cluster . There seems to be mismatch in iops
          between theoretical and what we see in disk status. 
        

        Our platform 5 node cluster 120 OSDs, with each node having
          24 disks HDD ( data, rcoksdb and rocksdb.WAL all resides in
          the same disk) .
        

        We use EC 4+1
        

        We do only write operation total average 1500 write iops
          (750objects/s and 750 attribute requests per second , single
          Key value entry for each object). And in the ceph status we
          see consistent 1500 write iops from the client.
        

        Please correct if our assumptions are wrong.
        1. For 750 object write request , data written directly
          into data partition and since we use EC 4+1 there will be 5
          iops across the cluster for each obejct write . This makes 750
          * 5 = 3750 iops
        2. For 750 attribute request , first it will be written
          into rocksdb.WAL and then to rocks.db . So , 2 iops per disk
          for every attribute request . This makes 750*2*5 = 7500 iops
          inside the cluster.
        

        Now the total iops inside the cluster would be 11250 iops.
          we have 120 OSDs , hence per OSD should have 11250/120 =
          ~94iops .
        

        Currently we see average 200iops per osd for the same load
          in iostat however the theoretical calculation seems to be only
          94iops .
        

        Could you please let us know where we miss the remaining
          iops inside the cluster for 1500 write iops from client?
        

        Does each object write will endup in writing one metadata
          inside rocksdb , then we need to add another 3750 to the total
          iops  and this make each OSD will have 125iops , still there
          is difference of 75iops per OSD.
        

        Thanks,
        Muthu
      
      
      _______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

    
    Also is your iostat reading write iops or total read+write iops
      (iostat tps), note there could be a metada read op at the start of
      the first write op if not cached in memory.
    /Maged

    
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com