Re: Slow ceph io. High iops. Compared to hadoop.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



And here is perfcounters, a few seconds after ceph starts
write data (disk util goes up in dstat)

2012/1/16 Andrey Stepachev <octo47@xxxxxxxxx>:
> Hi all.
>
> Last week I've investigate the status for hadoop on ceph.
> I create some patches to remove some bugs and crashes.
> Looks like it works. Even hbase works on top.
>
> For reference all sources and patches are here
>
> https://github.com/octo47/hadoop-common/tree/branch-1.0-ceph
> https://github.com/octo47/ceph/tree/v0.40-hadoop
>
> After YCSB and TestDSFIO work without crashes i start investigate
> performance.
>
> I have 5node cluster with 4 sata disks. btrfs. 24core on each.
> raid. iozone shows up to 520MB/s.
>
> Performance differs in 2-3 times. After some tests i see strange thing.
> hadoop uses disk very close to iozone: small amount and iops and high
> throughtput (same as iozone).
> ceph uses very inefficient: huge amount of iops, up to 3 times less
> throughtput (i think because of high amount of iops).
>
> hadoop dstat output:
> sda--sdb--sdc--sdd- ----total-cpu-usage---- -dsk/total- --io/total-
> util:util:util:util|usr sys idl wai hiq siq| read  writ| read  writ
>  100: 100: 100: 100|  1   5  83  11   0   0|   0   529M|   0   247
>  100: 100: 100: 100|  1   0  83  16   0   0|   0   542M|   0   168
>  100: 100: 100: 100|  1   0  81  18   0   0|  28k  518M|6.00   149
>  100: 100: 100: 100|  1   4  77  17   0   0|   0   533M|   0   243
>  100: 100: 100: 100|  1   3  83  13   0   0|   0   523M|   0   264
>
> ceph dstat output:
> ===================================================
> sda--sdb--sdc--sdd- ----total-cpu-usage---- -dsk/total- --io/total-
> util:util:util:util|usr sys idl wai hiq siq| read  writ| read  writ
> 68.0:70.0:79.0:76.0|  1   2  93   4   0   0|   0   195M|   0  1723
> 86.0:85.0:93.0:91.0|  1   2  91   5   0   0|   0   226M|   0  1816
> 85.0:85.0:85.0:84.0|  1   3  92   4   0   0|   0   235M|   0  2316
>
>
> So, my question is: can someone point me:
> a) can it be because of inefficient buffer size on osd part
> (i tried to increase CephOutputStream buffer to 256kb, not helps)
> b) what other problems can be and what options can i tune
> to find out what is going on.
>
> PS: i can't use iozone on kernel mounted fs. something
> hang in kernel, only reboot helps.
> in /var/log/messages i see attached kern.log.
>
>
>
> --
> Andrey.



-- 
Andrey.
{ "filestore" : { "apply_latency" : { "avgcount" : 3752,
          "sum" : 107.00700000000001
        },
      "bytes" : 1084026405,
      "commitcycle" : 10,
      "commitcycle_interval" : { "avgcount" : 10,
          "sum" : 57.902500000000003
        },
      "commitcycle_latency" : { "avgcount" : 10,
          "sum" : 7.89201
        },
      "committing" : 0,
      "journal_bytes" : 974956029,
      "journal_full" : 0,
      "journal_latency" : { "avgcount" : 3739,
          "sum" : 1361.1199999999999
        },
      "journal_ops" : 3739,
      "journal_queue_bytes" : 109070376,
      "journal_queue_max_bytes" : 104857600,
      "journal_queue_max_ops" : 500,
      "journal_queue_ops" : 13,
      "op_queue_bytes" : 26742974,
      "op_queue_max_bytes" : 104857600,
      "op_queue_max_ops" : 500,
      "op_queue_ops" : 3,
      "ops" : 3752
    },
  "osd" : { "buffer_bytes" : 0,
      "heartbeat_from_peers" : 4,
      "heartbeat_to_peers" : 4,
      "loadavg" : 0.41999999999999998,
      "map_message_epoch_dups" : 10,
      "map_message_epochs" : 14,
      "map_messages" : 11,
      "numpg" : 625,
      "numpg_primary" : 209,
      "numpg_replica" : 416,
      "numpg_stray" : 0,
      "op" : 137,
      "op_in_bytes" : 160997316,
      "op_latency" : { "avgcount" : 137,
          "sum" : 190.98699999999999
        },
      "op_out_bytes" : 26871,
      "op_r" : 9,
      "op_r_latency" : { "avgcount" : 9,
          "sum" : 3.02433
        },
      "op_r_out_bytes" : 26871,
      "op_rw" : 0,
      "op_rw_in_bytes" : 0,
      "op_rw_latency" : { "avgcount" : 0,
          "sum" : 0
        },
      "op_rw_out_bytes" : 0,
      "op_rw_rlat" : { "avgcount" : 0,
          "sum" : 0
        },
      "op_w" : 128,
      "op_w_in_bytes" : 160997316,
      "op_w_latency" : { "avgcount" : 128,
          "sum" : 187.96299999999999
        },
      "op_w_rlat" : { "avgcount" : 128,
          "sum" : 75.012200000000007
        },
      "op_wip" : 5,
      "opq" : 7,
      "pull" : 0,
      "push" : 0,
      "push_out_bytes" : 0,
      "recovery_ops" : 0,
      "subop" : 334,
      "subop_in_bytes" : 735611145,
      "subop_latency" : { "avgcount" : 334,
          "sum" : 238.98599999999999
        },
      "subop_pull" : 0,
      "subop_pull_latency" : { "avgcount" : 0,
          "sum" : 0
        },
      "subop_push" : 0,
      "subop_push_in_bytes" : 0,
      "subop_push_latency" : { "avgcount" : 0,
          "sum" : 0
        },
      "subop_w" : 0,
      "subop_w_in_bytes" : 735611145,
      "subop_w_latency" : { "avgcount" : 334,
          "sum" : 238.98599999999999
        }
    }
}

[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux