Re: Hadoop/Ceph and DFS IO tests

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



For this particular test I turned off replication for both hdfs and ceph. So there is just one copy of the data lying around.

hadoop@vega7250:~$ ceph osd dump | grep rep
pool 0 'data' rep size 1 min_size 1 crush_ruleset 0 object_hash rjenkins pg_num 960 pgp_num 960 last_change 26 owner 0 crash_replay_interval 45
pool 1 'metadata' rep size 2 min_size 1 crush_ruleset 1 object_hash rjenkins pg_num 960 pgp_num 960 last_change 1 owner 0
pool 2 'rbd' rep size 2 min_size 1 crush_ruleset 2 object_hash rjenkins pg_num 960 pgp_num 960 last_change 1 owner 0

From hdfs-site.xml:

  <property>
    <name>dfs.replication</name>
    <value>1</value>
  </property>





On Tue, Jul 9, 2013 at 2:44 PM, Noah Watkins <noah.watkins@xxxxxxxxxxx> wrote:
On Tue, Jul 9, 2013 at 12:35 PM, ker can <kercan74@xxxxxxxxx> wrote:
> hi Noah,
>
> while we're still on the hadoop topic ... I was also trying out the
> TestDFSIO tests ceph v/s hadoop.  The Read tests on ceph takes about 1.5x
> the hdfs time.  The write tests are worse about ... 2.5x the time on hdfs,
> but I guess we have additional journaling overheads for the writes on ceph.
> But there should be no such overheads for the read  ?

Out of the box Hadoop will keep 3 copies, and Ceph 2, so it could be
the case that reads are slower because there is less opportunity for
scheduling local reads. You can create a new pool with replication=3
and test this out (documentation on how to do this is on
http://ceph.com/docs/wip-hadoop-doc/cephfs/hadoop/).

As for writes, Hadoop will write 2 remote and 1 local blocks, however
Ceph will write all copies remotely, so there is some overhead for the
extra remote object write  (compared to Hadoop), but i wouldn't have
expected 2.5x. It might be useful to run dd or something like that on
Ceph to see if the numbers make sense to rule out Hadoop as the
bottleneck.

-Noah

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux