Re: ceph & hbase:

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On Thu, Jul 18, 2013 at 3:13 PM, ker can <kercan74@xxxxxxxxx> wrote:

the hbase+hdfs throughput results were 38x better.
Any thoughts on what might be going on ?


Looks like this might be a data locality issue. After loading the table, when I look at the data block map of  a region's store files its spread out on disks across all  nodes. For my test 'usertable' hbase table osd 0-6 is on one node, and 7-13 is on another node.  This is the map of region "da3b3bf6c0c5a9b387d23944122f208b" store file "0c43d345e3ea42abb5ce5a98b162218a"

hadoop@dmse-141:/mnt/mycephfs/hbase/usertable/da3b3bf6c0c5a9b387d23944122f208b/family$ cephfs 0c43d345e3ea42abb5ce5a98b162218a map
    FILE OFFSET                    OBJECT        OFFSET        LENGTH  OSD
              0      10000001abd.00000000             0      67108864  2
       67108864      10000001abd.00000001             0      67108864  4
      134217728      10000001abd.00000002             0      67108864  8
      201326592      10000001abd.00000003             0      67108864  6
      268435456      10000001abd.00000004             0      67108864  3
      335544320      10000001abd.00000005             0      67108864  6
      402653184      10000001abd.00000006             0      67108864  9
      469762048      10000001abd.00000007             0      67108864  9
      536870912      10000001abd.00000008             0      67108864  0
      603979776      10000001abd.00000009             0      67108864  2
      671088640      10000001abd.0000000a             0      67108864  8
      738197504      10000001abd.0000000b             0      67108864  13
      805306368      10000001abd.0000000c             0      67108864  1
      872415232      10000001abd.0000000d             0      67108864  1
      939524096      10000001abd.0000000e             0      67108864  3
     1006632960      10000001abd.0000000f             0      67108864  7
     1073741824      10000001abd.00000010             0      67108864  3
     1140850688      10000001abd.00000011             0      67108864  13
     1207959552      10000001abd.00000012             0      67108864  13


For hbase+hdfs, all blocks within a single region were on the same region server/data node. So in the region server stats with hdfs you see a 100% data locality index and much better cache hit ratios.

hbase + hdfs region server stats:
blockCacheSizeMB=201.31, blockCacheFreeMB=45.57, blockCacheCount=3013,
blockCacheHitCount=9464863, blockCacheMissCount=10633061, blockCacheEvictedCount=9305729, blockCacheHitRatio=47%, blockCacheHitCachingRatio=50%,
hdfsBlocksLocalityIndex=100,

hbase + ceph region server stats:
blockCacheSizeMB=205.59, blockCacheFreeMB=41.29, blockCacheCount=2989,
blockCacheHitCount=1038372, blockCacheMissCount=1042117, blockCacheEvictedCount=397801, blockCacheHitRatio=49%, blockCacheHitCachingRatio=72%,
hdfsBlocksLocalityIndex=47


With ceph is there any way to influence the data block placement for a single file ?



_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux