Hi KC, The locality information is now collected and available to Hadoop through the CephFS API, so fixing this is certainly possible. However, there has not been extensive testing. I think the tasks that need to be completed are (1) make sure that `CephFileSystem` is encoding the correct block location in `getFileBlockLocations` (which I think it is currently completed, but does need to be verified), and (2) make sure rack information is available in the jobtracker, or optionally use a flat hierarchy (i.e. default-rack). On Mon, Jul 8, 2013 at 12:47 PM, ker can <kercan74@xxxxxxxxx> wrote: > Hi There, > > I'm test driving Hadoop with CephFS as the storage layer. I was running the > Terasort benchmark and I noticed a lot of network IO activity when compared > to a HDFS storage layer setup. (Its a half-a-terabyte sort workload over two > data nodes.) > > Digging into the job tracker logs a little, I noticed that all the map tasks > were being assigned to process a split (block) on non-local nodes (which > explains all the network activity during the map phase) > > With Ceph: > > > 2013-07-08 11:19:53,535 INFO org.apache.hadoop.mapred.JobInProgress: Input > size for job job_201307081115_0001 = 500000000000. Number of splits = 7452 > 2013-07-08 11:19:53,538 INFO org.apache.hadoop.mapred.JobInProgress: Job > job_201307081115_0001 initialized successfully with 7452 map tasks and 32 > reduce tasks. > > 2013-07-08 11:19:54,836 INFO org.apache.hadoop.mapred.JobInProgress: > Choosing a non-local task task_201307081115_0001_m_000000 > 2013-07-08 11:19:54,836 INFO org.apache.hadoop.mapred.JobTracker: Adding > task (MAP) 'attempt_201307081115_0001_m_000000_0' to tip > task_201307081115_0001_m_000000, for tracker > 'tracker_vega7250:localhost/127.0.0.1:35422' > > 2013-07-08 11:19:54,990 INFO org.apache.hadoop.mapred.JobInProgress: > Choosing a non-local task task_201307081115_0001_m_000001 > 2013-07-08 11:19:54,990 INFO org.apache.hadoop.mapred.JobTracker: Adding > task (MAP) 'attempt_201307081115_0001_m_000001_0' to tip > task_201307081115_0001_m_000001, for tracker > 'tracker_vega7249:localhost/127.0.0.1:36725' > > ... and so on. > > In comparison with HDFS, the job tracker logs looked something like this. > The maps tasks were being assigned to process data blocks on the local > nodes. > > 2013-07-08 03:55:32,656 INFO org.apache.hadoop.mapred.JobInProgress: Input > size for job job_201307080351_0001 = 500000000000. Number of splits = 7452 > 2013-07-08 03:55:32,657 INFO org.apache.hadoop.mapred.JobInProgress: > tip:task_201307080351_0001_m_000000 has split on node:/default-rack/vega7247 > 2013-07-08 03:55:32,657 INFO org.apache.hadoop.mapred.JobInProgress: > tip:task_201307080351_0001_m_000001 has split on node:/default-rack/vega7247 > 2013-07-08 03:55:34,474 INFO org.apache.hadoop.mapred.JobTracker: Adding > task (MAP) 'attempt_201307080351_0001_m_000000_0' to tip > task_201307080351_0001_m_000000, for tracker > 'tracker_vega7247:localhost/127.0.0.1:43320' > 2013-07-08 03:55:34,475 INFO org.apache.hadoop.mapred.JobInProgress: > Choosing data-local task task_201307080351_0001_m_000000 > 2013-07-08 03:55:34,475 INFO org.apache.hadoop.mapred.JobTracker: Adding > task (MAP) 'attempt_201307080351_0001_m_000001_0' to tip > task_201307080351_0001_m_000001, for tracker > 'tracker_vega7247:localhost/127.0.0.1:43320' > 2013-07-08 03:55:34,475 INFO org.apache.hadoop.mapred.JobInProgress: > Choosing data-local task task_201307080351_0001_m_000001 > > Version Info: > ceph version 0.61.4 > hadoop 1.1.2 > > Has anyone else run into this ? > > Thanks > KC > > _______________________________________________ > ceph-users mailing list > ceph-users@xxxxxxxxxxxxxx > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com