error in using Hadoop with cephFS

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

I have a 5 node cluster (each ubuntu 14.04) where I have setup Hadoop (1.1.1) and ceph (v0.87). I want to use Hadoop with cephFS and run some experiments. I ran the wordcount example with normal hadoop setting and it worked fine. But when I change the Hadoop configuration as mentioned in the “Using Hadoop with CephFS” documentation http://ceph.com/docs/master/cephfs/hadoop/, I am facing the following error:

ceph@admin-node:/usr/local/hadoop-1.1.1$ bin/hadoop jar hadoop*examples*.jar wordcount /tmp/wc-input /tmp/wc-output-r8
15/03/26 02:54:35 INFO util.NativeCodeLoader: Loaded the native-hadoop library
15/03/26 02:54:35 INFO input.FileInputFormat: Total input paths to process : 1
15/03/26 02:54:35 WARN snappy.LoadSnappy: Snappy native library not loaded
15/03/26 02:54:35 INFO mapred.JobClient: Running job: job_201503260253_0001
15/03/26 02:54:36 INFO mapred.JobClient: map 0% reduce 0%
15/03/26 02:54:36 INFO mapred.JobClient: Task Id : attempt_201503260253_0001_m_000021_0, Status : FAILED
Error initializing attempt_201503260253_0001_m_000021_0:
java.io.FileNotFoundException: File file:/tmp/hadoop-ceph/mapred/system/job_201503260253_0001/jobToken does not exist.
at org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:397)
at org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:251)
at org.apache.hadoop.mapred.TaskTracker.localizeJobTokenFile(TaskTracker.java:4445)
at org.apache.hadoop.mapred.TaskTracker.initializeJob(TaskTracker.java:1272)
at org.apache.hadoop.mapred.TaskTracker.localizeJob(TaskTracker.java:1213)
at org.apache.hadoop.mapred.TaskTracker$5.run(TaskTracker.java:2568)
at java.lang.Thread.run(Thread.java:745)

15/03/26 02:54:36 WARN mapred.JobClient: Error reading task outputhttp://node2:50060/tasklog?plaintext=true&attemptid=attempt_201503260253_0001_m_000021_0&filter=stdout
15/03/26 02:54:37 WARN mapred.JobClient: Error reading task outputhttp://node2:50060/tasklog?plaintext=true&attemptid=attempt_201503260253_0001_m_000021_0&filter=stderr
15/03/26 02:54:38 INFO mapred.JobClient: Task Id : attempt_201503260253_0001_r_000002_0, Status : FAILED
Error initializing attempt_201503260253_0001_r_000002_0:
java.io.FileNotFoundException: File file:/tmp/hadoop-ceph/mapred/system/job_201503260253_0001/jobToken does not exist.
at org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:397)
at org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:251)
at org.apache.hadoop.mapred.TaskTracker.localizeJobTokenFile(TaskTracker.java:4445)
at org.apache.hadoop.mapred.TaskTracker.initializeJob(TaskTracker.java:1272)
at org.apache.hadoop.mapred.TaskTracker.localizeJob(TaskTracker.java:1213)
at org.apache.hadoop.mapred.TaskTracker$5.run(TaskTracker.java:2568)
at java.lang.Thread.run(Thread.java:745)

15/03/26 02:54:38 WARN mapred.JobClient: Error reading task outputhttp://node2:50060/tasklog?plaintext=true&attemptid=attempt_201503260253_0001_r_000002_0&filter=stdout
15/03/26 02:54:38 WARN mapred.JobClient: Error reading task outputhttp://node2:50060/tasklog?plaintext=true&attemptid=attempt_201503260253_0001_r_000002_0&filter=stderr
15/03/26 02:54:38 INFO mapred.JobClient: Task Id : attempt_201503260253_0001_m_000021_1, Status : FAILED
Error initializing attempt_201503260253_0001_m_000021_1:
java.io.FileNotFoundException: File file:/tmp/hadoop-ceph/mapred/system/job_201503260253_0001/jobToken does not exist.
at org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:397)
at org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:251)
at org.apache.hadoop.mapred.TaskTracker.localizeJobTokenFile(TaskTracker.java:4445)
at org.apache.hadoop.mapred.TaskTracker.initializeJob(TaskTracker.java:1272)
at org.apache.hadoop.mapred.TaskTracker.localizeJob(TaskTracker.java:1213)
at org.apache.hadoop.mapred.TaskTracker$5.run(TaskTracker.java:2568)
at java.lang.Thread.run(Thread.java:745) …………………….

Using cephFS instead of HDFS requires only the mapred daemons so only the jobtracker and tasktrackers are running in the node. My core-site.xml file of Hadoop:

<configuration>

<property>
<name>fs.ceph.impl</name>
<value>org.apache.hadoop.fs.ceph.CephFileSystem</value>
<description>
</description>
</property>

<property>
<name>fs.default.name</name>
<value>ceph:///</value>
</property>

<property>
<name>ceph.conf.file</name>
<value>/etc/ceph/ceph.conf</value>
</property>
<property>

<name>ceph.root.dir</name>
<value>/</value>
</property>
<property>

<name>ceph.mon.address</name>
<value>10.242.144.225:6789</value>
<description>This is the primary monitor node IP address in our installation.</description>
</property>

<property>
<name>ceph.auth.id</name>
<value>admin</value>
</property>

<property>
<name>ceph.auth.keyring</name>
<value>/etc/ceph/ceph.client.admin.keyring</value>
</property>

<property>
<name>ceph.object.size</name>
<value>67108864</value>
</property>

<property>
<name>ceph.data.pools</name>
<value>data</value>
</property>

<property>
<name>ceph.localize.reads</name>
<value>true</value>
</property>
</configuration>

Please let me know, how can I solve the problem.


_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux