Re: Ceph & Hbase

Jose M <soloninguno@xxxxxxxxxxx> · Thu, 7 Jan 2016 13:56:56 +0000

Hi,

Following Yan's feeling that something could be wrong with ceph configuration, i started again from scratch, this time configuring ceph with three nodes (one mon, two osds).

After starting hbase, it seems it moves forward a few more steps, but fails again, this time trying to create a file that starts with a dot (hidden file).

2016-01-06 14:36:08,509 INFO  [main] mortbay.log: Started SelectChannelConnector@0.0.0.0:16010
2016-01-06 14:36:08,516 INFO  [main] master.HMaster: hbase.rootdir=ceph://ceph-mon:6789/hbase, hbase.cluster.distributed=true
2016-01-06 14:36:08,537 INFO  [main] master.HMaster: Adding backup master ZNode /hbase/backup-masters/192.168.1.196,16000,1452090965392
2016-01-06 14:36:08,750 INFO  [192.168.1.196:16000.activeMasterManager] master.ActiveMasterManager: Deleting ZNode for /hbase/backup-masters/192.168.1.196,16000,1452090965392 from backup master directory
2016-01-06 14:36:08,771 INFO  [192.168.1.196:16000.activeMasterManager] master.ActiveMasterManager: Registered Active Master=192.168.1.196,16000,1452090965392
2016-01-06 14:36:08,845 INFO  [master//192.168.1.196:16000] zookeeper.RecoverableZooKeeper: Process identifier=hconnection-0x4d0894c1 connecting to ZooKeeper ensemble=localhost:2181
2016-01-06 14:36:08,845 INFO  [master//192.168.1.196:16000] zookeeper.ZooKeeper: Initiating client connection, connectString=localhost:2181 sessionTimeout=90000 watcher=hconnection-0x4d0894c10x0, quorum=localhost:2181, baseZNode=/hbase
2016-01-06 14:36:08,866 INFO  [master//192.168.1.196:16000-SendThread(localhost:2181)] zookeeper.ClientCnxn: Opening socket connection to server localhost/127.0.0.1:2181. Will not attempt to authenticate using SASL (unknown error)
2016-01-06 14:36:08,868 INFO  [master//192.168.1.196:16000-SendThread(localhost:2181)] zookeeper.ClientCnxn: Socket connection established to localhost/127.0.0.1:2181, initiating session
2016-01-06 14:36:08,873 INFO  [master//192.168.1.196:16000-SendThread(localhost:2181)] zookeeper.ClientCnxn: Session establishment complete on server localhost/127.0.0.1:2181, sessionid = 0x15213a42e100007, negotiated timeout = 90000
2016-01-06 14:36:08,875 INFO  [master//192.168.1.196:16000] client.ZooKeeperRegistry: ClusterId read in ZooKeeper is null
2016-01-06 14:36:09,022 FATAL [192.168.1.196:16000.activeMasterManager] master.HMaster: Failed to become active master
java.io.IOException: Error accessing ceph://ceph-mon:6789/hbase/data/hbase/meta/.tabledesc
        at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:1485)
        at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:1523)
        at org.apache.hadoop.hbase.util.FSUtils.listStatus(FSUtils.java:1721)
        at org.apache.hadoop.hbase.util.FSTableDescriptors.getCurrentTableInfoStatus(FSTableDescriptors.java:369)
        at org.apache.hadoop.hbase.util.FSTableDescriptors.getTableInfoPath(FSTableDescriptors.java:350)
        at org.apache.hadoop.hbase.util.FSTableDescriptors.getTableInfoPath(FSTableDescriptors.java:331)
        at org.apache.hadoop.hbase.util.FSTableDescriptorMigrationToSubdir.needsMigration(FSTableDescriptorMigrationToSubdir.java:58)
        at org.apache.hadoop.hbase.util.FSTableDescriptorMigrationToSubdir.migrateFSTableDescriptorsIfNecessary(FSTableDescriptorMigrationToSubdir.java:45)
        at org.apache.hadoop.hbase.master.MasterFileSystem.checkRootDir(MasterFileSystem.java:481)
        at org.apache.hadoop.hbase.master.MasterFileSystem.createInitialFileSystemLayout(MasterFileSystem.java:146)
        at org.apache.hadoop.hbase.master.MasterFileSystem.<init>(MasterFileSystem.java:126)
        at org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:649)
        at org.apache.hadoop.hbase.master.HMaster.access$500(HMaster.java:182)
        at org.apache.hadoop.hbase.master.HMaster$1.run(HMaster.java:1646)
        at java.lang.Thread.run(Thread.java:745)
2016-01-06 14:36:09,025 FATAL [192.168.1.196:16000.activeMasterManager] master.HMaster: Unhandled exception. Starting shutdown.
java.io.IOException: Error accessing ceph://ceph-mon:6789/hbase/data/hbase/meta/.tabledesc
        at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:1485)
        at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:1523)
        at org.apache.hadoop.hbase.util.FSUtils.listStatus(FSUtils.java:1721)
        at org.apache.hadoop.hbase.util.FSTableDescriptors.getCurrentTableInfoStatus(FSTableDescriptors.java:369)
        at org.apache.hadoop.hbase.util.FSTableDescriptors.getTableInfoPath(FSTableDescriptors.java:350)
        at org.apache.hadoop.hbase.util.FSTableDescriptors.getTableInfoPath(FSTableDescriptors.java:331)
        at org.apache.hadoop.hbase.util.FSTableDescriptorMigrationToSubdir.needsMigration(FSTableDescriptorMigrationToSubdir.java:58)
        at org.apache.hadoop.hbase.util.FSTableDescriptorMigrationToSubdir.migrateFSTableDescriptorsIfNecessary(FSTableDescriptorMigrationToSubdir.java:45)
        at org.apache.hadoop.hbase.master.MasterFileSystem.checkRootDir(MasterFileSystem.java:481)
        at org.apache.hadoop.hbase.master.MasterFileSystem.createInitialFileSystemLayout(MasterFileSystem.java:146)
        at org.apache.hadoop.hbase.master.MasterFileSystem.<init>(MasterFileSystem.java:126)
        at org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:649)
        at org.apache.hadoop.hbase.master.HMaster.access$500(HMaster.java:182)
        at org.apache.hadoop.hbase.master.HMaster$1.run(HMaster.java:1646)
        at java.lang.Thread.run(Thread.java:745)

I found that an old message in ceph mailing list talking about the same problem but with no real answers
http://lists.ceph.com/pipermail/ceph-users-ceph.com/2014-April/039001.html

Then I realize that the .metadesc was a directory, so I decide to create it manually with
     hadoop fs -mkdir /hbase/data/hbase/meta/.tabledesc

After starting hbase master again, I got another error, a NullPointer in Globber.java.

2016-01-06 19:38:03,067 INFO  [192.168.1.196:16000.activeMasterManager] master.ActiveMasterManager: Registered Active Master=192.168.1.196,16000,1452109080969
2016-01-06 19:38:03,133 INFO  [master//192.168.1.196:16000] zookeeper.RecoverableZooKeeper: Process identifier=hconnection-0x4ad81458 connecting to ZooKeeper ensemble=localhost:2181
2016-01-06 19:38:03,133 INFO  [master//192.168.1.196:16000] zookeeper.ZooKeeper: Initiating client connection, connectString=localhost:2181 sessionTimeout=90000 watcher=hconnection-0x4ad814580x0, quorum=localhost:2181, baseZNode=/hbase
2016-01-06 19:38:03,135 INFO  [master//192.168.1.196:16000-SendThread(localhost:2181)] zookeeper.ClientCnxn: Opening socket connection to server localhost/127.0.0.1:2181. Will not attempt to authenticate using SASL (unknown error)
2016-01-06 19:38:03,136 INFO  [master//192.168.1.196:16000-SendThread(localhost:2181)] zookeeper.ClientCnxn: Socket connection established to localhost/127.0.0.1:2181, initiating session
2016-01-06 19:38:03,140 INFO  [master//192.168.1.196:16000-SendThread(localhost:2181)] zookeeper.ClientCnxn: Session establishment complete on server localhost/127.0.0.1:2181, sessionid = 0x15213a42e10000b, negotiated timeout = 90000
2016-01-06 19:38:03,143 INFO  [master//192.168.1.196:16000] client.ZooKeeperRegistry: ClusterId read in ZooKeeper is null
2016-01-06 19:38:03,272 INFO  [192.168.1.196:16000.activeMasterManager] util.FSTableDescriptorMigrationToSubdir: Migrating user tables
2016-01-06 19:38:03,290 INFO  [192.168.1.196:16000.activeMasterManager] util.FSTableDescriptorMigrationToSubdir: Migrating system tables
2016-01-06 19:38:03,292 INFO  [192.168.1.196:16000.activeMasterManager] util.FSTableDescriptorMigrationToSubdir: Migration complete.
2016-01-06 19:38:03,305 INFO  [192.168.1.196:16000.activeMasterManager] ceph.CephFileSystem: selectDataPool path=ceph://ceph-mon:6789/hbase/data/hbase/meta/.tmp/.tableinfo.0000000001 pool:repl=cephfs_data:2 wanted=3
2016-01-06 19:38:03,418 FATAL [192.168.1.196:16000.activeMasterManager] master.HMaster: Failed to become active master
java.lang.NullPointerException
        at org.apache.hadoop.fs.Globber.glob(Globber.java:218)
        at org.apache.hadoop.fs.FileSystem.globStatus(FileSystem.java:1623)
        at org.apache.hadoop.hbase.util.FSUtils.getTableDirs(FSUtils.java:1368)
        at org.apache.hadoop.hbase.master.MasterFileSystem.checkTempDir(MasterFileSystem.java:506)
        at org.apache.hadoop.hbase.master.MasterFileSystem.createInitialFileSystemLayout(MasterFileSystem.java:149)
        at org.apache.hadoop.hbase.master.MasterFileSystem.<init>(MasterFileSystem.java:126)
        at org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:649)
        at org.apache.hadoop.hbase.master.HMaster.access$500(HMaster.java:182)
        at org.apache.hadoop.hbase.master.HMaster$1.run(HMaster.java:1646)
        at java.lang.Thread.run(Thread.java:745)
2016-01-06 19:38:03,422 FATAL [192.168.1.196:16000.activeMasterManager] master.HMaster: Unhandled exception. Starting shutdown.
java.lang.NullPointerException
        at org.apache.hadoop.fs.Globber.glob(Globber.java:218)
        at org.apache.hadoop.fs.FileSystem.globStatus(FileSystem.java:1623)
        at org.apache.hadoop.hbase.util.FSUtils.getTableDirs(FSUtils.java:1368)
        at org.apache.hadoop.hbase.master.MasterFileSystem.checkTempDir(MasterFileSystem.java:506)
        at org.apache.hadoop.hbase.master.MasterFileSystem.createInitialFileSystemLayout(MasterFileSystem.java:149)
        at org.apache.hadoop.hbase.master.MasterFileSystem.<init>(MasterFileSystem.java:126)
        at org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:649)
        at org.apache.hadoop.hbase.master.HMaster.access$500(HMaster.java:182)
        at org.apache.hadoop.hbase.master.HMaster$1.run(HMaster.java:1646)
        at java.lang.Thread.run(Thread.java:745)
2016-01-06 19:38:03,424 INFO  [192.168.1.196:16000.activeMasterManager] regionserver.HRegionServer: STOPPED: Unhandled exception. Starting shutdown.

Maybe anyone can hive a hint on this? It seems there isn't a lot of people using ceph+hbase, but don't lose anything asking :)

This is my current hbase-site.xml just in case

<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
  <property>
    <name>hbase.rootdir</name>
    <value>ceph://ceph-mon:6789/hbase</value>
  </property>
<property>
  <name>hbase.cluster.distributed</name>
  <value>true</value>
</property>
  <property>
    <name>hbase.zookeeper.property.dataDir</name>
    <value>ceph://ceph-mon:6789/zookeeper</value>
  </property>
  <property>
    <name>hbase.zookeeper.property.clientPort</name>
    <value>2181</value>
  </property>
  <property>
    <name>hbase.zookeeper.quorum</name>
    <value>localhost</value>
  </property>
  <property>
    <name>fs.ceph.impl</name>
    <value>org.apache.hadoop.fs.ceph.CephFileSystem</value>
  </property>
  <property>
    <name>fs.AbstractFileSystem.ceph.impl</name>
    <value>org.apache.hadoop.fs.ceph.CephFs</value>
  </property>
</configuration>

Thanks in advance!
________________________________________
De: Yan, Zheng <ukernel@xxxxxxxxx>
Enviado: jueves, 31 de diciembre de 2015 02:55 a.m.
Para: Jose M
Asunto: Re:  Ceph & Hbase

I have no knowledge of hadoop/hbase. the "Permission denied" exception
on mount is likely caused by incorrect ceph configuration (I didn't
see ceph related options in hbase config)

following URL is mail from a user who claim successfully run hbase
http://lists.ceph.com/pipermail/ceph-users-ceph.com/2013-July/002856.html
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com