On Wed, Jan 9, 2013 at 4:38 PM, Sage Weil <sage@xxxxxxxxxxx> wrote: > On Wed, 9 Jan 2013, Ian Pye wrote: >> Hi, >> >> Every time I try an bring up an OSD, it crashes and I get the >> following: "error (121) Remote I/O error not handled on operation 20" > > This error code (EREMOTEIO) is not used by Ceph. What fs are you using? > Which kernel version? Anything else unusual happen with your hardware > recently that might have wreaked havoc on your underlying fs? 3.7.1 kernel with XFS. Its a demo-box from a vendor, so should be brand new. I'm going to say its a disk error, given the following: mkfs.xfs: read failed: Input/output error Interestingly, running an osd and btrfs worked fine on the same disk. Thanks for the help, Ian > > sage > > > >> The cluster is new and only has a little bit of data on it. Any ideas >> what is going on? Does Remote I/O mean a network error? Full log >> below: >> >> -9> 2013-01-10 00:00:20.182237 7f2ddde8f910 0 >> filestore(/mnt/dist_j/ceph) error (121) Remote I/O error not handled >> on operation 20 (12.0.0, or op 0, counting from 0) >> -8> 2013-01-10 00:00:20.182275 7f2ddde8f910 0 >> filestore(/mnt/dist_j/ceph) unexpected error code >> -7> 2013-01-10 00:00:20.182285 7f2ddde8f910 0 >> filestore(/mnt/dist_j/ceph) transaction dump: >> { "ops": [ >> { "op_num": 0, >> "op_name": "mkcoll", >> "collection": "0.2c0_head"}, >> { "op_num": 1, >> "op_name": "collection_setattr", >> "collection": "0.2c0_head", >> "name": "info", >> "length": 5}, >> { "op_num": 2, >> "op_name": "truncate", >> "collection": "meta", >> "oid": "a04c46e9\/pginfo_0.2c0\/0\/\/-1", >> "offset": 0}, >> { "op_num": 3, >> "op_name": "write", >> "collection": "meta", >> "oid": "a04c46e9\/pginfo_0.2c0\/0\/\/-1", >> "length": 531, >> "offset": 0, >> "bufferlist length": 531}, >> { "op_num": 4, >> "op_name": "remove", >> "collection": "meta", >> "oid": "1f9ede85\/pglog_0.2c0\/0\/\/-1"}, >> { "op_num": 5, >> "op_name": "write", >> "collection": "meta", >> "oid": "1f9ede85\/pglog_0.2c0\/0\/\/-1", >> "length": 0, >> "offset": 0, >> "bufferlist length": 0}, >> { "op_num": 6, >> "op_name": "collection_setattr", >> "collection": "0.2c0_head", >> "name": "ondisklog", >> "length": 34}, >> { "op_num": 7, >> "op_name": "nop"}]} >> -6> 2013-01-10 00:00:20.183085 7f2dd5e7f910 10 monclient: >> _send_mon_message to mon.a at 108.162.209.120:6789/0 >> -5> 2013-01-10 00:00:20.183108 7f2dd5e7f910 1 -- >> 108.162.209.120:6834/6359 --> 108.162.209.120:6789/0 -- osd_pgtemp(e22 >> {0.110=[8,9],0.147=[3,9],0.155=[1,9],0.171=[0,9],0.194=[3,9],0.1ad=[10,9],0.1c2=[1,9],0.1cb=[7,9],0.1df=[6,9],0.1e8=[7,9],0.1e9=[11,9],0.1f1=[7,9]} >> v22) v1 -- ?+0 0x5b15600 con 0x34629a0 >> -4> 2013-01-10 00:00:20.183772 7f2dd6680910 10 monclient: >> _send_mon_message to mon.a at 108.162.209.120:6789/0 >> -3> 2013-01-10 00:00:20.183797 7f2dd6680910 1 -- >> 108.162.209.120:6834/6359 --> 108.162.209.120:6789/0 -- osd_pgtemp(e22 >> {0.110=[8,9],0.147=[3,9],0.155=[1,9],0.171=[0,9],0.194=[3,9],0.1ad=[10,9],0.1c2=[1,9],0.1cb=[7,9],0.1df=[6,9],0.1e8=[7,9],0.1e9=[11,9],0.1f1=[7,9]} >> v22) v1 -- ?+0 0x5f75600 con 0x34629a0 >> -2> 2013-01-10 00:00:20.184315 7f2dd5e7f910 10 monclient: >> _send_mon_message to mon.a at 108.162.209.120:6789/0 >> -1> 2013-01-10 00:00:20.184338 7f2dd5e7f910 1 -- >> 108.162.209.120:6834/6359 --> 108.162.209.120:6789/0 -- osd_pgtemp(e22 >> {0.110=[8,9],0.147=[3,9],0.155=[1,9],0.171=[0,9],0.194=[3,9],0.1ad=[10,9],0.1c2=[1,9],0.1cb=[7,9],0.1df=[6,9],0.1e8=[7,9],0.1e9=[11,9],0.1f1=[7,9]} >> v22) v1 -- ?+0 0x5b15400 con 0x34629a0 >> 0> 2013-01-10 00:00:20.184755 7f2ddde8f910 -1 os/FileStore.cc: In >> function 'unsigned int >> FileStore::_do_transaction(ObjectStore::Transaction&, uint64_t, int)' >> thread 7f2ddde8f910 time 2013-01-10 00:00:20.182422 >> os/FileStore.cc: 2681: FAILED assert(0 == "unexpected error") >> >> ceph version 0.56.1 (e4a541624df62ef353e754391cbbb707f54b16f7) >> 1: (FileStore::_do_transaction(ObjectStore::Transaction&, unsigned >> long, int)+0x90a) [0x73e14a] >> 2: (FileStore::do_transactions(std::list<ObjectStore::Transaction*, >> std::allocator<ObjectStore::Transaction*> >&, unsigned long)+0x4c) >> [0x7455dc] >> 3: (FileStore::_do_op(FileStore::OpSequencer*)+0xab) [0x72428b] >> 4: (ThreadPool::worker(ThreadPool::WorkThread*)+0x82b) [0x894feb] >> 5: (ThreadPool::WorkThread::entry()+0x10) [0x8977d0] >> 6: /lib/libpthread.so.0 [0x7f2de6d087aa] >> 7: (clone()+0x6d) [0x7f2de518159d] >> NOTE: a copy of the executable, or `objdump -rdS <executable>` is >> needed to interpret this. >> >> --- logging levels --- >> 0/ 5 none >> 0/ 1 lockdep >> 0/ 1 context >> 1/ 1 crush >> 1/ 5 mds >> 1/ 5 mds_balancer >> 1/ 5 mds_locker >> 1/ 5 mds_log >> 1/ 5 mds_log_expire >> 1/ 5 mds_migrator >> 0/ 1 buffer >> 0/ 1 timer >> 0/ 1 filer >> 0/ 1 striper >> 0/ 1 objecter >> 0/ 5 rados >> 0/ 5 rbd >> 0/ 5 journaler >> 0/ 5 objectcacher >> 0/ 5 client >> 0/ 5 osd >> 0/ 5 optracker >> 0/ 5 objclass >> 1/ 3 filestore >> 1/ 3 journal >> 0/ 5 ms >> 1/ 5 mon >> 0/10 monc >> 0/ 5 paxos >> 0/ 5 tp >> 1/ 5 auth >> 1/ 5 crypto >> 1/ 1 finisher >> 1/ 5 heartbeatmap >> 1/ 5 perfcounter >> 1/ 5 rgw >> 1/ 5 hadoop >> 1/ 5 javaclient >> 1/ 5 asok >> 1/ 1 throttle >> -2/-2 (syslog threshold) >> -1/-1 (stderr threshold) >> max_recent 100000 >> max_new 1000 >> log_file /var/log/ceph/ceph-osd.9.log >> --- end dump of recent events --- >> 2013-01-10 00:00:20.227763 7f2ddde8f910 -1 *** Caught signal (Aborted) ** >> in thread 7f2ddde8f910 >> >> ceph version 0.56.1 (e4a541624df62ef353e754391cbbb707f54b16f7) >> 1: /cf/ceph/bin/ceph-osd [0x7a5309] >> 2: /lib/libpthread.so.0 [0x7f2de6d10a60] >> 3: (gsignal()+0x35) [0x7f2de50e7f05] >> 4: (abort()+0x180) [0x7f2de50ead10] >> 5: (__gnu_cxx::__verbose_terminate_handler()+0x115) [0x7f2de596ed45] >> 6: /usr/lib/libstdc++.so.6 [0x7f2de596d176] >> 7: /usr/lib/libstdc++.so.6 [0x7f2de596d1a3] >> 8: /usr/lib/libstdc++.so.6 [0x7f2de596d29e] >> 9: (ceph::__ceph_assert_fail(char const*, char const*, int, char >> const*)+0x7c9) [0x898029] >> 10: (FileStore::_do_transaction(ObjectStore::Transaction&, unsigned >> long, int)+0x90a) [0x73e14a] >> 11: (FileStore::do_transactions(std::list<ObjectStore::Transaction*, >> std::allocator<ObjectStore::Transaction*> >&, unsigned long)+0x4c) >> [0x7455dc] >> 12: (FileStore::_do_op(FileStore::OpSequencer*)+0xab) [0x72428b] >> 13: (ThreadPool::worker(ThreadPool::WorkThread*)+0x82b) [0x894feb] >> 14: (ThreadPool::WorkThread::entry()+0x10) [0x8977d0] >> 15: /lib/libpthread.so.0 [0x7f2de6d087aa] >> 16: (clone()+0x6d) [0x7f2de518159d] >> NOTE: a copy of the executable, or `objdump -rdS <executable>` is >> needed to interpret this. >> >> --- begin dump of recent events --- >> -17> 2013-01-10 00:00:20.184913 7f2dd6680910 10 monclient: >> _send_mon_message to mon.a at 108.162.209.120:6789/0 >> -16> 2013-01-10 00:00:20.184936 7f2dd6680910 1 -- >> 108.162.209.120:6834/6359 --> 108.162.209.120:6789/0 -- osd_pgtemp(e22 >> {0.110=[8,9],0.147=[3,9],0.155=[1,9],0.171=[0,9],0.194=[3,9],0.1ad=[10,9],0.1c2=[1,9],0.1cb=[7,9],0.1df=[6,9],0.1e8=[7,9],0.1e9=[11,9],0.1f1=[7,9]} >> v22) v1 -- ?+0 0x5f75400 con 0x34629a0 >> -15> 2013-01-10 00:00:20.185444 7f2dd5e7f910 10 monclient: >> _send_mon_message to mon.a at 108.162.209.120:6789/0 >> -14> 2013-01-10 00:00:20.185461 7f2dd5e7f910 1 -- >> 108.162.209.120:6834/6359 --> 108.162.209.120:6789/0 -- osd_pgtemp(e22 >> {0.110=[8,9],0.147=[3,9],0.155=[1,9],0.171=[0,9],0.194=[3,9],0.1ad=[10,9],0.1c2=[1,9],0.1cb=[7,9],0.1df=[6,9],0.1e8=[7,9],0.1e9=[11,9],0.1f1=[7,9]} >> v22) v1 -- ?+0 0x5b15200 con 0x34629a0 >> -13> 2013-01-10 00:00:20.186028 7f2dd6680910 10 monclient: >> _send_mon_message to mon.a at 108.162.209.120:6789/0 >> -12> 2013-01-10 00:00:20.186049 7f2dd6680910 1 -- >> 108.162.209.120:6834/6359 --> 108.162.209.120:6789/0 -- osd_pgtemp(e22 >> {0.110=[8,9],0.147=[3,9],0.155=[1,9],0.171=[0,9],0.194=[3,9],0.1ad=[10,9],0.1c2=[1,9],0.1cb=[7,9],0.1df=[6,9],0.1e8=[7,9],0.1e9=[11,9],0.1f1=[7,9]} >> v22) v1 -- ?+0 0x5f75200 con 0x34629a0 >> -11> 2013-01-10 00:00:20.186585 7f2dd5e7f910 10 monclient: >> _send_mon_message to mon.a at 108.162.209.120:6789/0 >> -10> 2013-01-10 00:00:20.186596 7f2dd5e7f910 1 -- >> 108.162.209.120:6834/6359 --> 108.162.209.120:6789/0 -- osd_pgtemp(e22 >> {0.110=[8,9],0.147=[3,9],0.155=[1,9],0.171=[0,9],0.194=[3,9],0.1ad=[10,9],0.1c2=[1,9],0.1cb=[7,9],0.1df=[6,9],0.1e8=[7,9],0.1e9=[11,9],0.1f1=[7,9]} >> v22) v1 -- ?+0 0x5b15000 con 0x34629a0 >> -9> 2013-01-10 00:00:20.186956 7f2dd6680910 10 monclient: >> _send_mon_message to mon.a at 108.162.209.120:6789/0 >> -8> 2013-01-10 00:00:20.186973 7f2dd6680910 1 -- >> 108.162.209.120:6834/6359 --> 108.162.209.120:6789/0 -- osd_pgtemp(e22 >> {0.110=[8,9],0.147=[3,9],0.155=[1,9],0.171=[0,9],0.194=[3,9],0.1ad=[10,9],0.1c2=[1,9],0.1cb=[7,9],0.1df=[6,9],0.1e8=[7,9],0.1e9=[11,9],0.1f1=[7,9]} >> v22) v1 -- ?+0 0x5f75000 con 0x34629a0 >> -7> 2013-01-10 00:00:20.187288 7f2dd5e7f910 10 monclient: >> _send_mon_message to mon.a at 108.162.209.120:6789/0 >> -6> 2013-01-10 00:00:20.187298 7f2dd5e7f910 1 -- >> 108.162.209.120:6834/6359 --> 108.162.209.120:6789/0 -- osd_pgtemp(e22 >> {0.110=[8,9],0.147=[3,9],0.155=[1,9],0.171=[0,9],0.194=[3,9],0.1ad=[10,9],0.1c2=[1,9],0.1cb=[7,9],0.1df=[6,9],0.1e8=[7,9],0.1e9=[11,9],0.1f1=[7,9]} >> v22) v1 -- ?+0 0x387ce00 con 0x34629a0 >> -5> 2013-01-10 00:00:20.187671 7f2dd6680910 10 monclient: >> _send_mon_message to mon.a at 108.162.209.120:6789/0 >> -4> 2013-01-10 00:00:20.187688 7f2dd6680910 1 -- >> 108.162.209.120:6834/6359 --> 108.162.209.120:6789/0 -- osd_pgtemp(e22 >> {0.110=[8,9],0.147=[3,9],0.155=[1,9],0.171=[0,9],0.194=[3,9],0.1ad=[10,9],0.1c2=[1,9],0.1cb=[7,9],0.1df=[6,9],0.1e8=[7,9],0.1e9=[11,9],0.1f1=[7,9]} >> v22) v1 -- ?+0 0x393ae00 con 0x34629a0 >> -3> 2013-01-10 00:00:20.187946 7f2dd5e7f910 10 monclient: >> _send_mon_message to mon.a at 108.162.209.120:6789/0 >> -2> 2013-01-10 00:00:20.187957 7f2dd5e7f910 1 -- >> 108.162.209.120:6834/6359 --> 108.162.209.120:6789/0 -- osd_pgtemp(e22 >> {0.110=[8,9],0.147=[3,9],0.155=[1,9],0.171=[0,9],0.194=[3,9],0.1ad=[10,9],0.1c2=[1,9],0.1cb=[7,9],0.1df=[6,9],0.1e8=[7,9],0.1e9=[11,9],0.1f1=[7,9]} >> v22) v1 -- ?+0 0x387cc00 con 0x34629a0 >> -1> 2013-01-10 00:00:20.200448 7f2dcfb4d910 1 -- >> 108.162.209.120:6836/6359 >> :/0 pipe(0x38616c0 sd=49 :6836 pgs=0 cs=0 >> l=0).accept sd=49 108.162.209.120:13844/0 >> 0> 2013-01-10 00:00:20.227763 7f2ddde8f910 -1 *** Caught signal >> (Aborted) ** >> in thread 7f2ddde8f910 >> >> ceph version 0.56.1 (e4a541624df62ef353e754391cbbb707f54b16f7) >> 1: /cf/ceph/bin/ceph-osd [0x7a5309] >> 2: /lib/libpthread.so.0 [0x7f2de6d10a60] >> 3: (gsignal()+0x35) [0x7f2de50e7f05] >> 4: (abort()+0x180) [0x7f2de50ead10] >> 5: (__gnu_cxx::__verbose_terminate_handler()+0x115) [0x7f2de596ed45] >> 6: /usr/lib/libstdc++.so.6 [0x7f2de596d176] >> 7: /usr/lib/libstdc++.so.6 [0x7f2de596d1a3] >> 8: /usr/lib/libstdc++.so.6 [0x7f2de596d29e] >> 9: (ceph::__ceph_assert_fail(char const*, char const*, int, char >> const*)+0x7c9) [0x898029] >> 10: (FileStore::_do_transaction(ObjectStore::Transaction&, unsigned >> long, int)+0x90a) [0x73e14a] >> 11: (FileStore::do_transactions(std::list<ObjectStore::Transaction*, >> std::allocator<ObjectStore::Transaction*> >&, unsigned long)+0x4c) >> [0x7455dc] >> 12: (FileStore::_do_op(FileStore::OpSequencer*)+0xab) [0x72428b] >> 13: (ThreadPool::worker(ThreadPool::WorkThread*)+0x82b) [0x894feb] >> 14: (ThreadPool::WorkThread::entry()+0x10) [0x8977d0] >> 15: /lib/libpthread.so.0 [0x7f2de6d087aa] >> 16: (clone()+0x6d) [0x7f2de518159d] >> NOTE: a copy of the executable, or `objdump -rdS <executable>` is >> needed to interpret this. >> >> --- logging levels --- >> 0/ 5 none >> 0/ 1 lockdep >> 0/ 1 context >> 1/ 1 crush >> 1/ 5 mds >> 1/ 5 mds_balancer >> 1/ 5 mds_locker >> 1/ 5 mds_log >> 1/ 5 mds_log_expire >> 1/ 5 mds_migrator >> 0/ 1 buffer >> 0/ 1 timer >> 0/ 1 filer >> 0/ 1 striper >> 0/ 1 objecter >> 0/ 5 rados >> 0/ 5 rbd >> 0/ 5 journaler >> 0/ 5 objectcacher >> 0/ 5 client >> 0/ 5 osd >> 0/ 5 optracker >> 0/ 5 objclass >> 1/ 3 filestore >> 1/ 3 journal >> 0/ 5 ms >> 1/ 5 mon >> 0/10 monc >> 0/ 5 paxos >> 0/ 5 tp >> 1/ 5 auth >> 1/ 5 crypto >> 1/ 1 finisher >> 1/ 5 heartbeatmap >> 1/ 5 perfcounter >> 1/ 5 rgw >> 1/ 5 hadoop >> 1/ 5 javaclient >> 1/ 5 asok >> 1/ 1 throttle >> -2/-2 (syslog threshold) >> -1/-1 (stderr threshold) >> max_recent 100000 >> max_new 1000 >> log_file /var/log/ceph/ceph-osd.9.log >> --- end dump of recent events --- >> -- >> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in >> the body of a message to majordomo@xxxxxxxxxxxxxxx >> More majordomo info at http://vger.kernel.org/majordomo-info.html >> >> -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html