Blargh, sounds like we need a better error message there, thanks! -Sam On Mon, Jun 6, 2016 at 12:16 PM, Tu Holmes <tu.holmes@xxxxxxxxx> wrote: > It was a permission issue. While I followed the process and haven't changed > any data. The "current map" files on each OSD were still listed as owner > root as they were created while the older ceph processes were still running. > > Changing that after the fact was still a necessity and I will make sure that > those are also properly changed. > > On Mon, Jun 6, 2016 at 12:12 PM Samuel Just <sjust@xxxxxxxxxx> wrote: >> >> Oh, what was the problem (for posterity)? >> -Sam >> >> On Mon, Jun 6, 2016 at 12:11 PM, Tu Holmes <tu.holmes@xxxxxxxxx> wrote: >> > It totally did and I see what the problem is. >> > >> > Thanks for your input. I truly appreciate it. >> > >> > >> > On Mon, Jun 6, 2016 at 12:01 PM Samuel Just <sjust@xxxxxxxxxx> wrote: >> >> >> >> If you reproduce with >> >> >> >> debug osd = 20 >> >> debug filestore = 20 >> >> debug ms = 1 >> >> >> >> that might make it clearer what is going on. >> >> -Sam >> >> >> >> On Mon, Jun 6, 2016 at 11:53 AM, Tu Holmes <tu.holmes@xxxxxxxxx> wrote: >> >> > Hey cephers. I have been following the upgrade documents and I have >> >> > done >> >> > everything regarding upgrading the client to the latest version of >> >> > Hammer, >> >> > then to Jewel. >> >> > >> >> > I made sure that the owner of log partitions and all other items is >> >> > the >> >> > ceph >> >> > user and I've gone through the process as was described in the >> >> > documents, >> >> > but I am getting this on my nodes as I upgrade. >> >> > >> >> > >> >> > --- begin dump of recent events --- >> >> > >> >> > -72> 2016-06-06 11:49:37.720315 7f09d0152800 5 >> >> > asok(0x7f09dbb78280) >> >> > register_command perfcounters_dump hook 0x7f09dbb58050 >> >> > >> >> > -71> 2016-06-06 11:49:37.720328 7f09d0152800 5 >> >> > asok(0x7f09dbb78280) >> >> > register_command 1 hook 0x7f09dbb58050 >> >> > >> >> > -70> 2016-06-06 11:49:37.720330 7f09d0152800 5 >> >> > asok(0x7f09dbb78280) >> >> > register_command perf dump hook 0x7f09dbb58050 >> >> > >> >> > -69> 2016-06-06 11:49:37.720332 7f09d0152800 5 >> >> > asok(0x7f09dbb78280) >> >> > register_command perfcounters_schema hook 0x7f09dbb58050 >> >> > >> >> > -68> 2016-06-06 11:49:37.720333 7f09d0152800 5 >> >> > asok(0x7f09dbb78280) >> >> > register_command 2 hook 0x7f09dbb58050 >> >> > >> >> > -67> 2016-06-06 11:49:37.720334 7f09d0152800 5 >> >> > asok(0x7f09dbb78280) >> >> > register_command perf schema hook 0x7f09dbb58050 >> >> > >> >> > -66> 2016-06-06 11:49:37.720335 7f09d0152800 5 >> >> > asok(0x7f09dbb78280) >> >> > register_command perf reset hook 0x7f09dbb58050 >> >> > >> >> > -65> 2016-06-06 11:49:37.720337 7f09d0152800 5 >> >> > asok(0x7f09dbb78280) >> >> > register_command config show hook 0x7f09dbb58050 >> >> > >> >> > -64> 2016-06-06 11:49:37.720338 7f09d0152800 5 >> >> > asok(0x7f09dbb78280) >> >> > register_command config set hook 0x7f09dbb58050 >> >> > >> >> > -63> 2016-06-06 11:49:37.720339 7f09d0152800 5 >> >> > asok(0x7f09dbb78280) >> >> > register_command config get hook 0x7f09dbb58050 >> >> > >> >> > -62> 2016-06-06 11:49:37.720340 7f09d0152800 5 >> >> > asok(0x7f09dbb78280) >> >> > register_command config diff hook 0x7f09dbb58050 >> >> > >> >> > -61> 2016-06-06 11:49:37.720342 7f09d0152800 5 >> >> > asok(0x7f09dbb78280) >> >> > register_command log flush hook 0x7f09dbb58050 >> >> > >> >> > -60> 2016-06-06 11:49:37.720343 7f09d0152800 5 >> >> > asok(0x7f09dbb78280) >> >> > register_command log dump hook 0x7f09dbb58050 >> >> > >> >> > -59> 2016-06-06 11:49:37.720344 7f09d0152800 5 >> >> > asok(0x7f09dbb78280) >> >> > register_command log reopen hook 0x7f09dbb58050 >> >> > >> >> > -58> 2016-06-06 11:49:37.723459 7f09d0152800 0 set uid:gid to >> >> > 1000:1000 >> >> > (ceph:ceph) >> >> > >> >> > -57> 2016-06-06 11:49:37.723476 7f09d0152800 0 ceph version >> >> > 10.2.1 >> >> > (3a66dd4f30852819c1bdaa8ec23c795d4ad77269), process ceph-osd, pid >> >> > 9943 >> >> > >> >> > -56> 2016-06-06 11:49:37.727080 7f09d0152800 1 -- >> >> > 10.253.50.213:0/0 >> >> > learned my addr 10.253.50.213:0/0 >> >> > >> >> > -55> 2016-06-06 11:49:37.727092 7f09d0152800 1 >> >> > accepter.accepter.bind >> >> > my_inst.addr is 10.253.50.213:6806/9943 need_addr=0 >> >> > >> >> > -54> 2016-06-06 11:49:37.727104 7f09d0152800 1 -- 172.16.1.3:0/0 >> >> > learned >> >> > my addr 172.16.1.3:0/0 >> >> > >> >> > -53> 2016-06-06 11:49:37.727109 7f09d0152800 1 >> >> > accepter.accepter.bind >> >> > my_inst.addr is 172.16.1.3:6806/9943 need_addr=0 >> >> > >> >> > -52> 2016-06-06 11:49:37.727119 7f09d0152800 1 -- 172.16.1.3:0/0 >> >> > learned >> >> > my addr 172.16.1.3:0/0 >> >> > >> >> > -51> 2016-06-06 11:49:37.727129 7f09d0152800 1 >> >> > accepter.accepter.bind >> >> > my_inst.addr is 172.16.1.3:6807/9943 need_addr=0 >> >> > >> >> > -50> 2016-06-06 11:49:37.727139 7f09d0152800 1 -- >> >> > 10.253.50.213:0/0 >> >> > learned my addr 10.253.50.213:0/0 >> >> > >> >> > -49> 2016-06-06 11:49:37.727143 7f09d0152800 1 >> >> > accepter.accepter.bind >> >> > my_inst.addr is 10.253.50.213:6807/9943 need_addr=0 >> >> > >> >> > -48> 2016-06-06 11:49:37.727148 7f09d0152800 0 pidfile_write: >> >> > ignore >> >> > empty --pid-file >> >> > >> >> > -47> 2016-06-06 11:49:37.728364 7f09d0152800 5 >> >> > asok(0x7f09dbb78280) >> >> > init >> >> > /var/run/ceph/ceph-osd.8.asok >> >> > >> >> > -46> 2016-06-06 11:49:37.728417 7f09d0152800 5 >> >> > asok(0x7f09dbb78280) >> >> > bind_and_listen /var/run/ceph/ceph-osd.8.asok >> >> > >> >> > -45> 2016-06-06 11:49:37.728472 7f09d0152800 5 >> >> > asok(0x7f09dbb78280) >> >> > register_command 0 hook 0x7f09dbb54110 >> >> > >> >> > -44> 2016-06-06 11:49:37.728488 7f09d0152800 5 >> >> > asok(0x7f09dbb78280) >> >> > register_command version hook 0x7f09dbb54110 >> >> > >> >> > -43> 2016-06-06 11:49:37.728493 7f09d0152800 5 >> >> > asok(0x7f09dbb78280) >> >> > register_command git_version hook 0x7f09dbb54110 >> >> > >> >> > -42> 2016-06-06 11:49:37.728498 7f09d0152800 5 >> >> > asok(0x7f09dbb78280) >> >> > register_command help hook 0x7f09dbb58230 >> >> > >> >> > -41> 2016-06-06 11:49:37.728502 7f09d0152800 5 >> >> > asok(0x7f09dbb78280) >> >> > register_command get_command_descriptions hook 0x7f09dbb58220 >> >> > >> >> > -40> 2016-06-06 11:49:37.728544 7f09d0152800 10 >> >> > monclient(hunting): >> >> > build_initial_monmap >> >> > >> >> > -39> 2016-06-06 11:49:37.734765 7f09c9df5700 5 >> >> > asok(0x7f09dbb78280) >> >> > entry start >> >> > >> >> > -38> 2016-06-06 11:49:37.736541 7f09d0152800 5 adding auth >> >> > protocol: >> >> > cephx >> >> > >> >> > -37> 2016-06-06 11:49:37.736552 7f09d0152800 5 adding auth >> >> > protocol: >> >> > cephx >> >> > >> >> > -36> 2016-06-06 11:49:37.736672 7f09d0152800 5 >> >> > asok(0x7f09dbb78280) >> >> > register_command objecter_requests hook 0x7f09dbb58270 >> >> > >> >> > -35> 2016-06-06 11:49:37.736710 7f09d0152800 1 -- >> >> > 10.253.50.213:6806/9943 messenger.start >> >> > >> >> > -34> 2016-06-06 11:49:37.736732 7f09d0152800 1 -- :/0 >> >> > messenger.start >> >> > >> >> > -33> 2016-06-06 11:49:37.736748 7f09d0152800 1 -- >> >> > 10.253.50.213:6807/9943 messenger.start >> >> > >> >> > -32> 2016-06-06 11:49:37.736763 7f09d0152800 1 -- >> >> > 172.16.1.3:6807/9943 >> >> > messenger.start >> >> > >> >> > -31> 2016-06-06 11:49:37.736774 7f09d0152800 1 -- >> >> > 172.16.1.3:6806/9943 >> >> > messenger.start >> >> > >> >> > -30> 2016-06-06 11:49:37.736786 7f09d0152800 1 -- :/0 >> >> > messenger.start >> >> > >> >> > -29> 2016-06-06 11:49:37.736821 7f09d0152800 2 osd.8 0 mounting >> >> > /var/lib/ceph/osd/ceph-8 /var/lib/ceph/osd/ceph-8/journal >> >> > >> >> > -28> 2016-06-06 11:49:37.736864 7f09d0152800 0 >> >> > filestore(/var/lib/ceph/osd/ceph-8) backend xfs (magic 0x58465342) >> >> > >> >> > -27> 2016-06-06 11:49:37.737113 7f09d0152800 0 >> >> > genericfilestorebackend(/var/lib/ceph/osd/ceph-8) detect_features: >> >> > FIEMAP >> >> > ioctl is disabled via 'filestore fiemap' config option >> >> > >> >> > -26> 2016-06-06 11:49:37.737127 7f09d0152800 0 >> >> > genericfilestorebackend(/var/lib/ceph/osd/ceph-8) detect_features: >> >> > SEEK_DATA/SEEK_HOLE is disabled via 'filestore seek data hole' config >> >> > option >> >> > >> >> > -25> 2016-06-06 11:49:37.737144 7f09d0152800 0 >> >> > genericfilestorebackend(/var/lib/ceph/osd/ceph-8) detect_features: >> >> > splice is >> >> > supported >> >> > >> >> > -24> 2016-06-06 11:49:37.748618 7f09d0152800 0 >> >> > genericfilestorebackend(/var/lib/ceph/osd/ceph-8) detect_features: >> >> > syncfs(2) >> >> > syscall fully supported (by glibc and kernel) >> >> > >> >> > -23> 2016-06-06 11:49:37.748704 7f09d0152800 0 >> >> > xfsfilestorebackend(/var/lib/ceph/osd/ceph-8) detect_feature: extsize >> >> > is >> >> > disabled by conf >> >> > >> >> > -22> 2016-06-06 11:49:37.749225 7f09d0152800 1 leveldb: >> >> > Recovering >> >> > log >> >> > #62532 >> >> > >> >> > -21> 2016-06-06 11:49:37.837041 7f09d0152800 1 leveldb: Delete >> >> > type=3 >> >> > #62531 >> >> > >> >> > >> >> > -20> 2016-06-06 11:49:37.837098 7f09d0152800 1 leveldb: Delete >> >> > type=0 >> >> > #62532 >> >> > >> >> > >> >> > -19> 2016-06-06 11:49:37.837539 7f09d0152800 0 >> >> > filestore(/var/lib/ceph/osd/ceph-8) mount: enabling WRITEAHEAD >> >> > journal >> >> > mode: >> >> > checkpoint is not enabled >> >> > >> >> > -18> 2016-06-06 11:49:37.839069 7f09d0152800 2 journal open >> >> > /var/lib/ceph/osd/ceph-8/journal fsid >> >> > 12375982-ca61-4170-93a0-03f1930ced83 >> >> > fs_op_seq 58991306 >> >> > >> >> > -17> 2016-06-06 11:49:37.839106 7f09d0152800 1 journal _open >> >> > /var/lib/ceph/osd/ceph-8/journal fd 18: 10737418240 bytes, block size >> >> > 4096 >> >> > bytes, directio = 1, aio = 1 >> >> > >> >> > -16> 2016-06-06 11:49:37.873359 7f09d0152800 2 journal No further >> >> > valid >> >> > entries found, journal is most likely valid >> >> > >> >> > -15> 2016-06-06 11:49:37.873378 7f09d0152800 2 journal No further >> >> > valid >> >> > entries found, journal is most likely valid >> >> > >> >> > -14> 2016-06-06 11:49:37.873382 7f09d0152800 3 journal >> >> > journal_replay: >> >> > end of journal, done. >> >> > >> >> > -13> 2016-06-06 11:49:37.873430 7f09d0152800 1 journal _open >> >> > /var/lib/ceph/osd/ceph-8/journal fd 18: 10737418240 bytes, block size >> >> > 4096 >> >> > bytes, directio = 1, aio = 1 >> >> > >> >> > -12> 2016-06-06 11:49:37.896293 7f09d0152800 1 >> >> > filestore(/var/lib/ceph/osd/ceph-8) upgrade >> >> > >> >> > -11> 2016-06-06 11:49:37.896346 7f09d0152800 2 osd.8 0 boot >> >> > >> >> > -10> 2016-06-06 11:49:37.896597 7f09d0152800 1 <cls> >> >> > cls/replica_log/cls_replica_log.cc:141: Loaded replica log class! >> >> > >> >> > -9> 2016-06-06 11:49:37.898335 7f09d0152800 1 <cls> >> >> > cls/refcount/cls_refcount.cc:232: Loaded refcount class! >> >> > >> >> > -8> 2016-06-06 11:49:37.898423 7f09d0152800 1 <cls> >> >> > cls/timeindex/cls_timeindex.cc:259: Loaded timeindex class! >> >> > >> >> > -7> 2016-06-06 11:49:37.898507 7f09d0152800 1 <cls> >> >> > cls/statelog/cls_statelog.cc:306: Loaded log class! >> >> > >> >> > -6> 2016-06-06 11:49:37.898587 7f09d0152800 1 <cls> >> >> > cls/log/cls_log.cc:317: Loaded log class! >> >> > >> >> > -5> 2016-06-06 11:49:37.908832 7f09d0152800 1 <cls> >> >> > cls/user/cls_user.cc:375: Loaded user class! >> >> > >> >> > -4> 2016-06-06 11:49:37.910755 7f09d0152800 1 <cls> >> >> > cls/rgw/cls_rgw.cc:3207: Loaded rgw class! >> >> > >> >> > -3> 2016-06-06 11:49:37.920661 7f09d0152800 0 <cls> >> >> > cls/cephfs/cls_cephfs.cc:202: loading cephfs_size_scan >> >> > >> >> > -2> 2016-06-06 11:49:37.920774 7f09d0152800 0 <cls> >> >> > cls/hello/cls_hello.cc:305: loading cls_hello >> >> > >> >> > -1> 2016-06-06 11:49:37.920926 7f09d0152800 1 <cls> >> >> > cls/version/cls_version.cc:228: Loaded version class! >> >> > >> >> > 0> 2016-06-06 11:49:37.940701 7f09d0152800 -1 osd/OSD.h: In >> >> > function >> >> > 'OSDMapRef OSDService::get_map(epoch_t)' thread 7f09d0152800 time >> >> > 2016-06-06 >> >> > 11:49:37.921083 >> >> > >> >> > osd/OSD.h: 885: FAILED assert(ret) >> >> > >> >> > >> >> > ceph version 10.2.1 (3a66dd4f30852819c1bdaa8ec23c795d4ad77269) >> >> > >> >> > 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char >> >> > const*)+0x8b) [0x7f09d0b50e2b] >> >> > >> >> > 2: (OSDService::get_map(unsigned int)+0x3d) [0x7f09d055c85d] >> >> > >> >> > 3: (OSD::init()+0x1ed2) [0x7f09d0512052] >> >> > >> >> > 4: (main()+0x29d1) [0x7f09d0479cb1] >> >> > >> >> > 5: (__libc_start_main()+0xf5) [0x7f09cd075ec5] >> >> > >> >> > 6: (()+0x353987) [0x7f09d04c2987] >> >> > >> >> > NOTE: a copy of the executable, or `objdump -rdS <executable>` is >> >> > needed to >> >> > interpret this. >> >> > >> >> > >> >> > --- logging levels --- >> >> > >> >> > 0/ 5 none >> >> > >> >> > 0/ 1 lockdep >> >> > >> >> > 0/ 1 context >> >> > >> >> > 1/ 1 crush >> >> > >> >> > 1/ 5 mds >> >> > >> >> > 1/ 5 mds_balancer >> >> > >> >> > 1/ 5 mds_locker >> >> > >> >> > 1/ 5 mds_log >> >> > >> >> > 1/ 5 mds_log_expire >> >> > >> >> > 1/ 5 mds_migrator >> >> > >> >> > 0/ 1 buffer >> >> > >> >> > 0/ 1 timer >> >> > >> >> > 0/ 1 filer >> >> > >> >> > 0/ 1 striper >> >> > >> >> > 0/ 1 objecter >> >> > >> >> > 0/ 5 rados >> >> > >> >> > 0/ 5 rbd >> >> > >> >> > 0/ 5 rbd_mirror >> >> > >> >> > 0/ 5 rbd_replay >> >> > >> >> > 0/ 5 journaler >> >> > >> >> > 0/ 5 objectcacher >> >> > >> >> > 0/ 5 client >> >> > >> >> > 0/ 5 osd >> >> > >> >> > 0/ 5 optracker >> >> > >> >> > 0/ 5 objclass >> >> > >> >> > 1/ 3 filestore >> >> > >> >> > 1/ 3 journal >> >> > >> >> > 0/ 5 ms >> >> > >> >> > 1/ 5 mon >> >> > >> >> > 0/10 monc >> >> > >> >> > 1/ 5 paxos >> >> > >> >> > 0/ 5 tp >> >> > >> >> > 1/ 5 auth >> >> > >> >> > 1/ 5 crypto >> >> > >> >> > 1/ 1 finisher >> >> > >> >> > 1/ 5 heartbeatmap >> >> > >> >> > 1/ 5 perfcounter >> >> > >> >> > 1/ 5 rgw >> >> > >> >> > 1/10 civetweb >> >> > >> >> > 1/ 5 javaclient >> >> > >> >> > 1/ 5 asok >> >> > >> >> > 1/ 1 throttle >> >> > >> >> > 0/ 0 refs >> >> > >> >> > 1/ 5 xio >> >> > >> >> > 1/ 5 compressor >> >> > >> >> > 1/ 5 newstore >> >> > >> >> > 1/ 5 bluestore >> >> > >> >> > 1/ 5 bluefs >> >> > >> >> > 1/ 3 bdev >> >> > >> >> > 1/ 5 kstore >> >> > >> >> > 4/ 5 rocksdb >> >> > >> >> > 4/ 5 leveldb >> >> > >> >> > 1/ 5 kinetic >> >> > >> >> > 1/ 5 fuse >> >> > >> >> > -2/-2 (syslog threshold) >> >> > >> >> > -1/-1 (stderr threshold) >> >> > >> >> > max_recent 10000 >> >> > >> >> > max_new 1000 >> >> > >> >> > log_file /var/log/ceph/ceph-osd.8.log >> >> > >> >> > --- end dump of recent events --- >> >> > >> >> > >> >> > Now, if I do a zap on the disk and add everything back manually, that >> >> > works, >> >> > however I would really not like to do that for 12 x 10 nodes. >> >> > >> >> > Does anyone have any ideas? >> >> > >> >> > Thanks. >> >> > >> >> > -Tu Holmes >> >> > >> >> > >> >> > _______________________________________________ >> >> > ceph-users mailing list >> >> > ceph-users@xxxxxxxxxxxxxx >> >> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >> >> > _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com