Hi Yan, I've just checked my build process again... Your patch did get applied to journal.cc in the tree cloned from git. However, when I ran make-dist the resulting ceph-12.0.3-1744-g84d57eb.tar.bz2 now contains an un-patched journal.cc - presumably make-dist is downloading a new copy of journal.cc from github? If I run ./do_cmake.sh ; cd build ; make the new ceph-mds binary works perfectly, So I've copied the good copy over /usr/bin/ceph-mds and, good news, my MDS servers now work, so the the file system is accessible. By the way, I recall Greg Farnum warning against snapshots in June 2016: <http://lists.ceph.com/pipermail/ceph-users-ceph.com/2016-June/010812.html> Are snapshots still considered to be highly dangerous? And if so, is there a likelyhood of this changing in the next year? thanks again, Jake On 16/06/17 09:19, Jake Grimmett wrote: > Hi Yan, > > Many thanks for getting back to me - sorry to cause you bother. > > I think I'm patching OK, but can you please check my methodology? > > git clone git://github.com/ceph/ceph ; cd ceph > > git apply ceph-mds.patch ; ./make-srpm.sh > > rpmbuild --rebuild /root/ceph/ceph/ceph-12.0.3-1661-g3ddbfcd.el7.src.rpm > > > here is the section of the patched src/mds/journal.cc > > 2194 // note which segments inodes belong to, so we don't have to > start rejournaling them > 2195 for (const auto &ino : inos) { > 2196 CInode *in = mds->mdcache->get_inode(ino); > 2197 if (!in) { > 2198 dout(0) << "EOpen.replay ino " << ino << " not in > metablob" << dendl; > 2199 assert(in); > 2200 } > 2201 _segment->open_files.push_back(&in->item_open_file); > 2202 } > 2203 for (const auto &vino : snap_inos) { > 2204 CInode *in = mds->mdcache->get_inode(vino); > 2205 if (!in) { > 2206 dout(0) << "EOpen.replay ino " << vino << " not in > metablob" << dendl; > 2207 continue; > 2208 } > > many thanks for your time, > > Jake > > > On 16/06/17 08:04, Yan, Zheng wrote: >> On Thu, Jun 15, 2017 at 7:32 PM, Jake Grimmett <jog@xxxxxxxxxxxxxxxxx> wrote: >>> Hi Yan, >>> >>> Many thanks for looking into this and providing a patch. >>> >>> I've downloaded ceph 12.0.3-1661-g3ddbfcd, applied your patch, rebuilt >>> the rpms, and installed across my cluster. >>> >>> Unfortunately, the MDS are still crashing, any ideas welcome :) >>> >>> With "debug_mds = 10" the full Log is 140MB, a truncated version of the >>> log immediately preceding the crash follows: >>> >>> best, >>> >>> Jake >>> >>> -5> 2017-06-15 12:21:14.084373 7f77fe590700 10 mds.0.journal >>> EMetaBlob.replay added (full) [dentry >>> #1/isilon/sc/users/spc/JessComb_AB_230115/JessB_TO_190115_F6_1/n0/JessB_TO_190115_F6_1.peaks_int >>> [9f,head] auth NULL (dversion lock) v=3104 inode=0 >>> state=1073741888|bottomlru 0x7f781a3f1860] >>> -4> 2017-06-15 12:21:14.084375 7f77fe590700 10 mds.0.journal >>> EMetaBlob.replay added [inode 1000147f773 [9f,head] >>> /isilon/sc/users/spc/JessComb_AB_230115/JessB_TO_190115_F6_1/n0/JessB_TO_190115_F6_1.peaks_int >>> auth v3104 s=4 n(v0 b4 1=1+0) (iversion lock) cr={3554272=0-4194304@9e} >>> 0x7f781a3f5800] >>> -3> 2017-06-15 12:21:14.084379 7f77fe590700 10 mds.0.journal >>> EMetaBlob.replay added (full) [dentry >>> #1/isilon/sc/users/spc/JessComb_AB_230115/JessB_TO_190115_F6_1/n0/JessB_TO_190115_F6_1.peaks_maxt >>> [9f,head] auth NULL (dversion lock) v=3132 inode=0 >>> state=1073741888|bottomlru 0x7f781a3f1d40] >>> -2> 2017-06-15 12:21:14.084381 7f77fe590700 10 mds.0.journal >>> EMetaBlob.replay added [inode 1000147f775 [9f,head] >>> /isilon/sc/users/spc/JessComb_AB_230115/JessB_TO_190115_F6_1/n0/JessB_TO_190115_F6_1.peaks_maxt >>> auth v3132 s=4 n(v0 b4 1=1+0) (iversion lock) cr={3554272=0-4194304@9e} >>> 0x7f781a3f5e00] >>> -1> 2017-06-15 12:21:14.084406 7f77fe590700 0 mds.0.journal >>> EOpen.replay ino 1000147761b.9a not in metablob >>> 0> 2017-06-15 12:21:14.085348 7f77fe590700 -1 >>> /root/rpmbuild/BUILD/ceph-12.0.3-1661-g3ddbfcd/src/mds/journal.cc: In >>> function 'virtual void EOpen::replay(MDSRank*)' thread 7f77fe590700 time >>> 2017-06-15 12:21:14.084409 >>> /root/rpmbuild/BUILD/ceph-12.0.3-1661-g3ddbfcd/src/mds/journal.cc: 2207: >>> FAILED assert(in) >>> >> The assertion should be removed by my patch. Maybe you didn't cleanly >> apply the patch. >> >> >> Regards >> Yan, Zheng >> >>> ceph version 12.0.3-1661-g3ddbfcd >>> (3ddbfcd4357ab3a3c2f17f86f88dc83172d4ce0d) luminous (dev) >>> 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char >>> const*)+0x110) [0x7f780d290500] >>> 2: (EOpen::replay(MDSRank*)+0x3e5) [0x7f780d2397b5] >>> 3: (MDLog::_replay_thread()+0x5f2) [0x7f780d1efd12] >>> 4: (MDLog::ReplayThread::entry()+0xd) [0x7f780cf9b6ad] >>> 5: (()+0x7dc5) [0x7f780adb4dc5] >>> 6: (clone()+0x6d) [0x7f7809e9476d] >>> NOTE: a copy of the executable, or `objdump -rdS <executable>` is >>> needed to interpret this. >>> >>> --- logging levels --- >>> 0/ 5 none >>> 0/ 1 lockdep >>> 0/ 1 context >>> 1/ 1 crush >>> 10/10 mds >>> 1/ 5 mds_balancer >>> 1/ 5 mds_locker >>> 1/ 5 mds_log >>> 1/ 5 mds_log_expire >>> 1/ 5 mds_migrator >>> 0/ 1 buffer >>> 0/ 1 timer >>> 0/ 1 filer >>> 0/ 1 striper >>> 0/ 1 objecter >>> 0/ 5 rados >>> 0/ 5 rbd >>> 0/ 5 rbd_mirror >>> 0/ 5 rbd_replay >>> 0/ 5 journaler >>> 0/ 5 objectcacher >>> 0/ 5 client >>> 1/ 5 osd >>> 0/ 5 optracker >>> 0/ 5 objclass >>> 1/ 3 filestore >>> 1/ 3 journal >>> 0/ 5 ms >>> 1/ 5 mon >>> 0/10 monc >>> 1/ 5 paxos >>> 0/ 5 tp >>> 1/ 5 auth >>> 1/ 5 crypto >>> 1/ 1 finisher >>> 1/ 5 heartbeatmap >>> 1/ 5 perfcounter >>> 1/ 5 rgw >>> 1/10 civetweb >>> 1/ 5 javaclient >>> 1/ 5 asok >>> 1/ 1 throttle >>> 0/ 0 refs >>> 1/ 5 xio >>> 1/ 5 compressor >>> 1/ 5 bluestore >>> 1/ 5 bluefs >>> 1/ 3 bdev >>> 1/ 5 kstore >>> 4/ 5 rocksdb >>> 4/ 5 leveldb >>> 4/ 5 memdb >>> 1/ 5 kinetic >>> 1/ 5 fuse >>> 1/ 5 mgr >>> 1/ 5 mgrc >>> 1/ 5 dpdk >>> 1/ 5 eventtrace >>> -2/-2 (syslog threshold) >>> -1/-1 (stderr threshold) >>> max_recent 10000 >>> max_new 1000 >>> log_file /var/log/ceph/ceph-mds.cephfs1.log >>> --- end dump of recent events --- >>> 2017-06-15 12:21:14.101761 7f77fe590700 -1 *** Caught signal (Aborted) ** >>> in thread 7f77fe590700 thread_name:md_log_replay >>> >>> ceph version 12.0.3-1661-g3ddbfcd >>> (3ddbfcd4357ab3a3c2f17f86f88dc83172d4ce0d) luminous (dev) >>> 1: (()+0x57d7ff) [0x7f780d2507ff] >>> 2: (()+0xf370) [0x7f780adbc370] >>> 3: (gsignal()+0x37) [0x7f7809dd21d7] >>> 4: (abort()+0x148) [0x7f7809dd38c8] >>> 5: (ceph::__ceph_assert_fail(char const*, char const*, int, char >>> const*)+0x284) [0x7f780d290674] >>> 6: (EOpen::replay(MDSRank*)+0x3e5) [0x7f780d2397b5] >>> 7: (MDLog::_replay_thread()+0x5f2) [0x7f780d1efd12] >>> 8: (MDLog::ReplayThread::entry()+0xd) [0x7f780cf9b6ad] >>> 9: (()+0x7dc5) [0x7f780adb4dc5] >>> 10: (clone()+0x6d) [0x7f7809e9476d] >>> NOTE: a copy of the executable, or `objdump -rdS <executable>` is >>> needed to interpret this. >>> >>> --- begin dump of recent events --- >>> 0> 2017-06-15 12:21:14.101761 7f77fe590700 -1 *** Caught signal >>> (Aborted) ** >>> in thread 7f77fe590700 thread_name:md_log_replay >>> >>> ceph version 12.0.3-1661-g3ddbfcd >>> (3ddbfcd4357ab3a3c2f17f86f88dc83172d4ce0d) luminous (dev) >>> 1: (()+0x57d7ff) [0x7f780d2507ff] >>> 2: (()+0xf370) [0x7f780adbc370] >>> 3: (gsignal()+0x37) [0x7f7809dd21d7] >>> 4: (abort()+0x148) [0x7f7809dd38c8] >>> 5: (ceph::__ceph_assert_fail(char const*, char const*, int, char >>> const*)+0x284) [0x7f780d290674] >>> 6: (EOpen::replay(MDSRank*)+0x3e5) [0x7f780d2397b5] >>> 7: (MDLog::_replay_thread()+0x5f2) [0x7f780d1efd12] >>> 8: (MDLog::ReplayThread::entry()+0xd) [0x7f780cf9b6ad] >>> 9: (()+0x7dc5) [0x7f780adb4dc5] >>> 10: (clone()+0x6d) [0x7f7809e9476d] >>> NOTE: a copy of the executable, or `objdump -rdS <executable>` is >>> needed to interpret this. >>> >>> --- logging levels --- >>> 0/ 5 none >>> 0/ 1 lockdep >>> 0/ 1 context >>> 1/ 1 crush >>> 10/10 mds >>> 1/ 5 mds_balancer >>> 1/ 5 mds_locker >>> 1/ 5 mds_log >>> 1/ 5 mds_log_expire >>> 1/ 5 mds_migrator >>> 0/ 1 buffer >>> 0/ 1 timer >>> 0/ 1 filer >>> 0/ 1 striper >>> 0/ 1 objecter >>> 0/ 5 rados >>> 0/ 5 rbd >>> 0/ 5 rbd_mirror >>> 0/ 5 rbd_replay >>> 0/ 5 journaler >>> 0/ 5 objectcacher >>> 0/ 5 client >>> 1/ 5 osd >>> 0/ 5 optracker >>> 0/ 5 objclass >>> 1/ 3 filestore >>> 1/ 3 journal >>> 0/ 5 ms >>> 1/ 5 mon >>> 0/10 monc >>> 1/ 5 paxos >>> 0/ 5 tp >>> 1/ 5 auth >>> 1/ 5 crypto >>> 1/ 1 finisher >>> 1/ 5 heartbeatmap >>> 1/ 5 perfcounter >>> 1/ 5 rgw >>> 1/10 civetweb >>> 1/ 5 javaclient >>> 1/ 5 asok >>> 1/ 1 throttle >>> 0/ 0 refs >>> 1/ 5 xio >>> 1/ 5 compressor >>> 1/ 5 bluestore >>> 1/ 5 bluefs >>> 1/ 3 bdev >>> 1/ 5 kstore >>> 4/ 5 rocksdb >>> 4/ 5 leveldb >>> 4/ 5 memdb >>> 1/ 5 kinetic >>> 1/ 5 fuse >>> 1/ 5 mgr >>> 1/ 5 mgrc >>> 1/ 5 dpdk >>> 1/ 5 eventtrace >>> -2/-2 (syslog threshold) >>> -1/-1 (stderr threshold) >>> max_recent 10000 >>> max_new 1000 >>> log_file /var/log/ceph/ceph-mds.cephfs1.log >>> --- end dump of recent events --- >>> >>> >>> On 15/06/17 08:10, Yan, Zheng wrote: >>>> On Wed, Jun 14, 2017 at 11:49 PM, Jake Grimmett <jog@xxxxxxxxxxxxxxxxx> wrote: >>>>> Dear All, >>>>> >>>>> Sorry, but I need to add +1 to the mds crash reports with ceph >>>>> 12.0.3-1507-g52f0deb >>>>> >>>>> This happened to me after updating from 12.0.2 >>>>> All was fairly OK for a few hours, I/O around 500MB/s, then both MDS >>>>> servers crashed, and have not worked since. >>>>> >>>>> The two MDS servers, are active:standby, both now crash immediately >>>>> after being started. >>>>> >>>>> This cluster has been upgraded from Kraken, through several Luminous >>>>> versions, so I did a clean install of SL7.3 on one MDS server, and still >>>>> have crashes on this machine. >>>>> >>>>> Cluster has 40 x 8TB drives (EC 4+1), with dual replicated NVME >>>>> providing a hotpool to drive the Cephfs layer. df -h /cephfs is/was >>>>> 200TB. All OSD's are bluestore, and were generated on Luminous. >>>>> >>>>> I enabled snapshots a few days ago, and keep 144 snapshots (one taken >>>>> every 10 minutes, each is kept for 24 hours only) about 30TB is copied >>>>> into the fs each day. If snapshots caused the crash, I can regenerate >>>>> the data, but they are very useful. >>>>> >>>>> One MDS gave this log... >>>>> >>>>> <http://www.mrc-lmb.cam.ac.uk/jog/ceph-mds.cephfs1.log> >>>> It is a snapshot related bug. The Attached patch should prevent mds >>>> from crashing. >>>> Next time you restart mds, please set debug_mds=10 and upload the log. >>>> >>>> Regards >>>> Yan, Zheng >>>> >>>>> many thanks for any suggestions, and it's great to see the experimental >>>>> flag removed from bluestore! >>>>> >>>>> Jake >>>>> -- >>>>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in >>>>> the body of a message to majordomo@xxxxxxxxxxxxxxx >>>>> More majordomo info at http://vger.kernel.org/majordomo-info.html > -- > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in > the body of a message to majordomo@xxxxxxxxxxxxxxx > More majordomo info at http://vger.kernel.org/majordomo-info.html > -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html