Re: ceph-mds crash v12.0.3

Jake Grimmett <jog@xxxxxxxxxxxxxxxxx> · Fri, 16 Jun 2017 09:19:14 +0100

Hi Yan,

Many thanks for getting back to me - sorry to cause you bother.

I think I'm patching OK, but can you please check my methodology?

git clone git://github.com/ceph/ceph ; cd ceph

git apply ceph-mds.patch ; ./make-srpm.sh 

rpmbuild --rebuild /root/ceph/ceph/ceph-12.0.3-1661-g3ddbfcd.el7.src.rpm

here is the section of the patched src/mds/journal.cc

   2194   // note which segments inodes belong to, so we don't have to
start rejournaling them
   2195   for (const auto &ino : inos) {
   2196     CInode *in = mds->mdcache->get_inode(ino);
   2197     if (!in) {
   2198       dout(0) << "EOpen.replay ino " << ino << " not in
metablob" << dendl;
   2199       assert(in);
   2200     }
   2201     _segment->open_files.push_back(&in->item_open_file);
   2202   }
   2203   for (const auto &vino : snap_inos) {
   2204     CInode *in = mds->mdcache->get_inode(vino);
   2205     if (!in) {
   2206       dout(0) << "EOpen.replay ino " << vino << " not in
metablob" << dendl;
   2207       continue;
   2208     }

many thanks for your time,

Jake

On 16/06/17 08:04, Yan, Zheng wrote:
> On Thu, Jun 15, 2017 at 7:32 PM, Jake Grimmett <jog@xxxxxxxxxxxxxxxxx> wrote:
>> Hi Yan,
>>
>> Many thanks for looking into this and providing a patch.
>>
>> I've downloaded ceph 12.0.3-1661-g3ddbfcd, applied your patch, rebuilt
>> the rpms, and installed across my cluster.
>>
>> Unfortunately, the MDS are still crashing, any ideas welcome :)
>>
>> With "debug_mds = 10" the full Log is 140MB, a truncated version of the
>> log immediately preceding the crash follows:
>>
>> best,
>>
>> Jake
>>
>>     -5> 2017-06-15 12:21:14.084373 7f77fe590700 10 mds.0.journal
>> EMetaBlob.replay added (full) [dentry
>> #1/isilon/sc/users/spc/JessComb_AB_230115/JessB_TO_190115_F6_1/n0/JessB_TO_190115_F6_1.peaks_int
>> [9f,head] auth NULL (dversion lock) v=3104 inode=0
>> state=1073741888|bottomlru 0x7f781a3f1860]
>>     -4> 2017-06-15 12:21:14.084375 7f77fe590700 10 mds.0.journal
>> EMetaBlob.replay added [inode 1000147f773 [9f,head]
>> /isilon/sc/users/spc/JessComb_AB_230115/JessB_TO_190115_F6_1/n0/JessB_TO_190115_F6_1.peaks_int
>> auth v3104 s=4 n(v0 b4 1=1+0) (iversion lock) cr={3554272=0-4194304@9e}
>> 0x7f781a3f5800]
>>     -3> 2017-06-15 12:21:14.084379 7f77fe590700 10 mds.0.journal
>> EMetaBlob.replay added (full) [dentry
>> #1/isilon/sc/users/spc/JessComb_AB_230115/JessB_TO_190115_F6_1/n0/JessB_TO_190115_F6_1.peaks_maxt
>> [9f,head] auth NULL (dversion lock) v=3132 inode=0
>> state=1073741888|bottomlru 0x7f781a3f1d40]
>>     -2> 2017-06-15 12:21:14.084381 7f77fe590700 10 mds.0.journal
>> EMetaBlob.replay added [inode 1000147f775 [9f,head]
>> /isilon/sc/users/spc/JessComb_AB_230115/JessB_TO_190115_F6_1/n0/JessB_TO_190115_F6_1.peaks_maxt
>> auth v3132 s=4 n(v0 b4 1=1+0) (iversion lock) cr={3554272=0-4194304@9e}
>> 0x7f781a3f5e00]
>>     -1> 2017-06-15 12:21:14.084406 7f77fe590700  0 mds.0.journal
>> EOpen.replay ino 1000147761b.9a not in metablob
>>      0> 2017-06-15 12:21:14.085348 7f77fe590700 -1
>> /root/rpmbuild/BUILD/ceph-12.0.3-1661-g3ddbfcd/src/mds/journal.cc: In
>> function 'virtual void EOpen::replay(MDSRank*)' thread 7f77fe590700 time
>> 2017-06-15 12:21:14.084409
>> /root/rpmbuild/BUILD/ceph-12.0.3-1661-g3ddbfcd/src/mds/journal.cc: 2207:
>> FAILED assert(in)
>>
> The assertion should be removed by my patch. Maybe you didn't cleanly
> apply the patch.
>
>
> Regards
> Yan, Zheng
>
>>  ceph version 12.0.3-1661-g3ddbfcd
>> (3ddbfcd4357ab3a3c2f17f86f88dc83172d4ce0d) luminous (dev)
>>  1: (ceph::__ceph_assert_fail(char const*, char const*, int, char
>> const*)+0x110) [0x7f780d290500]
>>  2: (EOpen::replay(MDSRank*)+0x3e5) [0x7f780d2397b5]
>>  3: (MDLog::_replay_thread()+0x5f2) [0x7f780d1efd12]
>>  4: (MDLog::ReplayThread::entry()+0xd) [0x7f780cf9b6ad]
>>  5: (()+0x7dc5) [0x7f780adb4dc5]
>>  6: (clone()+0x6d) [0x7f7809e9476d]
>>  NOTE: a copy of the executable, or `objdump -rdS <executable>` is
>> needed to interpret this.
>>
>> --- logging levels ---
>>    0/ 5 none
>>    0/ 1 lockdep
>>    0/ 1 context
>>    1/ 1 crush
>>   10/10 mds
>>    1/ 5 mds_balancer
>>    1/ 5 mds_locker
>>    1/ 5 mds_log
>>    1/ 5 mds_log_expire
>>    1/ 5 mds_migrator
>>    0/ 1 buffer
>>    0/ 1 timer
>>    0/ 1 filer
>>    0/ 1 striper
>>    0/ 1 objecter
>>    0/ 5 rados
>>    0/ 5 rbd
>>    0/ 5 rbd_mirror
>>    0/ 5 rbd_replay
>>    0/ 5 journaler
>>    0/ 5 objectcacher
>>    0/ 5 client
>>    1/ 5 osd
>>    0/ 5 optracker
>>    0/ 5 objclass
>>    1/ 3 filestore
>>    1/ 3 journal
>>    0/ 5 ms
>>    1/ 5 mon
>>    0/10 monc
>>    1/ 5 paxos
>>    0/ 5 tp
>>    1/ 5 auth
>>    1/ 5 crypto
>>    1/ 1 finisher
>>    1/ 5 heartbeatmap
>>    1/ 5 perfcounter
>>    1/ 5 rgw
>>    1/10 civetweb
>>    1/ 5 javaclient
>>    1/ 5 asok
>>    1/ 1 throttle
>>    0/ 0 refs
>>    1/ 5 xio
>>    1/ 5 compressor
>>    1/ 5 bluestore
>>    1/ 5 bluefs
>>    1/ 3 bdev
>>    1/ 5 kstore
>>    4/ 5 rocksdb
>>    4/ 5 leveldb
>>    4/ 5 memdb
>>    1/ 5 kinetic
>>    1/ 5 fuse
>>    1/ 5 mgr
>>    1/ 5 mgrc
>>    1/ 5 dpdk
>>    1/ 5 eventtrace
>>   -2/-2 (syslog threshold)
>>   -1/-1 (stderr threshold)
>>   max_recent     10000
>>   max_new         1000
>>   log_file /var/log/ceph/ceph-mds.cephfs1.log
>> --- end dump of recent events ---
>> 2017-06-15 12:21:14.101761 7f77fe590700 -1 *** Caught signal (Aborted) **
>>  in thread 7f77fe590700 thread_name:md_log_replay
>>
>>  ceph version 12.0.3-1661-g3ddbfcd
>> (3ddbfcd4357ab3a3c2f17f86f88dc83172d4ce0d) luminous (dev)
>>  1: (()+0x57d7ff) [0x7f780d2507ff]
>>  2: (()+0xf370) [0x7f780adbc370]
>>  3: (gsignal()+0x37) [0x7f7809dd21d7]
>>  4: (abort()+0x148) [0x7f7809dd38c8]
>>  5: (ceph::__ceph_assert_fail(char const*, char const*, int, char
>> const*)+0x284) [0x7f780d290674]
>>  6: (EOpen::replay(MDSRank*)+0x3e5) [0x7f780d2397b5]
>>  7: (MDLog::_replay_thread()+0x5f2) [0x7f780d1efd12]
>>  8: (MDLog::ReplayThread::entry()+0xd) [0x7f780cf9b6ad]
>>  9: (()+0x7dc5) [0x7f780adb4dc5]
>>  10: (clone()+0x6d) [0x7f7809e9476d]
>>  NOTE: a copy of the executable, or `objdump -rdS <executable>` is
>> needed to interpret this.
>>
>> --- begin dump of recent events ---
>>      0> 2017-06-15 12:21:14.101761 7f77fe590700 -1 *** Caught signal
>> (Aborted) **
>>  in thread 7f77fe590700 thread_name:md_log_replay
>>
>>  ceph version 12.0.3-1661-g3ddbfcd
>> (3ddbfcd4357ab3a3c2f17f86f88dc83172d4ce0d) luminous (dev)
>>  1: (()+0x57d7ff) [0x7f780d2507ff]
>>  2: (()+0xf370) [0x7f780adbc370]
>>  3: (gsignal()+0x37) [0x7f7809dd21d7]
>>  4: (abort()+0x148) [0x7f7809dd38c8]
>>  5: (ceph::__ceph_assert_fail(char const*, char const*, int, char
>> const*)+0x284) [0x7f780d290674]
>>  6: (EOpen::replay(MDSRank*)+0x3e5) [0x7f780d2397b5]
>>  7: (MDLog::_replay_thread()+0x5f2) [0x7f780d1efd12]
>>  8: (MDLog::ReplayThread::entry()+0xd) [0x7f780cf9b6ad]
>>  9: (()+0x7dc5) [0x7f780adb4dc5]
>>  10: (clone()+0x6d) [0x7f7809e9476d]
>>  NOTE: a copy of the executable, or `objdump -rdS <executable>` is
>> needed to interpret this.
>>
>> --- logging levels ---
>>    0/ 5 none
>>    0/ 1 lockdep
>>    0/ 1 context
>>    1/ 1 crush
>>   10/10 mds
>>    1/ 5 mds_balancer
>>    1/ 5 mds_locker
>>    1/ 5 mds_log
>>    1/ 5 mds_log_expire
>>    1/ 5 mds_migrator
>>    0/ 1 buffer
>>    0/ 1 timer
>>    0/ 1 filer
>>    0/ 1 striper
>>    0/ 1 objecter
>>    0/ 5 rados
>>    0/ 5 rbd
>>    0/ 5 rbd_mirror
>>    0/ 5 rbd_replay
>>    0/ 5 journaler
>>    0/ 5 objectcacher
>>    0/ 5 client
>>    1/ 5 osd
>>    0/ 5 optracker
>>    0/ 5 objclass
>>    1/ 3 filestore
>>    1/ 3 journal
>>    0/ 5 ms
>>    1/ 5 mon
>>    0/10 monc
>>    1/ 5 paxos
>>    0/ 5 tp
>>    1/ 5 auth
>>    1/ 5 crypto
>>    1/ 1 finisher
>>    1/ 5 heartbeatmap
>>    1/ 5 perfcounter
>>    1/ 5 rgw
>>    1/10 civetweb
>>    1/ 5 javaclient
>>    1/ 5 asok
>>    1/ 1 throttle
>>    0/ 0 refs
>>    1/ 5 xio
>>    1/ 5 compressor
>>    1/ 5 bluestore
>>    1/ 5 bluefs
>>    1/ 3 bdev
>>    1/ 5 kstore
>>    4/ 5 rocksdb
>>    4/ 5 leveldb
>>    4/ 5 memdb
>>    1/ 5 kinetic
>>    1/ 5 fuse
>>    1/ 5 mgr
>>    1/ 5 mgrc
>>    1/ 5 dpdk
>>    1/ 5 eventtrace
>>   -2/-2 (syslog threshold)
>>   -1/-1 (stderr threshold)
>>   max_recent     10000
>>   max_new         1000
>>   log_file /var/log/ceph/ceph-mds.cephfs1.log
>> --- end dump of recent events ---
>>
>>
>> On 15/06/17 08:10, Yan, Zheng wrote:
>>> On Wed, Jun 14, 2017 at 11:49 PM, Jake Grimmett <jog@xxxxxxxxxxxxxxxxx> wrote:
>>>> Dear All,
>>>>
>>>> Sorry, but I need to add +1 to the mds crash reports with ceph
>>>> 12.0.3-1507-g52f0deb
>>>>
>>>> This happened to me after updating from 12.0.2
>>>> All was fairly OK for a few hours, I/O  around 500MB/s, then both MDS
>>>> servers crashed, and have not worked since.
>>>>
>>>> The two MDS servers, are active:standby, both now crash immediately
>>>> after being started.
>>>>
>>>> This cluster has been upgraded from Kraken, through several Luminous
>>>> versions, so I did a clean install of SL7.3 on one MDS server, and still
>>>> have crashes on this machine.
>>>>
>>>> Cluster has 40 x 8TB drives (EC 4+1), with dual replicated NVME
>>>> providing a hotpool to drive the Cephfs layer. df -h /cephfs is/was
>>>> 200TB. All OSD's are bluestore, and were generated on Luminous.
>>>>
>>>> I enabled snapshots a few days ago, and keep 144 snapshots (one taken
>>>> every 10 minutes, each is kept for 24 hours only) about 30TB is copied
>>>> into the fs each day. If snapshots caused the crash, I can regenerate
>>>> the data, but they are very useful.
>>>>
>>>> One MDS gave this log...
>>>>
>>>> <http://www.mrc-lmb.cam.ac.uk/jog/ceph-mds.cephfs1.log>
>>> It is a snapshot related bug. The Attached patch should prevent mds
>>> from crashing.
>>> Next time you restart mds, please set debug_mds=10 and upload the log.
>>>
>>> Regards
>>> Yan, Zheng
>>>
>>>> many thanks for any suggestions, and it's great to see the experimental
>>>> flag removed from bluestore!
>>>>
>>>> Jake
>>>> --
>>>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>>>> the body of a message to majordomo@xxxxxxxxxxxxxxx
>>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html