Re: MDS crashing

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Sep 9, 2010 at 3:27 AM, Sage Weil <sage@xxxxxxxxxxxx> wrote:
> Hi Nat,
>
> Can you add
>
>        debug mds = 20
>        debug ms = 1
>
> to the [mds] section of your ceph.conf, restart the mds, and post the
> resulting mds log (normally in /var/log/ceph)?
>
> It looks like interval_set is throwing the assertion because it's
> inserting a value into the set that is already there.. the question is
> why.  Once we identify where the error came from, we can also make the
> journal replay tolerate the condition as best it can.
>
> Thanks!
> sage
>

Sage,

I have completed this please see below for the last little bit before the crash

10.09.09_08:01:28.390543 44cef940 mds0.journal EMetaBlob.replay dir 10000385002
10.09.09_08:01:28.390550 44cef940 mds0.journal EMetaBlob.replay
updated dir [dir 10000385002 /data/md_public/tmp_krx/ [2,head] auth
v=9175 cv=0/0 state=1610612738|complete f(v0 m10.09.03_13:49:31.000144
190=1+189) n(v123 rc10.09.03_14:51:22.000677 b9509807549 1032=655+377)
hs=190+1,ss=0+0 dirty=191 | child dirty 0x7fe8ccaced38]
10.09.09_08:01:28.390574 44cef940 mds0.journal EMetaBlob.replay for
[2,head] had [dentry #1/data/md_public/tmp_krx/99 [2,head] auth
(dversion lock) v=9174 inode=0x7fe8d06e5228 | inodepin dirty
0x7fe8c22dae30]
10.09.09_08:01:28.390590 44cef940 mds0.journal EMetaBlob.replay for
[2,head] had [inode 100003850a8 [...2,head]
/data/md_public/tmp_krx/99/ auth v9174 f(v2 m10.09.03_13:23:04.000779
1=0+1) n(v10 rc10.09.03_13:47:44.000284 b36219707 7=5+2) (inest sync
dirty) (ifile sync dirty) (iversion lock) | dirtyscattered dirfrag
dirty 0x7fe8d06e5228]
10.09.09_08:01:28.390609 44cef940 mds0.journal EMetaBlob.replay
request client6312:2695 707
10.09.09_08:01:28.390626 44cef940 mds0.log _replay 16078321005~1330 /
16113176521 : EUpdate scatter_writebehind [metablob 10000385002, 2
dirs]
10.09.09_08:01:28.390634 44cef940 mds0.journal EMetaBlob.replay 2 dirlumps
10.09.09_08:01:28.390639 44cef940 mds0.journal EMetaBlob.replay dir 10000385002
10.09.09_08:01:28.390646 44cef940 mds0.cache.dir(10000385002)
mark_dirty (already dirty) [dir 10000385002 /data/md_public/tmp_krx/
[2,head] auth v=9177 cv=0/0 state=1610612738|complete f(v0
m10.09.03_13:49:31.000144 190=1+189) n(v123 rc10.09.03_14:51:22.000677
b9509807549 1032=655+377) hs=190+1,ss=0+0 dirty=191 | child dirty
0x7fe8ccaced38] version 9177
10.09.09_08:01:28.390668 44cef940 mds0.journal EMetaBlob.replay
clean nestinfo on [dir 10000385002 /data/md_public/tmp_krx/ [2,head]
auth v=9177 cv=0/0 state=1610612738|complete f(v0
m10.09.03_13:49:31.000144 190=1+189) n(v123 rc10.09.03_14:51:22.000677
b9509807549 1032=655+377) hs=190+1,ss=0+0 dirty=191 | child dirty
0x7fe8ccaced38]
10.09.09_08:01:28.390687 44cef940 mds0.journal EMetaBlob.replay
updated dir [dir 10000385002 /data/md_public/tmp_krx/ [2,head] auth
v=9177 cv=0/0 state=1610612738|complete f(v0 m10.09.03_13:49:31.000144
190=1+189) n(v123 rc10.09.03_14:51:22.000677 b9509807549 1032=655+377)
hs=190+1,ss=0+0 dirty=191 | child dirty 0x7fe8ccaced38]
10.09.09_08:01:28.390709 44cef940 mds0.journal EMetaBlob.replay for
[2,head] had [dentry #1/data/md_public/tmp_krx/99 [2,head] auth
(dversion lock) v=9176 inode=0x7fe8d06e5228 | inodepin dirty
0x7fe8c22dae30]
10.09.09_08:01:28.390721 44cef940 mds0.journal EMetaBlob.replay for
[2,head] had [inode 100003850a8 [...2,head]
/data/md_public/tmp_krx/99/ auth v9176 f(v2 m10.09.03_13:23:04.000779
1=0+1) n(v10 rc10.09.03_13:47:44.000284 b36219707 7=5+2) (inest sync
dirty) (ifile sync dirty) (iversion lock) | dirtyscattered dirfrag
dirty 0x7fe8d06e5228]
10.09.09_08:01:28.390741 44cef940 mds0.journal EMetaBlob.replay dir 100003850a8
10.09.09_08:01:28.390747 44cef940 mds0.journal EMetaBlob.replay
updated dir [dir 100003850a8 /data/md_public/tmp_krx/99/ [2,head] auth
v=207 cv=0/0 state=1610612738|complete f(v2 m10.09.03_13:23:04.000779
1=0+1) n(v10 rc10.09.03_13:47:44.000284 b36219707 6=5+1) hs=1+0,ss=0+0
dirty=1 | child dirty 0x7fe8ccb0b428]
10.09.09_08:01:28.390771 44cef940 mds0.journal EMetaBlob.replay for
[2,head] had [dentry #1/data/md_public/tmp_krx/99/output [2,head] auth
(dversion lock) v=206 inode=0x7fe8d06e5a70 | inodepin dirty
0x7fe8c22db000]
10.09.09_08:01:28.390787 44cef940 mds0.journal EMetaBlob.replay for
[2,head] had [inode 100003850a9 [...2,head]
/data/md_public/tmp_krx/99/output/ auth v206 f(v2
m10.09.03_13:47:44.000284 5=5+0) n(v0 rc10.09.03_13:47:44.000284
b36219707 6=5+1) (inest sync dirty) (ifile sync dirty) (iversion lock)
| dirtyscattered dirfrag dirty 0x7fe8d06e5a70]
10.09.09_08:01:28.390826 44cef940 mds0.log _replay 16078322339~1343 /
16113176521 : EUpdate setattr [metablob 10000385002, 2 dirs]
10.09.09_08:01:28.390835 44cef940 mds0.journal EMetaBlob.replay 2 dirlumps
10.09.09_08:01:28.390841 44cef940 mds0.journal EMetaBlob.replay dir 10000385002
10.09.09_08:01:28.390847 44cef940 mds0.cache.dir(10000385002)
mark_dirty (already dirty) [dir 10000385002 /data/md_public/tmp_krx/
[2,head] auth v=9179 cv=0/0 state=1610612738|complete f(v0
m10.09.03_13:49:31.000144 190=1+189) n(v123 rc10.09.03_14:51:22.000677
b9509807549 1032=655+377) hs=190+1,ss=0+0 dirty=191 | child dirty
0x7fe8ccaced38] version 9179
10.09.09_08:01:28.390868 44cef940 mds0.journal EMetaBlob.replay
clean nestinfo on [dir 10000385002 /data/md_public/tmp_krx/ [2,head]
auth v=9179 cv=0/0 state=1610612738|complete f(v0
m10.09.03_13:49:31.000144 190=1+189) n(v123 rc10.09.03_14:51:22.000677
b9509807549 1032=655+377) hs=190+1,ss=0+0 dirty=191 | child dirty
0x7fe8ccaced38]
10.09.09_08:01:28.390887 44cef940 mds0.journal EMetaBlob.replay
updated dir [dir 10000385002 /data/md_public/tmp_krx/ [2,head] auth
v=9179 cv=0/0 state=1610612738|complete f(v0 m10.09.03_13:49:31.000144
190=1+189) n(v123 rc10.09.03_14:51:22.000677 b9509807549 1032=655+377)
hs=190+1,ss=0+0 dirty=191 | child dirty 0x7fe8ccaced38]
10.09.09_08:01:28.390911 44cef940 mds0.journal EMetaBlob.replay for
[2,head] had [dentry #1/data/md_public/tmp_krx/99 [2,head] auth
(dversion lock) v=9178 inode=0x7fe8d06e5228 | inodepin dirty
0x7fe8c22dae30]
10.09.09_08:01:28.390924 44cef940 mds0.journal EMetaBlob.replay for
[2,head] had [inode 100003850a8 [...2,head]
/data/md_public/tmp_krx/99/ auth v9178 f(v2 m10.09.03_13:23:04.000779
1=0+1) n(v10 rc10.09.03_13:47:44.000284 b36219707 7=5+2) (inest sync
dirty) (ifile sync dirty) (iversion lock) | dirtyscattered dirfrag
dirty 0x7fe8d06e5228]
10.09.09_08:01:28.390943 44cef940 mds0.journal EMetaBlob.replay dir 100003850a8
10.09.09_08:01:28.390950 44cef940 mds0.journal EMetaBlob.replay
updated dir [dir 100003850a8 /data/md_public/tmp_krx/99/ [2,head] auth
v=209 cv=0/0 state=1610612738|complete f(v2 m10.09.03_13:23:04.000779
1=0+1) n(v10 rc10.09.03_13:47:44.000284 b36219707 6=5+1) hs=1+0,ss=0+0
dirty=1 | child dirty 0x7fe8ccb0b428]
10.09.09_08:01:28.390974 44cef940 mds0.journal EMetaBlob.replay for
[2,head] had [dentry #1/data/md_public/tmp_krx/99/output [2,head] auth
(dversion lock) v=208 inode=0x7fe8d06e5a70 | inodepin dirty
0x7fe8c22db000]
10.09.09_08:01:28.390988 44cef940 mds0.journal EMetaBlob.replay for
[2,head] had [inode 100003850a9 [...2,head]
/data/md_public/tmp_krx/99/output/ auth v208 f(v2
m10.09.03_13:47:44.000284 5=5+0) n(v0 rc10.09.03_13:47:44.000284
b36219707 6=5+1) (inest sync dirty) (ifile sync dirty) (iversion lock)
| dirtyscattered dirfrag dirty 0x7fe8d06e5a70]
10.09.09_08:01:28.391008 44cef940 mds0.journal EMetaBlob.replay
request client6312:2697 707
10.09.09_08:01:28.391025 44cef940 mds0.log _replay 16078323686~1406 /
16113176521 : EUpdate openc [metablob 100003850a8, 2 dirs
alloc_ino=10000385e80 prealloc_ino=[10000385e81~3e8] v3685237]
10.09.09_08:01:28.391034 44cef940 mds0.journal EMetaBlob.replay 2 dirlumps
10.09.09_08:01:28.391040 44cef940 mds0.journal EMetaBlob.replay dir 100003850a8
10.09.09_08:01:28.391046 44cef940 mds0.cache.dir(100003850a8)
mark_dirty (already dirty) [dir 100003850a8
/data/md_public/tmp_krx/99/ [2,head] auth v=211 cv=0/0
state=1610612738|complete f(v2 m10.09.03_13:23:04.000779 1=0+1) n(v10
rc10.09.03_14:56:31.000710 b36219707 7=6+1)/n(v10
rc10.09.03_13:47:44.000284 b36219707 6=5+1) hs=1+0,ss=0+0 dirty=1 |
child dirty 0x7fe8ccb0b428] version 211
10.09.09_08:01:28.391068 44cef940 mds0.journal EMetaBlob.replay
dirty nestinfo on [dir 100003850a8 /data/md_public/tmp_krx/99/
[2,head] auth v=211 cv=0/0 state=1610612738|complete f(v2
m10.09.03_13:23:04.000779 1=0+1) n(v10 rc10.09.03_14:56:31.000710
b36219707 7=6+1)/n(v10 rc10.09.03_13:47:44.000284 b36219707 6=5+1)
hs=1+0,ss=0+0 dirty=1 | child dirty 0x7fe8ccb0b428]
10.09.09_08:01:28.391099 44cef940 mds0.locker mark_updated_scatterlock
(inest sync dirty) - already on list since 10.09.09_08:01:25.431917
10.09.09_08:01:28.391111 44cef940 mds0.journal EMetaBlob.replay
updated dir [dir 100003850a8 /data/md_public/tmp_krx/99/ [2,head] auth
v=211 cv=0/0 state=1610612738|complete f(v2 m10.09.03_13:23:04.000779
1=0+1) n(v10 rc10.09.03_14:56:31.000710 b36219707 7=6+1)/n(v10
rc10.09.03_13:47:44.000284 b36219707 6=5+1) hs=1+0,ss=0+0 dirty=1 |
child dirty 0x7fe8ccb0b428]
10.09.09_08:01:28.391136 44cef940 mds0.journal EMetaBlob.replay for
[2,head] had [dentry #1/data/md_public/tmp_krx/99/output [2,head] auth
(dversion lock) v=210 inode=0x7fe8d06e5a70 | inodepin dirty
0x7fe8c22db000]
10.09.09_08:01:28.391150 44cef940 mds0.journal EMetaBlob.replay for
[2,head] had [inode 100003850a9 [...2,head]
/data/md_public/tmp_krx/99/output/ auth v210 f(v2
m10.09.03_14:56:31.000710 6=6+0) n(v0 rc10.09.03_14:56:31.000710
b36219707 7=6+1) (inest sync dirty) (ifile sync dirty) (iversion lock)
| dirtyscattered dirfrag dirty 0x7fe8d06e5a70]
10.09.09_08:01:28.391174 44cef940 mds0.journal EMetaBlob.replay dir 100003850a9
10.09.09_08:01:28.391180 44cef940 mds0.journal EMetaBlob.replay
updated dir [dir 100003850a9 /data/md_public/tmp_krx/99/output/
[2,head] auth v=185 cv=0/0 state=1610612738|complete f(v2
m10.09.03_14:56:31.000710 6=6+0) n(v0 rc10.09.03_14:56:31.000710
b36219707 6=6+0) hs=5+15,ss=0+0 dirty=20 | child dirty 0x7fe8ccb0ba20]
10.09.09_08:01:28.391212 44cef940 mds0.cache.dir(100003850a9)
add_null_dentry [dentry
#1/data/md_public/tmp_krx/99/output/.Kexpiring_2D_futures.zblock.bGxmqP
[2,head] auth NULL (dversion lock) pv=0 v=185 inode=0 0x7fe8c245df80]
10.09.09_08:01:28.391228 44cef940 mds0.journal EMetaBlob.replay added
[dentry #1/data/md_public/tmp_krx/99/output/.Kexpiring_2D_futures.zblock.bGxmqP
[2,head] auth NULL (dversion lock) v=184 inode=0 | dirty
0x7fe8c245df80]
10.09.09_08:01:28.391250 44cef940 mds0.cache.dir(100003850a9)
link_primary_inode [dentry
#1/data/md_public/tmp_krx/99/output/.Kexpiring_2D_futures.zblock.bGxmqP
[2,head] auth NULL (dversion lock) v=184 inode=0 | dirty
0x7fe8c245df80] [inode 10000385e80 [2,head] #10000385e80 auth v184 s=0
nl=1 n(v0 1=1+0) (iversion lock) cr={6312=0-4194304@0} 0x7fe8d09ecfc0]
10.09.09_08:01:28.391278 44cef940 mds0.journal EMetaBlob.replay added
[inode 10000385e80 [2,head]
/data/md_public/tmp_krx/99/output/.Kexpiring_2D_futures.zblock.bGxmqP
auth v184 s=0 nl=1 n(v0 1=1+0) (iversion lock) cr={6312=0-4194304@0} |
dirty 0x7fe8d09ecfc0]
10.09.09_08:01:28.391292 44cef940 mds0.journal EMetaBlob.replay noting
opened inode [inode 10000385e80 [2,head]
/data/md_public/tmp_krx/99/output/.Kexpiring_2D_futures.zblock.bGxmqP
auth v184 s=0 nl=1 n(v0 1=1+0) (iversion lock) cr={6312=0-4194304@0} |
dirty 0x7fe8d09ecfc0]
10.09.09_08:01:28.391308 44cef940 mds0.journal EMetaBlob.replay
inotable v 3685237 - 1 == table 3685235 allocated+used 10000385e80
prealloc [10000385e81~3e8]
10.09.09_08:01:28.391317 44cef940 mds0.inotable: replay_alloc_id 10000385e80
10.09.09_08:01:28.391329 44cef940 mds0.inotable: replay_alloc_ids
[10000385e81~3e8]
10.09.09_08:01:28.391338 44cef940 mds0.journal EMetaBlob.replay
sessionmap v7370409 -(1|2) == table 7370408 prealloc [10000385e81~3e8]
used 0
10.09.09_08:01:28.391345 44cef940 mds0.journal  (session prealloc
[10000385e80~3e8])
./include/interval_set.h: In function 'void interval_set<T>::insert(T,
T) [with T = inodeno_t]':
./include/interval_set.h:202: FAILED assert(0)
 1: (EMetaBlob::replay(MDS*, LogSegment*)+0x3f75) [0x691625]
 2: (EUpdate::replay(MDS*)+0x38) [0x694d28]
 3: (MDLog::_replay_thread()+0x68e) [0x68801e]
 4: (MDLog::ReplayThread::entry()+0xd) [0x4bb3cd]
 5: (Thread::_entry_func(void*)+0xa) [0x49c71a]
 6: /lib64/libpthread.so.0 [0x3909c0673d]
 7: (clone()+0x6d) [0x39094d3d1d]
 NOTE: a copy of the executable, or `objdump -rdS <executable>` is
needed to interpret this.
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux