Re: Failed assert when starting new OSDs in 0.60

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Guys,

Any additional thoughts on this?  There was a bit of information shared off-list I wanted to bring back:

Sam mentioned that the metadata looked odd, and suspected "some form of 32bit shenanigans in the key name construction".

However, that might not have been the case, because later came in with:

"Hmm.  Based on the omap and logs, the omap directory is simply a bunch
of updates behind.  Was the node rebooted as part of the osd restart?
FS is xfs?  What are your fs mount options?"

There was no node restart.  We are using XFS.

From ceph.conf:

osd mount options xfs = "rw,noatime,inode64,logbufs=8,logbsize=256k"

And of course as soon as I paste that, I look at "inode64" on these 32-bit ARM systems and think, "hmm".  I know 64-bit inodes are recommended for filesystems > 1TB (these are 4TB drives), but have never thought about if this is supported on a 32-bit system.  Quick web searches appear to indicate this may be okay...

Sorry some of this may be a duplicate.  I wanted to bring it back on-list in case someone looked at that and said "no, you can't use those XFS options on 32-bit ARM."  =)

On a side note, I've been using the cluster heavily the last couple days, with no other problems.  I just am not doing any cluster or OSD restarts for fear of the OSD not coming back.

 - Travis


On Tue, Apr 30, 2013 at 12:17 PM, Travis Rhoden <trhoden@xxxxxxxxx> wrote:
On the OSD node:

root@cepha0:~# lsb_release -a
No LSB modules are available.
Distributor ID:    Ubuntu
Description:    Ubuntu 12.10
Release:    12.10
Codename:    quantal
root@cepha0:~# dpkg -l "*leveldb*"
Desired=Unknown/Install/Remove/Purge/Hold
| Status=Not/Inst/Conf-files/Unpacked/halF-conf/Half-inst/trig-aWait/Trig-pend
|/ Err?=(none)/Reinst-required (Status,Err: uppercase=bad)
||/ Name                                   Version                  Architecture             Description
+++-======================================-========================-========================-==================================================================================
ii  libleveldb1:armhf                      0+20120530.gitdd0d562-2  armhf                    fast key-value storage library
root@cepha0:~# uname -a
Linux cepha0 3.5.0-27-highbank #46-Ubuntu SMP Mon Mar 25 23:19:40 UTC 2013 armv7l armv7l armv7l GNU/Linux


On the MON node:
# lsb_release -a
No LSB modules are available.
Distributor ID:    Ubuntu
Description:    Ubuntu 12.10
Release:    12.10
Codename:    quantal
# uname -a
Linux  3.5.0-27-generic #46-Ubuntu SMP Mon Mar 25 19:58:17 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux
# dpkg -l "*leveldb*"
Desired=Unknown/Install/Remove/Purge/Hold
| Status=Not/Inst/Conf-files/Unpacked/halF-conf/Half-inst/trig-aWait/Trig-pend
|/ Err?=(none)/Reinst-required (Status,Err: uppercase=bad)
||/ Name                                   Version                  Architecture             Description
+++-======================================-========================-========================-==================================================================================
un  leveldb-doc                            <none>                                            (no description available)
ii  libleveldb-dev:amd64                   0+20120530.gitdd0d562-2  amd64                    fast key-value storage library (development files)
ii  libleveldb1:amd64                      0+20120530.gitdd0d562-2  amd64                    fast key-value storage library


On Tue, Apr 30, 2013 at 12:11 PM, Samuel Just <sam.just@xxxxxxxxxxx> wrote:
What version of leveldb is installed?  Ubuntu/version?
-Sam

On Tue, Apr 30, 2013 at 8:50 AM, Travis Rhoden <trhoden@xxxxxxxxx> wrote:
> Interestingly, the down OSD does not get marked out after 5 minutes.
> Probably that is already fixed by http://tracker.ceph.com/issues/4822.
>
>
> On Tue, Apr 30, 2013 at 11:42 AM, Travis Rhoden <trhoden@xxxxxxxxx> wrote:
>>
>> Hi Sam,
>>
>> I was prepared to write in and say that the problem had gone away.  I
>> tried restarting several OSDs last night in the hopes of capturing the
>> problem on and OSD that hadn't failed yet, but didn't have any luck.  So I
>> did indeed re-create the cluster from scratch (using mkcephfs), and what do
>> you know -- everything worked.  I got everything in a nice stable state,
>> then decided to do a full cluster restart, just to be sure.  Sure enough,
>> one OSD failed to come up, and has the same stack trace.  So I believe I
>> have the log you want -- just from the OSD that failed, right?
>>
>> Question -- any feeling for what parts of the log you need?  It's 688MB
>> uncompressed (two hours!), so I'd like to be able to trim some off for you
>> before making it available.  Do you only need/want the part from after the
>> OSD was restarted?  Or perhaps the corruption happens on OSD shutdown and
>> you need some before that?  If you are fine with that large of a file, I can
>> just make that available too.  Let me know.
>>
>>  - Travis
>>
>>
>> On Mon, Apr 29, 2013 at 6:26 PM, Travis Rhoden <trhoden@xxxxxxxxx> wrote:
>>>
>>> Hi Sam,
>>>
>>> No problem, I'll leave that debugging turned up high, and do a mkcephfs
>>> from scratch and see what happens.  Not sure if it will happen again or not.
>>> =)
>>>
>>> Thanks again.
>>>
>>>  - Travis
>>>
>>>
>>> On Mon, Apr 29, 2013 at 5:51 PM, Samuel Just <sam.just@xxxxxxxxxxx>
>>> wrote:
>>>>
>>>> Hmm, I need logging from when the corruption happened.  If this is
>>>> reproducible, can you enable that logging on a clean osd (or better, a
>>>> clean cluster) until the assert occurs?
>>>> -Sam
>>>>
>>>> On Mon, Apr 29, 2013 at 2:45 PM, Travis Rhoden <trhoden@xxxxxxxxx>
>>>> wrote:
>>>> > Also, I can note that it does not take a full cluster restart to
>>>> > trigger
>>>> > this.  If I just restart an OSD that was up/in previously, the same
>>>> > error
>>>> > can happen (though not every time).  So restarting OSD's for me is a
>>>> > bit
>>>> > like Russian roullette.  =)  Even though restarting an OSD may not
>>>> > also
>>>> > result in the error, it seems that once it happens that OSD is gone
>>>> > for
>>>> > good.  No amount of restart has brought any of the dead ones back.
>>>> >
>>>> > I'd really like to get to the bottom of it.  Let me know if I can do
>>>> > anything to help.
>>>> >
>>>> > I may also have to try completely wiping/rebuilding to see if I can
>>>> > make
>>>> > this thing usable.
>>>> >
>>>> >
>>>> > On Mon, Apr 29, 2013 at 2:38 PM, Travis Rhoden <trhoden@xxxxxxxxx>
>>>> > wrote:
>>>> >>
>>>> >> Hi Sam,
>>>> >>
>>>> >> Thanks for being willing to take a look.
>>>> >>
>>>> >> I applied the debug settings on one host that 3 out of 3 OSDs with
>>>> >> this
>>>> >> problem.  Then tried to start them up.  Here are the resulting logs:
>>>> >>
>>>> >> https://dl.dropboxusercontent.com/u/23122069/cephlogs.tgz
>>>> >>
>>>> >>  - Travis
>>>> >>
>>>> >>
>>>> >> On Mon, Apr 29, 2013 at 1:04 PM, Samuel Just <sam.just@xxxxxxxxxxx>
>>>> >> wrote:
>>>> >>>
>>>> >>> You appear to be missing pg metadata for some reason.  If you can
>>>> >>> reproduce it with
>>>> >>> debug osd = 20
>>>> >>> debug filestore = 20
>>>> >>> debug ms = 1
>>>> >>> on all of the OSDs, I should be able to track it down.
>>>> >>>
>>>> >>> I created a bug: #4855.
>>>> >>>
>>>> >>> Thanks!
>>>> >>> -Sam
>>>> >>>
>>>> >>> On Mon, Apr 29, 2013 at 9:52 AM, Travis Rhoden <trhoden@xxxxxxxxx>
>>>> >>> wrote:
>>>> >>> > Thanks Greg.
>>>> >>> >
>>>> >>> > I quit playing with it because every time I restarted the cluster
>>>> >>> > (service
>>>> >>> > ceph -a restart), I lost more OSDs..  First time it was 1, 2nd 10,
>>>> >>> > 3rd
>>>> >>> > time
>>>> >>> > 13...  All 13 down OSDs all show the same stacktrace.
>>>> >>> >
>>>> >>> >  - Travis
>>>> >>> >
>>>> >>> >
>>>> >>> > On Mon, Apr 29, 2013 at 11:56 AM, Gregory Farnum
>>>> >>> > <greg@xxxxxxxxxxx>
>>>> >>> > wrote:
>>>> >>> >>
>>>> >>> >> This sounds vaguely familiar to me, and I see
>>>> >>> >> http://tracker.ceph.com/issues/4052, which is marked as "Can't
>>>> >>> >> reproduce" — I think maybe this is fixed in "next" and "master",
>>>> >>> >> but
>>>> >>> >> I'm not sure. For more than that I'd have to defer to Sage or
>>>> >>> >> Sam.
>>>> >>> >> -Greg
>>>> >>> >> Software Engineer #42 @ http://inktank.com | http://ceph.com
>>>> >>> >>
>>>> >>> >>
>>>> >>> >> On Sat, Apr 27, 2013 at 6:43 PM, Travis Rhoden
>>>> >>> >> <trhoden@xxxxxxxxx>
>>>> >>> >> wrote:
>>>> >>> >> > Hey folks,
>>>> >>> >> >
>>>> >>> >> > I'm helping put together a new test/experimental cluster, and
>>>> >>> >> > hit
>>>> >>> >> > this
>>>> >>> >> > today
>>>> >>> >> > when bringing the cluster up for the first time (using
>>>> >>> >> > mkcephfs).
>>>> >>> >> >
>>>> >>> >> > After doing the normal "service ceph -a start", I noticed one
>>>> >>> >> > OSD
>>>> >>> >> > was
>>>> >>> >> > down,
>>>> >>> >> > and a lot of PGs were stuck creating.  I tried restarting the
>>>> >>> >> > down
>>>> >>> >> > OSD,
>>>> >>> >> > but
>>>> >>> >> > it would come up.  It always had this error:
>>>> >>> >> >
>>>> >>> >> >     -1> 2013-04-27 18:11:56.179804 b6fcd000  2 osd.1 0 boot
>>>> >>> >> >      0> 2013-04-27 18:11:56.402161 b6fcd000 -1 osd/PG.cc: In
>>>> >>> >> > function
>>>> >>> >> > 'static epoch_t PG::peek_map_epoch(ObjectStore*, coll_t,
>>>> >>> >> > hobject_t&,
>>>> >>> >> > ceph::bufferlist*)' thread b6fcd000 time 2013-04-27
>>>> >>> >> > 18:11:56.399089
>>>> >>> >> > osd/PG.cc: 2556: FAILED assert(values.size() == 1)
>>>> >>> >> >
>>>> >>> >> >  ceph version 0.60-401-g17a3859
>>>> >>> >> > (17a38593d60f5f29b9b66c13c0aaa759762c6d04)
>>>> >>> >> >  1: (PG::peek_map_epoch(ObjectStore*, coll_t, hobject_t&,
>>>> >>> >> > ceph::buffer::list*)+0x1ad) [0x2c3c0a]
>>>> >>> >> >  2: (OSD::load_pgs()+0x357) [0x28cba0]
>>>> >>> >> >  3: (OSD::init()+0x741) [0x290a16]
>>>> >>> >> >  4: (main()+0x1427) [0x2155c0]
>>>> >>> >> >  5: (__libc_start_main()+0x99) [0xb69bcf42]
>>>> >>> >> >  NOTE: a copy of the executable, or `objdump -rdS <executable>`
>>>> >>> >> > is
>>>> >>> >> > needed to
>>>> >>> >> > interpret this.
>>>> >>> >> >
>>>> >>> >> >
>>>> >>> >> > I then did a full cluster restart, and now I have ten OSDs down
>>>> >>> >> > --
>>>> >>> >> > each
>>>> >>> >> > showing the same exception/failed assert.
>>>> >>> >> >
>>>> >>> >> > Anybody seen this?
>>>> >>> >> >
>>>> >>> >> > I know I'm running a weird version -- it's compiled from
>>>> >>> >> > source, and
>>>> >>> >> > was
>>>> >>> >> > provided to me.  The OSDs are all on ARM, and the mon is
>>>> >>> >> > x86_64.
>>>> >>> >> > Just
>>>> >>> >> > looking to see if anyone has seen this particular stack trace
>>>> >>> >> > of
>>>> >>> >> > load_pgs()/peek_map_epoch() before....
>>>> >>> >> >
>>>> >>> >> >  - Travis
>>>> >>> >> >
>>>> >>> >> > _______________________________________________
>>>> >>> >> > ceph-users mailing list
>>>> >>> >> > ceph-users@xxxxxxxxxxxxxx
>>>> >>> >> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>>> >>> >> >
>>>> >>> >
>>>> >>> >
>>>> >>
>>>> >>
>>>> >
>>>
>>>
>>
>


_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux