Re: CephFS FAILED assert(dn->get_linkage()->is_null())

Chris Sarginson <csargiso@xxxxxxxxx> · Fri, 09 Dec 2016 23:37:22 +0000

Hi Goncarlo,
In the end we ascertained that the assert was coming from reading corrupt data in the mds journal.  We have followed the sections at the following link (http://docs.ceph.com/docs/jewel/cephfs/disaster-recovery/) in order down to (and including) MDS Table wipes (only wiping the "session" table in the final step).  This resolved the problem we had with our mds asserting.

We have also run a cephfs scrub to validate the data (ceph daemon mds.0 scrub_path / recursive repair), which has resulted in "metadata damage detected" health warning.  This seems to perform a read of all objects involved in cephfs rados pools (anecdotal: performance of the scan against the data pool was much faster to process than the metadata pool itself).

We are now working with the output of "ceph tell mds.0 damage ls", and looking at the following mailing list post as a starting point for proceeding with that: http://ceph-users.ceph.narkive.com/EfFTUPyP/how-to-fix-the-mds-damaged-issue

Chris

On Fri, 9 Dec 2016 at 19:26 Goncalo Borges <goncalo.borges@xxxxxxxxxxxxx> wrote:
Hi Sean, Rob.

I saw on the tracker that you were able to resolve the mds assert by manually cleaning the corrupted metadata. Since I am also hitting that issue and I suspect that i will face an mds assert of the same type sooner or later, can you please explain a bit further what operations did you do to clean the problem?

Cheers

Goncalo

________________________________________

From: ceph-users [ceph-users-bounces@xxxxxxxxxxxxxx] on behalf of Rob Pickerill [r.pickerill@xxxxxxxxx]

Sent: 09 December 2016 07:13

To: Sean Redmond; John Spray

Cc: ceph-users

Subject: Re:  CephFS FAILED assert(dn->get_linkage()->is_null())

Hi John / All

Thank you for the help so far.

To add a further point to Sean's previous email, I see this log entry before the assertion failure:

    -6> 2016-12-08 15:47:08.483700 7fb133dca700 12 mds.0.cache.dir(1000a453344) remove_dentry [dentry #100/stray9/1000a453344/config [2,head] auth NULL (dver

sion lock) v=540 inode=0 0x55e8664fede0]

    -5> 2016-12-08 15:47:08.484882 7fb133dca700 -1 mds/CDir.cc: In function 'void CDir::try_remove_dentries_for_stray()' thread 7fb133dca700 time 2016-12-08

15:47:08.483704

mds/CDir.cc: 699: FAILED assert(dn->get_linkage()->is_null())

And I can reference this with:

root@ceph-mon1:~/1000a453344# rados -p ven-ceph-metadata-1 listomapkeys 1000a453344.00000000

1470734502_head

config_head

Would we also need to clean up this object, if so is there a safe we can do this?

Rob

On Thu, 8 Dec 2016 at 19:58 Sean Redmond <sean.redmond1@xxxxxxxxx<mailto:sean.redmond1@xxxxxxxxx>> wrote:

Hi John,

Thanks for your pointers, I have extracted the onmap_keys and onmap_values for an object I found in the metadata pool called '600.00000000' and dropped them at the below location

https://www.dropbox.com/sh/wg6irrjg7kie95p/AABk38IB4PXsn2yINpNa9Js5a?dl=0

Could you explain how is it possible to identify stray directory fragments?

Thanks

On Thu, Dec 8, 2016 at 6:30 PM, John Spray <jspray@xxxxxxxxxx<mailto:jspray@xxxxxxxxxx>> wrote:

On Thu, Dec 8, 2016 at 3:45 PM, Sean Redmond <sean.redmond1@xxxxxxxxx<mailto:sean.redmond1@xxxxxxxxx>> wrote:

> Hi,

>

> We had no changes going on with the ceph pools or ceph servers at the time.

>

> We have however been hitting this in the last week and it maybe related:

>

> http://tracker.ceph.com/issues/17177

Oh, okay -- so you've got corruption in your metadata pool as a result

of hitting that issue, presumably.

I think in the past people have managed to get past this by taking

their MDSs offline and manually removing the omap entries in their

stray directory fragments (i.e. using the `rados` cli on the objects

starting "600.").

John

> Thanks

>

> On Thu, Dec 8, 2016 at 3:34 PM, John Spray <jspray@xxxxxxxxxx<mailto:jspray@xxxxxxxxxx>> wrote:

>>

>> On Thu, Dec 8, 2016 at 3:11 PM, Sean Redmond <sean.redmond1@xxxxxxxxx<mailto:sean.redmond1@xxxxxxxxx>>

>> wrote:

>> > Hi,

>> >

>> > I have a CephFS cluster that is currently unable to start the mds server

>> > as

>> > it is hitting an assert, the extract from the mds log is below, any

>> > pointers

>> > are welcome:

>> >

>> > ceph version 10.2.3 (ecc23778eb545d8dd55e2e4735b53cc93f92e65b)

>> >

>> > 2016-12-08 14:50:18.577038 7f7d9faa3700  1 mds.0.47077 handle_mds_map

>> > state

>> > change up:rejoin --> up:active

>> > 2016-12-08 14:50:18.577048 7f7d9faa3700  1 mds.0.47077 recovery_done --

>> > successful recovery!

>> > 2016-12-08 14:50:18.577166 7f7d9faa3700  1 mds.0.47077 active_start

>> > 2016-12-08 14:50:19.460208 7f7d9faa3700  1 mds.0.47077 cluster

>> > recovered.

>> > 2016-12-08 14:50:19.495685 7f7d9abfc700 -1 mds/CDir.cc: In function

>> > 'void

>> > CDir::try_remove_dentries_for_stray()' thread 7f7d9abfc700 time

>> > 2016-12-08

>> > 14:50:19

>> > .494508

>> > mds/CDir.cc: 699: FAILED assert(dn->get_linkage()->is_null())

>> >

>> >  ceph version 10.2.3 (ecc23778eb545d8dd55e2e4735b53cc93f92e65b)

>> >  1: (ceph::__ceph_assert_fail(char const*, char const*, int, char

>> > const*)+0x80) [0x55f0f789def0]

>> >  2: (CDir::try_remove_dentries_for_stray()+0x1a0) [0x55f0f76666c0]

>> >  3: (StrayManager::__eval_stray(CDentry*, bool)+0x8c9) [0x55f0f75e7799]

>> >  4: (StrayManager::eval_stray(CDentry*, bool)+0x22) [0x55f0f75e7cf2]

>> >  5: (MDCache::scan_stray_dir(dirfrag_t)+0x16d) [0x55f0f753b30d]

>> >  6: (MDSInternalContextBase::complete(int)+0x18b) [0x55f0f76e93db]

>> >  7: (MDSRank::_advance_queues()+0x6a7) [0x55f0f749bf27]

>> >  8: (MDSRank::ProgressThread::entry()+0x4a) [0x55f0f749c45a]

>> >  9: (()+0x770a) [0x7f7da6bdc70a]

>> >  10: (clone()+0x6d) [0x7f7da509d82d]

>> >  NOTE: a copy of the executable, or `objdump -rdS <executable>` is

>> > needed to

>> > interpret this.

>>

>> Last time someone had this issue they had tried to create a filesystem

>> using pools that had another filesystem's old objects in:

>> http://tracker.ceph.com/issues/16829

>>

>> What was going on on your system before you hit this?

>>

>> John

>>

>> > Thanks

>> >

>> > _______________________________________________

>> > ceph-users mailing list

>> > ceph-users@xxxxxxxxxxxxxx<mailto:ceph-users@xxxxxxxxxxxxxx>

>> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

>> >

>

>

_______________________________________________

ceph-users mailing list

ceph-users@xxxxxxxxxxxxxx<mailto:ceph-users@xxxxxxxxxxxxxx>

http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

_______________________________________________

ceph-users mailing list

ceph-users@xxxxxxxxxxxxxx

http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com