Re: CephFS FAILED assert(dn->get_linkage()->is_null())

Brad Hubbard <bhubbard@xxxxxxxxxx> · Sun, 11 Dec 2016 10:09:00 +1000

On Sat, Dec 10, 2016 at 11:50 PM, Sean Redmond <sean.redmond1@xxxxxxxxx> wrote:
> Hi Goncarlo,
>
> With the output from "ceph tell mds.0 damage ls" we tracked the inodes of
> two damaged directories using 'find /mnt/ceph/ -inum $inode', after
> reviewing the paths involved we confirmed a backup was availble for this
> data so we ran "ceph tell mds.0 damage rm $inode" on the two inodes. We then
> marked the mds as repaired "ceph mds repaired 0".
>
> We have restarted the mds to confirm it is not htting any asserts, we are
> now just enabling scrubs and running a "ls -R /mnt/ceph" to see if we hit
> any further problems.

Hi Sean,

What level of confidence do you have that http://tracker.ceph.com/issues/17177
was the root cause (just so we know whether we should be looking for something
else or not).

-- 
Cheers,
Brad

>
> Thanks
>
> On Fri, Dec 9, 2016 at 11:37 PM, Chris Sarginson <csargiso@xxxxxxxxx> wrote:
>>
>> Hi Goncarlo,
>>
>> In the end we ascertained that the assert was coming from reading corrupt
>> data in the mds journal.  We have followed the sections at the following
>> link (http://docs.ceph.com/docs/jewel/cephfs/disaster-recovery/) in order
>> down to (and including) MDS Table wipes (only wiping the "session" table in
>> the final step).  This resolved the problem we had with our mds asserting.
>>
>> We have also run a cephfs scrub to validate the data (ceph daemon mds.0
>> scrub_path / recursive repair), which has resulted in "metadata damage
>> detected" health warning.  This seems to perform a read of all objects
>> involved in cephfs rados pools (anecdotal: performance of the scan against
>> the data pool was much faster to process than the metadata pool itself).
>>
>> We are now working with the output of "ceph tell mds.0 damage ls", and
>> looking at the following mailing list post as a starting point for
>> proceeding with that:
>> http://ceph-users.ceph.narkive.com/EfFTUPyP/how-to-fix-the-mds-damaged-issue
>>
>> Chris
>>
>> On Fri, 9 Dec 2016 at 19:26 Goncalo Borges <goncalo.borges@xxxxxxxxxxxxx>
>> wrote:
>>>
>>> Hi Sean, Rob.
>>>
>>> I saw on the tracker that you were able to resolve the mds assert by
>>> manually cleaning the corrupted metadata. Since I am also hitting that issue
>>> and I suspect that i will face an mds assert of the same type sooner or
>>> later, can you please explain a bit further what operations did you do to
>>> clean the problem?
>>> Cheers
>>> Goncalo
>>> ________________________________________
>>> From: ceph-users [ceph-users-bounces@xxxxxxxxxxxxxx] on behalf of Rob
>>> Pickerill [r.pickerill@xxxxxxxxx]
>>> Sent: 09 December 2016 07:13
>>> To: Sean Redmond; John Spray
>>> Cc: ceph-users
>>> Subject: Re:  CephFS FAILED
>>> assert(dn->get_linkage()->is_null())
>>>
>>> Hi John / All
>>>
>>> Thank you for the help so far.
>>>
>>> To add a further point to Sean's previous email, I see this log entry
>>> before the assertion failure:
>>>
>>>     -6> 2016-12-08 15:47:08.483700 7fb133dca700 12
>>> mds.0.cache.dir(1000a453344) remove_dentry [dentry
>>> #100/stray9/1000a453344/config [2,head] auth NULL (dver
>>> sion lock) v=540 inode=0 0x55e8664fede0]
>>>     -5> 2016-12-08 15:47:08.484882 7fb133dca700 -1 mds/CDir.cc: In
>>> function 'void CDir::try_remove_dentries_for_stray()' thread 7fb133dca700
>>> time 2016-12-08
>>> 15:47:08.483704
>>> mds/CDir.cc: 699: FAILED assert(dn->get_linkage()->is_null())
>>>
>>> And I can reference this with:
>>>
>>> root@ceph-mon1:~/1000a453344# rados -p ven-ceph-metadata-1 listomapkeys
>>> 1000a453344.00000000
>>> 1470734502_head
>>> config_head
>>>
>>> Would we also need to clean up this object, if so is there a safe we can
>>> do this?
>>>
>>> Rob
>>>
>>> On Thu, 8 Dec 2016 at 19:58 Sean Redmond
>>> <sean.redmond1@xxxxxxxxx<mailto:sean.redmond1@xxxxxxxxx>> wrote:
>>> Hi John,
>>>
>>> Thanks for your pointers, I have extracted the onmap_keys and
>>> onmap_values for an object I found in the metadata pool called
>>> '600.00000000' and dropped them at the below location
>>>
>>> https://www.dropbox.com/sh/wg6irrjg7kie95p/AABk38IB4PXsn2yINpNa9Js5a?dl=0
>>>
>>> Could you explain how is it possible to identify stray directory
>>> fragments?
>>>
>>> Thanks
>>>
>>> On Thu, Dec 8, 2016 at 6:30 PM, John Spray
>>> <jspray@xxxxxxxxxx<mailto:jspray@xxxxxxxxxx>> wrote:
>>> On Thu, Dec 8, 2016 at 3:45 PM, Sean Redmond
>>> <sean.redmond1@xxxxxxxxx<mailto:sean.redmond1@xxxxxxxxx>> wrote:
>>> > Hi,
>>> >
>>> > We had no changes going on with the ceph pools or ceph servers at the
>>> > time.
>>> >
>>> > We have however been hitting this in the last week and it maybe
>>> > related:
>>> >
>>> > http://tracker.ceph.com/issues/17177
>>>
>>> Oh, okay -- so you've got corruption in your metadata pool as a result
>>> of hitting that issue, presumably.
>>>
>>> I think in the past people have managed to get past this by taking
>>> their MDSs offline and manually removing the omap entries in their
>>> stray directory fragments (i.e. using the `rados` cli on the objects
>>> starting "600.").
>>>
>>> John
>>>
>>>
>>>
>>> > Thanks
>>> >
>>> > On Thu, Dec 8, 2016 at 3:34 PM, John Spray
>>> > <jspray@xxxxxxxxxx<mailto:jspray@xxxxxxxxxx>> wrote:
>>> >>
>>> >> On Thu, Dec 8, 2016 at 3:11 PM, Sean Redmond
>>> >> <sean.redmond1@xxxxxxxxx<mailto:sean.redmond1@xxxxxxxxx>>
>>> >> wrote:
>>> >> > Hi,
>>> >> >
>>> >> > I have a CephFS cluster that is currently unable to start the mds
>>> >> > server
>>> >> > as
>>> >> > it is hitting an assert, the extract from the mds log is below, any
>>> >> > pointers
>>> >> > are welcome:
>>> >> >
>>> >> > ceph version 10.2.3 (ecc23778eb545d8dd55e2e4735b53cc93f92e65b)
>>> >> >
>>> >> > 2016-12-08 14:50:18.577038 7f7d9faa3700  1 mds.0.47077
>>> >> > handle_mds_map
>>> >> > state
>>> >> > change up:rejoin --> up:active
>>> >> > 2016-12-08 14:50:18.577048 7f7d9faa3700  1 mds.0.47077 recovery_done
>>> >> > --
>>> >> > successful recovery!
>>> >> > 2016-12-08 14:50:18.577166 7f7d9faa3700  1 mds.0.47077 active_start
>>> >> > 2016-12-08 14:50:19.460208 7f7d9faa3700  1 mds.0.47077 cluster
>>> >> > recovered.
>>> >> > 2016-12-08 14:50:19.495685 7f7d9abfc700 -1 mds/CDir.cc: In function
>>> >> > 'void
>>> >> > CDir::try_remove_dentries_for_stray()' thread 7f7d9abfc700 time
>>> >> > 2016-12-08
>>> >> > 14:50:19
>>> >> > .494508
>>> >> > mds/CDir.cc: 699: FAILED assert(dn->get_linkage()->is_null())
>>> >> >
>>> >> >  ceph version 10.2.3 (ecc23778eb545d8dd55e2e4735b53cc93f92e65b)
>>> >> >  1: (ceph::__ceph_assert_fail(char const*, char const*, int, char
>>> >> > const*)+0x80) [0x55f0f789def0]
>>> >> >  2: (CDir::try_remove_dentries_for_stray()+0x1a0) [0x55f0f76666c0]
>>> >> >  3: (StrayManager::__eval_stray(CDentry*, bool)+0x8c9)
>>> >> > [0x55f0f75e7799]
>>> >> >  4: (StrayManager::eval_stray(CDentry*, bool)+0x22) [0x55f0f75e7cf2]
>>> >> >  5: (MDCache::scan_stray_dir(dirfrag_t)+0x16d) [0x55f0f753b30d]
>>> >> >  6: (MDSInternalContextBase::complete(int)+0x18b) [0x55f0f76e93db]
>>> >> >  7: (MDSRank::_advance_queues()+0x6a7) [0x55f0f749bf27]
>>> >> >  8: (MDSRank::ProgressThread::entry()+0x4a) [0x55f0f749c45a]
>>> >> >  9: (()+0x770a) [0x7f7da6bdc70a]
>>> >> >  10: (clone()+0x6d) [0x7f7da509d82d]
>>> >> >  NOTE: a copy of the executable, or `objdump -rdS <executable>` is
>>> >> > needed to
>>> >> > interpret this.
>>> >>
>>> >> Last time someone had this issue they had tried to create a filesystem
>>> >> using pools that had another filesystem's old objects in:
>>> >> http://tracker.ceph.com/issues/16829
>>> >>
>>> >> What was going on on your system before you hit this?
>>> >>
>>> >> John
>>> >>
>>> >> > Thanks
>>> >> >
>>> >> > _______________________________________________
>>> >> > ceph-users mailing list
>>> >> > ceph-users@xxxxxxxxxxxxxx<mailto:ceph-users@xxxxxxxxxxxxxx>
>>> >> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>> >> >
>>> >
>>> >
>>>
>>> _______________________________________________
>>> ceph-users mailing list
>>> ceph-users@xxxxxxxxxxxxxx<mailto:ceph-users@xxxxxxxxxxxxxx>
>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>>
>>> _______________________________________________
>>> ceph-users mailing list
>>> ceph-users@xxxxxxxxxxxxxx
>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>
>
> _______________________________________________
> ceph-users mailing list
> ceph-users@xxxxxxxxxxxxxx
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com