Re: MDS cluster degraded after upgrade to dumpling

Gregory Farnum <greg@xxxxxxxxxxx> · Fri, 23 Aug 2013 16:19:00 -0700



Ah, this appears to be http://tracker.ceph.com/issues/6087.
If you are able and care to install dev packages then the fix is now
in the dumpling branch and will be in the next release.
-Greg
Software Engineer #42 @ http://inktank.com | http://ceph.com


On Wed, Aug 21, 2013 at 2:06 AM, Damien Churchill <damoxc@xxxxxxxxx> wrote:
> I've uploaded the complete log[0]. It's about 70MB just as a warning.
>
> [0] damoxc.net/ceph-mds.ceph2.log.1.gz
>
> On 21 August 2013 07:07, Gregory Farnum <greg@xxxxxxxxxxx> wrote:
>> Do you have full logs from the beginning of replay? I believe you
>> should only see this when a client is reconnecting to the MDS with
>> files that the MDS doesn't know about already, which shouldn't happen
>> at all in a single-MDS system. Although that "pool -1" also looks
>> suspicious and makes me wonder about data corruption or disk format
>> issues; either way a log would be good.
>> -Greg
>> Software Engineer #42 @ http://inktank.com | http://ceph.com
>>
>> On Tue, Aug 20, 2013 at 5:51 AM, Damien Churchill <damoxc@xxxxxxxxx> wrote:
>>> Hi,
>>>
>>> After upgrading to dumpling I appear unable to get the mds cluster
>>> running. The active server just sits in the rejoin state spinning and
>>> causing lots of i/o on the osds. Looking at the logs it appears to be
>>> going through checking a vast number of missing inodes.
>>>
>>> 2013-08-20 13:50:29.129624 7fde8bd62700 10 mds.0.cache open_ino
>>> 10000123fce pool -1 want_replica 0
>>> 2013-08-20 13:50:29.129627 7fde8bd62700 10 mds.0.cache
>>> do_open_ino_peer 10000123fce active 0 all 0 checked 0
>>> 2013-08-20 13:50:29.129628 7fde8bd62700 10 mds.0.cache  all MDS peers
>>> have been checked
>>> 2013-08-20 13:50:29.129675 7fde8bd62700 10 mds.0.cache   opening
>>> missing ino 10000123fcf
>>> 2013-08-20 13:50:29.129677 7fde8bd62700 10 mds.0.cache open_ino
>>> 10000123fcf pool -1 want_replica 0
>>> 2013-08-20 13:50:29.129680 7fde8bd62700 10 mds.0.cache
>>> do_open_ino_peer 10000123fcf active 0 all 0 checked 0
>>> 2013-08-20 13:50:29.129682 7fde8bd62700 10 mds.0.cache  all MDS peers
>>> have been checked
>>> 2013-08-20 13:50:29.129732 7fde8bd62700 10 mds.0.cache   opening
>>> missing ino 10000123fd0
>>> 2013-08-20 13:50:29.129735 7fde8bd62700 10 mds.0.cache open_ino
>>> 10000123fd0 pool -1 want_replica 0
>>> 2013-08-20 13:50:29.129737 7fde8bd62700 10 mds.0.cache
>>> do_open_ino_peer 10000123fd0 active 0 all 0 checked 0
>>> 2013-08-20 13:50:29.129739 7fde8bd62700 10 mds.0.cache  all MDS peers
>>> have been checked
>>> 2013-08-20 13:50:29.129784 7fde8bd62700 10 mds.0.cache   opening
>>> missing ino 10000123fd1
>>>
>>> Does anyone have any suggestions?
>>>
>>> Thanks in advance!
>>> _______________________________________________
>>> ceph-users mailing list
>>> ceph-users@xxxxxxxxxxxxxx
>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com