Re: MDS stuck in 'rejoin' after network fragmentation caused OSD flapping

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



ceph version 13.2.0 (79a10589f1f80dfe21e8f9794365ed98143071c4) mimic (stable)


On Wed, Aug 15, 2018 at 10:51 PM, Yan, Zheng <ukernel@xxxxxxxxx> wrote:
> On Thu, Aug 16, 2018 at 10:50 AM Jonathan Woytek <woytek@xxxxxxxxxxx> wrote:
>>
>> Actually, I missed it--I do see the wipe start, wipe done in the log.
>> However, it is still doing verify_diri_backtrace, as described
>> previously.
>>
>
> which version of mds do you use?
>
>> jonathan
>>
>> On Wed, Aug 15, 2018 at 10:42 PM, Jonathan Woytek <woytek@xxxxxxxxxxx> wrote:
>> > On Wed, Aug 15, 2018 at 9:40 PM, Yan, Zheng <ukernel@xxxxxxxxx> wrote:
>> >> How many client reconnected when mds restarts?  The issue is likely
>> >> because reconnected clients held two many inodes, mds was opening
>> >> these inodes in rejoin state.  Try  starting mds with option
>> >> mds_wipe_sessions = true. The option makes mds ignore old clients
>> >> during recovery.  You need to unset the option and  remount clients
>> >> after mds become 'active'
>> >
>> >
>> > Thank you for the suggestion! I set that in the global section of
>> > ceph.conf on the node where I am starting ceph-mds. After setting it
>> > and starting ceph-mds, I'm not seeing markedly different behavior.
>> > After flying through replay and then flying through a bunch of the
>> > messages posted earlier, it begins to eat up memory again and slows
>> > down, still outputting the log messages as in the original post.
>> > Looking in the ceph-mds...log, I'm not seeing any reference to 'wipe',
>> > so I'm not sure if it is being honored. Am I putting that in the right
>> > place?
>> >
>> > jonathan
>> > --
>> > Jonathan Woytek
>> > http://www.dryrose.com
>> > KB3HOZ
>> > PGP:  462C 5F50 144D 6B09 3B65  FCE8 C1DC DEC4 E8B6 AABC
>>
>>
>>
>> --
>> Jonathan Woytek
>> http://www.dryrose.com
>> KB3HOZ
>> PGP:  462C 5F50 144D 6B09 3B65  FCE8 C1DC DEC4 E8B6 AABC



-- 
Jonathan Woytek
http://www.dryrose.com
KB3HOZ
PGP:  462C 5F50 144D 6B09 3B65  FCE8 C1DC DEC4 E8B6 AABC
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux