Re: mds laggy issue

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Mar 29, 2018 at 9:09 AM, 陶冬冬 <tdd21151186@xxxxxxxxx> wrote:
> Patrick, we are using 32G memory for the mds.
> Zheng, calling mds->heartbeat_reset() could make the healthy check pass so
> that monitor won’t kick it out.
> more frustrating me is about the laggy issue, from the monitor log, i can
> actually see the MDSBeacon are sent without delay.
> but  about 50 seconds later, mds start handling that MDSBeacon message.
> so i’m wondering would that possible the message stayed in the mqueue for
> that long time (if the previous message is MDSMap with rejoin state, ant
> that rejoin take long time)
> (meantime, i will do some more investigating about this issue)
>

If it's really caused by long wait in mqueue. we should limit
concurrent open_ino
started by MDCache::process_imported_caps()


> 在 2018年3月29日,上午8:10,Yan, Zheng <ukernel@xxxxxxxxx> 写道:
>
> On Wed, Mar 28, 2018 at 11:14 PM, 陶冬冬 <tdd21151186@xxxxxxxxx> wrote:
>
> Hi Zheng & Patrick,
>
> we are using v12.2.2.
> Recently we’ve met an mds laggy issue (significantly,  about 50 seconds)
> i’ve traced the monitor and mds log and found that the MMDSBeacon message
> was actually sent to mds 50 seconds ago.
> so, looks like monitor isn’t laggy , and more worse is that i also found
> that the mds’s health check is failed and eventually monitor
> just kicked out this mds and make it respawn.
> by the way, this happened at rejoin phase.
>
> Following is my analysis :
> The mds health check failure is because the mds tick thread could not get
> the mds_lock due to rejoin. (i found rejoin has many missing ino needed to
> fetch)
> and this leads the mqueue of the DispatchQueue consumed by Dispatcher got
> very slow, eventually make MMDSBeacon in mqueue got dispatched after a big
> delay.
>
>
> how about calling mds->heartbeat_reset() in the loop that fetch inodes
>
>
>
> i want to know if my analysis make sense to you ?  if so, i’m wondering can
> we make MMDSBeacon fast dispatch.
>
> Regards,
> Dongdong
>
>
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux