On Thu, Mar 29, 2018 at 9:09 AM, 陶冬冬 <tdd21151186@xxxxxxxxx> wrote: > Patrick, we are using 32G memory for the mds. > Zheng, calling mds->heartbeat_reset() could make the healthy check pass so > that monitor won’t kick it out. > more frustrating me is about the laggy issue, from the monitor log, i can > actually see the MDSBeacon are sent without delay. > but about 50 seconds later, mds start handling that MDSBeacon message. > so i’m wondering would that possible the message stayed in the mqueue for > that long time (if the previous message is MDSMap with rejoin state, ant > that rejoin take long time) > (meantime, i will do some more investigating about this issue) > If it's really caused by long wait in mqueue. we should limit concurrent open_ino started by MDCache::process_imported_caps() > 在 2018年3月29日,上午8:10,Yan, Zheng <ukernel@xxxxxxxxx> 写道: > > On Wed, Mar 28, 2018 at 11:14 PM, 陶冬冬 <tdd21151186@xxxxxxxxx> wrote: > > Hi Zheng & Patrick, > > we are using v12.2.2. > Recently we’ve met an mds laggy issue (significantly, about 50 seconds) > i’ve traced the monitor and mds log and found that the MMDSBeacon message > was actually sent to mds 50 seconds ago. > so, looks like monitor isn’t laggy , and more worse is that i also found > that the mds’s health check is failed and eventually monitor > just kicked out this mds and make it respawn. > by the way, this happened at rejoin phase. > > Following is my analysis : > The mds health check failure is because the mds tick thread could not get > the mds_lock due to rejoin. (i found rejoin has many missing ino needed to > fetch) > and this leads the mqueue of the DispatchQueue consumed by Dispatcher got > very slow, eventually make MMDSBeacon in mqueue got dispatched after a big > delay. > > > how about calling mds->heartbeat_reset() in the loop that fetch inodes > > > > i want to know if my analysis make sense to you ? if so, i’m wondering can > we make MMDSBeacon fast dispatch. > > Regards, > Dongdong > > -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html