On Thu, Oct 10, 2013 at 1:31 AM, Gregory Farnum <greg@xxxxxxxxxxx> wrote: > Right now the best way to deal with this is unfortunately to get logs > and figure out what operation got blocked. Can you add > > debug mds = 20 > debug ms = 1 > debug journaler = 20 > > to your mds config, restart, and then search through/post that log > somewhere we can check it out? (It will be large.) > -Greg > Software Engineer #42 @ http://inktank.com | http://ceph.com > > On Tue, Oct 8, 2013 at 10:21 PM, Dong Yuan <yuandong1222@xxxxxxxxx> wrote: >> I think I found a bug about the clientreply of mds. (different from #4742) >> >> After failover, the standby mds begin to recover and enter the >> clientreply state and never moves to the next state(Active). >> >> I gdbed the mds process by gcore and found that the main thread >> (dispatch thread) is idle and mdcache->active_request is empty, but >> mds->replay_queue still has one element, that is strange. >> >> From the code, replay_queue has all requests which need to be >> replayed. When the mds enters clientreplay state, >> MDS::queue_one_replay will be called to pick a requect from the >> replay_queue and put the request into finished_queue. So the replay >> operation begins to work. >> >> After the first replay request has finished, MDS::queue_one_replay >> should be called again to deal with the next replay request. There are >> three paths to do this: >> 1) Server::journal_and_reply >> 2) MDCache::reqeust_cleanup >> 3) Server::handle_client_request >> >> But it seems that no path called the MDS::queue_one_replay method. As >> a result, the mds stuck in clientreplay state. >> >> Maybe there is a request process path which will never use the above >> three methed. But I can't find the previous request while it seems to >> completed and cleaned up from the MDCache. >> >> There is any one has some idea about these problem? could you please check if MDCache::open is true? Regards Yan, Zheng >> >> I can give more details if needed. I have the core dump but it is too >> big (300MB+) to upload. >> >> Thanks for any help. >> >> -- >> Dong Yuan >> Email:yuandong1222@xxxxxxxxx >> -- >> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in >> the body of a message to majordomo@xxxxxxxxxxxxxxx >> More majordomo info at http://vger.kernel.org/majordomo-info.html > -- > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in > the body of a message to majordomo@xxxxxxxxxxxxxxx > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html