Hi, On Tue, 21 Aug 2018, longer_mail@xxxxxxx wrote: > Hi, Sage > I am a ceph learner, now I encountered a problem as below: > In the src/mds/mdstypes.h,there is a map completed_requests which record the requests that mds had replied to the client. > image > But in some case, the mds reply messages maybe lost, in other words,the client(kernel) doesn't receives the reply message form > mds. Than the client doesn't do the __unregister_request(), leading to the oldest_tid still the last r_tid. > image > When the next req come from client, mds trim_completed_requests(), but the oldest_tid still be the last one, so nothing erased > from the completed_requests. In this way, the completed_requests become bigger and bigger , when it more than 90M, > the osd return -OSD_WRITETOOBIG which lead to mds readonly. Can you explain what circumstances lead to the reply being lost? The client<->MDS session should be stateful and provide reliable, ordered delivery of messages... including the reply. Is this something you've encountered in production? Is it reproducible? > Here is my solutions, > 1. in the osd do_op() process divide the data. length into more than one op > image > I think this method maybe not good, because the osd handle one op after another, it maybe complex this way . Yeah, the problem is clearly that we are not trimming completed_requests properly. > 2. in the kernel code add a member last_reply_tid which record the last req which replied by mds and kernel received the reply > message. when mds received the new req from kernel, erase the last_reply_tid from the completed_requests, this way the > completed_requests will not becoming too big. The requests will not necessarily be replied to in the order they were submitted, so I don't think this would work. But... the reply delivery should be reliable, so I'm not sure why this would be happening. Thanks! sage