On Fri, 21 May 2010, Henry C Chang wrote: > These errors are similar to the issue mentioned in > http://ceph.newdream.net/git/?p=ceph-client-standalone.git;a=commit;h=f131f56ea2a09edc5de227975ef7b57ba1592880. > But, it does not work for my case. > > So, I traced the client and mds logs. It looks like > ceph_mdsc_do_request() does not handle ERESTARTSYS properly. > The logs show that the file creation request had been sent to MDS, and > MDS did complete the request. > However, before the reply has been received, the request on the client > side was aborted due to some interrupt. > VFS got the ERESTARTSYS error and tried to create the file again. > Since MDS has already created the file, "File exists" errors were returned. > > Then, I replaced "wait_for_completion_interruptible" with > "wait_for_completion" in ceph_mdsc_do_request(). > The above errors were gone, but I don't feel it is a right way to fix > it. Any ideas? Hmm. The problem is that ideally we want to be able to control-c an operation if the mds is slow or hung. However, hitting control-c triggers an ERESTARTSYS, same as a bunch of other signals that come up. If there is a way for the client to tell that it is retrying the _same_ vfs op and somehow match that up with the pending request, then we could do that, but I'm skeptical that would work (or be very pretty). It may be we need to drop the ability to control-c on a slow/hung operation. :/ There is probably a way to tell which signal is pending when we get ERESTARTSYS? I wonder what NFS does in this case? I created http://tracker.newdream.net/issues/141 sage -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html