On 22/11/11 11:38, Trond Myklebust wrote: > On Mon, 2011-11-21 at 18:14 +0000, Andrew Cooper wrote: >> Following some debugging, I believe that the attached patch fixes the >> problem. >> >> Simply returning EAGAIN is not sufficient, as the task does not get >> requeued, and times out 13 seconds later (as per our mount options). >> Setting the SOCK_ASYNC_NOSPACE bit causes the requeue to happen. >> >> I realize that this is a gross hack and I should probably not be using >> SOCK_ASYNC_NOSPACE in that way. Is there a better way to achieve the >> same solution? >> > What you are doing will cause the request to be put to sleep with no > guarantee that it will ever be woken up. Why would we want to do that if > there is no report of a tcp window/buffer space congestion? But the reason we get to this code is because there was a report of space collision. What would you suggest instead? Changing xs_{tcp,udp}_send_request() to retry in this case would defeat the point of having xs_nospace(). What should happen is the request getting re-queued to run at the next available opportunity, rather than perhaps sleeping for a certain length of time. At the moment, leaving SOCK_ASYNC_NOSPACE unset causes the request to never be woken, whereas setting that bit seems to always be re-queued at some near point in the future. -- Andrew Cooper - Dom0 Kernel Engineer, Citrix XenServer T: +44 (0)1223 225 900, http://www.citrix.com -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html