Hi Dave, I found the problem that is causing this issue. The logic around the threshold calculation works as expected. I saw the problem even when there is lot of space left and xlog_grant_push_ail() returns with free space available. The problem is in the way the l_reserveq and l_writeq are handled. When we wake the processes that are sleeping on l_reserveq and l_writeq thru wake_up(), we do not remove them from the queue, we expect the process to remove themselves from the list (and we drop the lock). But, before the woken up process gets a chance to remove itself, some other process p1 comes in, checks that the queue is not empty and puts itself at the end of the queue. All the woken up processes remove themselves from the queue and move on. Whereas, the process p1 just gets stuck in the queue. Any new process that comes in gets back at the end of the queue and all of them gets stuck. The problem doesn't happen if there is lot of activities, which makes some process calls xfs_log_move_tail() or push the ail (thru xlog_grant_push_ail()). But, with no activity, all these processes are never woken up. IMO, the right solution is to remove the item from the list when we wake them up. I tried the change and it works as expected. Will send the patch to the list shortly. Regards, Chandra _______________________________________________ xfs mailing list xfs@xxxxxxxxxxx http://oss.sgi.com/mailman/listinfo/xfs