A special case in write_flush which cause the umount busy

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

I have a test case on mount/umount on a partition from nfs server side. And encounter a problem of umount busy in a low probability.

The Linux version is 3.10.64 with the patch "sunrpc/cache: make cache flushing more reliable".
https://patchwork.kernel.org/patch/7410021/

After some analysis and test in many times, I find that when it failed to mount, the time "then" and "now" are different, which caused the last_refresh is far beyond the flush_time. So this cache is not expired and won't be clean at once. 
Then the ref in cache_head won't be released, and mntput_no_expire didn't be called to decrease the count. That caused the umount busy.

Below are logs in my test.

kernel: [  292.767801] write_flush 1480 then = 249, now = 250
kernel: [  292.767817] cache_clean 451, cd name nfsd.fh expiry_time 790485253, cd flush_time 249, last_refresh 369, seconds_since_boot 250
kernel: [  292.767913] write_flush 1480 then = 249, now = 250
kernel: [  292.767928] cache_clean 451, cd name nfsd.export expiry_time 2049, cd flush_time 249, last_refresh 369, seconds_since_boot 250
kernel: [  292.773229] do_refcount_check 283 mycount 4
kernel: [  292.773245] do_umount 1344 retval -16

I think this happens in such case that the exportfs writes the flush with current time, the time of "then". But when seconds_since_boot being called in function write_flush, the time is on the next second, so the "now" is one second after "then".
Because "then" is less than "now", the flush_time is set directly to original "then", rather than "cd->flush_time + 1".


And I want to change the condition as below. I'm not sure it's OK and has no effects to other part.

--------------------------------------------------------------------------
       then = get_expiry(&bp);
       now = seconds_since_boot();
       cd->nextcheck = now;
       /* Can only set flush_time to 1 second beyond "now", or
       * possibly 1 second beyond flushtime.  This is because
       * flush_time never goes backwards so it mustn't get too far
       * ahead of time.
       */
       if (then != now)
              printk("%s %d then = %d, now = %d\n", __func__, __LINE__, then, now);
-      if (then >= now) {
+     if (then >= now - 1) {
              /* Want to flush everything, so behave like cache_purge() */
              if (cd->flush_time >= now)
                     now = cd->flush_time + 1;
              then = now;
       }

       cd->flush_time = then;
--------------------------------------------------------------------------

Best Regards,
Alan Wang

--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux