The return of the hanging "ls"...

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Trond,
 I just noticed commit acdc53b2146c7ee67feb1f02f7bc3020126514b8 from 2010
reverts the effect commit 28c494c5c8d425e15b7b82571e4df6d6bc34594d from Chunk
in 2007.

Specifically it removes mutex lock/unlock in nfs_getattr.
Chuck added them:

-       /* Flush out writes to the server in order to update c/mtime */
-       if (S_ISREG(inode->i_mode))
+       /*
+        * Flush out writes to the server in order to update c/mtime.
+        *
+        * Hold the i_mutex to suspend application writes temporarily;
+        * this prevents long-running writing applications from blocking
+        * nfs_wb_nocommit.
+        */
+       if (S_ISREG(inode->i_mode)) {
+               mutex_lock(&inode->i_mutex);
                nfs_wb_nocommit(inode);
+               mutex_unlock(&inode->i_mutex);
+       }


You removed them.

-       /*
-        * Flush out writes to the server in order to update c/mtime.
-        *
-        * Hold the i_mutex to suspend application writes temporarily;
-        * this prevents long-running writing applications from blocking
-        * nfs_wb_nocommit.
-        */
+       /* Flush out writes to the server in order to update c/mtime.  */
        if (S_ISREG(inode->i_mode)) {
-               mutex_lock(&inode->i_mutex);
-               nfs_wb_nocommit(inode);
-               mutex_unlock(&inode->i_mutex);
+               err = filemap_write_and_wait(inode->i_mapping);
+               if (err)
+                       goto out;
        }
 
        /*


Do you recall why?

I noticed because a customer reported exactly the same symptoms the were
fixed by Chucks patch some years ago.


The comment on your patch says (in part):

    Also replace nfs_wb_nocommit() with a
    call to filemap_write_and_wait(), which doesn't need to hold the
    inode->i_mutex.

It is certainly true that filemap_write_and_wait doesn't need to hold the
mutex, but neither did nfs_wb_nocommit.  The mutex is held to stop "suspend
application writes temporarily" so no more pages get dirtied until all
the current dirty pages have been written out.  i.e. to stop
generic_file_aio_write() from proceeding.

The particular test that shows the problem is a large write like
  dd if=/dev/zero of=/mnt/nfs/somefile count=2000000
then in another window
  ls -l /mnt/nfs

the "ls -l" will hang until the "dd" completes.

Can we put the mutex lock/unlock back please?

Thanks,
NeilBrown

Attachment: signature.asc
Description: PGP signature


[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux