Re: [PATCH 2/9] vfs: export do_splice_direct() to modules

Al Viro <viro@xxxxxxxxxxxxxxxxxx> · Sat, 23 Mar 2013 04:41:40 +0000

On Sat, Mar 23, 2013 at 11:49:11AM +0900, J. R. Okajima wrote:
> 
> Al Viro:
> > The scenario, BTW, looks so:
> > process A does sb_start_write() (on your upper layer)
> > process B tries to freeze said upper layer and blocks, waiting for A to finish
> > process C grabs ->i_mutex in your upper layer
> > process C does vfs_write(), which blocks, since there's a pending attempt to
> > freeze
> > process A tries to grab ->i_mutex held by C and blocks
> 
> According to latest mm/filemap.c:generic_file_aio_write(),
> 	sb_start_write(inode->i_sb);
> 	mutex_lock(&inode->i_mutex);
> 	ret = __generic_file_aio_write(iocb, iov, nr_segs, &iocb->ki_pos);
> 	mutex_unlock(&inode->i_mutex);
> 	:::
> 	sb_end_write(inode->i_sb);
> 
> Process C would block *BEFORE* i_mutex by sb_start_write()? No?

Different ->i_mutex; you are holding one on the parent directory already.

That's the problem - you have ->i_mutex nested both inside that sucker (as
it ought to) and outside.  Which tends to do bad things, obviously, in
particular because something like mkdir(2) will do sb_start_write() (from
mnt_want_write(), called by kern_path_create()) before grabbing directory
->i_mutex.

Thus the activity with lifting the bastard out of ->aio_write(), etc. in
vfs.git#experimental - *any* union-like variant will need the ability to
pull sb_start_write() outside of locking the parent directory on copyup.
And yes, it's a common prerequisite to anything doing copyups - aufs is
in the same boat as overlayfs and unionmount.  Same deadlock for all three
of them.
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html