Re: [RFC][PATH 4/4] ovl: relax lock_rename when moving files between work and upper dir

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Nov 11, 2016 at 7:44 PM, Miklos Szeredi <miklos@xxxxxxxxxx> wrote:
> On Fri, Nov 11, 2016 at 6:27 PM, Al Viro <viro@xxxxxxxxxxxxxxxxxx> wrote:
>> On Fri, Nov 11, 2016 at 06:17:07PM +0200, Amir Goldstein wrote:
>>
>>> I'm afraid so.
>>> It seems to me that most of the time, lock_rename() is being used in
>>> overlayfs for no better alternative to lock 2 directories (work+upper).
>>>
>>> My suggestion is a small modification to the overlayfs locking scheme.
>>> ---Instead of this:
>>> assert(lock_rename(workdir, upperdir) != NULL));
>>>    copy_up(src, tmp);
>>>    vfs_rename(tmp, dst);
>>> unlock_rename(workdir, upperdir);
>>>
>>> +++Use this:
>>> assert(lock_rename(workdir, upperdir) != NULL));
>>> mutex_unlock(s_vfs_rename_mutex);
>>>    copy_up(src, tmp);
>>> inode_unlock(upperdir);
>>> inode_unlock(workdir);
>>> assert(lock_rename(workdir, upperdir) != NULL));
>>>   vfs_rename(tmp, dst);
>>> unlock_rename(workdir, upperdir);
>>>
>>> Miklos,
>>>
>>> Do you see any problem with the proposed scheme?
>>> Anything that can go wrong while releasing the workdir lock before vfs_rename()?
>>
>> Huh???
>>
>> ->rename() definitely counts upon parents being locked; please, read the
>> damn Documentation/filesystems/locking, it's there for a reason.
>>
>> The real question is why the fsck do you need to lock the workdir for the
>> duration of copying at all.  O_TMPFILE open + writes there doesn't need
>> lock_rename() *or* parents being locked.  You need the parent locked when
>> you link the sucker in place, but that's it.  IDGI...
>
> There's that.  The other thing that lock_rename does is prevent
> multiple copy ups on the same file.   Arguably it's an overkill, but a
> replacement needs to be added.
>

Miklos,

FYI, I started to work on copy up using O_TMPFILE with i_mutex
held only for upper dir.

> There's that.  The other thing that lock_rename does is prevent
> multiple copy ups on the same file.   Arguably it's an overkill, but a
> replacement needs to be added.
>

I think you suggested here to use a waitqueue for pending copyups?
Did you mean a waitqueue per overlay parent directory inode or
did you mean something else?

> Fact is, nobody ever reported this blatant performance bottleneck.
> Probably because copying up gigabyte files is not the usual use case
> for union filesystems...

Still, it's a loophole for some serious DoS. An unprivileged user inside
container can touch a file and block all copy up on all other containers
on that host and all renames on that host fs as well!

Amir.
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [Samba]     [Device Mapper]     [CEPH Development]
  Powered by Linux