Re: Union mount and lockdep design issues

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 11 July 2011 19:23, Ian Kent <ikent@xxxxxxxxxx> wrote:
> On Mon, 2011-07-11 at 18:17 +0200, Michal Suchanek wrote:
>> On 11 July 2011 15:50, Ian Kent <ikent@xxxxxxxxxx> wrote:
>> > On Mon, 2011-07-11 at 15:36 +0200, Michal Suchanek wrote:
>> >> On 11 July 2011 14:00, Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:
>> >> > On Mon, 2011-07-11 at 12:01 +0100, David Howells wrote:
>> >> >> Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:
>> >> >>
>> >>
>> >> >> > Also, why would you want to have a class per sb-instance? From last
>> >> >> > talking to David, he said there could only ever be 2 filesystems
>> >> >> > involved in this, the top and bottom, and it is determined on (union)
>> >> >> > mount time which is which.
>> >> >>
>> >> >> There can be more than 2 - one upperfs (the actual union) and many lowerfs -
>> >> >> though I think only one lowerfs is accessed at a time.
>> >> >
>> >> > Right, however I understood from our earlier discussion that the vfs
>> >> > would only ever try to lock 2 filesystems at a time, the top and one
>> >> > lower.
>> >>
>> >> This is true from local point of view. However, it is technically
>> >> possible to use overlayfs as the upper layer of another overlayfs
>> >> which allows layering multiple readonly "branches" into a single
>> >> overlay. Since the vfs will lock the "union" and one (or possibly
>> >> both) of its branches and one of the branches may be itself an union
>> >> you can get arbitrary depth (which is currently limited by a constant
>> >> in the code to cut recursion depth and stack usage).
>> >
>> > Off topic but can you elaborate on that?
>> >
>> > Are you saying the "unioned stack" can consist of more than two file
>> > systems and can have more than two layers and possibly a mix of multiple
>> > read-only and read-write file systems?
>> >
>>
>> This is how requirements are described in documentation:
>>
>> > The lower filesystem can be any filesystem supported by Linux and does
>> > not need to be writable.  The lower filesystem can even be another
>> > overlayfs.  The upper filesystem will normally be writable and if it
>> > is it must support the creation of trusted.* extended attributes, and
>> > must provide valid d_type in readdir responses, at least for symbolic
>> > links - so NFS is not suitable.
>>
>> In no place it says that the lower filesystem is required to be
>> readonly, only that it should not be modified.
>>
>>
>> This is what the documentation gives as example:
>>
>> > mount -t overlayfs overlayfs -olowerdir=/lower,upperdir=/upper /overlay
>>
>> This is how it can be expanded:
>>
>> mount -t overlayfs overlayfs -olowerdir=/lower2,upperdir=/upper /tmpoverlay
>> mount -t overlayfs overlayfs -olowerdir=/lower1,upperdir=/tmpoverlay /overlay
>
> OK, I'll have to think about what this means but I suspect that it is
> broken. I'll have a look at the overlayfs code and see if there are
> globally enforced ordering of stacked file systems. If there is none
> then I believe overlayfs is probably open to AB <-> BA deadlock due to
> the possibility of locking two file systems in one overlayfs stack in
> one order and the same two file systems in the opposite order in
> another.

I think this is fine as long as the same layer does not appear in two
different unions.

The locking order is likely determined by the structure of the union
and not some system-wide order of filesystems so assuming the readonly
layers are locked as well you will probably get a deadlock with
technically correct mount:

mount -t overlayfs overlayfs -olowerdir=/lower2,upperdir=/upper /tmpoverlay
mount -t overlayfs overlayfs -olowerdir=/lower1,upperdir=/tmpoverlay /overlay

mount -t overlayfs overlayfs -olowerdir=/lower1,upperdir=/upper2 /tmpoverlay2
mount -t overlayfs overlayfs -olowerdir=/lower2,upperdir=/tmpoverlay2 /overlay2

because now lower1 and lower2 are differently ordered in the two overlays.

System-wide locking order and some optimizations are reasonably
possible only when the mount is actually aware that it has multiple
branches like

mount -t overlayfs overlayfs
-olowerdirs=/lower1:/lower2,upperdir=/upper3 /not-possible-overlay

Note also that there is no guarantee that /lower1 and /lower2 are in
any way distinct or don't have intermingled hardlinks or symlinks.


Thanks

Michal
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [Samba]     [Device Mapper]     [CEPH Development]
  Powered by Linux