On 11 July 2011 19:23, Ian Kent <ikent@xxxxxxxxxx> wrote: > On Mon, 2011-07-11 at 18:17 +0200, Michal Suchanek wrote: >> On 11 July 2011 15:50, Ian Kent <ikent@xxxxxxxxxx> wrote: >> > On Mon, 2011-07-11 at 15:36 +0200, Michal Suchanek wrote: >> >> On 11 July 2011 14:00, Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote: >> >> > On Mon, 2011-07-11 at 12:01 +0100, David Howells wrote: >> >> >> Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote: >> >> >> >> >> >> >> >> > Also, why would you want to have a class per sb-instance? From last >> >> >> > talking to David, he said there could only ever be 2 filesystems >> >> >> > involved in this, the top and bottom, and it is determined on (union) >> >> >> > mount time which is which. >> >> >> >> >> >> There can be more than 2 - one upperfs (the actual union) and many lowerfs - >> >> >> though I think only one lowerfs is accessed at a time. >> >> > >> >> > Right, however I understood from our earlier discussion that the vfs >> >> > would only ever try to lock 2 filesystems at a time, the top and one >> >> > lower. >> >> >> >> This is true from local point of view. However, it is technically >> >> possible to use overlayfs as the upper layer of another overlayfs >> >> which allows layering multiple readonly "branches" into a single >> >> overlay. Since the vfs will lock the "union" and one (or possibly >> >> both) of its branches and one of the branches may be itself an union >> >> you can get arbitrary depth (which is currently limited by a constant >> >> in the code to cut recursion depth and stack usage). >> > >> > Off topic but can you elaborate on that? >> > >> > Are you saying the "unioned stack" can consist of more than two file >> > systems and can have more than two layers and possibly a mix of multiple >> > read-only and read-write file systems? >> > >> >> This is how requirements are described in documentation: >> >> > The lower filesystem can be any filesystem supported by Linux and does >> > not need to be writable. The lower filesystem can even be another >> > overlayfs. The upper filesystem will normally be writable and if it >> > is it must support the creation of trusted.* extended attributes, and >> > must provide valid d_type in readdir responses, at least for symbolic >> > links - so NFS is not suitable. >> >> In no place it says that the lower filesystem is required to be >> readonly, only that it should not be modified. >> >> >> This is what the documentation gives as example: >> >> > mount -t overlayfs overlayfs -olowerdir=/lower,upperdir=/upper /overlay >> >> This is how it can be expanded: >> >> mount -t overlayfs overlayfs -olowerdir=/lower2,upperdir=/upper /tmpoverlay >> mount -t overlayfs overlayfs -olowerdir=/lower1,upperdir=/tmpoverlay /overlay > > OK, I'll have to think about what this means but I suspect that it is > broken. I'll have a look at the overlayfs code and see if there are > globally enforced ordering of stacked file systems. If there is none > then I believe overlayfs is probably open to AB <-> BA deadlock due to > the possibility of locking two file systems in one overlayfs stack in > one order and the same two file systems in the opposite order in > another. I think this is fine as long as the same layer does not appear in two different unions. The locking order is likely determined by the structure of the union and not some system-wide order of filesystems so assuming the readonly layers are locked as well you will probably get a deadlock with technically correct mount: mount -t overlayfs overlayfs -olowerdir=/lower2,upperdir=/upper /tmpoverlay mount -t overlayfs overlayfs -olowerdir=/lower1,upperdir=/tmpoverlay /overlay mount -t overlayfs overlayfs -olowerdir=/lower1,upperdir=/upper2 /tmpoverlay2 mount -t overlayfs overlayfs -olowerdir=/lower2,upperdir=/tmpoverlay2 /overlay2 because now lower1 and lower2 are differently ordered in the two overlays. System-wide locking order and some optimizations are reasonably possible only when the mount is actually aware that it has multiple branches like mount -t overlayfs overlayfs -olowerdirs=/lower1:/lower2,upperdir=/upper3 /not-possible-overlay Note also that there is no guarantee that /lower1 and /lower2 are in any way distinct or don't have intermingled hardlinks or symlinks. Thanks Michal -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html