Am 02.03.2014 08:10, schrieb Ian Kent:
On Sun, 2014-03-02 at 10:22 +0800, Ian Kent wrote:On Fri, 2014-02-28 at 08:29 -0500, Alexander Viro wrote:On Fri, Feb 28, 2014 at 01:12:58PM +0100, Donald Buczek wrote:Obviously, "cleared mounted on dentry" is missing. It looks like we enter put_mountpoint() but don't get to dentry->d_flags &= ~DCACHE_MOUNTED; mp->m_count is not zero probably. What does it mean? The mount is still locked but not in the mount hash?No, it means that something else is mounted on the same dentry (in another part of mount tree, obviously). If you mount the same fs on two different mountpoints, e.g. mount /dev/sda1 /mnt mount /dev/sda1 /tmp/foo you will have the same dentries seen in two places. Now, mount /dev/sdb11 /mnt/a mount /dev/sdc5 /tmp/foo/a and you've got two different filesystems mounted on two different places (/mnt/a and /tmp/foo/a). These two places have different vfsmounts, but the same dentry. struct mountpoint is associated with dentry, so it's also the same for both. And it serves as a mountpoint for two vfsmounts - one for fs from sdb11, another for fs from sdc5. Now umount /mnt/a; one of those two vfsmounts is gone now. struct mountpoint survives, of course, and dentry is *still* a mountpoint. sdc5 is still mounted on /tmp/foo/a, after all...Good example but for autofs file systems doesn't this amount to saying its been bound somewhere else? Illegal as far as autofs is concerned because an autofs mount is strictly associated with a path defined by its map. And, yes, bind mounting an autofs file system elsewhere isn't vetoed by the kernel. This makes be start thinking about implications wrt. containers ....Ahh, right ... I'll need to think about my use (misuse) of d_mountpoint().So maybe I don't need to worry about this just yet.
I think you should, because exactly this is the bug.d_mountpoint(dentry) just says, that we have a struct mountpoint for the dentry. It does not say, that the path is mounted in the current namespace. The struct mountpoint might exists, because the path is mounted in other namespaces but not ours.
The problem at our site is clear now:We have only one service with PrivateTmp=yes which is colord.service. And here is the missing mount:
root:kasslerbraten:/lib/systemd/system/# ps -Af|fgrep colord root 7670 1 0 Feb28 ? 00:00:00 /usr/lib/colord/colord root 7897 7329 0 14:46 pts/8 00:00:00 fgrep colordroot:kasslerbraten:/lib/systemd/system/# cat /proc/7670/mounts|grep mariux32 pille:/amd/pille/1/project/mariux32 /project/mariux32 nfs rw,nosuid,relatime,vers=3,rsize=524288,wsize=524288,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,mountaddr=141.14.28.250,mountvers=3,mountport=56263,mountproto=udp,local_lock=none,addr=141.14.28.250 0 0
colord.service is dbus-started. So it is started quiet randomly and depending on user usage pattern, mostly but not exclusively on workstations. That is exactly how we've seen the bug to appear.
When the services is started, systemd uses unshare(CLONE_NEWNS) to clone the namespace. This new namespace inherits existing mounts, including automounted ones. These mounts might eventually expire at a later time. When this occurs, they are dismounted from the automount daemons namespace, which is the global, pid 1 namespace. But because they are still mounted in another namespace, the dentry stays flagged as DCACHE_MOUNTED, which prevents autofs to remount it on access. The mount, however, just exists in another namespace and is useless for anybody else.
Final prove, that this is the true story:
root:kasslerbraten:/lib/systemd/system/# ls /project/mariux32ls: cannot open directory /project/mariux32: Too many levels of symbolic linksroot:kasslerbraten:/lib/systemd/system/# kill -9 7670 root:kasslerbraten:/lib/systemd/system/# ls /project/mariux32 beeroot home i686 svnroot root:kasslerbraten:/lib/systemd/system/#
Of course, I can easily work around that in our environment (eg. just remove PrivateTmp=yes from the service). So I'm pretty sure, it will work for me now. The bug, however, is in autofs. systemd is doing perfectly legal user-mode things.
Perhaps autofs should use lookup_mnt() to decide along this pattern: if ( dentry->d_flags & DCACHE_MOUNTED && lookup_mnt(path) ) { /* mounted */ } else { /* not mounted */ }That doesn't solve the problem, however, that mounts cloned by a unshare(CLONE_NEWNS) would never expire. Also there is another bug somewhere, because I see, that the mount, visible to the /usr/lib/colord/colord process was logged as "unmounted" in the nfs server when it expired in the global namespace. So I doubt it would be working even for that process. So possibly automounted mounts shouldn't be cloned at all? Together with chroot or pivot_root the sematics would be more than unclear anyway. Your problem now :-)
Thanks for you help with this! Regards Donald -- Donald Buczek buczek@xxxxxxxxxxxxx Tel: +49 30 8413 1433
Attachment:
smime.p7s
Description: S/MIME Cryptographic Signature