Re: "Too many levels of symbolic links"

Donald Buczek <buczek@xxxxxxxxxxxxx> · Sun, 02 Mar 2014 15:55:56 +0100

Am 02.03.2014 08:10, schrieb Ian Kent:
On Sun, 2014-03-02 at 10:22 +0800, Ian Kent wrote:
On Fri, 2014-02-28 at 08:29 -0500, Alexander Viro wrote:
On Fri, Feb 28, 2014 at 01:12:58PM +0100, Donald Buczek wrote:

Obviously, "cleared mounted on dentry" is missing.

It looks like we enter put_mountpoint() but don't get to
dentry->d_flags &= ~DCACHE_MOUNTED;

mp->m_count is not zero probably.

What does it mean? The mount is still locked but not in the mount hash?
No, it means that something else is mounted on the same dentry (in another
part of mount tree, obviously).

If you mount the same fs on two different mountpoints, e.g.
mount /dev/sda1 /mnt
mount /dev/sda1 /tmp/foo
you will have the same dentries seen in two places.  Now,
mount /dev/sdb11 /mnt/a
mount /dev/sdc5 /tmp/foo/a

and you've got two different filesystems mounted on two different places
(/mnt/a and /tmp/foo/a).  These two places have different vfsmounts,
but the same dentry.  struct mountpoint is associated with dentry, so
it's also the same for both.  And it serves as a mountpoint for two
vfsmounts - one for fs from sdb11, another for fs from sdc5.

Now umount /mnt/a; one of those two vfsmounts is gone now.  struct mountpoint
survives, of course, and dentry is *still* a mountpoint.  sdc5 is still
mounted on /tmp/foo/a, after all...
Good example but for autofs file systems doesn't this amount to saying
its been bound somewhere else?

Illegal as far as autofs is concerned because an autofs mount is
strictly associated with a path defined by its map.

And, yes, bind mounting an autofs file system elsewhere isn't vetoed by
the kernel.

This makes be start thinking about implications wrt. containers ....

Ahh, right ... I'll need to think about my use (misuse) of
d_mountpoint().
So maybe I don't need to worry about this just yet.

I think you should, because exactly this is the bug.
d_mountpoint(dentry) just says, that we have a struct mountpoint for the 
dentry. It does not say, that the path is mounted in the current 
namespace. The struct mountpoint might exists, because the path is 
mounted in other namespaces but not ours.

The problem at our site is clear now:

We have only one service with PrivateTmp=yes which is colord.service. 
And here is the missing mount:

root:kasslerbraten:/lib/systemd/system/# ps -Af|fgrep colord
root      7670     1  0 Feb28 ?        00:00:00 /usr/lib/colord/colord
root      7897  7329  0 14:46 pts/8    00:00:00 fgrep colord
root:kasslerbraten:/lib/systemd/system/# cat /proc/7670/mounts|grep 
mariux32
pille:/amd/pille/1/project/mariux32 /project/mariux32 nfs 
rw,nosuid,relatime,vers=3,rsize=524288,wsize=524288,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,mountaddr=141.14.28.250,mountvers=3,mountport=56263,mountproto=udp,local_lock=none,addr=141.14.28.250 
0 0

colord.service is dbus-started. So it is started quiet randomly and 
depending on user usage pattern, mostly but not exclusively on 
workstations. That is exactly how we've seen the bug to appear.

When the services is started, systemd uses unshare(CLONE_NEWNS) to clone 
the namespace. This new namespace inherits existing mounts, including 
automounted ones.
These mounts might eventually expire at a later time. When this occurs, 
they are dismounted from the automount daemons namespace, which is the 
global, pid 1 namespace. But because they are still mounted in another 
namespace, the dentry stays flagged as DCACHE_MOUNTED, which prevents 
autofs to remount it on access. The mount, however, just exists in 
another namespace and is useless for anybody else.

Final prove, that this is the true story:

root:kasslerbraten:/lib/systemd/system/# ls /project/mariux32
ls: cannot open directory /project/mariux32: Too many levels of 
symbolic links
root:kasslerbraten:/lib/systemd/system/# kill -9 7670
root:kasslerbraten:/lib/systemd/system/# ls /project/mariux32
beeroot  home  i686  svnroot
root:kasslerbraten:/lib/systemd/system/#

Of course, I can easily work around that in our environment (eg. just 
remove PrivateTmp=yes from the service). So I'm pretty sure, it will 
work for me now.
The bug, however, is in autofs. systemd is doing perfectly legal 
user-mode things.

Perhaps autofs should use lookup_mnt()  to decide along this pattern:

if ( dentry->d_flags & DCACHE_MOUNTED && lookup_mnt(path)  ) {
  /* mounted */
} else {
  /* not mounted */
}

That doesn't solve the problem, however, that mounts cloned by a 
unshare(CLONE_NEWNS) would never expire. Also there is another bug 
somewhere, because I see, that the mount, visible to the 
/usr/lib/colord/colord process was logged as "unmounted" in the nfs 
server when it expired in the global namespace. So I doubt it would be 
working even for that process. So possibly automounted mounts shouldn't 
be cloned at all? Together with chroot or pivot_root the sematics would 
be more than unclear anyway. Your problem now :-)

Thanks for you help with this!

Regards
  Donald

--
Donald Buczek
buczek@xxxxxxxxxxxxx
Tel: +49 30 8413 1433

Attachment:
smime.p7s

Description: S/MIME Cryptographic Signature