Re: spurios failures in tests/encryption/crypt.t

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 






On Tue, May 20, 2014 at 10:54 PM, Pranith Kumar Karampuri <pkarampu@xxxxxxxxxx> wrote:


----- Original Message -----
> From: "Anand Avati" <avati@xxxxxxxxxxx>
> To: "Pranith Kumar Karampuri" <pkarampu@xxxxxxxxxx>
> Cc: "Edward Shishkin" <edward@xxxxxxxxxx>, "Gluster Devel" <gluster-devel@xxxxxxxxxxx>
> Sent: Wednesday, May 21, 2014 10:53:54 AM
> Subject: Re: spurios failures in tests/encryption/crypt.t
>
> There are a few suspicious things going on here..
>
> On Tue, May 20, 2014 at 10:07 PM, Pranith Kumar Karampuri <
> pkarampu@xxxxxxxxxx> wrote:
>
> >
> > > > hi,
> > > >      crypt.t is failing regression builds once in a while and most of
> > > > the times it is because of the failures just after the remount in the
> > > > script.
> > > >
> > > > TEST rm -f $M0/testfile-symlink
> > > > TEST rm -f $M0/testfile-link
> > > >
> > > > Both of these are failing with ENOTCONN. I got a chance to look at
> > > > the logs. According to the brick logs, this is what I see:
> > > > [2014-05-17 05:43:43.363979] E [posix.c:2272:posix_open]
> > > > 0-patchy-posix: open on /d/backends/patchy1/testfile-symlink:
> > > > Transport endpoint is not connected
> >
>
> posix_open() happening on a symlink? This should NEVER happen. glusterfs
> itself should NEVER EVER by triggering symlink resolution on the server. In
> this case, for whatever reason an open() is attempted on a symlink, and it
> is getting followed back onto gluster's own mount point (test case is
> creating an absolute link).
>
> So first find out: who is triggering fop->open() on a symlink. Fix the
> caller.
>
> Next: add a check in posix_open() to fail with ELOOP or EINVAL if the inode
> is a symlink.

I think I understood what you are saying. Open call for symlink on fuse mount lead to an open call again for the target on the same fuse mount.

It's not that simple. The client VFS is intelligent enough to resolve symlinks and send open() only on non-symlinks. And the test case script was doing an obvious unlink() (TEST rm -f <filename>), so it was not initiated by an open() attempt in the first place. My guess is that some xlator (probably crypt?) is doing an open() on an inode and that is going through unchecked in posix. It is a bug in both the caller and posix, but the onus/responsibility is on posix to disallow open() on anything but regular files (even open() on character or block devices should not happen in posix).

 
Which lead to deadlock :). That is why we disallow opens on symlink in gluster?

That's not just why open on symlink is disallowed in gluster, it is a more generic problem of following symlinks in general inside gluster. Symlink resolution must strictly happen only in the outermost VFS. Following symlinks inside the filesystem is not only an invalid operation, but can lead to all kinds of deadlocks, security holes (what if you opened a symlink which points to /etc/passwd, should it show the contents of the client machine's /etc/passwd or the server? Now what if you wrote to the file through the symlink? etc. you get the idea..) and wrong/weird/dangerous behaviors. This is not just related to following symlinks, even open()ing special devices.. e.g if you create a char device file with major/minor number of an audio device and wrote pcm data into it, should it play music on the client machine or in the server machine? etc. The summary is, following symlinks or opening non-regular files is VFS/client operation and are invalid operations in a filesystem context.

_______________________________________________
Gluster-devel mailing list
Gluster-devel@xxxxxxxxxxx
http://supercolony.gluster.org/mailman/listinfo/gluster-devel

[Index of Archives]     [Gluster Users]     [Ceph Users]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Security]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux