On 5/29/13 1:19 PM, Jeremy Allison wrote:
On Wed, May 29, 2013 at 12:23:04PM -0700, Anand Avati wrote:
We just uncovered this issue in our QE testing -
On Wed, May 29, 2013 at 4:21 AM, Anand Avati <avati@xxxxxxxxxx> wrote:
+static DIR *vfs_gluster_fdopendir(struct vfs_handle_struct *handle,
+ files_struct *fsp, const char *mask,
+ uint32 attributes)
+{
+ return (DIR *) glfd_fd_get(fsp->fh->fd);
+}
When code takes this vfs_fdopendir() path (happened when testing
fsstress, most of the times vfs_opendir() is called - not sure why),
we are just passing a pointer of glfs_fd structure ...
The vfs_fdopendir() will get called when SMB2 is being
used and a directory listing is being done on an open
SMB2 directory handle.
+static int vfs_gluster_closedir(struct vfs_handle_struct *handle,
DIR *dirp)
+{
+ return glfs_closedir((void *)dirp);
+}
... and Samba does a vfs_closedir() _and_ vfs_close(), essentially
doing a double free on the glfs_fd structure.
Let me look into the mainline Samba code here. As I recall
it should set the fsp->fh->fd to -1 after doing the vfs_closedir()
when the file handle on the directory is closed. Thus the
vfs_close should see the fsp->fh->fd as already zero and
just ignore it.
Our testing was on a backport of this patch in v3-6-stable, which has
this interesting piece of code:
void dptr_CloseDir(files_struct *fsp)
{
if (fsp->dptr) {
/*
* Ugly hack. We have defined fdopendir to return ENOSYS if dirfd also
isn't
* present. I hate Solaris. JRA.
*/
#ifdef HAVE_DIRFD
if (fsp->fh->fd != -1 &&
fsp->dptr->dir_hnd &&
dirfd(fsp->dptr->dir_hnd->dir)) {
/* The call below closes the underlying fd. */
fsp->fh->fd = -1;
}
#endif
dptr_close_internal(fsp->dptr);
fsp->dptr = NULL;
}
}
1. The code is calling dirfd() on the value returned by vfs_fdopendir()
and vfs_opendir(). We return 'struct glfs_fd' typecasted to DIR *
(technique I copied from vfs_ceph). So dirfd() is getting called against
'struct glfs_fd *' in this case.
2. The logic around dirfd() is checking for > 0 as the condition of a
valid fd. Since dirfd is now working on a differnet object, it probably
picked an arbitrary location in glfs_fd and found the value to be 0, and
this lead to fsp->fh->fd to _not_ be 0.
3. In our test case the fd number returned by vfs_gluster for the
initial open file _was_ 0. That dirfd() condition needs a fix, I think?
I don't know if the master branch has equivalent code elsewhere,
couldn't find it easily..
Avati