[ Added Al explicitly, although I guess he sees the fsdevel posts too ] On Fri, May 18, 2012 at 2:27 AM, Tetsuo Handa <penguin-kernel@xxxxxxxxxxxxxxxxxxx> wrote: > > I noticed that /proc/self/fd/ returns wrong information to lstat() when files > are open()ed after already opened files have been close()d in accordance with > entries returned from opendir("/proc/self/fd"). So if I understand this correctly, the wrong information is otherwise fine, except the link has the S_IWUSR bit set. Correct? The code that determines the mode of the link is proc_fd_instantiate(), which does inode->i_mode = S_IFLNK; /* * We are not taking a ref to the file structure, so we must * hold ->file_lock. */ spin_lock(&files->file_lock); file = fcheck_files(files, fd); if (!file) goto out_unlock; if (file->f_mode & FMODE_READ) inode->i_mode |= S_IRUSR | S_IXUSR; if (file->f_mode & FMODE_WRITE) inode->i_mode |= S_IWUSR | S_IXUSR; spin_unlock(&files->file_lock); and what *seems* to happen is that your test program basically triggers a situation where the old /proc/self/fd/<x> dentry has not been replaced, so it contains the previous one that was writable (because you used to have a writable fd there before you closed it). The "readdir()" you have probably only matters because it instantiates the dentries in /proc/self/fd, and then the "close()" just doesn't invalidate the cached dentry - and neither does the lookup(). In fact, I can show the issue much more simply with this program, no readdir() required: --- test.c --- #include <stdio.h> #include <sys/types.h> #include <sys/stat.h> #include <fcntl.h> #include <unistd.h> #include <string.h> static void show_stat(const char *mode, int fd) { char buffer[100]; struct stat st; snprintf(buffer, sizeof(buffer), "/proc/self/fd/%u", fd); if (!lstat(buffer, &st)) printf("st_mode = %06o (%s)\n", st.st_mode, mode); } int main(int argc, char *argv[]) { int fdrw = open("/dev/null", O_RDWR); int fdro = open("/dev/null", O_RDONLY); show_stat("read-write", fdrw); show_stat("read-only", fdro); dup2(fdro, fdrw); show_stat("read-only", fdrw); show_stat("read-only", fdro); return 0; } --- end test.c --- and running it results in: [torvalds@i5 ~]$ ./a.out st_mode = 120700 (read-write) st_mode = 120500 (read-only) st_mode = 120700 (read-only) st_mode = 120500 (read-only) notice how the "dup2()" that closed the read-write fd and replaced it with the read-only fd still shows 0700 in the lstat information. The good news is that we still do check the inode permission when actually following the link and doing the open(), so you cannot actually open a read-only file using that link. So it's a "stale information" thing, and ugly, but not a security issue. I would suggest just moving the i_mode initialization from proc_fd_instantiate() into the revalidate function that we already have, and that already fixes up i_uid/i_gid etc. Attached is a TOTALLY UNTESTED patch that does this, and actually seems to simplify things in the process. Al, Eric? Linus
Attachment:
patch.diff
Description: Binary data