On Wed, Aug 28, 2013 at 12:59 PM, Al Viro <viro@xxxxxxxxxxxxxxxxxx> wrote: > On Wed, Aug 28, 2013 at 12:04:43PM -0700, Andy Lutomirski wrote: >> On Tue, Aug 27, 2013 at 11:16 PM, Al Viro <viro@xxxxxxxxxxxxxxxxxx> wrote: >> > On Tue, Aug 27, 2013 at 01:28:27PM -0700, Andy Lutomirski wrote: >> >> There are also O_PATH fds, and I'm not sure what the semantics of >> >> O_PATH fds are or should be when they refer to something other than a >> >> directory. >> > >> > O_PATH file just points to specific location in the tree, no more and >> > no less. >> >> I don't know whether ftruncate(some O_PATH fd) should work. But this >> probably barely matters. > > It shouldn't. No IO on these guys at all. > >> > AFAICS, the *only* cases when we might possibly care are linkat() target, >> > truncate() and open(). Note, BTW, that right now we *do* allow an attempt >> > to reopen a file via procfs symlink r/w, even when file had been r/o. >> > It's subject to permissions on the object being opened, but that's it. >> > >> > I'm not sure we can change that - again, it's a user-visible API, and >> > the change is very likely to break some scripts. In fact, it's about >> > as dangerous as a full-blown switch to dup-style semantics for procfs >> > opens, and it's a lot less attractive. >> > >> > For truncate() we would only need to have FMODE_WRITE reported, more or >> > less the same way as FMODE_FLINK. And without open() changes it doesn't >> > buy us anything at all... >> > >> > I've no problem with unrolling the user_path_at() in do_sys_truncate() >> > into an explicit loop by trailing symlinks and checking for indication >> > left by proc_pid_follow_link(), more or less the same way as with >> > LOOKUP_LINK in lookup_last(). It's _far_ less invasive than playing >> > with "oh, here we fill a struct path or maybe a struct file" horrors, >> > pinning struct file for no reason, etc. >> > >> > AFAICS, the real question is whether we dare to change open() behaviour on >> > /proc/*/fd/*. I've played with that a bit and I believe that I can do >> > the switch to dup-style with very localized changes in fs/namei.c and >> > fs/proc/{base,fd}.c. Will be even binary compatible kernel-side - >> > ->atomic_open() returns NULL/ERR_PTR where it used to return 0/-error, >> > not that we had many instances to convert. *IF* that variant is not >> > out of consideration for userland API stability reasons, I would certainly >> > prefer to go that way; turns out that these days we can pull it off without >> > black magic in descriptor handling, etc. Linus? >> >> I personally find the check-mode-but-get-a-new-struct-file version to >> be less weird that the dup approach. Either approach will break >> scripts that try to write to /dev/stdin (which is the whole point). > > What, breaking existing userland? IMO that's a thing to avoid, unless we > have really, really strong reasons not to. And yes, it goes for both > variants... FWIW, I'm not convinced that the reasons you are giving for > it are strong enough - passing somebody a read-only file descriptor to > a file they could open for write and relying on their inability to truncate > the fscker just because it's not reachable via any path they've got search > permissions to looks like a Bloody Bad Idea(tm), and not only because it won't > do what you hope it'll do on existing kernels. It's very easy to fuck up > and end up with a searchable path to the damn thing; e.g. /proc/<pid>/cwd/foo > will bypass the grandparent of foo not being searchable for you, etc. > This affects O_TMPFILE, for example -- create a file with O_TMPFILE | O_RDWR and mode 0666 (by accident), write something, then open("/proc/self/fd/N", O_RDONLY) and send the resulting fd to someone. They can't directly write it, but they can reopen it O_RDWR. I agree that flink is the main issue here, FWIW. I think that the current semantics are just too screwed up to make it really safe to pass around fds with reduced access modes and expect those modes to stick. --Andy -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html