On Wed, May 17, 2023 at 9:17 AM Alexei Starovoitov <alexei.starovoitov@xxxxxxxxx> wrote: > > On Wed, May 17, 2023 at 5:05 AM Christoph Hellwig <hch@xxxxxx> wrote: > > > > On Wed, May 17, 2023 at 11:11:24AM +0200, Christian Brauner wrote: > > > Adding fsdevel so we're aware of this quirk. > > > > > > So I'm not sure whether this was ever discussed on fsdevel when you took > > > the decision to treat fd 0 as AT_FDCWD or in general treat fd 0 as an > > > invalid value. > > > > I've never heard of this before, and I think it is compltely > > unacceptable. 0 ist just a normal FD, although one that happens to > > have specific meaning in userspace as stdin. > > > > > > > > If it was discussed then great but if not then I would like to make it > > > very clear that if in the future you decide to introduce custom > > > semantics for vfs provided infrastructure - especially when exposed to > > > userspace - that you please Cc us. > > > > I don't think it's just the future. We really need to undo this ASAP. > > Christian is not correct in stating that treatment of fd==0 as invalid > bpf object applies to vfs fd-s. > The path_fd addition in this patch is really the very first one of this kind. > At the same time bpf anon fd-s (progs, maps, links, btfs) with fd == 0 > are invalid and this is not going to change. It's been uapi for a long time. > > More so fd-s 0,1,2 are not "normal FDs". > Unix has made two mistakes: > 1. fd==0 being valid fd > 2. establishing convention that fd-s 0,1,2 are stdin, stdout, stderr. > > The first mistake makes it hard to pass FD without an extra flag. > The 2nd mistake is just awful. > We've seen plenty of severe datacenter wide issues because some > library or piece of software assumes stdin/out/err. > Various services have been hurt badly by this "convention". > In libbpf we added ensure_good_fd() to make sure none of bpf objects > (progs, maps, etc) are ever seen with fd=0,1,2. > Other pieces of datacenter software enforce the same. > > In other words fds=0,1,2 are taken. They must not be anything but > stdin/out/err or gutted to /dev/null. > Otherwise expect horrible bugs and multi day debugging. > > Because of that, several years ago, we've decided to fix unix mistake #1 > when it comes to bpf objects and started reserving fd=0 as invalid. > This patch is proposing to do the same for path_fd (normal vfs fd) when > it is passed to bpf syscall. I think it's a good trade-off and fits > the rest of bpf uapi. > > Everyone who's hiding behind statements: but POSIX is a standard.. > or this is how we've been doing things... are ignoring the practical > situation at hand. fd-s 0,1,2 are taken. Make sure your sw never produces them. Summarizing an offlist discussion with Christian and Andrii. The key issue is that fd == 0 must not mean AT_FDCWD and that's clear. We'll respin with an extra flag to indicate that path_fd should be used.