Re: fd == 0 means AT_FDCWD BPF_OBJ_GET commands

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, May 17, 2023 at 09:17:36AM -0700, Alexei Starovoitov wrote:
> On Wed, May 17, 2023 at 5:05 AM Christoph Hellwig <hch@xxxxxx> wrote:
> >
> > On Wed, May 17, 2023 at 11:11:24AM +0200, Christian Brauner wrote:
> > > Adding fsdevel so we're aware of this quirk.
> > >
> > > So I'm not sure whether this was ever discussed on fsdevel when you took
> > > the decision to treat fd 0 as AT_FDCWD or in general treat fd 0 as an
> > > invalid value.
> >
> > I've never heard of this before, and I think it is compltely
> > unacceptable. 0 ist just a normal FD, although one that happens to
> > have specific meaning in userspace as stdin.
> >
> > >
> > > If it was discussed then great but if not then I would like to make it
> > > very clear that if in the future you decide to introduce custom
> > > semantics for vfs provided infrastructure - especially when exposed to
> > > userspace - that you please Cc us.
> >
> > I don't think it's just the future.  We really need to undo this ASAP.
> 
> Christian is not correct in stating that treatment of fd==0 as invalid
> bpf object applies to vfs fd-s.
> The path_fd addition in this patch is really the very first one of this kind.
> At the same time bpf anon fd-s (progs, maps, links, btfs) with fd == 0
> are invalid and this is not going to change. It's been uapi for a long time.
> 
> More so fd-s 0,1,2 are not "normal FDs".
> Unix has made two mistakes:
> 1. fd==0 being valid fd
> 2. establishing convention that fd-s 0,1,2 are stdin, stdout, stderr.
> 
> The first mistake makes it hard to pass FD without an extra flag.
> The 2nd mistake is just awful.
> We've seen plenty of severe datacenter wide issues because some
> library or piece of software assumes stdin/out/err.
> Various services have been hurt badly by this "convention".
> In libbpf we added ensure_good_fd() to make sure none of bpf objects
> (progs, maps, etc) are ever seen with fd=0,1,2.
> Other pieces of datacenter software enforce the same.
> 
> In other words fds=0,1,2 are taken. They must not be anything but
> stdin/out/err or gutted to /dev/null.
> Otherwise expect horrible bugs and multi day debugging.
> 
> Because of that, several years ago, we've decided to fix unix mistake #1
> when it comes to bpf objects and started reserving fd=0 as invalid.
> This patch is proposing to do the same for path_fd (normal vfs fd) when

It isn't as you now realized but I'm glad we cleared that up off-list.

> it is passed to bpf syscall. I think it's a good trade-off and fits
> the rest of bpf uapi.
> 
> Everyone who's hiding behind statements: but POSIX is a standard..
> or this is how we've been doing things... are ignoring the practical
> situation at hand. fd-s 0,1,2 are taken. Make sure your sw never produces them.

(Prefix: Imagine me calmly writing this and in a relaxed tone.)

Just to clarify. I do think that deciding that 0 is an invalid file
descriptor number is weird and I wish you'd have discussed this with us
before you took that decision. You've seen the reaction that other
low-level userspace people you talked to had to these news...

I'm not sure what to make of the POSIX excursion. I think that it is a
complete sideshow to the issue here in a way. But fwiw...

We don't follow arbitrary conventions such as 0, 1, and 2 because we all
have sworn allegiance to The Secret Order of the POSIX but because the
alternative is that one subsystem finds it neat to use fd 0 to refer to
AT_FDCWD and another one to AT_MY_CUSTOM_FD0_MEANING. Which is exactly
what would've happened if this patch would have made it unnoticed.

This doesn't scale and our interfaces aren't designed around
Shakespeare's dictum "What's in a name?".

This will quickly devolve into a situation similar to letting a step on
a staircase be off by a few millimeters. See those users falling.

I'm glad we cleared this up. My main issue is indeed that fd 0 now isn't
just forbidden it would be given an entirely different meaning which is
not acceptable from the vfs perspective.




[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux