On Thu, Oct 26, 2023 at 2:30 PM Maciej Żenczykowski <maze@xxxxxxxxxx> wrote: > > Alexei said he's travelling busy and to resend to bpf@vger, so here it goes: > > https://github.com/libbpf/libbpf/blob/master/src/libbpf.c#L4525 > > int bpf_map__reuse_fd(struct bpf_map *map, int fd) { > ... > /* > * Like dup(), but make sure new FD is >= 3 and has O_CLOEXEC set. > * This is similar to what we do in ensure_good_fd(), but without > * closing original FD. > */ > new_fd = fcntl(fd, F_DUPFD_CLOEXEC, 3); > if (new_fd < 0) { > err = -errno; > goto err_free_new_name; > } > > err = zclose(map->fd); > if (err) { > err = -errno; > goto err_close_new_fd; > } > > This is based *purely* on very local code inspection of the ~50 lines > up above - so I may well be wrong - but: > > (a) optimization opportunity: there should be no need to > F_DUPFD_CLOEXEC 3 if fd is already >= 3 > I assume in this case map->fd == fd, though it's not clear from > looking at the function... > why is fd a direct argument? maybe it's trying to take ownership > while still allowing the caller to close the passed in fd? > > it feels like this function should: > close(map->fd) almost on entry > if (fd < 3) {new_fd = dupfd_cloexec(fd,3) and close(fd); fd = new_fd; } > map->fd = fd > > Note: I'm assuming here that kernel returned map fds already have > cloexec set, which I seem to recall was the case? > > Maybe pass in *fd instead of fd? > bpf_map__reuse_fd() doesn't take ownership of passed FD, so it always has to dup() it. Doesn't seem like we do anything unnecessary here? > (b) the close() system call closes the file descriptor *even* if it > returns an error > it's likely a mistake to be checking what value zclose() returns. > (close() returns *pending* errors but it always releases the file descriptor) > looking at the implementation of zclose() earlier in the file, it's > aware of this. > as such I think: > > err = zclose(map->fd); > if (err) { > err = -errno; > goto err_close_new_fd; > } > > should probably be *just* > (void)zclose(map->fd); > or even just > close(map->fd) > since map->fd will be overwritten in a moment. This is true, I think we can safely just zclose() unconditionally and proceed. I think we still need zclose() to avoid close(-1) situation, though. Please send a patch. > > btw. happened to look at this because I ran across the > "fd == 0 means AT_FDCWD BPF_OBJ_GET commands" thread. > It looks like the kernel community is perfectly okay with the kernel > throwing land mines at userspace... > I totally agree with you that for most new fd-like things (incl. > timerfd's, bpf, etc.) > the kernel should never allocate them with index <3 because it's *too* > ripe for abuse. > > Anyway, the above comments *might* not be correct. > I didn't do a deep analysis or anything, just something I noticed / > was confused by on a quick glance. > There may be deeper reasons why the code should be the way it is > (though perhaps it could use more comments then) > Thanks for taking a time to look at this and report what you think is wrong. I think the F_DUPFD_CLOEXEC is a non-issue, but zclose() one we can improve, thanks! > - Maciej >