Sargun Dhillon <sargun@xxxxxxxxx> wrote: > I discovered an interesting behaviour in epoll today. If I register the same > file twice, under two different file descriptor numbers, and then I close one of > the two file descriptors, epoll "leaks" the first event. This is fine, because > one would think I could just go ahead and remove the event, but alas, that isn't > the case. Some example python code follows to show the issue at hand. > > I'm not sure if this is really considered a "bug" or just "interesting epoll > behaviour", but in my opinion this is kind of a bug, especially because leaks > may happen by accident -- especially if files are not immediately freed. "Interesting epoll behavior" combined with a quirk with the Python wrapper for epoll. It passes the FD as epoll_event.data (.data could also be any void *ptr, a u64, or u32). Not knowing Python myself (but knowing Ruby and Perl5 well); I assume Python developers chose the safest route in passing an integer FD for .data. Passing a pointer to an arbitrary Perl/Ruby object would cause tricky lifetime issues with the automatic memory management of those languages; I expect Python would have the same problem. > I'm also not sure why epoll events are registered by file, and not just fd. > Is the expectation that you can share a single epoll amongst multiple > "users" and register different files that have the same file descriptor No, the other way around. Different FDs for the same file. Having registration keyed by [file+fd] allows users to pass different pointers for different events to the same file; which could have its uses. Registering by FD alone isn't enough; since the epoll FD itself can be shared across fork (which is of limited usefulness[1]). Originaly iterations of epoll were keyed only by the file; with the FD being added later. > number (at least for purposes other than CRIU). Maybe someone can shed > light on the behaviour. CRIU? Checkpoint/Restore In Userspace? [1] In contrast, kqueue has a unique close-on-fork behavior which greatly simplifies usage from C code (but less so for high-level runtimes which auto-close FDs).