Hello wonderful kernel dev team! I noticed a weird performance behavior when closing an inotify file descriptor. Here is a test C program which demonstrates the issue: https://gist.github.com/1fa8ae0e0d16a0618691d896315d93e8 My test C program's job is to call inotify_init1(), inotify_add_watch(), then close(). Each watch is for a different directory. Each inotify fd gets one watch. The program times how long close()ing the inotify fd takes. When I run my test C program on my machine, I get this output: https://gist.github.com/b396f2379cc066e78e15938b5490cb4d close() is very slow depending on how you call it in relation to other inotify fds in the process. I looked at inotify's implementation and I think the slowness is because of the synchronize_srcu() call in fsnotify_mark_destroy_workfn() (fs/notify/mark.c). Why does close() performance matter to me? I am writing a test suite for a program which uses inotify. Many test cases in the test suite do the following: 1. Create a temporary directory 2. Add files into the temporary directory 3. Create an inotify fd 4. Watch some directories and files in temporary directory 5. Manipulate the filesystem in interesting ways 6. Read the inotify fd and do application-specific logic 7. Assert that the application did the right thing 8. Close the inotify fd 9. Delete the temporary directory and its contents I noticed that my test suite started becoming slow. With only a handful of test cases, the test suite was taking half a second. I tracked the problem down to close(), so I created a test C program to demonstrate the performance behavior of close() (linked above). I naively expected close() for an inotify fd to be pretty fast. (I do understand that close() can be slow for files on NFS, though.) I found a workaround for the slowness: at the end of each test case, don't close() the inotify fd. Instead, unwatch everything associated with that inotify fd, and every few test cases, close all the inotify fds. This amortizes the RCU synchronization in my test suite. This workaround is codified by TEST_WATCH_AND_UNWATCH_EACH_THEN_CLOSE_ALL in my test C program. With this workaround, I don't need close() to be faster. I thought I'd bring the issue to your attention regardless. Have a nice day, strager