On Tue, Aug 20, 2024 at 12:34:14PM GMT, Eric Biggers wrote: > On Mon, Aug 19, 2024 at 10:41:15AM +0200, Christian Brauner wrote: > > On Sat, Aug 17, 2024 at 08:58:18PM GMT, Eric Biggers wrote: > > > Hi Christian, > > > > > > On Wed, Jul 31, 2024 at 12:01:12PM +0200, Christian Brauner wrote: > > > > It's currently possible to create pidfds for kthreads but it is unclear > > > > what that is supposed to mean. Until we have use-cases for it and we > > > > figured out what behavior we want block the creation of pidfds for > > > > kthreads. > > > > > > > > Fixes: 32fcb426ec00 ("pid: add pidfd_open()") > > > > Cc: stable@xxxxxxxxxxxxxxx > > > > Signed-off-by: Christian Brauner <brauner@xxxxxxxxxx> > > > > --- > > > > kernel/fork.c | 25 ++++++++++++++++++++++--- > > > > 1 file changed, 22 insertions(+), 3 deletions(-) > > > > > > Unfortunately this commit broke systemd-shutdown's ability to kill processes, > > > which makes some filesystems no longer get unmounted at shutdown. > > > > > > It looks like systemd-shutdown relies on being able to create a pidfd for any > > > process listed in /proc (even a kthread), and if it gets EINVAL it treats it a > > > fatal error and stops looking for more processes... > > > > Thanks for the report! > > I talked to Daan De Meyer who made that change and he said that this > > must a systemd version that hasn't gotten his fixes yet. In any case, if > > this causes regression then I'll revert it right now. See the appended > > revert. > > Thanks for queueing up a revert. > > This was on systemd 256.4 which was released less than a month ago. > > I'm not sure what systemd fix you are talking about. Looking at killall() in > src/shared/killall.c on the latest "main" branch of systemd, it calls > proc_dir_read_pidref() => pidref_set_pid() => pidfd_open(), and EINVAL gets > passed back up to killall() and treated as a fatal error. ignore_proc() skips > kernel threads but is executed too late. I didn't test it, so I could be wrong, > but based on the code it does not appear to be fixed. Yeah, I think you're right. What they fixed is ead48ec35c86 ("cgroup-util: Don't try to open pidfd for kernel threads") when reading pids from cgroup.procs. Daan is currently prepping a fix for reading pids from /proc as well.