On Wed, Feb 12, 2025 at 12:16:44PM +0100, Christian Brauner wrote: > On Tue, Feb 11, 2025 at 01:56:41PM -0500, Jeff Layton wrote: > > On Tue, 2025-02-11 at 18:15 +0100, Christian Brauner wrote: > > > In [1] it was reported that the acct(2) system call can be used to > > > trigger a NULL deref in cases where it is set to write to a file that > > > triggers an internal lookup. > > > > > > This can e.g., happen when pointing acct(2) to /sys/power/resume. At the > > > point the where the write to this file happens the calling task has > > > already exited and called exit_fs() but an internal lookup might be > > > triggered through lookup_bdev(). This may trigger a NULL-deref > > > when accessing current->fs. > > > > > > This series does two things: > > > > > > - Reorganize the code so that the the final write happens from the > > > workqueue but with the caller's credentials. This preserves the > > > (strange) permission model and has almost no regression risk. > > > > > > - Block access to kernel internal filesystems as well as procfs and > > > sysfs in the first place. > > > > > > This api should stop to exist imho. > > > > > > > I wonder who uses it these days, and what would we suggest they replace > > it with? Maybe syscall auditing? > > Someone pointed me to atop but that also works without it. Since this is > a privileged api I think the natural candidate to replace all of this is > bpf. I'm pretty sure that it's relatively straightforward to get a lot > more information out of it than with acct(2) and it will probably be > more performant too. > > Without any limitations as it is right now, acct(2) can easily lockup > the system quite easily by pointing it to various things in sysfs and > I'm sure it can be abused in other ways. So I wouldn't enable it. And I totally forgot about taskstats via Netlink: https://www.kernel.org/doc/Documentation/accounting/taskstats.txt include/uapi/linux/taskstats.h