Topi Miettinen <toiwoton@xxxxxxxxx> writes: > On 4.11.2019 17.44, Eric W. Biederman wrote: >> Topi Miettinen <toiwoton@xxxxxxxxx> writes: >> >>> On 3.11.2019 20.50, Eric W. Biederman wrote: >>>> Topi Miettinen <toiwoton@xxxxxxxxx> writes: >>>> >>>>> Several items in /proc/sys need not be accessible to unprivileged >>>>> tasks. Let the system administrator change the permissions, but only >>>>> to more restrictive modes than what the sysctl tables allow. >>>> >>>> This looks quite buggy. You neither update table->mode nor >>>> do you ever read from table->mode to initialize the inode. >>>> I am missing something in my quick reading of your patch? >>> >>> inode->i_mode gets initialized in proc_sys_make_inode(). >>> >>> I didn't want to touch the table, so that the original permissions can >>> be used to restrict the changes made. In case the restrictions are >>> removed as suggested by Theodore Ts'o, table->mode could be >>> changed. Otherwise I'd rather add a new field to store the current >>> mode and the mode field can remain for reference. As the original >>> author of the code from 2007, would you let the administrator to >>> chmod/chown the items in /proc/sys without restrictions (e.g. 0400 -> >>> 0777)? >> >> At an architectural level I think we need to do this carefully and have >> a compelling reason. The code has survived nearly the entire life of >> linux without this capability. > > I'd be happy with only allowing restrictions to access for > now. Perhaps later with more analysis, also relaxing changes and maybe > UID/GID changes can be allowed. Let's find the use case where someone cares before we think about that. >> I think right now the common solution is to mount another file over the >> file you are trying to hide/limit. Changing the permissions might be >> better but that is not at all clear. >> >> Do you have specific examples of the cases where you would like to >> change the permissions? > > Unprivileged applications typically do not need to access most items > in /proc/sys, so I'd like to gradually find out which are needed. So > far I've seen no problems with 0500 mode for directories abi, crypto, > debug, dev, fs, user or vm. But if there is no problem in letting everyone access the information why reduce the permissions? > I'm also using systemd's InaccessiblePaths to limit access (which > mounts an inaccessible directory over the path), but that's a bit too > big hammer. For example there are over 100 files in /proc/sys/kernel, > perhaps there will be issues when creating a mount for each, and that > multiplied by a number of services. My sense is that if there is any kind of compelling reason to make world-readable values not world-readable, and it doesn't break anything (except malicious applications) than a kernel patch is probably the way to go. Policy knobs like this on proc tend to break in normal maintenance because they are not used enough so I am not a big fan of adding policy knobs just because we can. > I see no problems by using Firejail (which uses PID namespacing) with > v2, the permissions in /proc/sys are the same as outside the > namespace. Thank you for testing. Eric