On 24/11/13 11:07AM, Chris PeBenito wrote: > On 11/6/2024 10:21 AM, Chris PeBenito wrote: > > I've recently become aware of systemd's credentials feature[1]. In a > > nutshell, the intent is to reduce privilege in units by systemd itself > > copying the credentials (crypto materials, passwords, though in practice > > it could be anything) and placing it in /run/credentials, with access > > managed by namespacing. This is intended to eliminate the need for the > > daemon in the unit directly accessing the data. My concern is the > > possibility of inadvertently leaking credentials or abuse. i.e. putting > > in > > > > LoadCredential=foobar:/etc/shadow > > > > This illustrative, as systemd can't read shadow if it's confined, but > > you could replace shadow with a private key or any other highly > > confidential data systemd needs to access. The common response to my > > concern is systemd units should be protected at high integrity; only > > root can modify them, etc. However, I think we can do better to reduce > > the possibility of errors. > > > > I've discussed this with one of the systemd maintainers, and I'm > > proposing this change: > > > > 1. pid1 forks (to pidN) to run the unit, as usual. > > 2. pidN does security_compute_create() using the current domain and the > > label of the unit to get a new domain. > > 3. pidN does setcon() to the new domain. > > 4. pidN runs the unit as per usual (including the credentials stuff) > > > > Then we'd need to add something like this to the policy: > > > > allow init_t httpd_initrc_t:process dyntransition; > > type_transition init_t httpd_unit_t:process httpd_initrc_t; > > > > I have not yet prototyped this, but based on my discussion with the > > systemd maintainers, this should be doable. I believe the added benefit > > is we can decompose initrc_t's privilege and maybe even reduce init_t's > > privilege. > > Hearing no objections, I've done an initial implementation: > > https://github.com/systemd/systemd/compare/main...pebenito:systemd:pidN-selinux-setcon > > If there is no policy in place, it does not incur new denials. One nice > thing I found is that the unit name is available, so I used that in the > security_compute_create_name_raw() call. I tested by adding the following > systemd-networkd.service drop-in: > > [Service] > LoadCredential=shadow:/etc/shadow > > > I added the following to the policy: > > type systemd_networkd_initrc_t; > domain_type(systemd_networkd_initrc_t) > role system_r types systemd_networkd_initrc_t; > allow init_t self:process setcurrent; > domain_dyntrans_type(init_t) > allow init_t systemd_networkd_initrc_t:process dyntransition; > type_transition init_t systemd_networkd_unit_t:process > systemd_networkd_initrc_t; > domtrans_pattern(systemd_networkd_initrc_t, systemd_networkd_exec_t, > systemd_networkd_t) > > > These changes resulted in this denial: > > Nov 13 15:10:54 azurelinux-vm audit[605]: AVC avc: denied { read } for > pid=605 comm="(sd-mkdcreds)" name="shadow" dev="sda2" ino=18058 > scontext=system_u:system_r:systemd_networkd_initrc_t:s0 > tcontext=system_u:object_r:shadow_t:s0 tclass=file permissive=1 > > > The remaining policy for systemd_networkd_initrc_t would look like (denials > summarized by audit2allow): > > allow systemd_networkd_initrc_t autofs_t:dir getattr; > allow systemd_networkd_initrc_t autofs_t:filesystem unmount; > allow systemd_networkd_initrc_t bin_t:dir { getattr search }; > allow systemd_networkd_initrc_t bin_t:file { execute execute_no_trans > getattr map open read }; > allow systemd_networkd_initrc_t boot_t:dir search; > allow systemd_networkd_initrc_t cgroup_t:dir { getattr search }; > allow systemd_networkd_initrc_t cgroup_t:file { getattr mounton }; > allow systemd_networkd_initrc_t cgroup_t:filesystem { getattr remount }; > allow systemd_networkd_initrc_t device_t:dir mounton; > allow systemd_networkd_initrc_t devlog_t:sock_file write; > allow systemd_networkd_initrc_t dosfs_t:filesystem remount; > allow systemd_networkd_initrc_t fs_t:filesystem { remount unmount }; > allow systemd_networkd_initrc_t home_root_t:dir { getattr mounton }; > allow systemd_networkd_initrc_t init_runtime_t:dir { add_name create getattr > mounton remove_name rmdir search write }; > allow systemd_networkd_initrc_t init_t:dir search; > allow systemd_networkd_initrc_t init_t:fd use; > allow systemd_networkd_initrc_t init_t:file { getattr ioctl open read }; > allow systemd_networkd_initrc_t init_t:unix_stream_socket getattr; > allow systemd_networkd_initrc_t kernel_t:unix_dgram_socket sendto; > allow systemd_networkd_initrc_t kmsg_device_t:chr_file { getattr mounton }; > allow systemd_networkd_initrc_t modules_object_t:dir { getattr mounton }; > allow systemd_networkd_initrc_t proc_kmsg_t:file { getattr mounton }; > allow systemd_networkd_initrc_t proc_t:file { getattr open read }; > allow systemd_networkd_initrc_t proc_t:filesystem { mount remount unmount }; > allow systemd_networkd_initrc_t root_t:dir mounton; > allow systemd_networkd_initrc_t self:capability { dac_read_search fowner > net_admin setgid setpcap setuid sys_resource }; > allow systemd_networkd_initrc_t self:key { search setattr write }; > allow systemd_networkd_initrc_t self:netlink_route_socket { bind create > getattr getopt nlmsg_read read setopt write }; > allow systemd_networkd_initrc_t self:process { getcap setcap setfscreate > setrlimit }; > allow systemd_networkd_initrc_t self:unix_dgram_socket { connect create > getopt setopt }; > allow systemd_networkd_initrc_t shell_exec_t:file getattr; > allow systemd_networkd_initrc_t sysctl_fs_t:dir { getattr mounton search }; > allow systemd_networkd_initrc_t sysctl_kernel_t:dir search; > allow systemd_networkd_initrc_t sysctl_kernel_t:file { getattr ioctl open > read }; > allow systemd_networkd_initrc_t syslogd_runtime_t:dir search; > allow systemd_networkd_initrc_t systemd_networkd_runtime_t:dir { getattr > mounton open read search watch }; > allow systemd_networkd_initrc_t systemd_networkd_runtime_t:file { getattr > open read }; > allow systemd_networkd_initrc_t systemd_networkd_t:process2 nnp_transition; > ### other than mounton this tmpfs dir/file access is for creating the > /run/credentials content > allow systemd_networkd_initrc_t tmpfs_t:dir { add_name create getattr > mounton open read remove_name search setattr write }; > contents: > allow systemd_networkd_initrc_t tmpfs_t:file { create getattr open read > rename setattr write }; > allow systemd_networkd_initrc_t tmpfs_t:filesystem { mount remount unmount > }; > allow systemd_networkd_initrc_t unlabeled_t:dir mounton; > allow systemd_networkd_initrc_t user_home_dir_t:dir { getattr mounton }; > allow systemd_networkd_initrc_t user_runtime_root_t:dir { getattr mounton }; > > This seems like a very promising way to break up initrc_t, limit privileges, > and prevent administrator errors. What do you think? > > > -- > Chris > Overall I like the direction this is going! I am curious, though, about whether this will affect systemd units' ExecStartPre=, ExecStartPost=, and similar directives. One of the problems I have when writing policy for some systemd units is these directives will run commands normally under init_t instead of the resulting daemon domain. A unit may, as an example, want to remove a file on ExecStartPre=, which will run as init_t and therefore will need to be allowed to do so by policy if the command there is a simple /bin/rm. Will this be extended to have these commands run under the unit's corresponding initrc_t domain if there is one? That would solve so much headache when dealing with these types of units and also allow breaking up init_t potentially further. -Kenton Groombridge
Attachment:
signature.asc
Description: PGP signature