Ping, for review feedback. On Tue, Sep 26, 2023 at 05:11:44PM +0100, Daniel P. Berrangé wrote: > The 'systemd-analyze security' command looks at the unit file > configuration and reports on any settings which increase the > attack surface for the daemon. Since most systemd units are > fairly minimalist, this is generally informing us about settings > that we never put any thought into using before. > > In its current configuration it reports > > # systemd-analyze security virtlogd.service > ...snip... > → Overall exposure level for virtlogd.service: 9.6 UNSAFE 😨 > > which is pretty terrible as a score. > > If we apply all of the recommendations that appear possible > without (knowingly) breaking functionality it reports: > > # systemd-analyze security virtlogd.service > ...snip... > → Overall exposure level for virtlogd.service: 2.2 OK 🙂 > > which is a pretty decent improvement. > > Some of the settings we would like to enable require a systemd > version that is newer than that available in our oldest distro > target - RHEL-8 at v239. > > NB, RestrictSUIDSGID is technically newer than 239, but RHEL-8 > backported it, and other distros we target have it by default. > > Remaining recommendations are > > ✗ CapabilityBoundingSet=~CAP_(DAC_*|FOWNER|IPC_OWNER) > > We block FOWNER/IPC_OWNER, but can't block the two DAC > capabilities. Historically apps/users might point QEMU > to log files in $HOME, pre-created with their own user > ID. > > ✗ IPAddressDeny= > > Not required since RestrictAddressFamilies blocks IP > usage. Ignoring this avoids the overhead of creating > a traffic filter than will never be used. > > ✗ NoNewPrivileges= > > Highly desirable, but cannot enable it yet, because it > will block the ability to transition to the virtlogd_t > SELinux domain during execve. The SELinux policy needs > fixing to permit this transition under NNP first. > > ✗ PrivateTmp= > > There is a decent chance people have VMs configured > with a serial port logfile pointing at /tmp. We would > cause a regression to use private /tmp for logging > > ✗ PrivateUsers= > > This would put virtlogd inside a user namespace where > its root is in fact unprivileged. Same problem as the > User= setting below > > ✗ ProcSubset= > > Libraries we link to might read certain non-PID related > files from /proc > > ✗ ProtectClock= > > Requires v245 > > ✗ ProtectHome= > > Same problem as PrivateTmp=. There's a decent chance > that someone has a VM configured to write a logfile > to /home > > ✗ ProtectHostname= > > Requires v241 > > ✗ ProtectKernelLogs > > Requires v244 > > ✗ ProtectProc > > Requires v247 > > ✗ ProtectSystem= > > We only set it to 'full', as 'strict' is not viable for > our required usage > > ✗ RootDirectory=/RootImage= > > We are not capable of running inside a custom chroot > given needs to write log files to arbitrary places > > ✗ RestrictAddressFamilies=~AF_UNIX > > We need AF_UNIX to communicate with other libvirt daemons > > ✗ SystemCallFilter=~@resources > > We link to libvirt.so which links to libnuma.so which has > a constructor that calls set_mempolicy. This is highly > undesirable todo during a constructor. > > ✗ User=/DynamicUser= > > This is highly desirable, but we currently read/write > logs as root, and directories we're told to write into > could be anywhere. So using a non-root user would have > a major risk of regressions for applications and also > have upgrade implications > > Signed-off-by: Daniel P. Berrangé <berrange@xxxxxxxxxx> > --- > src/logging/virtlogd.service.in | 94 +++++++++++++++++++++++++++++++++ > 1 file changed, 94 insertions(+) > > diff --git a/src/logging/virtlogd.service.in b/src/logging/virtlogd.service.in > index 8e245ddb43..9e3838ff34 100644 > --- a/src/logging/virtlogd.service.in > +++ b/src/logging/virtlogd.service.in > @@ -20,5 +20,99 @@ OOMScoreAdjust=-900 > # per systemd recommendations > LimitNOFILE=1024:524288 > > +CapabilityBoundingSet=~CAP_AUDIT_CONTROL > +CapabilityBoundingSet=~CAP_AUDIT_READ > +CapabilityBoundingSet=~CAP_AUDIT_WRITE > +CapabilityBoundingSet=~CAP_BLOCK_SUSPEND > +CapabilityBoundingSet=~CAP_CHOWN > +# Mgmt app/user might have pre-created log files that we're > +# told to open and write to, or be storing them in otherwise > +# inaccessible locations like $HOME. So we need to ignore > +# DAC permission checks. > +#CapabilityBoundingSet=~CAP_DAC_OVERRIDE > +#CapabilityBoundingSet=~CAP_DAC_READ_SEARCH > +CapabilityBoundingSet=~CAP_FOWNER > +CapabilityBoundingSet=~CAP_FSETID > +CapabilityBoundingSet=~CAP_IPC_LOCK > +CapabilityBoundingSet=~CAP_IPC_OWNER > +CapabilityBoundingSet=~CAP_KILL > +CapabilityBoundingSet=~CAP_LEASE > +CapabilityBoundingSet=~CAP_LINUX_IMMUTABLE > +CapabilityBoundingSet=~CAP_MAC_ADMIN > +CapabilityBoundingSet=~CAP_MAC_OVERRIDE > +CapabilityBoundingSet=~CAP_MKNOD > +CapabilityBoundingSet=~CAP_NET_ADMIN > +CapabilityBoundingSet=~CAP_NET_BIND_SERVICE > +CapabilityBoundingSet=~CAP_NET_BROADCAST > +CapabilityBoundingSet=~CAP_NET_RAW > +CapabilityBoundingSet=~CAP_SETFCAP > +CapabilityBoundingSet=~CAP_SETPCAP > +CapabilityBoundingSet=~CAP_SETGID > +CapabilityBoundingSet=~CAP_SETUID > +CapabilityBoundingSet=~CAP_SYSLOG > +CapabilityBoundingSet=~CAP_SYS_ADMIN > +CapabilityBoundingSet=~CAP_SYS_BOOT > +CapabilityBoundingSet=~CAP_SYS_CHROOT > +CapabilityBoundingSet=~CAP_SYS_MODULE > +CapabilityBoundingSet=~CAP_SYS_NICE > +CapabilityBoundingSet=~CAP_SYS_PACCT > +CapabilityBoundingSet=~CAP_SYS_PTRACE > +CapabilityBoundingSet=~CAP_SYS_RAWIO > +CapabilityBoundingSet=~CAP_SYS_RESOURCE > +CapabilityBoundingSet=~CAP_SYS_TIME > +CapabilityBoundingSet=~CAP_SYS_TTY_CONFIG > +CapabilityBoundingSet=~CAP_WAKE_ALARM > + > +LockPersonality=true > +MemoryDenyWriteExecute=true > +# Cannot enable this as it prevents transitioning to > +# the confined SELinux virtlogd_t domain on execve > +# unless we modify the policy to allow this. > +#NoNewPrivileges=true > +PrivateDevices=true > +PrivateMounts=true > +PrivateNetwork=true > +# XXX someone could configure QEMU to log a serial port to an > +# arbitrary directory, including /tmp, even if this is ill-advised > +#PrivateTmp=true > +# Not until oldest build target has systemd >= v245 > +#ProtectClock=true > +ProtectControlGroups=true > +# Not until oldest build target has systemd >= v241 > +#ProtectHostname=true > +# Not until oldest build target has systemd >= v244 > +#ProtectKernelLogs=true > +ProtectKernelModules=true > +ProtectKernelTunables=true > +# Not until oldest build target has systemd >= v247 > +#ProtectProc=invisible > +ProtectSystem=full > +RestrictAddressFamilies=AF_UNIX > +RestrictNamespaces=~cgroup > +RestrictNamespaces=~ipc > +RestrictNamespaces=~mnt > +RestrictNamespaces=~net > +RestrictNamespaces=~pid > +RestrictNamespaces=~user > +RestrictNamespaces=~uts > +RestrictRealtime=true > +RestrictSUIDSGID=true > +SystemCallArchitectures=native > +SystemCallFilter=~@clock > +SystemCallFilter=~@debug > +SystemCallFilter=~@module > +SystemCallFilter=~@mount > +SystemCallFilter=~@raw-io > +SystemCallFilter=~@reboot > +SystemCallFilter=~@swap > +SystemCallFilter=~@privileged > +# Unfortunately we link to libnuma via libvirt.so which > +# has a constructor that runs unconditionally that invokes > +# set_mempolicy() > +#SystemCallFilter=~@resources > +SystemCallFilter=~@cpu-emulation > +SystemCallFilter=~@obsolete > +UMask=077 > + > [Install] > Also=virtlogd.socket > -- > 2.41.0 > With regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|