Following up on xfs and reflinks,
it appears they are enabled on my out-of-box RHEL9.0. Fwiw, this
is a VBox VM however so if the FC34 system which works
correctly, but is using btrfs.
As always, appreciate any help/references.
TIA
-Tom
[toma@localhost ~]$ xfs_info /
meta-data="" isize=512 agcount=4,
agsize=4185600 blks
= sectsz=512 attr=2,
projid32bit=1
= crc=1 finobt=1,
sparse=1, rmapbt=0
= reflink=1 bigtime=1
inobtcount=1
data = bsize=4096 blocks=16742400,
imaxpct=25
= sunit=0 swidth=0 blks
naming =version 2 bsize=4096 ascii-ci=0,
ftype=1
log =internal log bsize=4096 blocks=8175,
version=2
= sectsz=512 sunit=0 blks,
lazy-count=1
realtime =none extsz=4096 blocks=0,
rtextents=0
[toma@localhost ~]$
-------- Forwarded Message --------
Subject: | Re: systemd-devel Digest, Vol 148, Issue 2 |
---|---|
Date: | Thu, 4 Aug 2022 11:22:32 -0400 |
From: | Thomas Archambault <toma@xxxxxxxxxxxxxxxxx> |
Reply-To: | toma@xxxxxxxxxxxxxxxxx |
To: | systemd-devel-request@xxxxxxxxxxxxxxxxxxxxx |
Thank you Lennart. Very much appreciate the quick and clear response.
You're absolutely correct about the btrfs/xfs difference between the working FC34 system and the problematic RHEL9.0 system:
/dev/mapper/rhel-root on / type xfs(rw,relatime,seclabel,attr2,inode64,logbufs=8,logbsize=32k,noquota)
My strace output did indicate that there are copying going on but I did not know if that that was a problem or not. Obviously it can be in terms of start-up time and UX w/xfs.
- Tom
On 8/4/22 08:00, systemd-devel-request@xxxxxxxxxxxxxxxxxxxxx wrote:
Send systemd-devel mailing list submissions to
systemd-devel@xxxxxxxxxxxxxxxxxxxxx
To subscribe or unsubscribe via the World Wide Web, visit
https://lists.freedesktop.org/mailman/listinfo/systemd-devel
or, via email, send a message with subject or body 'help' to
systemd-devel-request@xxxxxxxxxxxxxxxxxxxxx
You can reach the person managing the list at
systemd-devel-owner@xxxxxxxxxxxxxxxxxxxxx
When replying, please edit your Subject line so it is more specific
than "Re: Contents of systemd-devel digest..."
Today's Topics:
1. systemd-nspawn container not starting on RHEL9.0
(Thomas Archambault)
2. Re: systemd-nspawn container not starting on RHEL9.0
(Lennart Poettering)
----------------------------------------------------------------------
Message: 1
Date: Wed, 3 Aug 2022 15:40:21 -0400
From: Thomas Archambault <toma@xxxxxxxxxxxxxxxxx>
To: systemd-devel@xxxxxxxxxxxxxxxxxxxxx
Subject: systemd-nspawn container not starting on
RHEL9.0
Message-ID: <2d4567ae-f0e5-9e6a-10fe-9592498c6c6e@xxxxxxxxxxxxxxxxx>
Content-Type: text/plain; charset="utf-8"; Format="flowed"
Good day everyone on the dev list,
We are adding an analysis tool to our application that uses the host's
rootfs as one of its inputs.
As a proof of concept, we used systemd-nspawn on Fedora 34 to create an
isolated container environment using the host's rootfs as the
container's rootfs and things worked correctly and as expected. The
host's rootfs is analyzed with tmp and results files generated within
the container without persistent modifications affecting the host's
rootfs. Since RHEL is our ultimate target platform, I've been trying to
duplicate our work over RHEL9.0 without success with the container not
being instantiated.
I've tried to boil down the duplication code to the simplest example,
which is also an example in the man page $ sudo systemd-nspawn -xbD/. As
with my prototyping, the container does not seem to be instantiated.
Any help with troubleshooting, or specific known issues, or requests for
more data would be appreciated.
TIA
tparchambault
ps: Regarding security - selinux is in Permissive mode. I do not know if
seccomp filters are getting in the way or not; This is an out-ot-the-box
RHEL9.0 base workstation install. In the FC34 prototype, I did need to
allow certain syscalls via --system-call-filter in order to get a daemon
within the container to run correctly but afaik that should have no
bearing on the instantiation of the container.
==== On a RHEL9.0 host bash session ====
[toma@localhost ~]$ systemctl --version
systemd 250 (250-6.el9_0)
+PAM +AUDIT +SELINUX -APPARMOR +IMA +SMACK +SECCOMP +GCRYPT +GNUTLS
+OPENSSL +ACL +BLKID +CURL +ELFUTILS -FIDO2 +IDN2 -IDN -IPTC +KMOD
+LIBCRYPTSETUP +LIBFDISK +PCRE2 -PWQUALITY +P11KIT -QRENCODE +BZIP2 +LZ4
+XZ +ZLIB +ZSTD -BPF_FRAMEWORK +XKBCOMMON +UTMP +SYSVINIT
default-hierarchy=unified
[toma@localhost ~]$ uname -a
Linux localhost.localdomain 5.14.0-70.17.1.el9_0.x86_64 #1 SMP PREEMPT
Tue Jun 14 11:32:10 EDT 2022 x86_64 x86_64 x86_64 GNU/Linux
[toma@localhost ~]$
[toma@localhost ~]$ sudo time systemd-nspawn -D / -xb
^C^C^C^C^CCommand terminated by signal 15
40.81user 298.75system 6:29.72elapsed 87%CPU (0avgtext+0avgdata
8524maxresident)k
205032inputs+0outputs (0major+3287minor)pagefaults 0swaps
[toma@localhost ~]$
==== In another bash session on the same host ====
[toma@localhost ~]$ sudo machinectl list
[sudo] password for toma:
No machines.
[toma@localhost ~]$ sudo pkill nspawn
[toma@localhost ~]$
== In the original host bash session, w/increased logging and strace
capture ==
[toma@localhost ~]$ sudo SYSTEMD_LOG_LEVEL=debug strace -o
Development/nspawn.strace.rhel90.out systemd-nspawn -D / -xb
[sudo] password for toma:
Setting RLIMIT_CPU to infinity.
Setting RLIMIT_FSIZE to infinity.
Setting RLIMIT_DATA to infinity.
Setting RLIMIT_STACK to 8388608:infinity.
Setting RLIMIT_CORE to 0:infinity.
Setting RLIMIT_RSS to infinity.
Setting RLIMIT_NPROC to 14657.
Setting RLIMIT_NOFILE to 1024:524288.
Setting RLIMIT_MEMLOCK to 65536.
Setting RLIMIT_AS to infinity.
Setting RLIMIT_LOCKS to infinity.
Setting RLIMIT_SIGPENDING to 14657.
Setting RLIMIT_MSGQUEUE to 819200.
Setting RLIMIT_NICE to 0.
Setting RLIMIT_RTPRIO to 0.
Setting RLIMIT_RTTIME to infinity.
Found cgroup2 on /sys/fs/cgroup/, full unified hierarchy
Terminated
[toma@localhost ~]$
As with the first run, killed via pkill from the other terminal session.
Fwiw, on Fedora 34, the log debug output shows the instantiation of the
container after the "Found csgroup2..." line, with the container working as
documented eventually presenting the login prompt, i.e.
...
Setting RLIMIT_RTTIME to infinity.
Found cgroup2 on /sys/fs/cgroup/, full unified hierarchy
Spawning container fedora-1aabc34e0a52a82b on /.#machine.6e49b8aa974c6f37.
Press ^] three times within 1s to kill container.
Outer child is initializing.
Mounting / (MS_REC|MS_SLAVE "")...
...
[? OK? ] Finished Update UTMP about System Runlevel Changes.
Fedora 34 (Workstation Edition)
Kernel 5.11.12-300.fc34.x86_64 on an x86_64 (console)
fedora-1aabc34e0a52a82b login:
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/systemd-devel/attachments/20220803/359f5243/attachment-0001.htm>
------------------------------
Message: 2
Date: Thu, 4 Aug 2022 09:30:26 +0200
From: Lennart Poettering <lennart@xxxxxxxxxxxxxx>
To: Thomas Archambault <toma@xxxxxxxxxxxxxxxxx>
Cc: systemd-devel@xxxxxxxxxxxxxxxxxxxxx
Subject: Re: systemd-nspawn container not starting on
RHEL9.0
Message-ID: <Yut1kq+IsLkSYdeg@gardel-login>
Content-Type: text/plain; charset=us-ascii
On Mi, 03.08.22 15:40, Thomas Archambault (toma@xxxxxxxxxxxxxxxxx) wrote:
Good day everyone on the dev list,"-x" is ephemeral mode. This means nspawn will make a copy of the OS
We are adding an analysis tool to our application that uses the host's
rootfs as one of its inputs.
As a proof of concept, we used systemd-nspawn on Fedora 34 to create an
isolated container environment using the host's rootfs as the container's
rootfs and things worked correctly and as expected. The host's rootfs is
analyzed with tmp and results files generated within the container without
persistent modifications affecting the host's rootfs. Since RHEL is our
ultimate target platform, I've been trying to duplicate our work over
RHEL9.0 without success with the container not being instantiated.
I've tried to boil down the duplication code to the simplest example, which
is also an example in the man page $ sudo systemd-nspawn -xbD/. As with my
prototyping, the container does not seem to be instantiated.
Any help with troubleshooting, or specific known issues, or requests for
more data would be appreciated.
tree before booting into it, and remove it afterwards.
"-x" on btrfs is very fast and space efficient, because btrfs supports
both snapshots and reflinks. nspawn will make a subvol snapshot if the
root you specify is a subvol. It will make reflink-based file copies
otherwise.
Other file systems have a more 1990's feature set, i.e. no reflinks
nor snapshots. (modern xfs on very new kernels can support reflinks if
this is opt-in'ed to.) In that case we have to copy the data files
with their contents, and that's slow.
Hence, what backing fs do you use?
if you use non-btrfs it might hence simply be that we are busy
individually copying all files...
Lennart
--
Lennart Poettering, Berlin
------------------------------
Subject: Digest Footer
_______________________________________________
systemd-devel mailing list
systemd-devel@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/systemd-devel
------------------------------
End of systemd-devel Digest, Vol 148, Issue 2
*********************************************