Le 18/06/2024 à 11:51, Karel Zak a écrit :
Hi Laurent,
Hi Karel,
On Tue, Jun 11, 2024 at 10:43:14AM +0200, Laurent Vivier wrote:+*-l*, **--load-interp=**__file__:: +Load binfmt_misc definition in the namespace (implies *--mount-binfmt*).Is it actually a file, or does the argument have a more complex format? If there is something more that it should be described here. It fine describe in the man page more about the interpreters.
Your right the format here is not actually a file, but it defines how to use the file provided in the parameter as an interpreter.
We provide here what we will write in /proc/sys/fs/binfmt_misc/register and the format is described in https://www.kernel.org/doc/Documentation/admin-guide/binfmt-misc.rst:
"To actually register a new binary type, you have to set up a string looking like ``:name:type:offset:magic:mask:interpreter:flags``
[...] - ``name`` is an identifier string. A new /proc file will be created with this name below ``/proc/sys/fs/binfmt_misc`` - ``type`` is the type of recognition. Give ``M`` for magic and ``E`` for extension. - ``offset`` is the offset of the magic/mask in the file - ``magic`` is the byte sequence binfmt_misc is matching for. - ``mask`` is an (optional, defaults to all 0xff) mask. - ``interpreter`` is the program that should be invoked with the binary as first argument - ``flags`` is an optional field that controls several aspects of the invocation of the interpreter. ``P`` - preserve-argv[0] Legacy behavior of binfmt_misc is to overwrite the original argv[0] with the full path to the binary. When this flag is included, binfmt_misc will add an argument to the argument vector for this purpose, thus preserving the original ``argv[0]``. ``O`` - open-binary Legacy behavior of binfmt_misc is to pass the full path of the binary to the interpreter as an argument. When this flag is included, binfmt_misc will open the file for reading and pass its descriptor as an argument ``C`` - credentials Currently, the behavior of binfmt_misc is to calculate the credentials and security token of the new process according to the interpreter. When this flag is included, these attributes are calculated according to the binary ``F`` - fix binary The usual behaviour of binfmt_misc is to spawn the binary lazily when the misc format file is invoked. However, this doesn't work very well in the face of mount namespaces and changeroots, so the ``F`` mode opens the binary as soon as the emulation is installed and uses the opened image to spawn the emulator"
+ *--monotonic* _offset_:: Set the offset of *CLOCK_MONOTONIC* which will be used in the entered time namespace. This option requires unsharing a time namespace with *--time*.@@ -256,6 +259,13 @@ up 21 hours, 30 minutesup 9 years, 28 weeks, 1 day, 2 hours, 50 minutes ....+The following example execute a chroot into the directory /chroot/powerpc/jessie and install the interpreter /bin/qemu-ppc-static to execute the powerpc binaries.+If the interpreter is defined with the flag F, the interpreter is loaded before the chroot otherwise the interpreter is loaded from inside the chroot. + +.... +$ unshare --map-root-user --fork --pid --load-interp=":qemu-ppc:M::\\x7fELF\x01\\x02\\x01\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x02\\x00\\x14:\\xff\\xff\\xff\\xff\\xff\\xff\\xff\\x00\\xff\\xff\\xff\\xff\\xff\\xff\\xff\\xff\\xff\\xfe\\xff\\xff:/bin/qemu-ppc-static:OCF" --root=/chroot/powerpc/jessie /bin/bash -l +....As an uneducated reader, I am confused by the flags. Where is the 'F' flag? Perhaps you could provide more explanation to make it easier for readers to understand.
I think this option should be used by educated user that is aware of binfmt_misc format. Do you want I copy a part of the binfmt_misc documentation in the unshare documentation?
== AUTHORSmailto:dottedmag@xxxxxxxxxxxxx[Mikhail Gusarov],diff --git a/sys-utils/unshare.c b/sys-utils/unshare.c index d79aa1125955..f8e1141840ca 100644 --- a/sys-utils/unshare.c +++ b/sys-utils/unshare.c @@ -725,6 +725,35 @@ static pid_t map_ids_from_child(int *fd, uid_t mapuser, exit(EXIT_SUCCESS); }+static int is_fixed(const char *interp)+{ + const char *flags; + + flags = strrchr(interp, ':'); + + return strchr(flags, 'F') != NULL; +} + +static void load_interp(const char *binfmt_mnt, const char *interp) +{ + int dirfd, fd; + + dirfd = open(binfmt_mnt, O_PATH | O_DIRECTORY); + if (dirfd < 0) + err(EXIT_FAILURE, _("cannot open %s"), binfmt_mnt); + + fd = openat(dirfd, "register", O_WRONLY); + if (fd < 0) + err(EXIT_FAILURE, _("cannot open %s/register"), binfmt_mnt); + + if (write_all(fd, interp, strlen(interp))) + err(EXIT_FAILURE, _("write failed %s/register"), binfmt_mnt); + + close(fd); + + close(dirfd); +} + static void __attribute__((__noreturn__)) usage(void) { FILE *out = stdout; @@ -772,6 +801,7 @@ static void __attribute__((__noreturn__)) usage(void) fputs(_(" -G, --setgid <gid> set gid in entered namespace\n"), out); fputs(_(" --monotonic <offset> set clock monotonic offset (seconds) in time namespaces\n"), out); fputs(_(" --boottime <offset> set clock boottime offset (seconds) in time namespaces\n"), out); + fputs(_(" -l, --load-interp <file> load binfmt definition in the namespace (implies --mount-binfmt)\n"), out);fputs(USAGE_SEPARATOR, out);fprintf(out, USAGE_HELP_OPTIONS(27)); @@ -830,6 +860,7 @@ int main(int argc, char *argv[]) { "wd", required_argument, NULL, 'w' }, { "monotonic", required_argument, NULL, OPT_MONOTONIC }, { "boottime", required_argument, NULL, OPT_BOOTTIME }, + { "load-interp", required_argument, NULL, 'l' }, { NULL, 0, NULL, 0 } };@@ -846,6 +877,7 @@ int main(int argc, char *argv[])const char *newroot = NULL; const char *newdir = NULL; pid_t pid_bind = 0, pid_idmap = 0; + const char *newinterp = NULL; pid_t pid = 0; #ifdef UL_HAVE_PIDFD int fd_parent_pid = -1; @@ -868,7 +900,7 @@ int main(int argc, char *argv[]) textdomain(PACKAGE); close_stdout_atexit();- while ((c = getopt_long(argc, argv, "+fhVmuinpCTUrR:w:S:G:c", longopts, NULL)) != -1) {+ while ((c = getopt_long(argc, argv, "+fhVmuinpCTUrR:w:S:G:cl:", longopts, NULL)) != -1) { switch (c) { case 'f': forkit = 1; @@ -1011,6 +1043,15 @@ int main(int argc, char *argv[]) boottime = strtos64_or_err(optarg, _("failed to parse boottime offset")); force_boottime = 1; break; + case 'l': + unshare_flags |= CLONE_NEWNS | CLONE_NEWUSER; + if (!binfmt_mnt) { + if (!procmnt) + procmnt = "/proc"; + binfmt_mnt = _PATH_PROC_BINFMT_MISC; + } + newinterp = optarg; + break;case 'h':usage(); @@ -1165,6 +1206,13 @@ int main(int argc, char *argv[]) if ((unshare_flags & CLONE_NEWNS) && propagation) set_propagation(propagation);+ if (newinterp && is_fixed(newinterp)) {+ if (mount("binfmt_misc", _PATH_PROC_BINFMT_MISC, "binfmt_misc", + MS_NOSUID|MS_NOEXEC|MS_NODEV, NULL) != 0) + err(EXIT_FAILURE, _("mount %s failed"), _PATH_PROC_BINFMT_MISC); + load_interp(_PATH_PROC_BINFMT_MISC, newinterp); + }If I understand correctly, using --load-interp with 'F' calls mount(binfmt_misc) twice: 1) before chroot 2) after chroot() and after mount(/proc) (implies --mount-binfmt and --mount-proc too)
Yes, it's needed before chroot to load the interpreter from the caller filesystem.it's not needed after the chroot in this case, it's only there for consistency to have it in the chroot as we asked it on the command line. I think it can be removed if you prefer.
I believe it would be helpful to include this information in the man page.
I'll update the man page accordingly. Thanks, Laurent
Karel