Re: [PATCH v2 1/1] mke2fs: the -d option can now handle tarball input

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Aug 12, 2023, at 9:02 AM, Johannes Schauer Marin Rodrigues <josch@xxxxxxxxxxxxxxxx> wrote:
> 
> If archive.h is available during compilation, enable mke2fs to read a
> tarball as input. Since libarchive.so.13 is opened with dlopen,
> libarchive is not a hard library dependency of the resulting binary.
> 
> This enables the creation of filesystems containing files which would
> otherwise need superuser privileges to create (like device nodes, which
> are also not allowed in unshared user namespaces). By reading from
> standard input when the filename is a dash (-), mke2fs can be used as
> part of a shell pipeline without temporary files.
> 
> A round-trip from tarball to ext4 to tarball yields bit-by-bit identical
> results.
> 
> Signed-off-by: Johannes Schauer Marin Rodrigues <josch@xxxxxxxxxxxxxxxx>

Hi Johannes,
thanks for the updated patch.  Review comments inline.

I think this looks useful, and the code is now isolated enough that it
could be easily managed in the future.  It is looking in reasonable shape,
but could use another round of updates to make the patch a bit cleaner.

The big question is whether Ted will accept it.

Cheers, Andreas


> diff --git a/MCONFIG.in b/MCONFIG.in
> index 82c75a28..cb3ec759 100644
> --- a/MCONFIG.in
> +++ b/MCONFIG.in
> @@ -141,6 +141,7 @@ LIBFUSE = @FUSE_LIB@
> LIBSUPPORT = $(LIBINTL) $(LIB)/libsupport@STATIC_LIB_EXT@
> LIBBLKID = @LIBBLKID@ @PRIVATE_LIBS_CMT@ $(LIBUUID)
> LIBINTL = @LIBINTL@
> +LIBARCHIVE = @ARCHIVE_LIB@
> SYSLIBS = @LIBS@ @PTHREAD_LIBS@
> DEPLIBSS = $(LIB)/libss@LIB_EXT@
> DEPLIBCOM_ERR = $(LIB)/libcom_err@LIB_EXT@
> diff --git a/configure b/configure
> index 72c39b4d..27b97c3b 100755
> --- a/configure
> +++ b/configure
> @@ -704,6 +704,7 @@ SEM_INIT_LIB
> FUSE_CMT
> FUSE_LIB
> CLOCK_GETTIME_LIB
> +ARCHIVE_LIB
> MAGIC_LIB
> SOCKET_LIB
> SIZEOF_TIME_T

I think this part of the configure change is correct, to add ARCHIVE_LIB.

> @@ -8678,7 +8679,7 @@ else $as_nop
>       *-*-aix*)
>         cat confdefs.h - <<_ACEOF >conftest.$ac_ext
> /* end confdefs.h.  */
> -#if defined __powerpc64__ || defined __LP64__
> +#if defined __powerpc64__ || defined _ARCH_PPC64
>                 int ok;
>                #else
>                 error fail
> @@ -8969,7 +8970,7 @@ rm -f core conftest.err conftest.$ac_objext conftest.beam conftest.$ac_ext
>            # be generating 64-bit code.
>            cat confdefs.h - <<_ACEOF >conftest.$ac_ext
> /* end confdefs.h.  */
> -#if defined __powerpc64__ || defined __LP64__
> +#if defined __powerpc64__ || defined _ARCH_PPC64
>                    int ok;
>                   #else
>                    error fail

These parts of the change is specific to your system and should probably
be removed from the patch.

[snip some similar random changes]

> @@ -13539,6 +13522,57 @@ if test "$ac_cv_func_dlopen" = yes ; then
>    MAGIC_LIB=$DLOPEN_LIB
> fi
> 
> +{ printf "%s\n" "$as_me:${as_lineno-$LINENO}: checking for archive_read_new in -larchive" >&5
> +printf %s "checking for archive_read_new in -larchive... " >&6; }
> +if test ${ac_cv_lib_archive_archive_read_new+y}
> +then :
> +  printf %s "(cached) " >&6
> +else $as_nop
> +  ac_check_lib_save_LIBS=$LIBS
> +LIBS="-larchive  $LIBS"
> +cat confdefs.h - <<_ACEOF >conftest.$ac_ext
> +/* end confdefs.h.  */
> +
> +/* Override any GCC internal prototype to avoid an error.
> +   Use char because int might match the return type of a GCC
> +   builtin and then its argument prototype would still apply.  */
> +char archive_read_new ();
> +int
> +main (void)
> +{
> +return archive_read_new ();
> +  ;
> +  return 0;
> +}
> +_ACEOF
> +if ac_fn_c_try_link "$LINENO"
> +then :
> +  ac_cv_lib_archive_archive_read_new=yes
> +else $as_nop
> +  ac_cv_lib_archive_archive_read_new=no
> +fi
> +rm -f core conftest.err conftest.$ac_objext conftest.beam \
> +    conftest$ac_exeext conftest.$ac_ext
> +LIBS=$ac_check_lib_save_LIBS
> +fi
> +{ printf "%s\n" "$as_me:${as_lineno-$LINENO}: result: $ac_cv_lib_archive_archive_read_new" >&5
> +printf "%s\n" "$ac_cv_lib_archive_archive_read_new" >&6; }
> +if test "x$ac_cv_lib_archive_archive_read_new" = xyes
> +then :
> +  ARCHIVE_LIB=-larchive
> +ac_fn_c_check_header_compile "$LINENO" "archive.h" "ac_cv_header_archive_h" "$ac_includes_default"
> +if test "x$ac_cv_header_archive_h" = xyes
> +then :
> +  printf "%s\n" "#define HAVE_ARCHIVE_H 1" >>confdefs.h
> +
> +fi
> +
> +fi
> +
> +if test "$ac_cv_func_dlopen" = yes ; then
> +   ARCHIVE_LIB=$DLOPEN_LIB
> +fi
> +
> { printf "%s\n" "$as_me:${as_lineno-$LINENO}: checking for clock_gettime in -lrt" >&5
> printf %s "checking for clock_gettime in -lrt... " >&6; }
> if test ${ac_cv_lib_rt_clock_gettime+y}

This is the part specific to the ARCHIVE_LIB check, which is fine.

> @@ -14803,11 +14837,11 @@ if test x$ac_prog_cxx_stdcxx = xno
> then :
>   { printf "%s\n" "$as_me:${as_lineno-$LINENO}: checking for $CXX option to enable C++11 features" >&5
> printf %s "checking for $CXX option to enable C++11 features... " >&6; }
> -if test ${ac_cv_prog_cxx_11+y}
> +if test ${ac_cv_prog_cxx_cxx11+y}
> then :
>   printf %s "(cached) " >&6
> else $as_nop
> -  ac_cv_prog_cxx_11=no
> +  ac_cv_prog_cxx_cxx11=no
> ac_save_CXX=$CXX
> cat confdefs.h - <<_ACEOF >conftest.$ac_ext
> /* end confdefs.h.  */
> @@ -14849,11 +14883,11 @@ if test x$ac_prog_cxx_stdcxx = xno
> then :
>   { printf "%s\n" "$as_me:${as_lineno-$LINENO}: checking for $CXX option to enable C++98 features" >&5
> printf %s "checking for $CXX option to enable C++98 features... " >&6; }
> -if test ${ac_cv_prog_cxx_98+y}
> +if test ${ac_cv_prog_cxx_cxx98+y}
> then :
>   printf %s "(cached) " >&6
> else $as_nop
> -  ac_cv_prog_cxx_98=no
> +  ac_cv_prog_cxx_cxx98=no
> ac_save_CXX=$CXX
> cat confdefs.h - <<_ACEOF >conftest.$ac_ext
> /* end confdefs.h.  */
> @@ -15193,7 +15227,7 @@ fi
> 
> 
> if test $pkg_failed = yes; then
> -   	{ printf "%s\n" "$as_me:${as_lineno-$LINENO}: result: no" >&5
> +        { printf "%s\n" "$as_me:${as_lineno-$LINENO}: result: no" >&5
> printf "%s\n" "no" >&6; }
> 
> if $PKG_CONFIG --atleast-pkgconfig-version 0.20; then

These are again specific to your system or configure version and should
be removed from the patch.

[snip similar random changes and boilerplate Makefile changes]

> diff --git a/misc/create_inode.c b/misc/create_inode.c
> index a3a34cd9..6382f17e 100644
> --- a/misc/create_inode.c
> +++ b/misc/create_inode.c
> @@ -1069,14 +1067,49 @@ errcode_t populate_fs2(ext2_filsys fs, ext2_ino_t parent_ino,
> 	file_info.path_max_len = 255;
> 	file_info.path = calloc(file_info.path_max_len, 1);
> 
> -	retval = set_inode_xattr(fs, root, source_dir);
> +	/* interpret input as tarball either if it's "-" (stdin) or if it's
> +	 * a regular file (or a symlink pointing to a regular file)
> +	 */
> +	if (strcmp(source, "-") == 0) {
> +#ifdef HAVE_ARCHIVE_H
> +		retval = __populate_fs_from_tar(fs, parent_ino, NULL, root, &hdlinks,
> +					   &file_info, fs_callbacks);
> +#else
> +		com_err(__func__, 0,
> +			_("you need to compile e2fsprogs with libarchive to "
> +			  "be able to process tarballs"));
> +		retval = 1;
> +#endif

Rather than having an inline #ifdef here, this could be structured like
the following in create_file_libarchive.c:

#ifdef HAVE_ARCHIVE_H
errcode_t __populate_fs_from_tar(...)
{
	:
	[ proper implementation ]
}
#else
errcode_t __populate_fs_from_tar(...)
{
	com_err(__func__, 0, _("you need to compile ..."));

	return 1;
}
#endif

and then the code here only calls:

		retval = __populate_fs_from_tar(fs, ...);

and the code is not complicated depending on if it is compiled in or not.

> +	if (S_ISREG(st.st_mode)) {
> +#ifdef HAVE_ARCHIVE_H
> +		retval = __populate_fs_from_tar(fs, parent_ino, source, root, &hdlinks,
> +					   &file_info, fs_callbacks);
> +#else
> +		com_err(__func__, 0,
> +			_("you need to compile e2fsprogs with libarchive to be "
> +			  "able to process tarballs"));
> +		retval = 1;
> +#endif

This would be handled similarly.

> diff --git a/misc/create_inode_libarchive.c b/misc/create_inode_libarchive.c
> new file mode 100644
> index 00000000..26c3b4dc
> --- /dev/null
> +++ b/misc/create_inode_libarchive.c
> @@ -0,0 +1,677 @@
> +/*
> + * create_inode_libarchive.c --- create an inode from libarchive input
> + *
> + * Copyright (C) 2023 Johannes Schauer Marin Rodrigues <josch@xxxxxxxxxx>
> + *
> + * %Begin-Header%
> + * This file may be redistributed under the terms of the GNU library
> + * General Public License, version 2.
> + * %End-Header%
> + */
> +
> +#define _FILE_OFFSET_BITS 64
> +#define _LARGEFILE64_SOURCE 1
> +#define _GNU_SOURCE 1
> +
> +#include "config.h"
> +
> +#ifdef HAVE_ARCHIVE_H
> +
> +#include <ext2fs/ext2_types.h>
> +
> +#include "create_inode.h"
> +#include "support/nls-enable.h"
> +
> +/* 64KiB is the minimum blksize to best minimize system call overhead. */
> +#define COPY_FILE_BUFLEN 65536

If we expect larger files, having a 1MB or 16MB buffer is not outrageous
these days, and would probably improve the performance noticeably.

[snip]

> +/* Rounds qty up to a multiple of siz. siz should be a power of 2
> + */
> +static inline unsigned int __rndup(unsigned int qty, unsigned int siz)

I'm sure we can afford a few extra "e" here and write "size", and write
round_up() to make this code more readable...

> +{
> +	return (qty + (siz - 1)) & ~(siz - 1);
> +}
> +

[snip]

> +errcode_t __populate_fs_from_tar(ext2_filsys fs, ext2_ino_t root_ino,
> +				 const char *source_tar, ext2_ino_t root,
> +				 struct hdlinks_s *hdlinks,
> +				 struct file_info *target,
> +				 struct fs_ops_callbacks *fs_callbacks)
> +{
> +	char *path2, *path3, *dir, *name;
> +	unsigned int uid, gid, mode;
> +	unsigned long ctime, mtime;
> +	struct archive *a;
> +	struct archive_entry *entry;
> +	errcode_t retval = 0;
> +	locale_t archive_locale;
> +	locale_t old_locale;
> +	ext2_ino_t dirinode, tmpino;
> +	const struct stat *st;
> +
> +	if (!libarchive_available()) {
> +		com_err(__func__, 0,
> +			_("you need libarchive to be able to process tarballs"));
> +		return 1;
> +	}

[snip]

> +out:
> +	dl_archive_read_close(a);
> +	dl_archive_read_free(a);
> +	uselocale(old_locale);
> +	freelocale(archive_locale);
> +	return retval;
> +}

The "not supported" code would go here:

#else

errcode_t __populate_fs_from_tar(...)
{
	com_err(__func__, 0, _("you need to compile ..."));

	return 1;
}

> +#endif

[snip]


Cheers, Andreas





Attachment: signature.asc
Description: Message signed with OpenPGP


[Index of Archives]     [Reiser Filesystem Development]     [Ceph FS]     [Kernel Newbies]     [Security]     [Netfilter]     [Bugtraq]     [Linux FS]     [Yosemite National Park]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Samba]     [Device Mapper]     [Linux Media]

  Powered by Linux