Re: [PATCH] Please make the output of cache files reproducible

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Akira,

> Please mention about SOURCE_DATE_EPOCH at the section of the
> Environment variables in doc/fontconfig-user.sgml

Whoops, I somehow forgot this bit. Updated patch attached:

  commit 9213848ca27ccc3587a1a60539ec7c02fb02016f
  Author: Chris Lamb <chris@xxxxxxxxxxxxxxxx>
  Date:   Tue May 15 22:11:24 2018 +0200
  
      Ensure cache checksums are deterministic
      
      Whilst working on the Reproducible Builds[0] effort, we noticed that
      fontconfig generates unreproducible cache files.
      
      This is due to fc-cache uses the modification timestamps of each
      directory in the "checksum" and "checksum_nano" members of the _FcCache
      struct. This is so that it can identify which cache files are valid
      and/or require regeneration.
      
      This patch changes the behaviour of the checksum calculations to prefer
      the value of the SOURCE_DATE_EPOCH[1] environment variable over the
      directory's own mtime. This variable can then be exported by build
      systems to ensure reproducible output.
      
      If SOURCE_DATE_EPOCH is not set or is newer than the mtime of the
      directory, the existing behaviour is unchanged.
      
      This work was sponsored by Tails[2].
      
       [0] https://reproducible-builds.org/
       [1] https://reproducible-builds.org/specs/source-date-epoch/
       [2] https://tails.boum.org/
  
   doc/fontconfig-user.sgml |  6 ++++-
   src/fccache.c            | 58 +++++++++++++++++++++++++++++++++++++++++++-----
   2 files changed, 57 insertions(+), 7 deletions(-)


You can also merge from the "864082-FcConfigGetFontDirs" branch of
https://github.com/lamby/fontconfig if that is more convenient.


Best wishes,

-- 
      ,''`.
     : :'  :     Chris Lamb
     `. `'`      lamby@xxxxxxxxxx / chris-lamb.co.uk
       `-
From 9213848ca27ccc3587a1a60539ec7c02fb02016f Mon Sep 17 00:00:00 2001
From: Chris Lamb <chris@xxxxxxxxxxxxxxxx>
Date: Tue, 15 May 2018 22:11:24 +0200
Subject: [PATCH] Ensure cache checksums are deterministic

Whilst working on the Reproducible Builds[0] effort, we noticed that
fontconfig generates unreproducible cache files.

This is due to fc-cache uses the modification timestamps of each
directory in the "checksum" and "checksum_nano" members of the _FcCache
struct. This is so that it can identify which cache files are valid
and/or require regeneration.

This patch changes the behaviour of the checksum calculations to prefer
the value of the SOURCE_DATE_EPOCH[1] environment variable over the
directory's own mtime. This variable can then be exported by build
systems to ensure reproducible output.

If SOURCE_DATE_EPOCH is not set or is newer than the mtime of the
directory, the existing behaviour is unchanged.

This work was sponsored by Tails[2].

 [0] https://reproducible-builds.org/
 [1] https://reproducible-builds.org/specs/source-date-epoch/
 [2] https://tails.boum.org/
---
 doc/fontconfig-user.sgml |  6 ++++-
 src/fccache.c            | 58 +++++++++++++++++++++++++++++++++++-----
 2 files changed, 57 insertions(+), 7 deletions(-)

diff --git a/doc/fontconfig-user.sgml b/doc/fontconfig-user.sgml
index 43ac957..d40d60b 100644
--- a/doc/fontconfig-user.sgml
+++ b/doc/fontconfig-user.sgml
@@ -798,10 +798,14 @@ is used to specify the default language as the weak binding in the query. if thi
 <emphasis>FONTCONFIG_USE_MMAP</emphasis>
 is used to control the use of mmap(2) for the cache files if available. this take a boolean value. fontconfig will checks if the cache files are stored on the filesystem that is safe to use mmap(2). explicitly setting this environment variable will causes skipping this check and enforce to use or not use mmap(2) anyway.
   </para>
+  <para>
+<emphasis>SOURCE_DATE_EPOCH</emphasis>
+is used to ensure <literal>fc-cache(1)</literal> generates files in a deterministic manner in order to support reproducible builds. When set to a numeric representation of UNIX timestamp, fontconfig will prefer this value over using the modification timestamps of the input files in order to identify which cache files require regeneration. If <literal>SOURCE_DATE_EPOCH</literal> is not set (or is newer than the mtime of the directory), the existing behaviour is unchanged.
+  </para>
 </refsect1>
 <refsect1><title>See Also</title>
   <para>
-fc-cat(1), fc-cache(1), fc-list(1), fc-match(1), fc-query(1)
+fc-cat(1), fc-cache(1), fc-list(1), fc-match(1), fc-query(1), <ulink url="https://reproducible-builds.org/specs/source-date-epoch/";>SOURCE_DATE_EPOCH</ulink>.
   </para>
 </refsect1>
 <refsect1><title>Version</title>
diff --git a/src/fccache.c b/src/fccache.c
index 7abb750..6318135 100644
--- a/src/fccache.c
+++ b/src/fccache.c
@@ -989,6 +989,54 @@ FcDirCacheLoadFile (const FcChar8 *cache_file, struct stat *file_stat)
     return cache;
 }
 
+static int
+FcDirChecksum (struct stat *statb)
+{
+    int			ret = (int) statb->st_mtime;
+    char		*endptr;
+    char		*source_date_epoch;
+    unsigned long long	epoch;
+
+    source_date_epoch = getenv("SOURCE_DATE_EPOCH");
+    if (source_date_epoch) {
+	epoch = strtoull(source_date_epoch, &endptr, 10);
+
+	if (endptr == source_date_epoch)
+	    fprintf (stderr,
+		     "Fontconfig: SOURCE_DATE_EPOCH invalid\n");
+	else if ((errno == ERANGE && (epoch == ULLONG_MAX || epoch == 0))
+		|| (errno != 0 && epoch == 0))
+	    fprintf (stderr,
+		     "Fontconfig: SOURCE_DATE_EPOCH: strtoull: %s: %llu\n",
+		     strerror(errno), epoch);
+	else if (*endptr != '\0')
+	    fprintf (stderr,
+		     "Fontconfig: SOURCE_DATE_EPOCHh has trailing garbage\n");
+	else if (epoch > ULONG_MAX)
+	    fprintf (stderr,
+		     "Fontconfig: SOURCE_DATE_EPOCH must be <= %lu but saw: %llu\n",
+		     ULONG_MAX, epoch);
+	else if (epoch < ret)
+	    /* Only override if directory is newer */
+	    ret = (int) epoch;
+    }
+
+    return ret;
+}
+
+static int64_t
+FcDirChecksumNano (struct stat *statb)
+{
+#ifdef HAVE_STRUCT_STAT_ST_MTIM
+    /* No nanosecond component to parse */
+    if (getenv("SOURCE_DATE_EPOCH"))
+	return 0;
+    return statb->st_mtim.tv_nsec;
+#else
+    return 0;
+#endif
+}
+
 /*
  * Validate a cache file by reading the header and checking
  * the magic number and the size field
@@ -1007,10 +1055,10 @@ FcDirCacheValidateHelper (FcConfig *config, int fd, struct stat *fd_stat, struct
 	ret = FcFalse;
     else if (fd_stat->st_size != c.size)
 	ret = FcFalse;
-    else if (c.checksum != (int) dir_stat->st_mtime)
+    else if (c.checksum != FcDirChecksum (dir_stat))
 	ret = FcFalse;
 #ifdef HAVE_STRUCT_STAT_ST_MTIM
-    else if (c.checksum_nano != dir_stat->st_mtim.tv_nsec)
+    else if (c.checksum_nano != FcDirChecksumNano (dir_stat))
 	ret = FcFalse;
 #endif
     return ret;
@@ -1086,10 +1134,8 @@ FcDirCacheBuild (FcFontSet *set, const FcChar8 *dir, struct stat *dir_stat, FcSt
     cache->magic = FC_CACHE_MAGIC_ALLOC;
     cache->version = FC_CACHE_VERSION_NUMBER;
     cache->size = serialize->size;
-    cache->checksum = (int) dir_stat->st_mtime;
-#ifdef HAVE_STRUCT_STAT_ST_MTIM
-    cache->checksum_nano = dir_stat->st_mtim.tv_nsec;
-#endif
+    cache->checksum = FcDirChecksum (dir_stat);
+    cache->checksum_nano = FcDirChecksumNano (dir_stat);
 
     /*
      * Serialize directory name
-- 
2.17.0

_______________________________________________
Fontconfig mailing list
Fontconfig@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/fontconfig

[Index of Archives]     [Fedora Fonts]     [Fedora Users]     [Fedora Cloud]     [Kernel]     [Fedora Packaging]     [Fedora Desktop]     [PAM]     [Gimp Graphics Editor]     [Yosemite News]

  Powered by Linux