Am 01.05.19 um 20:18 schrieb Jeff King: > On Wed, May 01, 2019 at 07:45:05PM +0200, René Scharfe wrote: > >>> But since the performance is still not quite on par with `gzip`, I would >>> actually rather not, and really, just punt on that one, stating that >>> people interested in higher performance should use `pigz`. >> >> Here are my performance numbers for generating .tar.gz files again: OK, tried one more version, with pthreads (patch at the end). Also redid all measurements for better comparability; everything is faster now for some reason (perhaps due to a compiler update? clang version 7.0.1-8 now): master, using gzip(1): Benchmark #1: git archive --format=tgz HEAD Time (mean ± σ): 15.697 s ± 0.246 s [User: 19.213 s, System: 0.386 s] Range (min … max): 15.405 s … 16.103 s 10 runs using zlib sequentially: Benchmark #1: git archive --format=tgz HEAD Time (mean ± σ): 19.191 s ± 0.408 s [User: 19.091 s, System: 0.100 s] Range (min … max): 18.802 s … 19.877 s 10 runs using a gzip-lookalike: Benchmark #1: git archive --format=tgz HEAD Time (mean ± σ): 16.289 s ± 0.218 s [User: 19.485 s, System: 0.337 s] Range (min … max): 16.020 s … 16.555 s 10 runs using zlib with start_async: Benchmark #1: git archive --format=tgz HEAD Time (mean ± σ): 16.516 s ± 0.334 s [User: 20.282 s, System: 0.383 s] Range (min … max): 16.166 s … 17.283 s 10 runs using zlib in a separate thread (that's the new one): Benchmark #1: git archive --format=tgz HEAD Time (mean ± σ): 16.310 s ± 0.237 s [User: 20.075 s, System: 0.173 s] Range (min … max): 15.983 s … 16.790 s 10 runs > I think the start_async() one seems like a good option. It reclaims most > of the (wall-clock) performance, isn't very much code, and doesn't leave > any ugly user-visible traces. The pthreads numbers look a bit better still. The patch is huge though, because it duplicates almost everything. It was easier that way; a real patch series would extract functions that can be used both with static and allocated headers first, and keep everything in archive-tar.c. > I'd be fine to see it come later, though, on top of the patches Dscho is > sending. Even though changing to sequential zlib is technically a change > in behavior, the existing behavior wasn't really planned. And given the > wall-clock versus CPU time tradeoff, it's not entirely clear that one > solution is better than the other. The current behavior is not an accident; the synchronous method was rejected in 2009 because it was slower [1]. Redid the measurements with v1.6.5-rc0 and the old patch [2], but they would only compile with gcc (Debian 8.3.0-6) for me, so it's not directly comparable to the numbers above: v1.6.5-rc0: Benchmark #1: ../git/git-archive HEAD | gzip Time (mean ± σ): 16.051 s ± 0.486 s [User: 19.514 s, System: 0.341 s] Range (min … max): 15.416 s … 17.001 s 10 runs v1.6.5-rc0 + [2]: Benchmark #1: ../git/git-archive --format=tar.gz HEAD Time (mean ± σ): 19.684 s ± 0.374 s [User: 19.601 s, System: 0.060 s] Range (min … max): 19.082 s … 20.177 s 10 runs User time is still slightly higher, but the difference is in the noise. [1] http://public-inbox.org/git/4AAAC8CE.8020302@xxxxxxxxxxxxxx/ [2] http://public-inbox.org/git/4AA97B61.6030301@xxxxxxxxxxxxxx/ >>> And who knows, maybe nobody will complain at all about the performance? >> >> Probably. And popular tarballs would be cached anyway, I guess. > > At GitHub we certainly do cache the git-archive output. We'd also be > just fine with the sequential solution. We generally turn down > pack.threads to 1, and keep our CPUs busy by serving multiple users > anyway. > > So whatever has the lowest overall CPU time is generally preferable, but > the times are close enough that I don't think we'd care much either way > (and it's probably not worth having a config option or similar). Moving back to 2009 and reducing the number of utilized cores both feels weird, but the sequential solution *is* the most obvious, easiest and (by a narrow margin) lightest one if gzip(1) is not an option anymore. Anyway, the threading patch: --- Makefile | 1 + archive-tar.c | 11 +- archive-tgz.c | 452 ++++++++++++++++++++++++++++++++++++++++++++++++++ archive.h | 4 + 4 files changed, 465 insertions(+), 3 deletions(-) create mode 100644 archive-tgz.c diff --git a/Makefile b/Makefile index 8a7e235352..ed649ac18d 100644 --- a/Makefile +++ b/Makefile @@ -834,6 +834,7 @@ LIB_OBJS += alloc.o LIB_OBJS += apply.o LIB_OBJS += archive.o LIB_OBJS += archive-tar.o +LIB_OBJS += archive-tgz.o LIB_OBJS += archive-zip.o LIB_OBJS += argv-array.o LIB_OBJS += attr.o diff --git a/archive-tar.c b/archive-tar.c index 3e53aac1e6..929eb58235 100644 --- a/archive-tar.c +++ b/archive-tar.c @@ -15,7 +15,9 @@ static char block[BLOCKSIZE]; static unsigned long offset; -static int tar_umask = 002; +int tar_umask = 002; + +static const char internal_gzip[] = "git archive gzip"; static int write_tar_filter_archive(const struct archiver *ar, struct archiver_args *args); @@ -445,6 +447,9 @@ static int write_tar_filter_archive(const struct archiver *ar, if (!ar->data) BUG("tar-filter archiver called with no filter defined"); + if (!strcmp(ar->data, internal_gzip)) + return write_tgz_archive(ar, args); + strbuf_addstr(&cmd, ar->data); if (args->compression_level >= 0) strbuf_addf(&cmd, " -%d", args->compression_level); @@ -483,9 +488,9 @@ void init_tar_archiver(void) int i; register_archiver(&tar_archiver); - tar_filter_config("tar.tgz.command", "gzip -cn", NULL); + tar_filter_config("tar.tgz.command", internal_gzip, NULL); tar_filter_config("tar.tgz.remote", "true", NULL); - tar_filter_config("tar.tar.gz.command", "gzip -cn", NULL); + tar_filter_config("tar.tar.gz.command", internal_gzip, NULL); tar_filter_config("tar.tar.gz.remote", "true", NULL); git_config(git_tar_config, NULL); for (i = 0; i < nr_tar_filters; i++) { diff --git a/archive-tgz.c b/archive-tgz.c new file mode 100644 index 0000000000..ae219e1cc0 --- /dev/null +++ b/archive-tgz.c @@ -0,0 +1,452 @@ +#include "cache.h" +#include "config.h" +#include "tar.h" +#include "archive.h" +#include "object-store.h" +#include "streaming.h" + +#define RECORDSIZE (512) +#define BLOCKSIZE (RECORDSIZE * 20) + +static gzFile gzip; +static size_t offset; + +/* + * This is the max value that a ustar size header can specify, as it is fixed + * at 11 octal digits. POSIX specifies that we switch to extended headers at + * this size. + * + * Likewise for the mtime (which happens to use a buffer of the same size). + */ +#if ULONG_MAX == 0xFFFFFFFF +#define USTAR_MAX_SIZE ULONG_MAX +#else +#define USTAR_MAX_SIZE 077777777777UL +#endif +#if TIME_MAX == 0xFFFFFFFF +#define USTAR_MAX_MTIME TIME_MAX +#else +#define USTAR_MAX_MTIME 077777777777ULL +#endif + +static void tgz_write(const void *data, size_t size) +{ + const char *p = data; + while (size) { + size_t to_write = size; + if (to_write > UINT_MAX) + to_write = UINT_MAX; + if (gzwrite(gzip, p, to_write) != to_write) + die(_("gzwrite failed")); + p += to_write; + size -= to_write; + offset = (offset + to_write) % BLOCKSIZE; + } +} + +static void tgz_finish_record(void) +{ + size_t tail = offset % RECORDSIZE; + if (tail) { + size_t to_seek = RECORDSIZE - tail; + if (gzseek(gzip, to_seek, SEEK_CUR) < 0) + die(_("gzseek failed")); + offset = (offset + to_seek) % BLOCKSIZE; + } +} + +static void tgz_write_trailer(void) +{ + size_t to_seek = BLOCKSIZE - offset; + if (to_seek < 2 * RECORDSIZE) + to_seek += BLOCKSIZE; + if (gzseek(gzip, to_seek, SEEK_CUR) < 0) + die(_("gzseek failed")); + if (gzflush(gzip, Z_FINISH) != Z_OK) + die(_("gzflush failed")); +} + +struct work_item { + void *buffer; + size_t size; + int finish_record; +}; + +#define TODO_SIZE 64 +struct work_item todo[TODO_SIZE]; +static int todo_start; +static int todo_end; +static int todo_done; +static int all_work_added; +static pthread_mutex_t tar_mutex; +static pthread_t thread; + +static void tar_lock(void) +{ + pthread_mutex_lock(&tar_mutex); +} + +static void tar_unlock(void) +{ + pthread_mutex_unlock(&tar_mutex); +} + +static pthread_cond_t cond_add; +static pthread_cond_t cond_write; +static pthread_cond_t cond_result; + +static void add_work(void *buffer, size_t size, int finish_record) +{ + tar_lock(); + + while ((todo_end + 1) % ARRAY_SIZE(todo) == todo_done) + pthread_cond_wait(&cond_write, &tar_mutex); + + todo[todo_end].buffer = buffer; + todo[todo_end].size = size; + todo[todo_end].finish_record = finish_record; + + todo_end = (todo_end + 1) % ARRAY_SIZE(todo); + + pthread_cond_signal(&cond_add); + tar_unlock(); +} + +static struct work_item *get_work(void) +{ + struct work_item *ret = NULL; + + tar_lock(); + while (todo_start == todo_end && !all_work_added) + pthread_cond_wait(&cond_add, &tar_mutex); + + if (todo_start != todo_end || !all_work_added) { + ret = &todo[todo_start]; + todo_start = (todo_start + 1) % ARRAY_SIZE(todo); + } + tar_unlock(); + return ret; +} + +static void work_done(void) +{ + tar_lock(); + todo_done = (todo_done + 1) % ARRAY_SIZE(todo); + pthread_cond_signal(&cond_write); + + if (all_work_added && todo_done == todo_end) + pthread_cond_signal(&cond_result); + tar_unlock(); +} + +static void *run(void *arg) +{ + for (;;) { + struct work_item *w = get_work(); + if (!w) + break; + tgz_write(w->buffer, w->size); + free(w->buffer); + if (w->finish_record) + tgz_finish_record(); + work_done(); + } + return NULL; +} + +static void start_output_thread(void) +{ + int err; + + pthread_mutex_init(&tar_mutex, NULL); + pthread_cond_init(&cond_add, NULL); + pthread_cond_init(&cond_write, NULL); + pthread_cond_init(&cond_result, NULL); + + memset(todo, 0, sizeof(todo)); + + err = pthread_create(&thread, NULL, run, NULL); + if (err) + die(_("failed to create thread: %s"), strerror(err)); +} + +static void wait_for_output_thread(void) +{ + tar_lock(); + all_work_added = 1; + + while (todo_done != todo_end) + pthread_cond_wait(&cond_result, &tar_mutex); + + pthread_cond_broadcast(&cond_add); + tar_unlock(); + + pthread_join(thread, NULL); + + pthread_mutex_destroy(&tar_mutex); + pthread_cond_destroy(&cond_add); + pthread_cond_destroy(&cond_write); + pthread_cond_destroy(&cond_result); +} + +static int stream_blob(const struct object_id *oid) +{ + struct git_istream *st; + enum object_type type; + unsigned long sz; + ssize_t readlen; + size_t chunk_size = BLOCKSIZE * 10; + + st = open_istream(oid, &type, &sz, NULL); + if (!st) + return error(_("cannot stream blob %s"), oid_to_hex(oid)); + for (;;) { + char *buf = xmalloc(chunk_size); + readlen = read_istream(st, buf, chunk_size); + if (readlen <= 0) + break; + sz -= readlen; + add_work(buf, readlen, !sz); + } + close_istream(st); + return readlen; +} + +/* + * pax extended header records have the format "%u %s=%s\n". %u contains + * the size of the whole string (including the %u), the first %s is the + * keyword, the second one is the value. This function constructs such a + * string and appends it to a struct strbuf. + */ +static void strbuf_append_ext_header(struct strbuf *sb, const char *keyword, + const char *value, unsigned int valuelen) +{ + int len, tmp; + + /* "%u %s=%s\n" */ + len = 1 + 1 + strlen(keyword) + 1 + valuelen + 1; + for (tmp = len; tmp > 9; tmp /= 10) + len++; + + strbuf_grow(sb, len); + strbuf_addf(sb, "%u %s=", len, keyword); + strbuf_add(sb, value, valuelen); + strbuf_addch(sb, '\n'); +} + +/* + * Like strbuf_append_ext_header, but for numeric values. + */ +static void strbuf_append_ext_header_uint(struct strbuf *sb, + const char *keyword, + uintmax_t value) +{ + char buf[40]; /* big enough for 2^128 in decimal, plus NUL */ + int len; + + len = xsnprintf(buf, sizeof(buf), "%"PRIuMAX, value); + strbuf_append_ext_header(sb, keyword, buf, len); +} + +static unsigned int ustar_header_chksum(const struct ustar_header *header) +{ + const unsigned char *p = (const unsigned char *)header; + unsigned int chksum = 0; + while (p < (const unsigned char *)header->chksum) + chksum += *p++; + chksum += sizeof(header->chksum) * ' '; + p += sizeof(header->chksum); + while (p < (const unsigned char *)header + sizeof(struct ustar_header)) + chksum += *p++; + return chksum; +} + +static size_t get_path_prefix(const char *path, size_t pathlen, size_t maxlen) +{ + size_t i = pathlen; + if (i > 1 && path[i - 1] == '/') + i--; + if (i > maxlen) + i = maxlen; + do { + i--; + } while (i > 0 && path[i] != '/'); + return i; +} + +static void prepare_header(struct archiver_args *args, + struct ustar_header *header, + unsigned int mode, unsigned long size) +{ + xsnprintf(header->mode, sizeof(header->mode), "%07o", mode & 07777); + xsnprintf(header->size, sizeof(header->size), "%011"PRIoMAX , S_ISREG(mode) ? (uintmax_t)size : (uintmax_t)0); + xsnprintf(header->mtime, sizeof(header->mtime), "%011lo", (unsigned long) args->time); + + xsnprintf(header->uid, sizeof(header->uid), "%07o", 0); + xsnprintf(header->gid, sizeof(header->gid), "%07o", 0); + strlcpy(header->uname, "root", sizeof(header->uname)); + strlcpy(header->gname, "root", sizeof(header->gname)); + xsnprintf(header->devmajor, sizeof(header->devmajor), "%07o", 0); + xsnprintf(header->devminor, sizeof(header->devminor), "%07o", 0); + + memcpy(header->magic, "ustar", 6); + memcpy(header->version, "00", 2); + + xsnprintf(header->chksum, sizeof(header->chksum), "%07o", ustar_header_chksum(header)); +} + +static void write_extended_header(struct archiver_args *args, + const struct object_id *oid, + struct strbuf *extended_header) +{ + size_t size; + char *buffer = strbuf_detach(extended_header, &size); + struct ustar_header *header = xcalloc(1, sizeof(*header)); + unsigned int mode; + *header->typeflag = TYPEFLAG_EXT_HEADER; + mode = 0100666; + xsnprintf(header->name, sizeof(header->name), "%s.paxheader", + oid_to_hex(oid)); + prepare_header(args, header, mode, size); + add_work(header, sizeof(*header), 1); + add_work(buffer, size, 1); +} + +static int write_tar_entry(struct archiver_args *args, + const struct object_id *oid, + const char *path, size_t pathlen, + unsigned int mode) +{ + struct ustar_header *header = xcalloc(1, sizeof(*header)); + struct strbuf ext_header = STRBUF_INIT; + unsigned int old_mode = mode; + unsigned long size, size_in_header; + void *buffer; + int err = 0; + + if (S_ISDIR(mode) || S_ISGITLINK(mode)) { + *header->typeflag = TYPEFLAG_DIR; + mode = (mode | 0777) & ~tar_umask; + } else if (S_ISLNK(mode)) { + *header->typeflag = TYPEFLAG_LNK; + mode |= 0777; + } else if (S_ISREG(mode)) { + *header->typeflag = TYPEFLAG_REG; + mode = (mode | ((mode & 0100) ? 0777 : 0666)) & ~tar_umask; + } else { + return error(_("unsupported file mode: 0%o (SHA1: %s)"), + mode, oid_to_hex(oid)); + } + if (pathlen > sizeof(header->name)) { + size_t plen = get_path_prefix(path, pathlen, + sizeof(header->prefix)); + size_t rest = pathlen - plen - 1; + if (plen > 0 && rest <= sizeof(header->name)) { + memcpy(header->prefix, path, plen); + memcpy(header->name, path + plen + 1, rest); + } else { + xsnprintf(header->name, sizeof(header->name), "%s.data", + oid_to_hex(oid)); + strbuf_append_ext_header(&ext_header, "path", + path, pathlen); + } + } else + memcpy(header->name, path, pathlen); + + if (S_ISREG(mode) && !args->convert && + oid_object_info(args->repo, oid, &size) == OBJ_BLOB && + size > big_file_threshold) + buffer = NULL; + else if (S_ISLNK(mode) || S_ISREG(mode)) { + enum object_type type; + buffer = object_file_to_archive(args, path, oid, old_mode, &type, &size); + if (!buffer) + return error(_("cannot read %s"), oid_to_hex(oid)); + } else { + buffer = NULL; + size = 0; + } + + if (S_ISLNK(mode)) { + if (size > sizeof(header->linkname)) { + xsnprintf(header->linkname, sizeof(header->linkname), + "see %s.paxheader", oid_to_hex(oid)); + strbuf_append_ext_header(&ext_header, "linkpath", + buffer, size); + } else + memcpy(header->linkname, buffer, size); + } + + size_in_header = size; + if (S_ISREG(mode) && size > USTAR_MAX_SIZE) { + size_in_header = 0; + strbuf_append_ext_header_uint(&ext_header, "size", size); + } + + prepare_header(args, header, mode, size_in_header); + + if (ext_header.len > 0) { + write_extended_header(args, oid, &ext_header); + } + add_work(header, sizeof(*header), 1); + if (S_ISREG(mode) && size > 0) { + if (buffer) + add_work(buffer, size, 1); + else + err = stream_blob(oid); + } else + free(buffer); + return err; +} + +static void write_global_extended_header(struct archiver_args *args) +{ + const struct object_id *oid = args->commit_oid; + struct strbuf ext_header = STRBUF_INIT; + struct ustar_header *header = xcalloc(1, sizeof(*header)); + unsigned int mode; + size_t size; + char *buffer; + + if (oid) + strbuf_append_ext_header(&ext_header, "comment", + oid_to_hex(oid), + the_hash_algo->hexsz); + if (args->time > USTAR_MAX_MTIME) { + strbuf_append_ext_header_uint(&ext_header, "mtime", + args->time); + args->time = USTAR_MAX_MTIME; + } + + if (!ext_header.len) + return; + + buffer = strbuf_detach(&ext_header, &size); + *header->typeflag = TYPEFLAG_GLOBAL_HEADER; + mode = 0100666; + xsnprintf(header->name, sizeof(header->name), "pax_global_header"); + prepare_header(args, header, mode, size); + add_work(header, sizeof(*header), 1); + add_work(buffer, size, 1); +} + +int write_tgz_archive(const struct archiver *ar, struct archiver_args *args) +{ + int level = args->compression_level; + int err = 0; + + gzip = gzdopen(1, "wb"); + if (!gzip) + return error(_("gzdopen failed")); + if (gzsetparams(gzip, level, Z_DEFAULT_STRATEGY) != Z_OK) + return error(_("unable to set compression level %d"), level); + + start_output_thread(); + write_global_extended_header(args); + err = write_archive_entries(args, write_tar_entry); + if (err) + return err; + wait_for_output_thread(); + tgz_write_trailer(); + return err; +} diff --git a/archive.h b/archive.h index e60e3dd31c..b00afa1a9f 100644 --- a/archive.h +++ b/archive.h @@ -45,6 +45,10 @@ void init_tar_archiver(void); void init_zip_archiver(void); void init_archivers(void); +int tar_umask; + +int write_tgz_archive(const struct archiver *ar, struct archiver_args *args); + typedef int (*write_archive_entry_fn_t)(struct archiver_args *args, const struct object_id *oid, const char *path, size_t pathlen, -- 2.22.0