From: Lars Schneider <larsxschneider@xxxxxxxxx> The goal of this series is to avoid launching a new clean/smudge filter process for each file that is filtered. A short summary about v1 to v5 can be found here: https://git.github.io/rev_news/2016/08/17/edition-18/ This series is also published on web: https://github.com/larsxschneider/git/pull/11 Thanks a lot to Stefan, Junio, Jakub (Narebski and Keller), Peff, and Torsten for very helpful reviews, Lars ## Major changes since v6 * series rebased to Git v2.10.0 * simplified and improved multi-packet test * v2 filter takes precedence over v1 filter * rename "error-all" to "abort" * patch 07/13 "pack-protocol: fix maximum pkt-line size" submitted separately: http://public-inbox.org/git/20160829175509.69050-1-larsxschneider@xxxxxxxxx/ * removed "10/13 convert: generate large test files only once" * patch 13/13 "read-cache: make sure file handles are not inherited by child processes" moved to an own series: http://public-inbox.org/git/20160905211111.72956-1-larsxschneider@xxxxxxxxx/ ## All changes since v6 ### Stefan * http://public-inbox.org/git/CAGZ79kaJn-yf28+Ls2JyqSs6Jt-Oj2JW-sAXQn-Oe5xaxSyjMQ@xxxxxxxxxxxxxx/ * submit patch "pack-protocol: fix maximum pkt-line size" separately: http://public-inbox.org/git/20160829175509.69050-1-larsxschneider@xxxxxxxxx/ * http://public-inbox.org/git/CAGZ79kbEoYVM_+MWkajT2pHN1OKZXATe5d+Dv_uSJ7dyPGxUeg@xxxxxxxxxxxxxx/ * trace failures in new *_gently() pkt-line functions ### Junio * http://public-inbox.org/git/xmqq8tvkle6o.fsf@xxxxxxxxxxxxxxxxxxxxxxxxxxx/ * share code between packet_write_fmt() and packet_write_fmt_gently() * http://public-inbox.org/git/xmqq4m68ldrg.fsf@xxxxxxxxxxxxxxxxxxxxxxxxxxx/ * remove "Second, it will..." from "pkt-line: add packet_write_gently()" commit message * use separate buffer in write_packetized_from_fd() * fix write order in packet_write_gently() * http://public-inbox.org/git/xmqqzio0jxh7.fsf@xxxxxxxxxxxxxxxxxxxxxxxxxxx/ * rename read/write flush terminated packet streams functions * remove meaningless "bytes_to_write > ..." check * reuse packet_read() function in read_packetized_to_buf() * http://public-inbox.org/git/xmqqk2f3fgbd.fsf@xxxxxxxxxxxxxxxxxxxxxxxxxxx/ * revert test_config replacement in test setup * http://public-inbox.org/git/xmqq37lncvj6.fsf@xxxxxxxxxxxxxxxxxxxxxxxxxxx/ * improve multi-packet test * http://public-inbox.org/git/xmqqzinv6wtg.fsf@xxxxxxxxxxxxxxxxxxxxxxxxxxx/ * give v2 filter precedence over v1 filter * http://public-inbox.org/git/xmqqmvjt3nht.fsf@xxxxxxxxxxxxxxxxxxxxxxxxxxx/ * explain that "error" behavior mimics v1 filter * explain that Git closes the pipe on exit to allow the filter a graceful exit ### Jakub * http://public-inbox.org/git/7fbd02a1-ad62-7391-df2a-835aae8a62a7@xxxxxxxxx/ * rename "error-all" to "abort" * add a simple Perl v2 filter example in contrib ## Interdiff (v6..v7) diff --git a/Documentation/gitattributes.txt b/Documentation/gitattributes.txt index df059e9..ac000ea 100644 --- a/Documentation/gitattributes.txt +++ b/Documentation/gitattributes.txt @@ -470,20 +470,11 @@ packet: git< status=error\n packet: git< 0000 ------------------------ -In case the filter cannot or does not want to process the content -as well as any future content for the lifetime of the Git process, -it is expected to respond with an "error-all" status. Depending on -the `filter.<driver>.required` flag Git will interpret that as error -but it will not stop or restart the filter process. ------------------------- -packet: git< status=error-all\n -packet: git< 0000 ------------------------- - If the filter experiences an error during processing, then it can -send the status "error". Depending on the `filter.<driver>.required` -flag Git will interpret that as error but it will not stop or restart -the filter process. +send the status "error" after the content was (partially or +completely) sent. Depending on the `filter.<driver>.required` flag +Git will interpret that as error but it will not stop or restart the +filter process. ------------------------ packet: git< status=success\n packet: git< 0000 @@ -495,21 +486,38 @@ packet: git< 0000 If the filter dies during the communication or does not adhere to the protocol then Git will stop the filter process and restart it -with the next file that needs to be processed. +with the next file that needs to be processed. Depending on the +`filter.<driver>.required` flag Git will interpret that as error. + +The error handling for all cases above mimic the behavior of +the `filter.<driver>.clean` / `filter.<driver>.smudge` error +handling. + +In case the filter cannot or does not want to process the content +as well as any future content for the lifetime of the Git process, +it is expected to respond with an "abort" status. Depending on +the `filter.<driver>.required` flag Git will interpret that as error +for the content as well as any future content for the lifetime of the +Git process but it will not stop or restart the filter process. +------------------------ +packet: git< status=abort\n +packet: git< 0000 +------------------------ After the filter has processed a blob it is expected to wait for -the next "key=value" list containing a command. When the Git process -terminates, it will send a kill signal to the filter in that stage. +the next "key=value" list containing a command. Git will close +the command pipe on exit. The filter is expected to detect EOF +and exit gracefully on its own. A long running filter demo implementation can be found in -`t/t0021/rot13-filter.pl` located in the Git core repository. -If you develop your own long running filter process then the -`GIT_TRACE_PACKET` environment variables can be very helpful -for debugging (see linkgit:git[1]). - -If a `filter.<driver>.clean` or `filter.<driver>.smudge` command -is configured then these commands always take precedence over -a configured `filter.<driver>.process` command. +`contrib/long-running-filter/example.pl` located in the Git +core repository. If you develop your own long running filter +process then the `GIT_TRACE_PACKET` environment variables can be +very helpful for debugging (see linkgit:git[1]). + +If a `filter.<driver>.process` command is configured then it +always takes precedence over a configured `filter.<driver>.clean` +or `filter.<driver>.smudge` command. Please note that you cannot use an existing `filter.<driver>.clean` or `filter.<driver>.smudge` command with `filter.<driver>.process` diff --git a/contrib/long-running-filter/example.pl b/contrib/long-running-filter/example.pl new file mode 100755 index 0000000..279fbfb --- /dev/null +++ b/contrib/long-running-filter/example.pl @@ -0,0 +1,111 @@ +#!/usr/bin/perl +# +# Example implementation for the Git filter protocol version 2 +# See Documentation/gitattributes.txt, section "Filter Protocol" +# + +use strict; +use warnings; + +my $MAX_PACKET_CONTENT_SIZE = 65516; + +sub packet_read { + my $buffer; + my $bytes_read = read STDIN, $buffer, 4; + if ( $bytes_read == 0 ) { + + # EOF - Git stopped talking to us! + exit(); + } + elsif ( $bytes_read != 4 ) { + die "invalid packet size '$bytes_read' field"; + } + my $pkt_size = hex($buffer); + if ( $pkt_size == 0 ) { + return ( 1, "" ); + } + elsif ( $pkt_size > 4 ) { + my $content_size = $pkt_size - 4; + $bytes_read = read STDIN, $buffer, $content_size; + if ( $bytes_read != $content_size ) { + die "invalid packet ($content_size expected; $bytes_read read)"; + } + return ( 0, $buffer ); + } + else { + die "invalid packet size"; + } +} + +sub packet_write { + my ($packet) = @_; + print STDOUT sprintf( "%04x", length($packet) + 4 ); + print STDOUT $packet; + STDOUT->flush(); +} + +sub packet_flush { + print STDOUT sprintf( "%04x", 0 ); + STDOUT->flush(); +} + +( packet_read() eq ( 0, "git-filter-client" ) ) || die "bad initialization"; +( packet_read() eq ( 0, "version=2" ) ) || die "bad version"; +( packet_read() eq ( 1, "" ) ) || die "bad version end"; + +packet_write("git-filter-server\n"); +packet_write("version=2\n"); + +( packet_read() eq ( 0, "clean=true" ) ) || die "bad capability"; +( packet_read() eq ( 0, "smudge=true" ) ) || die "bad capability"; +( packet_read() eq ( 1, "" ) ) || die "bad capability end"; + +packet_write( "clean=true\n" ); +packet_write( "smudge=true\n" ); +packet_flush(); + +while (1) { + my ($command) = packet_read() =~ /^command=([^=]+)\n$/; + my ($pathname) = packet_read() =~ /^pathname=([^=]+)\n$/; + + packet_read(); + + my $input = ""; + { + binmode(STDIN); + my $buffer; + my $done = 0; + while ( !$done ) { + ( $done, $buffer ) = packet_read(); + $input .= $buffer; + } + } + + my $output; + if ( $command eq "clean" ) { + ### Perform clean here ### + $output = $input; + } + elsif ( $command eq "smudge" ) { + ### Perform smudge here ### + $output = $input; + } + else { + die "bad command '$command'"; + } + + packet_write("status=success\n"); + packet_flush(); + while ( length($output) > 0 ) { + my $packet = substr( $output, 0, $MAX_PACKET_CONTENT_SIZE ); + packet_write($packet); + if ( length($output) > $MAX_PACKET_CONTENT_SIZE ) { + $output = substr( $output, $MAX_PACKET_CONTENT_SIZE ); + } + else { + $output = ""; + } + } + packet_flush(); # flush content! + packet_flush(); # empty list! +} diff --git a/convert.c b/convert.c index 0269a49..0ed48ed 100644 --- a/convert.c +++ b/convert.c @@ -723,9 +723,9 @@ static int apply_multi_file_filter(const char *path, const char *src, size_t len goto done; if (fd >= 0) - err = packet_write_stream_with_flush_from_fd(fd, process->in); + err = write_packetized_from_fd(fd, process->in); else - err = packet_write_stream_with_flush_from_buf(src, len, process->in); + err = write_packetized_from_buf(src, len, process->in); if (err) goto done; @@ -734,7 +734,7 @@ static int apply_multi_file_filter(const char *path, const char *src, size_t len if (err) goto done; - err = packet_read_till_flush(process->out, &nbuf) < 0; + err = read_packetized_to_buf(process->out, &nbuf) < 0; if (err) goto done; @@ -747,7 +747,7 @@ static int apply_multi_file_filter(const char *path, const char *src, size_t len if (err || errno == EPIPE) { if (!strcmp(filter_status.buf, "error")) { /* The filter signaled a problem with the file. */ - } else if (!strcmp(filter_status.buf, "error-all")) { + } else if (!strcmp(filter_status.buf, "abort")) { /* * The filter signaled a permanent problem. Don't try to filter * files with the same command for the lifetime of the current @@ -790,9 +790,9 @@ static int apply_filter(const char *path, const char *src, size_t len, if (!dst) return 1; - if ((CAP_CLEAN & wanted_capability) && drv->clean) + if (!drv->process && (CAP_CLEAN & wanted_capability) && drv->clean) cmd = drv->clean; - else if ((CAP_SMUDGE & wanted_capability) && drv->smudge) + else if (!drv->process && (CAP_SMUDGE & wanted_capability) && drv->smudge) cmd = drv->smudge; if (cmd && *cmd) diff --git a/pkt-line.c b/pkt-line.c index 3033aa3..5001a07 100644 --- a/pkt-line.c +++ b/pkt-line.c @@ -3,7 +3,6 @@ #include "run-command.h" char packet_buffer[LARGE_PACKET_MAX]; -static char packet_write_buffer[LARGE_PACKET_MAX]; static const char *packet_trace_prefix = "git"; static struct trace_key trace_packet = TRACE_KEY_INIT(PACKET); static struct trace_key trace_pack = TRACE_KEY_INIT(PACKFILE); @@ -95,7 +94,10 @@ void packet_flush(int fd) int packet_flush_gently(int fd) { packet_trace("0000", 4, 1); - return (write_in_full(fd, "0000", 4) == 4 ? 0 : -1); + if (write_in_full(fd, "0000", 4) == 4) + return 0; + error("flush packet write failed"); + return -1; } void packet_buf_flush(struct strbuf *buf) @@ -132,39 +134,70 @@ static void format_packet(struct strbuf *out, const char *fmt, va_list args) packet_trace(out->buf + orig_len + 4, n - 4, 1); } +static int packet_write_fmt_1(int fd, int gently, + const char *fmt, va_list args) +{ + struct strbuf buf = STRBUF_INIT; + size_t count; + + format_packet(&buf, fmt, args); + count = write_in_full(fd, buf.buf, buf.len); + if (count == buf.len) + return 0; + + if (!gently) { + if (errno == EPIPE) { + if (in_async()) + async_exit(141); + + signal(SIGPIPE, SIG_DFL); + raise(SIGPIPE); + /* Should never happen, but just in case... */ + exit(141); + } + die_errno("packet write error"); + } + error("packet write failed"); + return -1; +} + void packet_write_fmt(int fd, const char *fmt, ...) { - static struct strbuf buf = STRBUF_INIT; va_list args; - strbuf_reset(&buf); va_start(args, fmt); - format_packet(&buf, fmt, args); + packet_write_fmt_1(fd, 0, fmt, args); va_end(args); - write_or_die(fd, buf.buf, buf.len); } int packet_write_fmt_gently(int fd, const char *fmt, ...) { - static struct strbuf buf = STRBUF_INIT; + int status; va_list args; - strbuf_reset(&buf); va_start(args, fmt); - format_packet(&buf, fmt, args); + status = packet_write_fmt_1(fd, 1, fmt, args); va_end(args); - return (write_in_full(fd, buf.buf, buf.len) == buf.len ? 0 : -1); + return status; } int packet_write_gently(const int fd_out, const char *buf, size_t size) { - if (size > sizeof(packet_write_buffer) - 4) + static char packet_write_buffer[LARGE_PACKET_MAX]; + + if (size > sizeof(packet_write_buffer) - 4) { + error("packet write failed"); return -1; + } packet_trace(buf, size, 1); - memmove(packet_write_buffer + 4, buf, size); size += 4; set_packet_header(packet_write_buffer, size); - return (write_in_full(fd_out, packet_write_buffer, size) == size ? 0 : -1); + memcpy(packet_write_buffer + 4, buf, size - 4); + if (write_in_full(fd_out, packet_write_buffer, size) == size) + return 0; + + error("packet write failed"); + return -1; } void packet_buf_write(struct strbuf *buf, const char *fmt, ...) @@ -176,35 +209,35 @@ void packet_buf_write(struct strbuf *buf, const char *fmt, ...) va_end(args); } -int packet_write_stream_with_flush_from_fd(int fd_in, int fd_out) +int write_packetized_from_fd(int fd_in, int fd_out) { + static char buf[PKTLINE_DATA_MAXLEN]; int err = 0; ssize_t bytes_to_write; while (!err) { - bytes_to_write = xread(fd_in, packet_write_buffer, sizeof(packet_write_buffer) - 4); + bytes_to_write = xread(fd_in, buf, sizeof(buf)); if (bytes_to_write < 0) return COPY_READ_ERROR; if (bytes_to_write == 0) break; - if (bytes_to_write > sizeof(packet_write_buffer) - 4) - return COPY_WRITE_ERROR; - err = packet_write_gently(fd_out, packet_write_buffer, bytes_to_write); + err = packet_write_gently(fd_out, buf, bytes_to_write); } if (!err) err = packet_flush_gently(fd_out); return err; } -int packet_write_stream_with_flush_from_buf(const char *src_in, size_t len, int fd_out) +int write_packetized_from_buf(const char *src_in, size_t len, int fd_out) { + static char buf[PKTLINE_DATA_MAXLEN]; int err = 0; size_t bytes_written = 0; size_t bytes_to_write; while (!err) { - if ((len - bytes_written) > sizeof(packet_write_buffer) - 4) - bytes_to_write = sizeof(packet_write_buffer) - 4; + if ((len - bytes_written) > sizeof(buf)) + bytes_to_write = sizeof(buf); else bytes_to_write = len - bytes_written; if (bytes_to_write == 0) @@ -327,52 +360,29 @@ char *packet_read_line_buf(char **src, size_t *src_len, int *dst_len) return packet_read_line_generic(-1, src, src_len, dst_len); } -ssize_t packet_read_till_flush(int fd_in, struct strbuf *sb_out) +ssize_t read_packetized_to_buf(int fd_in, struct strbuf *sb_out) { - int len, ret; + int paket_len; int options = PACKET_READ_GENTLE_ON_EOF; - char linelen[4]; size_t oldlen = sb_out->len; size_t oldalloc = sb_out->alloc; for (;;) { - /* Read packet header */ - ret = get_packet_data(fd_in, NULL, NULL, linelen, 4, options); - if (ret < 0) - goto done; - len = packet_length(linelen); - if (len < 0) - die("protocol error: bad line length character: %.4s", linelen); - if (!len) { - /* Found a flush packet - Done! */ - packet_trace("0000", 4, 0); + strbuf_grow(sb_out, PKTLINE_DATA_MAXLEN+1); + paket_len = packet_read(fd_in, NULL, NULL, + sb_out->buf + sb_out->len, PKTLINE_DATA_MAXLEN+1, options); + if (paket_len <= 0) break; - } - len -= 4; - - /* Read packet content */ - strbuf_grow(sb_out, len); - ret = get_packet_data(fd_in, NULL, NULL, sb_out->buf + sb_out->len, len, options); - if (ret < 0) - goto done; - - if (ret != len) { - error("protocol error: incomplete read (expected %d, got %d)", len, ret); - goto done; - } - - packet_trace(sb_out->buf + sb_out->len, len, 0); - sb_out->len += len; + sb_out->len += paket_len; } -done: - if (ret < 0) { + if (paket_len < 0) { if (oldalloc == 0) strbuf_release(sb_out); else strbuf_setlen(sb_out, oldlen); - return ret; /* unexpected EOF */ + return paket_len; } return sb_out->len - oldlen; } diff --git a/pkt-line.h b/pkt-line.h index 6a3b7cf..3d873f3 100644 --- a/pkt-line.h +++ b/pkt-line.h @@ -25,8 +25,8 @@ void packet_buf_flush(struct strbuf *buf); void packet_buf_write(struct strbuf *buf, const char *fmt, ...) __attribute__((format (printf, 2, 3))); int packet_flush_gently(int fd); int packet_write_fmt_gently(int fd, const char *fmt, ...) __attribute__((format (printf, 2, 3))); -int packet_write_stream_with_flush_from_fd(int fd_in, int fd_out); -int packet_write_stream_with_flush_from_buf(const char *src_in, size_t len, int fd_out); +int write_packetized_from_fd(int fd_in, int fd_out); +int write_packetized_from_buf(const char *src_in, size_t len, int fd_out); /* * Read a packetized line into the buffer, which must be at least size bytes @@ -82,7 +82,7 @@ char *packet_read_line_buf(char **src_buf, size_t *src_len, int *size); /* * Reads a stream of variable sized packets until a flush packet is detected. */ -ssize_t packet_read_till_flush(int fd_in, struct strbuf *sb_out); +ssize_t read_packetized_to_buf(int fd_in, struct strbuf *sb_out); #define DEFAULT_PACKET_MAX 1000 #define LARGE_PACKET_MAX 65520 diff --git a/read-cache.c b/read-cache.c index 02f74d3..491e52d 100644 --- a/read-cache.c +++ b/read-cache.c @@ -156,7 +156,7 @@ void fill_stat_cache_info(struct cache_entry *ce, struct stat *st) static int ce_compare_data(const struct cache_entry *ce, struct stat *st) { int match = -1; - int fd = open(ce->name, O_RDONLY | O_CLOEXEC); + int fd = open(ce->name, O_RDONLY); if (fd >= 0) { unsigned char sha1[20]; diff --git a/t/t0021-conversion.sh b/t/t0021-conversion.sh index 94b0bf3..1c98ac3 100755 --- a/t/t0021-conversion.sh +++ b/t/t0021-conversion.sh @@ -4,15 +4,6 @@ test_description='blob conversion via gitattributes' . ./test-lib.sh -if test_have_prereq EXPENSIVE -then - T0021_LARGE_FILE_SIZE=2048 - T0021_LARGISH_FILE_SIZE=100 -else - T0021_LARGE_FILE_SIZE=30 - T0021_LARGISH_FILE_SIZE=2 -fi - cat <<EOF >rot13.sh #!$SHELL_PATH tr \ @@ -22,8 +13,8 @@ EOF chmod +x rot13.sh test_expect_success setup ' - test_config filter.rot13.smudge ./rot13.sh && - test_config filter.rot13.clean ./rot13.sh && + git config filter.rot13.smudge ./rot13.sh && + git config filter.rot13.clean ./rot13.sh && { echo "*.t filter=rot13" @@ -43,26 +34,7 @@ test_expect_success setup ' git checkout -- test test.t test.i && echo "content-test2" >test2.o && - echo "content-test3-subdir" >test3-subdir.o && - - mkdir generated-test-data && - for i in $(test_seq 1 $T0021_LARGE_FILE_SIZE) - do - RANDOM_STRING="$(test-genrandom end $i | tr -dc "A-Za-z0-9" )" - ROT_RANDOM_STRING="$(echo $RANDOM_STRING | ./rot13.sh )" - # Generate 1MB of empty data and 100 bytes of random characters - # printf "$(test-genrandom start $i)" - printf "%1048576d" 1 >>generated-test-data/large.file && - printf "$RANDOM_STRING" >>generated-test-data/large.file && - printf "%1048576d" 1 >>generated-test-data/large.file.rot13 && - printf "$ROT_RANDOM_STRING" >>generated-test-data/large.file.rot13 && - - if test $i = $T0021_LARGISH_FILE_SIZE - then - cat generated-test-data/large.file >generated-test-data/largish.file && - cat generated-test-data/large.file.rot13 >generated-test-data/largish.file.rot13 - fi - done + echo "content-test3-subdir" >test3-subdir.o ' script='s/^\$Id: \([0-9a-f]*\) \$/\1/p' @@ -230,9 +202,9 @@ test_expect_success 'required filter clean failure' ' test_expect_success 'filtering large input to small output should use little memory' ' test_config filter.devnull.clean "cat >/dev/null" && test_config filter.devnull.required true && - cp generated-test-data/large.file large.file && - echo "large.file filter=devnull" >.gitattributes && - GIT_MMAP_LIMIT=1m GIT_ALLOC_LIMIT=1m git add large.file + for i in $(test_seq 1 30); do printf "%1048576d" 1; done >30MB && + echo "30MB filter=devnull" >.gitattributes && + GIT_MMAP_LIMIT=1m GIT_ALLOC_LIMIT=1m git add 30MB ' test_expect_success 'filter that does not read is fine' ' @@ -245,15 +217,15 @@ test_expect_success 'filter that does not read is fine' ' test_cmp expect actual ' -test_expect_success 'filter large file' ' +test_expect_success EXPENSIVE 'filter large file' ' test_config filter.largefile.smudge cat && test_config filter.largefile.clean cat && - echo "large.file filter=largefile" >.gitattributes && - cp generated-test-data/large.file large.file && - git add large.file 2>err && + for i in $(test_seq 1 2048); do printf "%1048576d" 1; done >2GB && + echo "2GB filter=largefile" >.gitattributes && + git add 2GB 2>err && test_must_be_empty err && - rm -f large.file && - git checkout -- large.file 2>err && + rm -f 2GB && + git checkout -- 2GB 2>err && test_must_be_empty err ' @@ -374,22 +346,24 @@ test_expect_success PERL 'required process filter should filter data' ' check_filter \ git add . \ <<-\EOF && - 1 IN: clean test.r 57 [OK] -- OUT: 57 [OK] - 1 IN: clean test2.r 14 [OK] -- OUT: 14 [OK] - 1 IN: clean test4-empty.r 0 [OK] -- OUT: 0 [OK] - 1 IN: clean testsubdir/test3-subdir.r 21 [OK] -- OUT: 21 [OK] - 1 start + 1 IN: clean test.r 57 [OK] -- OUT: 57 . [OK] + 1 IN: clean test2.r 14 [OK] -- OUT: 14 . [OK] + 1 IN: clean test4-empty.r 0 [OK] -- OUT: 0 [OK] + 1 IN: clean testsubdir/test3-subdir.r 21 [OK] -- OUT: 21 . [OK] + 1 START + 1 STOP 1 wrote filter header EOF check_filter_count_clean \ git commit . -m "test commit" \ <<-\EOF && - x IN: clean test.r 57 [OK] -- OUT: 57 [OK] - x IN: clean test2.r 14 [OK] -- OUT: 14 [OK] - x IN: clean test4-empty.r 0 [OK] -- OUT: 0 [OK] - x IN: clean testsubdir/test3-subdir.r 21 [OK] -- OUT: 21 [OK] - 1 start + x IN: clean test.r 57 [OK] -- OUT: 57 . [OK] + x IN: clean test2.r 14 [OK] -- OUT: 14 . [OK] + x IN: clean test4-empty.r 0 [OK] -- OUT: 0 [OK] + x IN: clean testsubdir/test3-subdir.r 21 [OK] -- OUT: 21 . [OK] + 1 START + 1 STOP 1 wrote filter header EOF @@ -398,28 +372,31 @@ test_expect_success PERL 'required process filter should filter data' ' check_filter_ignore_clean \ git checkout . \ <<-\EOF && - start + START wrote filter header - IN: smudge test2.r 14 [OK] -- OUT: 14 [OK] - IN: smudge testsubdir/test3-subdir.r 21 [OK] -- OUT: 21 [OK] + IN: smudge test2.r 14 [OK] -- OUT: 14 . [OK] + IN: smudge testsubdir/test3-subdir.r 21 [OK] -- OUT: 21 . [OK] + STOP EOF check_filter_ignore_clean \ git checkout empty \ <<-\EOF && - start + START wrote filter header + STOP EOF check_filter_ignore_clean \ git checkout master \ <<-\EOF && - start + START wrote filter header - IN: smudge test.r 57 [OK] -- OUT: 57 [OK] - IN: smudge test2.r 14 [OK] -- OUT: 14 [OK] - IN: smudge test4-empty.r 0 [OK] -- OUT: 0 [OK] - IN: smudge testsubdir/test3-subdir.r 21 [OK] -- OUT: 21 [OK] + IN: smudge test.r 57 [OK] -- OUT: 57 . [OK] + IN: smudge test2.r 14 [OK] -- OUT: 14 . [OK] + IN: smudge test4-empty.r 0 [OK] -- OUT: 0 [OK] + IN: smudge testsubdir/test3-subdir.r 21 [OK] -- OUT: 21 . [OK] + STOP EOF check_rot13 ../test.o test.r && @@ -428,57 +405,8 @@ test_expect_success PERL 'required process filter should filter data' ' ) ' -test_expect_success PERL 'required process filter should filter smudge data and one-shot filter should clean' ' +test_expect_success PERL 'required process filter should clean only and take precedence' ' test_config_global filter.protocol.clean ./../rot13.sh && - test_config_global filter.protocol.process "$TEST_DIRECTORY/t0021/rot13-filter.pl smudge" && - test_config_global filter.protocol.required true && - rm -rf repo && - mkdir repo && - ( - cd repo && - git init && - - echo "*.r filter=protocol" >.gitattributes && - git add . && - git commit . -m "test commit" && - git branch empty && - - cp ../test.o test.r && - cp ../test2.o test2.r && - - check_filter_no_call \ - git add . && - - check_filter_no_call \ - git commit . -m "test commit" && - - rm -f test?.r testsubdir/test3-subdir.r && - - check_filter_ignore_clean \ - git checkout . \ - <<-\EOF && - start - wrote filter header - IN: smudge test2.r 14 [OK] -- OUT: 14 [OK] - EOF - - git checkout empty && - - check_filter_ignore_clean \ - git checkout master\ - <<-\EOF && - start - wrote filter header - IN: smudge test.r 57 [OK] -- OUT: 57 [OK] - IN: smudge test2.r 14 [OK] -- OUT: 14 [OK] - EOF - - check_rot13 ../test.o test.r && - check_rot13 ../test2.o test2.r - ) -' - -test_expect_success PERL 'required process filter should clean only' ' test_config_global filter.protocol.process "$TEST_DIRECTORY/t0021/rot13-filter.pl clean" && test_config_global filter.protocol.required true && rm -rf repo && @@ -497,42 +425,74 @@ test_expect_success PERL 'required process filter should clean only' ' check_filter \ git add . \ <<-\EOF && - 1 IN: clean test.r 57 [OK] -- OUT: 57 [OK] - 1 start + 1 IN: clean test.r 57 [OK] -- OUT: 57 . [OK] + 1 START + 1 STOP 1 wrote filter header EOF check_filter_count_clean \ git commit . -m "test commit" \ <<-\EOF - x IN: clean test.r 57 [OK] -- OUT: 57 [OK] - 1 start + x IN: clean test.r 57 [OK] -- OUT: 57 . [OK] + 1 START + 1 STOP 1 wrote filter header EOF ) ' -test_expect_success PERL 'required process filter should process binary files larger LARGE_PACKET_MAX' ' +generate_test_data () { + LEN=$1 + NAME=$2 + test-genrandom end $LEN | + perl -pe "s/./chr((ord($&) % 26) + 97)/sge" >../$NAME.file && + cp ../$NAME.file . && + ./../rot13.sh <../$NAME.file >../$NAME.file.rot13 +} + +test_expect_success PERL 'required process filter should process multiple packets' ' test_config_global filter.protocol.process "$TEST_DIRECTORY/t0021/rot13-filter.pl clean smudge" && test_config_global filter.protocol.required true && + rm -rf repo && mkdir repo && ( cd repo && git init && - echo "*.file filter=protocol" >.gitattributes && - cat ../generated-test-data/largish.file.rot13 >large.rot13 && - cat ../generated-test-data/largish.file >large.file && - cat large.file >large.original && + # Generate data that requires 3 packets + PKTLINE_DATA_MAXLEN=65516 && + + generate_test_data $(($PKTLINE_DATA_MAXLEN )) 1pkt_1__ && + generate_test_data $(($PKTLINE_DATA_MAXLEN + 1)) 2pkt_1+1 && + generate_test_data $(($PKTLINE_DATA_MAXLEN * 2 - 1)) 2pkt_2-1 && + generate_test_data $(($PKTLINE_DATA_MAXLEN * 2 )) 2pkt_2__ && + generate_test_data $(($PKTLINE_DATA_MAXLEN * 2 + 1)) 3pkt_2+1 && - git add large.file .gitattributes && + echo "*.file filter=protocol" >.gitattributes && + check_filter \ + git add *.file .gitattributes \ + <<-\EOF && + 1 IN: clean 1pkt_1__.file 65516 [OK] -- OUT: 65516 . [OK] + 1 IN: clean 2pkt_1+1.file 65517 [OK] -- OUT: 65517 .. [OK] + 1 IN: clean 2pkt_2-1.file 131031 [OK] -- OUT: 131031 .. [OK] + 1 IN: clean 2pkt_2__.file 131032 [OK] -- OUT: 131032 .. [OK] + 1 IN: clean 3pkt_2+1.file 131033 [OK] -- OUT: 131033 ... [OK] + 1 START + 1 STOP + 1 wrote filter header + EOF git commit . -m "test commit" && - rm -f large.file && - git checkout -- large.file && - git cat-file blob :large.file >actual && - test_cmp large.rot13 actual + rm -f *.file && + git checkout -- *.file && + + for f in *.file + do + git cat-file blob :$f >actual && + test_cmp ../$f.rot13 actual + done ) ' @@ -577,13 +537,14 @@ test_expect_success PERL 'process filter should restart after unexpected write f check_filter_ignore_clean \ git checkout . \ <<-\EOF && - start + START wrote filter header IN: smudge smudge-write-fail.r 22 [OK] -- OUT: 22 [WRITE FAIL] - start + START wrote filter header - IN: smudge test.r 57 [OK] -- OUT: 57 [OK] - IN: smudge test2.r 14 [OK] -- OUT: 14 [OK] + IN: smudge test.r 57 [OK] -- OUT: 57 . [OK] + IN: smudge test2.r 14 [OK] -- OUT: 14 . [OK] + STOP EOF check_rot13 ../test.o test.r && @@ -617,11 +578,12 @@ test_expect_success PERL 'process filter should not restart in case of an error' check_filter_ignore_clean \ git checkout . \ <<-\EOF && - start + START wrote filter header IN: smudge error.r 25 [OK] -- OUT: 0 [ERROR] - IN: smudge test.r 57 [OK] -- OUT: 57 [OK] - IN: smudge test2.r 14 [OK] -- OUT: 14 [OK] + IN: smudge test.r 57 [OK] -- OUT: 57 . [OK] + IN: smudge test2.r 14 [OK] -- OUT: 14 . [OK] + STOP EOF check_rot13 ../test.o test.r && @@ -642,8 +604,8 @@ test_expect_success PERL 'process filter should be able to signal an error for a cp ../test.o test.r && cp ../test2.o test2.r && - echo "error this blob and all future blobs" >error-all.o && - cp error-all.o error-all.r && + echo "error this blob and all future blobs" >abort.o && + cp abort.o abort.r && git add . && git commit . -m "test commit" && rm -f *.r && @@ -651,14 +613,15 @@ test_expect_success PERL 'process filter should be able to signal an error for a check_filter_ignore_clean \ git checkout . \ <<-\EOF && - start + START wrote filter header - IN: smudge error-all.r 37 [OK] -- OUT: 0 [ERROR-ALL] + IN: smudge abort.r 37 [OK] -- OUT: 0 [ABORT] + STOP EOF test_cmp ../test.o test.r && test_cmp ../test2.o test2.r && - test_cmp error-all.o error-all.r + test_cmp abort.o abort.r ) ' diff --git a/t/t0021/rot13-filter.pl b/t/t0021/rot13-filter.pl index 0e1181d..8e27877 100755 --- a/t/t0021/rot13-filter.pl +++ b/t/t0021/rot13-filter.pl @@ -14,7 +14,7 @@ # (3) If data with the pathname "error.r" is processed with any # operation then the filter signals that it cannot or does not want # to process the file. -# (4) If data with the pathname "error-all.r" is processed with any +# (4) If data with the pathname "abort.r" is processed with any # operation then the filter signals that it cannot or does not want # to process the file and any file after that is processed with the # same command. @@ -26,6 +26,8 @@ use warnings; my $MAX_PACKET_CONTENT_SIZE = 65516; my @capabilities = @ARGV; +open my $debug, ">>", "rot13-filter.log"; + sub rot13 { my ($str) = @_; $str =~ y/A-Za-z/N-ZA-Mn-za-m/; @@ -38,6 +40,7 @@ sub packet_read { if ( $bytes_read == 0 ) { # EOF - Git stopped talking to us! + print $debug "STOP\n"; exit(); } elsif ( $bytes_read != 4 ) { @@ -72,8 +75,7 @@ sub packet_flush { STDOUT->flush(); } -open my $debug, ">>", "rot13-filter.log"; -print $debug "start\n"; +print $debug "START\n"; $debug->flush(); ( packet_read() eq ( 0, "git-filter-client" ) ) || die "bad initialization"; @@ -120,7 +122,7 @@ while (1) { } my $output; - if ( $pathname eq "error.r" or $pathname eq "error-all.r" ) { + if ( $pathname eq "error.r" or $pathname eq "abort.r" ) { $output = ""; } elsif ( $command eq "clean" and grep( /^clean$/, @capabilities ) ) { @@ -142,10 +144,10 @@ while (1) { packet_write("status=error\n"); packet_flush(); } - elsif ( $pathname eq "error-all.r" ) { - print $debug "[ERROR-ALL]\n"; + elsif ( $pathname eq "abort.r" ) { + print $debug "[ABORT]\n"; $debug->flush(); - packet_write("status=error-all\n"); + packet_write("status=abort\n"); packet_flush(); } else { @@ -161,6 +163,7 @@ while (1) { while ( length($output) > 0 ) { my $packet = substr( $output, 0, $MAX_PACKET_CONTENT_SIZE ); packet_write($packet); + print $debug "."; if ( length($output) > $MAX_PACKET_CONTENT_SIZE ) { $output = substr( $output, $MAX_PACKET_CONTENT_SIZE ); } @@ -169,7 +172,7 @@ while (1) { } } packet_flush(); - print $debug "[OK]\n"; + print $debug " [OK]\n"; $debug->flush(); packet_flush(); } Lars Schneider (10): pkt-line: rename packet_write() to packet_write_fmt() pkt-line: extract set_packet_header() pkt-line: add packet_write_fmt_gently() pkt-line: add packet_flush_gently() pkt-line: add packet_write_gently() pkt-line: add functions to read/write flush terminated packet streams convert: quote filter names in error messages convert: modernize tests convert: make apply_filter() adhere to standard Git error handling convert: add filter.<driver>.process option Documentation/gitattributes.txt | 154 +++++++++++- builtin/archive.c | 4 +- builtin/receive-pack.c | 4 +- builtin/remote-ext.c | 4 +- builtin/upload-archive.c | 4 +- connect.c | 2 +- contrib/long-running-filter/example.pl | 111 +++++++++ convert.c | 373 +++++++++++++++++++++++++---- daemon.c | 2 +- http-backend.c | 2 +- pkt-line.c | 160 ++++++++++++- pkt-line.h | 12 +- shallow.c | 2 +- t/t0021-conversion.sh | 423 ++++++++++++++++++++++++++++++--- t/t0021/rot13-filter.pl | 179 ++++++++++++++ unpack-trees.c | 1 + upload-pack.c | 30 +-- 17 files changed, 1356 insertions(+), 111 deletions(-) create mode 100755 contrib/long-running-filter/example.pl create mode 100755 t/t0021/rot13-filter.pl -- 2.10.0