[PATCH] do not require filters to consume stdin

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



A clean filter that uses %f to examine a file may not need to consume
the entire file content from stdin every time it's run, due to caching
or other optimisations to support large files.

Ignore the SIGPIPE that may result when writing to the filter
if it exits without consuming stdin, and do not check that all
content is sent to it. The filter is still required to exit
successfully, so a crash in the filter should still be handled
correctly.
---

There has been discussion before about using clean and smudge filters
with %f to handle big files in git, with the file content stored outside
git somewhere.  A simplistic clean filter for large files could look
like this:

#!/bin/sh
file="$1"
ln -f $file ~/.big/$file
echo $file

But trying to use this will fail on truely large files. For example:

$ ls -l sorta.huge 
-rw-r--r-- 3 joey joey 524288000 Aug 29 15:19 sorta.huge
$ git add sorta.huge 
broken pipe  git add sorta.huge
$ echo $?
141

The SIGPIPE occurs because git expects the clean filter to read
the full file content from stdin. (Although if the content is small
enough, a SIGPIPE may not occur.) So the clean filter needs to
look like this:

#!/bin/sh
file="$1"
cat >/dev/null
ln -f $file ~/.big/$file
echo $file

But this means much more work has to be done whenever the clean filter
is run. Including every time git status is run. So it's currently
impractical to use clean/smudge filters like this for large files.
This patch should close that gap and allow such filters to be developed.

 convert.c |    6 ++++--
 1 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/convert.c b/convert.c
index 416bf83..3d528c4 100644
--- a/convert.c
+++ b/convert.c
@@ -330,7 +330,7 @@ static int filter_buffer(int in, int out, void *data)
 	 */
 	struct child_process child_process;
 	struct filter_params *params = (struct filter_params *)data;
-	int write_err, status;
+	int write_err = 0, status;
 	const char *argv[] = { NULL, NULL };
 
 	/* apply % substitution to cmd */
@@ -360,9 +360,11 @@ static int filter_buffer(int in, int out, void *data)
 	if (start_command(&child_process))
 		return error("cannot fork to run external filter %s", params->cmd);
 
-	write_err = (write_in_full(child_process.in, params->src, params->size) < 0);
+	signal(SIGPIPE, SIG_IGN);
+	write_in_full(child_process.in, params->src, params->size);
 	if (close(child_process.in))
 		write_err = 1;
+	signal(SIGPIPE, SIG_DFL);
 	if (write_err)
 		error("cannot feed the input to external filter %s", params->cmd);
 
-- 
1.7.5.4
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]