Carlos Rica <jasampler@xxxxxxxxx> writes: > diff --git a/builtin-stripspace.c b/builtin-stripspace.c > index d8358e2..949b640 100644 > --- a/builtin-stripspace.c > +++ b/builtin-stripspace.c > @@ -2,12 +2,11 @@ > #include "cache.h" > > /* > - * Remove trailing spaces from a line. > + * Returns the length of a line removing trailing spaces. This did not parse well for me; perhaps a comma before "removing" would make it easier to read? > @@ -28,52 +26,67 @@ static int cleanup(char *line, int len) > * Remove empty lines from the beginning and end > * and also trailing spaces from every line. > * > + * Note that the buffer will not be null-terminated. > + * The name of the sentinel character '\0' is NUL, not null (which is a different word, used to call a pointer that points nowhere). The buffer will not be "NUL-terminated". > -void stripspace(FILE *in, FILE *out, int skip_comments) > +size_t stripspace(char *buffer, size_t length, int skip_comments) > { > + newlen = cleanup(buffer + i, len); > > /* Not just an empty line? */ > + if (newlen) { > + if (empties != -1) > + buffer[j++] = '\n'; > if (empties > 0) > + buffer[j++] = '\n'; > empties = 0; > + memmove(buffer + j, buffer + i, newlen); > continue; > } It somehow strikes me odd that, given this: buffer j i **********************texttext \n....... you would first do this with cleanup(): buffer j i **********************texttext\n....... and then do this with this memmove(): buffer j i *************texttext\n....... Would it become simpler if cleanup() knew where the final text goes (i.e. buffer+j)? > int cmd_stripspace(int argc, const char **argv, const char *prefix) > { > - stripspace(stdin, stdout, 0); > + char *buffer; > + unsigned long size; > + > + size = 1024; > + buffer = xmalloc(size); > + if (read_pipe(0, &buffer, &size)) > + die("could not read the input"); The command used to be capable of streaming and filtering a few hundred gigabytes of text on a machine with small address space, as it operated one line at a time, but now it cannot as it has to hold everything in core before starting. I do not think we miss that loss of capability too much, but I wonder if we can be a bit more clever about it, perhaps feeding a chunk at a time. Not a very strong request, but just wondering if it is an easy change. - To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html