Re: [PATCH 3/3] grep: support newline separated pattern list

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Am 21.05.2012 00:29, schrieb Junio C Hamano:
> René Scharfe<rene.scharfe@xxxxxxxxxxxxxx>  writes:
> 
>> Currently, patterns that contain newline characters don't match anything
>> when given to git grep.  Regular grep(1) interprets patterns as lists of
>> newline separated search strings instead.
>>
>> Implement this functionality by creating and inserting extra grep_pat
>> structures for patterns consisting of multiple lines when appending to
>> the pattern lists.  For simplicity, all pattern strings are duplicated.
>> The original pattern is truncated in place to make it contain only the
>> first line.
> 
> Thanks for a fix; I am assuming that the duplication for simplicity is not
> for simplified allocation but primarily for a simpler freeing?

When we split a search string into multiple parts, we could allocate only
the new parts and remember which strings were allocated, so that we could
later free them (and only them).  Or we could allocate and leak them, as
there aren't that many and we keep them during the whole run of the program
anyway.  Or we could do without any allocations if all match backends
respected length limited strings like kwset does.  Or only allocate
non-fixed pattern.

Allocating any and all pattern strings and freeing them at the end, thus
disregarding these considerations, is simpler, cleaner and still doesn't
cause excessive memory usage.

That reminds me that we can now have this bonus patch:

-- >8 --
Subject: [PATCH 4/3] grep: stop leaking line strings with -f

When reading patterns from a file, we pass the lines as allocated string
buffers to append_grep_pat() and never free them.  That's not a problem
because they are needed until the program ends anyway.

However, now that the function duplicates the pattern string, we can
reuse the strbuf after calling that function.  This simplifies the code
a bit and plugs a minor memory leak.

Signed-off-by: Rene Scharfe <rene.scharfe@xxxxxxxxxxxxxx>
---
 builtin/grep.c |    7 ++-----
 1 file changed, 2 insertions(+), 5 deletions(-)

diff --git a/builtin/grep.c b/builtin/grep.c
index 643938d..fe1726f 100644
--- a/builtin/grep.c
+++ b/builtin/grep.c
@@ -600,15 +600,12 @@ static int file_callback(const struct option *opt, const char *arg, int unset)
 	if (!patterns)
 		die_errno(_("cannot open '%s'"), arg);
 	while (strbuf_getline(&sb, patterns, '\n') == 0) {
-		char *s;
-		size_t len;
-
 		/* ignore empty line like grep does */
 		if (sb.len == 0)
 			continue;
 
-		s = strbuf_detach(&sb, &len);
-		append_grep_pat(grep_opt, s, len, arg, ++lno, GREP_PATTERN);
+		append_grep_pat(grep_opt, sb.buf, sb.len, arg, ++lno,
+				GREP_PATTERN);
 	}
 	if (!from_stdin)
 		fclose(patterns);
-- 
1.7.10.2
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]