Re: [RFC PATCH 3/3] git-grep: Learn PCRE

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 03.05.2011 16:11:06 -0700 Junio C Hamano <gitster@xxxxxxxxx> wrote:

> MichaÅ Kiedrowicz <michal.kiedrowicz@xxxxxxxxx> writes:
> 
> > This patch lacks tests.
> >
> > AFAIK grep.c is a part of libgit.a, so libpcre is linked with the
> > git itself, which bloats it. I don't like it, especially because it
> > makes 'make test' take:
> >
> > 	real    6m27.558s
> > 	user    1m34.392s
> > 	sys     2m11.029s
> >
> > instead of
> >
> > 	real    3m41.322s
> > 	user    1m44.005s
> > 	sys     2m32.403s
> >
> > on my PC.
> 
> I am not against pcre, but the performance numbers look rather yucky.
> 
> > diff --git a/grep.c b/grep.c
> > index d67baf9..4f5fcbb 100644
> > --- a/grep.c
> > +++ b/grep.c
> > @@ -70,6 +70,30 @@ static void compile_regexp(struct grep_pat *p,
> > struct grep_opt *opt) if (p->fixed)
> >  		return;
> >  
> > +	if (opt->pcre) {
> > +#ifdef NO_LIBPCRE
> > +		die("cannot use Perl-compatible regexes when
> > libpcre is not compiled in"); +#else
> > +		const char *error;
> > +	...
> > +		return;
> > +#endif
> > +	}
> > +
> 
> Please avoid these #ifdefs in the middle of otherwise generic
> function.

Yeah, I don't like them too.

> 
> Make the above into something like:
> 
> 	if (opt->pcre) {
>         	compile_pcre(p, opt);
>                 return;
> 	}
> 
> and then have an extra section entirely devoted to pcre integration
> that has bunch of ...
> 
>         #ifdef NO_LIBPCRE
>         static void compile_pcre_part(struct grep_pat *p, struct
> grep_opt *opt) {
>                 die("...");
> 	}
>         #else
>         static void compile_pcre_part(struct grep_pat *p, struct
> grep_opt *opt) {
>                 ...
> 	}
>         #endif
> 
> ... that is totally separate from the generic part of the codebase.
> They could even be in a separate file, if you need numerous helpers.
> 
> > @@ -320,7 +344,16 @@ void free_grep_patterns(struct grep_opt *opt)
> >  		case GREP_PATTERN: /* atom */
> >  		case GREP_PATTERN_HEAD:
> >  		case GREP_PATTERN_BODY:
> > +#ifndef NO_LIBPCRE
> > +			if (p->re) {
> > +				pcre_free(p->re);
> > +				pcre_free(p->extra);
> > +			} else {
> > +				regfree(&p->regexp);
> > +			}
> > +#else
> >  			regfree(&p->regexp);
> > +#endif
> >  			break;
> >  		default:
> >  			break;
> 
> > diff --git a/grep.h b/grep.h
> > index 06621fe..cd202a9 100644
> > --- a/grep.h
> > +++ b/grep.h
> > @@ -1,6 +1,9 @@
> >  #ifndef GREP_H
> >  #define GREP_H
> >  #include "color.h"
> > +#ifndef NO_LIBPCRE
> > +#include <pcre.h>
> > +#endif /* NO_LIBPCRE */
> 
> This part might want to do something like
> 
> 	#ifndef NO_LIBPCRE
>         #include <pcre.h>
>         #else
>         typedef int pcre; /* dummy */
>         typedef int pcre_extra; /* dummy */
>         #endif
> 
> if it makes it easier to keep the generic part of the code and
> structure definition clean.
> 
> For example, the "free" part of your patch above then would become
> 
> 	if (p->pcre_regexp)
>         	free_pcre_part(p);
> 	else
>         	regfree(&p->regexp);
> 
> with the conditional enclosed entirely within the implementation of
> the free_pcre_part() helper function.
> 
> Also by doing so, the struct definition below can lose #ifndef, and
> can become:
> 
> 	 struct grep_pat {
>          ...
>         +	pcre *pcre_regexp;
>         +	pcre_extra *pcre_extra_info;
> 	 ...
> 	 };

That makes sense. Will re-roll (and add Signed-off-by).

> 
> > @@ -33,6 +36,10 @@ struct grep_pat {
> >  	size_t patternlen;
> >  	enum grep_header_field field;
> >  	regex_t regexp;
> > +#ifndef NO_LIBPCRE
> > +	pcre *re;
> > +	pcre_extra *extra;
> > +#endif /* NO_LIBPCRE */
> >  	unsigned fixed:1;
> >  	unsigned ignore_case:1;
> >  	unsigned word_regexp:1;
> > @@ -83,6 +90,7 @@ struct grep_opt {
> >  #define GREP_BINARY_TEXT	2
> >  	int binary;
> >  	int extended;
> > +	int pcre;
> >  	int relative;
> >  	int pathname;
> >  	int null_following_name;
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]