Re: [PATCH 2/2] grep: don't call regexec() for fixed strings

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Alex Riesen schrieb:
> 2009/1/10 René Scharfe <rene.scharfe@xxxxxxxxxxxxxx>:
>> +static int isregexspecial(int c)
>> +{
>> +       return isspecial(c) || c == '$' || c == '(' || c == ')' || c == '+' ||
>> +                              c == '.' || c == '^' || c == '{' || c == '|';
>> +}
>> +
>> +static int is_fixed(const char *s)
>> +{
>> +       while (!isregexspecial(*s))
>> +               s++;
>> +       return !*s;
>> +}
> 
> strchr?

Oh, yes, that would look nicer.

Another option is to extend ctype.c and implement isregexspecial() --
and while we're at it islowerxdigit() (builtin-name-rev.c::ishex()) and
iswordchar() (config.c::iskeychar(), grep.c::word_char()), too -- as
table lookups.  I.e., something like the following (untested).

Which of the mentioned functions are really worth of this promotion?
The isregexspecial() char class has more members than isspecial(), but
it's not performance critical (unless you have a lot of patterns and
only a small amount of data to grep :).

Are there more candidates for ctype-ification?

René


 ctype.c           |   14 ++++++++++----
 git-compat-util.h |    6 ++++++
 2 files changed, 16 insertions(+), 4 deletions(-)

diff --git a/ctype.c b/ctype.c
index 9208d67..1a76586 100644
--- a/ctype.c
+++ b/ctype.c
@@ -10,20 +10,26 @@
 #undef AA
 #undef DD
 #undef GS
+#undef RR
+#undef US
+#undef Ah
 
 #define SS GIT_SPACE
 #define AA GIT_ALPHA
 #define DD GIT_DIGIT
 #define GS GIT_SPECIAL  /* \0, *, ?, [, \\ */
+#define RR GIT_REGEX_SPECIAL /* $, (, ), +, ., ^, {, | */
+#define US GIT_UNDERSCORE
+#define Ah (GIT_ALPHA | GIT_LOWER_XDIGIT)
 
 unsigned char sane_ctype[256] = {
 	GS,  0,  0,  0,  0,  0,  0,  0,  0, SS, SS,  0,  0, SS,  0,  0,		/* 0-15 */
 	 0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,		/* 16-15 */
-	SS,  0,  0,  0,  0,  0,  0,  0,  0,  0, GS,  0,  0,  0,  0,  0,		/* 32-15 */
+	SS,  0,  0,  0, RR,  0,  0,  0, RR, RR, GS, RR,  0,  0, RR,  0,		/* 32-15 */
 	DD, DD, DD, DD, DD, DD, DD, DD, DD, DD,  0,  0,  0,  0,  0, GS,		/* 48-15 */
 	 0, AA, AA, AA, AA, AA, AA, AA, AA, AA, AA, AA, AA, AA, AA, AA,		/* 64-15 */
-	AA, AA, AA, AA, AA, AA, AA, AA, AA, AA, AA, GS, GS,  0,  0,  0,		/* 80-15 */
-	 0, AA, AA, AA, AA, AA, AA, AA, AA, AA, AA, AA, AA, AA, AA, AA,		/* 96-15 */
-	AA, AA, AA, AA, AA, AA, AA, AA, AA, AA, AA,  0,  0,  0,  0,  0,		/* 112-15 */
+	AA, AA, AA, AA, AA, AA, AA, AA, AA, AA, AA, GS, GS,  0, RR, US,		/* 80-15 */
+	 0, Ah, Ah, Ah, Ah, Ah, Ah, AA, AA, AA, AA, AA, AA, AA, AA, AA,		/* 96-15 */
+	AA, AA, AA, AA, AA, AA, AA, AA, AA, AA, AA, RR, RR,  0,  0,  0,		/* 112-15 */
 	/* Nothing in the 128.. range */
 };
diff --git a/git-compat-util.h b/git-compat-util.h
index e20b1e8..5eaa662 100644
--- a/git-compat-util.h
+++ b/git-compat-util.h
@@ -328,12 +328,18 @@ extern unsigned char sane_ctype[256];
 #define GIT_DIGIT 0x02
 #define GIT_ALPHA 0x04
 #define GIT_SPECIAL 0x08
+#define GIT_REGEX_SPECIAL 0x10
+#define GIT_UNDERSCORE 0x20
+#define GIT_LOWER_XDIGIT 0x40
 #define sane_istest(x,mask) ((sane_ctype[(unsigned char)(x)] & (mask)) != 0)
 #define isspace(x) sane_istest(x,GIT_SPACE)
 #define isdigit(x) sane_istest(x,GIT_DIGIT)
 #define isalpha(x) sane_istest(x,GIT_ALPHA)
 #define isalnum(x) sane_istest(x,GIT_ALPHA | GIT_DIGIT)
 #define isspecial(x) sane_istest(x,GIT_SPECIAL)
+#define isregexspecial(x) sane_istest(x,GIT_SPECIAL | GIT_REGEX_SPECIAL)
+#define iswordchar(x) sane_istest(x,GIT_ALPHA | GIT_DIGIT | GIT_UNDERSCORE)
+#define islowerxdigit(x) sane_istest(x,GIT_DIGIT | GIT_LOWER_XDIGIT)
 #define tolower(x) sane_case((unsigned char)(x), 0x20)
 #define toupper(x) sane_case((unsigned char)(x), 0)
 
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux