Re: Regex in PHP

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, 2008-06-05 at 00:24 -0400, Nathan Nobbe wrote:
>
> you really know how to rub it in there rob.  but i was looking at the
> implementation in the php code, looks like somebody likes my idea
> (this code
> found in ext/standard/string.c).  on the second line the haystack is
> converted to lower case[1], then if it passes a couple of checks, the
> needle
> is converted to lower case[2], and lastly the comparison is
> performed[3].
> there is no logic to check both cases.
> (i have placed a star beside the statements ive referred to).
> ...
>     haystack_dup = estrndup(haystack, haystack_len);
> *[1]    php_strtolower(haystack_dup, haystack_len);
> 
>     if (Z_TYPE_P(needle) == IS_STRING) {
>         if (Z_STRLEN_P(needle) == 0 || Z_STRLEN_P(needle) >
> haystack_len) {
>             efree(haystack_dup);
>             RETURN_FALSE;
>         }
> 
>         needle_dup = estrndup(Z_STRVAL_P(needle), Z_STRLEN_P(needle));
> *[2]        php_strtolower(needle_dup, Z_STRLEN_P(needle));
> *[3]        found = php_memnstr(haystack_dup + offset, needle_dup,
> Z_STRLEN_P(needle), haystack_dup + haystack_len);
>     }

Funny, I guess they took the quick route. This code could obviously be
optmized :)

But let's go with something used more often... such as more traditional
string comparison where you're more likely to want to eke out
efficiency:

ZEND_API int zend_binary_strcasecmp(char *s1, uint len1, char *s2, uint
len2)
{
    int len;
    int c1, c2;

    len = MIN(len1, len2);

    while (len--) {
        c1 = zend_tolower((int)*(unsigned char *)s1++);
        c2 = zend_tolower((int)*(unsigned char *)s2++);
        if (c1 != c2) {
            return c1 - c2;
        }
    }

    return len1 - len2;
}

Well looks like they do indeed do a conversion.. but on a char by char
basis. Strange that. Could more than likely speed it up by doing an
initial exactness comparison and then falling back on the above. Maybe
I'll compile and test out the following later:

ZEND_API int zend_binary_strcasecmp
(char *s1, uint len1, char *s2, uint len2)
{
    int len;
    int c1, c2;

    len = MIN(len1, len2);

    while (len--) {
        c1 = (int)*(unsigned char *)s1++;
        c2 = (int)*(unsigned char *)s2++;

        if( c1 != c2 ){
            c1 = zend_tolower( c1 );
            c2 = zend_tolower( c2 );

            if (c1 != c2) {
                return c1 - c2;
            }
        }
    }

    return len1 - len2;
}

Cheers,
Rob.
-- 
http://www.interjinn.com
Application and Templating Framework for PHP


-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php


[Index of Archives]     [PHP Home]     [Apache Users]     [PHP on Windows]     [Kernel Newbies]     [PHP Install]     [PHP Classes]     [Pear]     [Postgresql]     [Postgresql PHP]     [PHP on Windows]     [PHP Database Programming]     [PHP SOAP]

  Powered by Linux