Re: filter function vs regular expression

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sat, Dec 20, 2008 at 9:06 AM, Richard Heyes <richard@xxxxxxx> wrote:

> > i'm reading a book about PHP and i was wondering why regular expressions
> are
> > so often used to check format of variables or emails while the function
> > filter exists since version 5.2.
>
> That's not so long.
>
> > What are the plus of regular expression while checking variable format ?
>
> They' more versatile. Far more in the case of PCRE.


to elaborate, in general, the filter extension should be faster than
corresponding preg_* calls from user space.  why?  b/c, they essentially are
compiled calls to pcre (in many cases) for specific cases, such as email.
check out the C for filter_validate_email(), its pretty simple:

void php_filter_validate_email(PHP_INPUT_FILTER_PARAM_DECL) /* {{{ */
{
    /* From
http://cvs.php.net/co.php/pear/HTML_QuickForm/QuickForm/Rule/Email.php?r=1.4*/
    const char regexp[] =
"/^((\\\"[^\\\"\\f\\n\\r\\t\\b]+\\\")|([\\w\\!\\#\\$\\%\\&\\'\\*\\+\\-\\~\\/\\^\\`\\|\\{\\}]+(\\.[\\w\\!\\#\\$\\%\\&\\'\\*\\+\\-\\~\\/\\^\\`\\|\\{\\}]+)*))@((\\[(((25[0-5])|(2[0-4]

    pcre       *re = NULL;
    pcre_extra *pcre_extra = NULL;
    int preg_options = 0;
    int         ovector[150]; /* Needs to be a multiple of 3 */
    int         matches;

    re = pcre_get_compiled_regex((char *)regexp, &pcre_extra, &preg_options
TSRMLS_CC);
    if (!re) {
        RETURN_VALIDATION_FAILED
    }
    matches = pcre_exec(re, NULL, Z_STRVAL_P(value), Z_STRLEN_P(value), 0,
0, ovector, 3);

    /* 0 means that the vector is too small to hold all the captured
substring offsets */
    if (matches < 0) {
        RETURN_VALIDATION_FAILED
    }
}

basically all it does is call pcre_exec() on against some email regex, and
the string you want to search from userspace.  the difference between that
and a call to preg_match() using the same regex and search string is speed.
the other tradeoff, as richard mentioned is flexibility.  since you cant
possibly conjure / write all possible calls to the regex engine, it makes
sense to expose something like the preg_* functions to userspace.  that
being said, id recommend wrapping the filter_* calls in your validation code
when & where possible, which is essentially the mantra of php programming in
general anyway (stick to the native functions as much as possible).

-nathan

ps.
ill probly setup a test later to see if my half-baked theory is even
accurate :O

[Index of Archives]     [PHP Home]     [Apache Users]     [PHP on Windows]     [Kernel Newbies]     [PHP Install]     [PHP Classes]     [Pear]     [Postgresql]     [Postgresql PHP]     [PHP on Windows]     [PHP Database Programming]     [PHP SOAP]

  Powered by Linux