Re: Re: preg_match() returns false but no documentation why

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On 5/30/07, Jim Lucas <lists@xxxxxxxxx> wrote:

The op will need to use something other than forward slashes.

At 5/30/2007 03:26 PM, Jared Farrish wrote:
You mean the delimiters (a la Richard's suggestion about using '|')?


Hi Jared,

If the pattern delimiter character appears in the pattern it must be escaped so that the regexp processor will correctly interpret it as a pattern character and not as the end of the pattern.

This would produce a regexp error:

        /ldap://*/

but this is OK:

        /ldap:\/\/*/

Therefore if you choose another delimiter altogether you don't have to escape the slashes:

        #ldap://*#

Cleaner and more clear.


preg_match('|^ldap(s)?://[a-zA-Z0-9-]+\.[a-zA-Z.]{2,5}$|', $this->server )

I also recommend using single quotes instead of double quotes here.

Single Quotes: Noted. Any reason why? I guess you might be a little out of
luck putting $vars into a regex without . concatenating.

Both PHP and regexp use the backslash as an escape. Inside double quotes, PHP interprets \ as escape, while inside single quotes PHP interprets \ as a simple backslash character.

When working with regexp in PHP you're dealing with two interpreters, first PHP and then regexp. To support PHP's interpretation with double quotes, you have to escape the escapes:

Single quotes:          '/ldap:\/\/*/'
Double quotes:          "/ldap:\\/\\/*/"

PHP interprets "\\/" as \/
RegExp interprets \/ as /

There's also the additional minor argument that single-quoted strings take less processing because PHP isn't scanning them for escaped characters and variables to expand. On a practical level, though, the difference is going to be measured in microseconds and is unlikely to affect the perceptible speed of a typical PHP application.

So, for a pattern like this that contains slashes, it's best to use a non-slash delimiter AND single quotes (unless, as you say, you need to include PHP variables in the pattern):

        $pattern = '#ldap://*#';

Personally I favor heredoc syntax for such situations because I don't have to worry about the quotes:

$regexp = <<<_
#ldap://*$var#
_;


why is there a period in the second pattern?

The period comes from the original article on SitePoint (linked earlier). Is
it unnecessary? I can't say I'm real sure what this means for the '.' in
regex's:

"Matches any single character except line break characters \r and \n. Most
regex flavors have an option to make the dot match line break characters
too."
- http://www.regular-expressions.info/reference.html

Inside of a bracketed character class, the dot means a literal period character and not a wildcard.

"All non-alphanumeric characters other than \, -, ^ (at the start) and the terminating ] are non-special in character classes"

PHP PREG
Pattern Syntax
http://www.php.net/manual/en/reference.pcre.pattern.syntax.php
scroll down to 'Square brackets'


Also, why are you allowing for uppercase letters
when the RFC's don't allow them?

I hadn't gotten far enough to strtolower(), but that's a good point, I
hadn't actually considered it yet.

Perhaps it has to do with the source of the string: can you guarantee that the URIs passed to this routine conform to spec?

Another way to handle this would be to simply accept case-insensitive strings:

        |^ldap(s)?://[a-z0-9-]+\.[a-z.]{2,5}$|i

Pattern Modifiers
http://www.php.net/manual/en/reference.pcre.pattern.modifiers.php

"i (PCRE_CASELESS)
" If this modifier is set, letters in the pattern match both upper and lower case letters."

Regards,

Paul
__________________________

Paul Novitski
Juniper Webcraft Ltd.
http://juniperwebcraft.com
--
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php


[Index of Archives]     [PHP Home]     [Apache Users]     [PHP on Windows]     [Kernel Newbies]     [PHP Install]     [PHP Classes]     [Pear]     [Postgresql]     [Postgresql PHP]     [PHP on Windows]     [PHP Database Programming]     [PHP SOAP]

  Powered by Linux