Re: preg_match

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 8/24/05, Richard Lynch <ceo@xxxxxxxxx> wrote:
> On Tue, August 23, 2005 1:58 am, Robin Vickery wrote:
> >
> > Both of these will fail to compile:
> > preg_match( '/\\7abc/', $myString);
> > preg_match( '/\7abc/',  $myString);
> 
> Well, duh!
> 
> You'd have to use them correctly in Regular Expressions to not get a
> "failure to compile" from the PCRE Module.

Well, duh!

My point is that doubling up the backslashes has done nothing to
change the meaning of the string.

In both of the examples you gave, if it worked before you doubled the
slashes, then it worked after. If it didn't work before, then it
didn't work after. The '\\7' is just as special when it gets to the
regexp as '\7'.

Doubling the slashes has done nothing at all except make it a little
harder to read.

> Look, let's just forget the Regex, since it's confusing you.

It confuses me not at all. 

> "\\7abc" and "\7abc" are NOT the same string.
> "\\abc7" and "\abc7" *ARE* the same string.

Absolutely - if you have them in double quotes then they are NOT the
same. But we weren't discussing double-quoted strings, we were
discussing *single-quotes*.

'\\7abc' and '\7abc' ARE the same string.
'\\abc7' and '\abc7' ARE the same string.

print '\\7abc' === '\7abc' ? "same\n" : "different\n"; // same
print '\\abc7' === '\abc7' ? "same\n" : "different\n"; // same

So of the two semantically identical strings, why not go for the simpler one?

> If you use \\ it *ALWAYS* works.
> If you use \ it *SOMETIMES* works.

> *SOMETIMES* \ means just \ and *SOMETIMES* it means something else
> based on the next character.

It means something else only in two specific cases. It's not exactly a
bewildering array of options.

Consistency is not automatically a good thing. If you make something
consistently more complicated you're *introducing* the likelihood of
bugs. Especially when the original exceptions are so few.

Occams Razor: Do not multiply entities (or backslashes) beyond necessity.

> If you use \ to mean \ then sooner or later, you're going to end up
> changing the character after the \ to something that suddenly changes
> the meaning of \ to not be \ any more but to be some special character
> (newline, tab, etc)

Single-quotes - variables and escape sequences do not get expanded.
That's what the single-quote syntax is for; so you don't need to worry
about escape sequences.

> This is especially important when you start getting your Regex strings
> from external sources such as User Input or outside data.

What?

Who in their right mind would use user input in a regexp without
passing it through preg_quote()?

> Yes, the manual definition specifically allows you to use \ and any
> non-special character afterwards.  That still doesn't make it a Good
> Practice.

> The manual no longer specifies that you can indent your code badly,
> AFAICS, which is probably just as well, but you *can* do it, and it's
> syntactically correct.  And it's a Bad Practice.

Says you!

I'd say that the indiscriminate doubling up of backslashes is worse,
but that's my opinion.

> If you still disagree, I give up.

Yeah, it doesn't seem to be getting anywhere.

You say using \\ everywhere is consistent and that that makes code
more maintainable.

I say there's only two, well-documented "inconsistencies", and
doubling up every slash for the sake of that is more likely to
introduce problems than solve them.

We disagree. It's not the end of the world.

-robin

-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php



[Index of Archives]     [PHP Home]     [Apache Users]     [PHP on Windows]     [Kernel Newbies]     [PHP Install]     [PHP Classes]     [Pear]     [Postgresql]     [Postgresql PHP]     [PHP on Windows]     [PHP Database Programming]     [PHP SOAP]

  Powered by Linux