Re: Re: Space in regex

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 17/11/06, Paul Novitski <paul@xxxxxxxxxxxxxxxxxxx> wrote:
At 11/16/2006 03:19 PM, Dotan Cohen wrote:
>However, this function:
>$text=preg_replace_callback('/\[([A-Za-z0-9\|\'.-:underscore:]+)\]/i'
>, "findLinks", $text);
>Does what I want it to when there is no space, regardless of whether
>or not there is a pipe. It does not replace anything if there is a
>space.


I see your problem -- you're omitting the brackets around your
metacharacters.  I believe you should be using [:underscore:] not
:underscore: -- therefore,

         /\[([A-Za-z0-9\|\'.-[:underscore:]]+)\]/i

I'm not sure why you need those metacharacters, however; I've never
had trouble matching literal space and underscore characters, e.g. [ _]

Also:
- You don't need to escape the vertical pipe.
- You don't need to escape the apostrophe.
- You do need to escape the hyphen unless you mean it to indicate a
range, which I'm sure you don't here.


On other regexp points:

>Thanks, Paul. I've been refining my methods, and I think it's better
>(for me) to just match everything between [ and ], including spaces,
>underscores, apostrophies, and pipes. I'll explode on the pipe inside
>the function.
>
>So I thought that a simple "/\[([.]+)\]/i" should do it.

Oops:  "[.]+" will look for one or more periods.  ".+" means one or
more character of any kind.  So you'd want:

         /\[(.+)\]/i

In a case like this where you're not using any alphabetic letters in
the pattern, the -i pattern modifier is irrelevant, so I'd drop it:

         /\[(.+)\]/

Then your problem is that regexp is 'greedy' and will grab as long a
matching string as it can.  If there's more than one of your link
structures in your text, the regexp above will grab everything from
the beginning of the first link to the end of the last.  That's why I
excluded the close-bracket in my pattern:

         /\[([^]]+)]/

I know [^]] looks funny but the close-bracket doesn't need to be
escaped if it's in the first position, which includes the first
position after the negating circumflex.  I've also omitted the
backslash before the final literal close-bracket which doesn't need
one because there's no open bracket context for it to be confused with.

Regards,
Paul


Thank you Paul. The greedy bit caught me off guard a few hours ago,
but I was able to tame it. I don't remember with texactly what code
(as it's changed about fifty times since then), but I'm now starting
to really get a handle on things. Thank you very much for your
detailed explanation. This is more fun than calculus!

Dotan Cohen

http://lyricslist.com/
http://what-is-what.com/

--
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php


[Index of Archives]     [PHP Home]     [Apache Users]     [PHP on Windows]     [Kernel Newbies]     [PHP Install]     [PHP Classes]     [Pear]     [Postgresql]     [Postgresql PHP]     [PHP on Windows]     [PHP Database Programming]     [PHP SOAP]

  Powered by Linux