Re: PCRE regex result is different between Linux & Windows.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



ClapClap schreef:
> Hi,
> 
> [I precede you, sorry for language mistakes...]

php or english? :-)

> 
> I have done a pretty regex which can normally strip all the empties HTML

regexp's are never pretty IMHO ;-)



I'm just going to start asking a whole stack of questions in the
hope something sticks ...

> tags. With PHP v5.2.4, under Windows XP, preg_replace() gives me 90
> matches (which is correct) while under linux it stagnate to 54.

okay, are you using the same PHP version on both machines?
anything in the php.ini's that differs?

are you possibly looking at an input/file character-set encoding related
issue? (i.e. encoding is different between the two servers)?

are you looking at an issues related to line-endings? (your not using
'multi-line' modifier)

I don't think the 'S' modifier is doing anything for you (no
performance benefit if I read the manual correctly) ... try removing it.

can you post a short complete script to see if others can reproduce the
error?

can you search the archives for similiar posts regarding regexp
problems ... I recall a thread this year (which I participated in
to try and help the OP) that dealt with something rather similar.

> Let me introduce this regex :
> $pattern = '
> @
> <([a-z0-9]+)[^>]*>     # open tag
>   (?:[\s|(?:&nbsp;)]*| # white characters (ugly ?)
>     (?R))*             # recursion (parent tag is empty to ?)
> </\s*\1\s*>            # close tag
> @Sux'; // I love those pattern's options !! :-D)

nice comment! have you tried to condense the pattern (and remove it's
comments and the 'x' modifier ... just to see whether that makes a
difference?

> So, is it normal that I have this big difference for matches (Windows vs
> Linux) ?

no, kind of defeats the object of the game ... you may have found a bug,
but I wouldn't bet on it, the recursion wotsit is marked experimental and
there is a high chance the problem is at your end ... but then the
only thing that really matters is figuring out how to solve the issue and
move onto the next problem.

have you tried to use the Tidy extension to clean up the input string?,
it has alsorts of wonderful settings for making (x)HTML nice an shiny.

> I thought it was a recursion trouble so I include the preg_replace() in
> a do/while structure. It did not change anything...

did you change the regexp by removing the "|(?R)" when testing that?

> If anyone can tell me about it, it will be a pleasure !
> 
> 


-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php


[Index of Archives]     [PHP Home]     [Apache Users]     [PHP on Windows]     [Kernel Newbies]     [PHP Install]     [PHP Classes]     [Pear]     [Postgresql]     [Postgresql PHP]     [PHP on Windows]     [PHP Database Programming]     [PHP SOAP]

  Powered by Linux