Jochem Maas a écrit :
>>
>> [I precede you, sorry for language mistakes...]
>
> php or english? :-)
>
ohhh... sh.. ! I think I speak PHP better than english (silly, not ?).
>
> okay, are you using the same PHP version on both machines?
> anything in the php.ini's that differs?
>
The same, not possible (Windows/Linux).
For php.ini, quite the same (some directories are different).
Under Windows (PHP 5.1.6 on 2k SP4 / 5.2.4 on XP SP2, the officials).
Under Linux (Ubuntu 8.04) 5.2.4-2ubuntu5.3.
> are you possibly looking at an input/file character-set encoding related
> issue? (i.e. encoding is different between the two servers)?
>
All PHP source is written in UTF-8.
I take the HTML code and convert it to UTF-8 using iconv() / mbstring...
> can you post a short complete script to see if others can reproduce the
> error?
>
See the following link for the bogus test (Must match : windows = 90,
linux = 54): http://pastebin.com/m1c43cc10
The same results are given when :
- comments are removed
- with 'm' or 's' PCRE options
- recursion is removed (multiple parses in while statement (matches for
each pass : 55, 26, 5, 2))
This snippet is used in a part of code which goal is to convert HTML
from Word 2003 to valid XHTML. But that is not the subject...
For the PCRE version, I really can not tell you which one I use...
Where can I see that ?
So, It may be a bug ? Too bad...
>
> have you tried to use the Tidy extension to clean up the input string?,
> it has alsorts of wonderful settings for making (x)HTML nice an shiny.
>
You think I have already tried it. ;-)
Tidy is too agressive for parsing HTML from MS Office...
Hope it will work.... :-/
--
Julien
--
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php