Re: Regular expressions (regex) question for parsing

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



2008/12/22 Jim Lucas <lists@xxxxxxxxx>:
> Rene Fournier wrote:
>> Hi, I'm looking for some ideas on the best way to parse blocks of text
>> that is formatted such as:
>>
>> $sometext %\r\n                        -- good data
>> $otherstring %\r\n                        -- good data
>> $andyetmoretext %\r\n                    -- good data
>> $finaltext                                 -- bad data (missing ending)
>>
>> Each line should start with a $dollar sign, then some arbitrary text,
>> ends with a percent sign, followed by carriage-return and line-feed.
>> Sometimes though, the final line is not complete. In that case, I want
>> to save those lines too.
>>
>> ....so that I end up with an array like:
>>
>> $result = array (    "matches" =>
>>         array (    0 => "$sometext %\r\n",
>>                 1 => "$otherstring %\r\n",
>>                 2 => "$andyetmoretext %\r\n"
>>                 ),
>>     "non_matches" =>
>>         array (    3 => "$finaltext"
>>                 )
>>             );
>>
>> The key thing here is that the line numbers are preserved and the
>> non-matched lines are saved...
>>
>> Any ideas, what's the best way to go about this? Preg_matc, preg_split
>> or something incorporating explode?
>>
>> ....Rene
>>
>
> Something along the line of this?
>
> <pre><?php
>
> $block_of_text = '$sometext %\r\n
> $otherstring %\r\n
> $andyetmoretext %\r\n
> $finaltext';
>
> $lines = explode("\n", $block_of_text);
>
> $results = array();
>
> foreach ( $lines AS $i => $line ) {
>        if ( preg_match('|^\$.* %\\\r\\\n$|', $line ) ) {
>                $results['matches'][$i] = $line;
>        } else {
>                $results['non_matches'][$i] = $line;
>        }
> }
>
> print_r($results);
>
> ?>

I know I'm arguing against premature optimization in another thread at
the moment, but using regexps for this is overkill from the get-go:

    if ($line[0] === '$' && substr($line, -3) === "%\r\n") {

This is both faster and easier to read. Regexps are great for more
complex stuff but something this simple doesn't demand their power (or
overhead).


Torben

-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php


[Index of Archives]     [PHP Home]     [Apache Users]     [PHP on Windows]     [Kernel Newbies]     [PHP Install]     [PHP Classes]     [Pear]     [Postgresql]     [Postgresql PHP]     [PHP on Windows]     [PHP Database Programming]     [PHP SOAP]

  Powered by Linux