Seemingly weird regex problem

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hello!  I'm having an odd regex problem.  Here's a summary of what I'm
trying to accomplish:

I've got a report file generated from our business management system
(Progress 4GL), one fixed-width record per line.  I've got a php script
that reads in the raw file one line at a time, and "strips" out any
"unwanted" lines (repeated column headings, mostly).  

I'm stripping out unwanted lines by looking at the beginning of each
line and doing the following:
1. If the line begins with a non-word character (\W+), discard it;
2. If the line begins with the word "Vendor", discard it;
3. If the line begins with "Loc", discard it;
4. If the line begins with a dash, discard it;
5. Else keep the line and write it to an output file.

The way I've implemented this in code is via the code snippet below. 
The problem I'm encountering, however, is that any line that begins with
a word, such as "AKRN", is matching rule #1, thus discarding the line. 
This is not what I want, but I'm having difficulty spotting my mistake. 

To try to help spot the issue, I put in the if(preg_match("/^\W+/",
$line)) logic, and the weird thing is that this logic isn't outputting
the line beginning with things like "AKRN", yet the same line is getting
caught in the switch statement and being discarded.

Any suggestions?

 while (!feof($input_handle))
 {
    $line = fgets($input_handle);
 
    if (preg_match("/^\W+/", $line))
    {
      echo "$line\n";
    }
 
    switch ($line)
    {
        case ($total_counter <= 5):
        fwrite($output_handle, $line);
        $counter++;
        $total_counter++;
        break;
       // Rule #1: non-word character
       case preg_match("/^\W+/", $line):
          array_push($tossed_lines, $line);
          echo "Rule #1 violation\n";
          $tossed_counter++;
          $total_counter++;
          break;
        // Rule #2: "Vendor" at beginning of line
        case preg_match("/^Vendor/i", $line):
          array_push($tossed_lines, $line);
          echo "Rule #2 violation\n";
          $tossed_counter++;
          $total_counter++;
          break;
       // Rule #3: "Loc" at beginning of line
        case preg_match("/^Loc/i", $line):
          array_push($tossed_lines, $line);
          echo "Rule #3 violation\n";
          $tossed_counter++;
          $total_counter++;
          break;
       // Rule #4: dash character at beginning of line
        case preg_match("/^\-/", $line):
           array_push($tossed_lines, $line);
           echo "Rule #4 violation\n";
           $tossed_counter++;
           $total_counter++;
           break;
        default:
           fwrite($output_handle, $line);
           $counter++;
           $total_counter++;
           break;
       }
     }

-- 
Tim Boring
IT Department, Automotive Distributors
Toll Free: 800-421-5556 x3007
Direct: 614-532-4240
E-mail: tboring@xxxxxxxx

-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php


[Index of Archives]     [PHP Home]     [Apache Users]     [PHP on Windows]     [Kernel Newbies]     [PHP Install]     [PHP Classes]     [Pear]     [Postgresql]     [Postgresql PHP]     [PHP on Windows]     [PHP Database Programming]     [PHP SOAP]

  Powered by Linux