Re: Parsing pageranges

Luis Moreira <luis.moreira@xxxxxxxxxxxxxxx> · Tue, 29 Jun 2004 13:18:46 +0100

George Pitcher wrote:

Hi,

I have several systems running which gather, store and process bibliographic
data. I have treated pageranges on the basis of two fields per range - start
and end, with supplemantary ranges available as well.

I've never had to deal with more than 3 ranges in a reference: chapter,
references and notes. Now, a client has asked for 6 ranges. I could
futureproof this by putting 10 ranges in but I got to thinking about how
Pagemaker used to handle prining (and how Micro$oft do now) where I can note
a range as (example) 1-4,6,8-10.

I need to be able to parse this type of string so that I can identify the
number of pages being referenced.

I also need to ensure that the user hasn't entered a mixed range such as
xiii-5 (I know that the second part of that is 1-5 but I don't know what the
highest roman numeral was). I do know how to handle the roman calculations,
so that's a side issue.

I'm guessing that regex is the way to go, but whenever I'm confronted with
it, I look for a chinese interpreter.

Any suggestions?

George in Oxford

I am not used to work with regex, either, so here are  my 2pence...
Just writing as I think, I would say something like

- Explode the string looking for the comma
- If the resulting "pieces" are numerical, fine.
- If not, explode them indivudually looking for the hiphen
- If the results are numeric, fine
- If in between something "non-numeric" is found, you have an error

This, obviously, ignoring the roman numerals...

-- 
PHP Windows Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php