Matijn Woudt <tijnema@xxxxxxxxx> wrote: >On Wed, May 29, 2013 at 10:51 PM, Sebastian Krebs ><krebs.seb@xxxxxxxxx>wrote: > >> >> >> >> 2013/5/29 Matijn Woudt <tijnema@xxxxxxxxx> >> >>> On Wed, May 29, 2013 at 6:08 PM, Sean Greenslade ><zootboysean@xxxxxxxxx >>> >wrote: >>> >>> > On Wed, May 29, 2013 at 9:57 AM, Jonesy <gmane@xxxxxxxx> wrote: >>> > > On Tue, 28 May 2013 14:17:06 -0700, Daevid Vincent wrote: >>> > >> I'm adding some minification to our cache.class.php and am >running >>> into >>> > an >>> > >> edge case that is causing me grief. >>> > >> >>> > >> I want to remove all comments of the // variety, HOWEVER I >don't >>> want to >>> > >> remove URLs... >>> > > >>> > > KISS. >>> > > >>> > > To make it simple, straight-forward, and understandable next >year >>> when I >>> > > have to re-read what I've written: >>> > > >>> > > I'd change all "://" to "QqQ" -- or any unlikely text string. >>> > > >>> > > Then I'd do whatever needs to be done to the "//" occurances. >>> > > >>> > > Finally, I'd change all "QqQ" back to "://". >>> > > >>> > > Jonesy >>> > >>> > Wow. This is just a spectacularly bad suggestion. >>> > >>> > First off, this task is probably a bit beyond the capabilities of >a >>> > regex. Yes, you may be able to come up with something that works >99% >>> > of the time, but this is really a job for a parser of some sort. >I'm >>> > sorry I don't have any suggestions on exactly where to go with >that, >>> > however I'm sure Google can be of assistance. The main problem is >that >>> > regex doesn't understand context. It just blindly finds patterns. >A >>> > parser understands context, and can figure out which //'s are >comments >>> > and which are something else. As a bonus, it can probably >understand >>> > other forms of comments like /* */, which regex would completely >die >>> > on. >>> > >>> > >>> It is possible to write a whole parser as a single regex, being it >>> terribly >>> long and complex. >>> >> >> No, it isn't. >> > > >It's better if you throw some smart words on the screen if you want to >convince someone. Just thinking about it, it makes sense as a true >regular >expression can only describe a regular language, and I think all the >programming languages are not regular languages. >But, We have PHP PCRE with extensions like Recursive patterns[1] and >Back >references[2], which can describe much more than just a regular >language. >And I do believe it would be able to handle it. >Too bad it probably takes months to complete a regular expression like >this. > >- Matijn > >[1] http://php.net/manual/en/regexp.reference.recursive.php >[2] http://php.net/manual/en/regexp.reference.back-references.php Sometimes when all you know is regex, everything looks like a nail... Thanks, Ash -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php