On Wed, May 29, 2013 at 10:51 PM, Sebastian Krebs <krebs.seb@xxxxxxxxx>wrote: > > > > 2013/5/29 Matijn Woudt <tijnema@xxxxxxxxx> > >> On Wed, May 29, 2013 at 6:08 PM, Sean Greenslade <zootboysean@xxxxxxxxx >> >wrote: >> >> > On Wed, May 29, 2013 at 9:57 AM, Jonesy <gmane@xxxxxxxx> wrote: >> > > On Tue, 28 May 2013 14:17:06 -0700, Daevid Vincent wrote: >> > >> I'm adding some minification to our cache.class.php and am running >> into >> > an >> > >> edge case that is causing me grief. >> > >> >> > >> I want to remove all comments of the // variety, HOWEVER I don't >> want to >> > >> remove URLs... >> > > >> > > KISS. >> > > >> > > To make it simple, straight-forward, and understandable next year >> when I >> > > have to re-read what I've written: >> > > >> > > I'd change all "://" to "QqQ" -- or any unlikely text string. >> > > >> > > Then I'd do whatever needs to be done to the "//" occurances. >> > > >> > > Finally, I'd change all "QqQ" back to "://". >> > > >> > > Jonesy >> > >> > Wow. This is just a spectacularly bad suggestion. >> > >> > First off, this task is probably a bit beyond the capabilities of a >> > regex. Yes, you may be able to come up with something that works 99% >> > of the time, but this is really a job for a parser of some sort. I'm >> > sorry I don't have any suggestions on exactly where to go with that, >> > however I'm sure Google can be of assistance. The main problem is that >> > regex doesn't understand context. It just blindly finds patterns. A >> > parser understands context, and can figure out which //'s are comments >> > and which are something else. As a bonus, it can probably understand >> > other forms of comments like /* */, which regex would completely die >> > on. >> > >> > >> It is possible to write a whole parser as a single regex, being it >> terribly >> long and complex. >> > > No, it isn't. > It's better if you throw some smart words on the screen if you want to convince someone. Just thinking about it, it makes sense as a true regular expression can only describe a regular language, and I think all the programming languages are not regular languages. But, We have PHP PCRE with extensions like Recursive patterns[1] and Back references[2], which can describe much more than just a regular language. And I do believe it would be able to handle it. Too bad it probably takes months to complete a regular expression like this. - Matijn [1] http://php.net/manual/en/regexp.reference.recursive.php [2] http://php.net/manual/en/regexp.reference.back-references.php