2013/5/29 Matijn Woudt <tijnema@xxxxxxxxx> > > > On Wed, May 29, 2013 at 10:51 PM, Sebastian Krebs <krebs.seb@xxxxxxxxx>wrote: > >> >> >> >> 2013/5/29 Matijn Woudt <tijnema@xxxxxxxxx> >> >>> On Wed, May 29, 2013 at 6:08 PM, Sean Greenslade <zootboysean@xxxxxxxxx >>> >wrote: >>> >>> > On Wed, May 29, 2013 at 9:57 AM, Jonesy <gmane@xxxxxxxx> wrote: >>> > > On Tue, 28 May 2013 14:17:06 -0700, Daevid Vincent wrote: >>> > >> I'm adding some minification to our cache.class.php and am running >>> into >>> > an >>> > >> edge case that is causing me grief. >>> > >> >>> > >> I want to remove all comments of the // variety, HOWEVER I don't >>> want to >>> > >> remove URLs... >>> > > >>> > > KISS. >>> > > >>> > > To make it simple, straight-forward, and understandable next year >>> when I >>> > > have to re-read what I've written: >>> > > >>> > > I'd change all "://" to "QqQ" -- or any unlikely text string. >>> > > >>> > > Then I'd do whatever needs to be done to the "//" occurances. >>> > > >>> > > Finally, I'd change all "QqQ" back to "://". >>> > > >>> > > Jonesy >>> > >>> > Wow. This is just a spectacularly bad suggestion. >>> > >>> > First off, this task is probably a bit beyond the capabilities of a >>> > regex. Yes, you may be able to come up with something that works 99% >>> > of the time, but this is really a job for a parser of some sort. I'm >>> > sorry I don't have any suggestions on exactly where to go with that, >>> > however I'm sure Google can be of assistance. The main problem is that >>> > regex doesn't understand context. It just blindly finds patterns. A >>> > parser understands context, and can figure out which //'s are comments >>> > and which are something else. As a bonus, it can probably understand >>> > other forms of comments like /* */, which regex would completely die >>> > on. >>> > >>> > >>> It is possible to write a whole parser as a single regex, being it >>> terribly >>> long and complex. >>> >> >> No, it isn't. >> > > > It's better if you throw some smart words on the screen if you want to > convince someone. Just thinking about it, it makes sense as a true regular > expression can only describe a regular language, and I think all the > programming languages are not regular languages. > But, We have PHP PCRE with extensions like Recursive patterns[1] and Back > references[2], which can describe much more than just a regular language. > And I do believe it would be able to handle it. > Too bad it probably takes months to complete a regular expression like > this. > Then you start as soon as possible, so that you not realitze, that this is wrong, when it is too late. I am not going to start explaining this again, because it becomes a waste of time. You call it "smart words on the screen", I call it "advice". > - Matijn > > [1] http://php.net/manual/en/regexp.reference.recursive.php > [2] http://php.net/manual/en/regexp.reference.back-references.php > -- github.com/KingCrunch