Re: Re: need some regex help to strip out // comments but not http:// urls

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



2013/5/29 Matijn Woudt <tijnema@xxxxxxxxx>

>
>
> On Wed, May 29, 2013 at 10:51 PM, Sebastian Krebs <krebs.seb@xxxxxxxxx>wrote:
>
>>
>>
>>
>> 2013/5/29 Matijn Woudt <tijnema@xxxxxxxxx>
>>
>>> On Wed, May 29, 2013 at 6:08 PM, Sean Greenslade <zootboysean@xxxxxxxxx
>>> >wrote:
>>>
>>> > On Wed, May 29, 2013 at 9:57 AM, Jonesy <gmane@xxxxxxxx> wrote:
>>> > > On Tue, 28 May 2013 14:17:06 -0700, Daevid Vincent wrote:
>>> > >> I'm adding some minification to our cache.class.php and am running
>>> into
>>> > an
>>> > >> edge case that is causing me grief.
>>> > >>
>>> > >> I want to remove all comments of the // variety, HOWEVER I don't
>>> want to
>>> > >> remove URLs...
>>> > >
>>> > > KISS.
>>> > >
>>> > > To make it simple, straight-forward, and understandable next year
>>> when I
>>> > > have to re-read what I've written:
>>> > >
>>> > > I'd change all "://" to "QqQ"  -- or any unlikely text string.
>>> > >
>>> > > Then I'd do whatever needs to be done to the "//" occurances.
>>> > >
>>> > > Finally, I'd change all "QqQ" back to "://".
>>> > >
>>> > > Jonesy
>>> >
>>> > Wow. This is just a spectacularly bad suggestion.
>>> >
>>> > First off, this task is probably a bit beyond the capabilities of a
>>> > regex. Yes, you may be able to come up with something that works 99%
>>> > of the time, but this is really a job for a parser of some sort. I'm
>>> > sorry I don't have any suggestions on exactly where to go with that,
>>> > however I'm sure Google can be of assistance. The main problem is that
>>> > regex doesn't understand context. It just blindly finds patterns. A
>>> > parser understands context, and can figure out which //'s are comments
>>> > and which are something else. As a bonus, it can probably understand
>>> > other forms of comments like /* */, which regex would completely die
>>> > on.
>>> >
>>> >
>>> It is possible to write a whole parser as a single regex, being it
>>> terribly
>>> long and complex.
>>>
>>
>> No, it isn't.
>>
>
>
> It's better if you throw some smart words on the screen if you want to
> convince someone. Just thinking about it, it makes sense as a true regular
> expression can only describe a regular language, and I think all the
> programming languages are not regular languages.
> But, We have PHP PCRE with extensions like Recursive patterns[1] and Back
> references[2], which can describe much more than just a regular language.
> And I do believe it would be able to handle it.
> Too bad it probably takes months to complete a regular expression like
> this.
>

Then you start as soon as possible, so that you not realitze, that this is
wrong, when it is too late. I am not going to start explaining this again,
because it becomes a waste of time. You call it "smart words on the
screen", I call it "advice".


> - Matijn
>
> [1] http://php.net/manual/en/regexp.reference.recursive.php
> [2] http://php.net/manual/en/regexp.reference.back-references.php
>



-- 
github.com/KingCrunch

[Index of Archives]     [PHP Home]     [Apache Users]     [PHP on Windows]     [Kernel Newbies]     [PHP Install]     [PHP Classes]     [Pear]     [Postgresql]     [Postgresql PHP]     [PHP on Windows]     [PHP Database Programming]     [PHP SOAP]

  Powered by Linux