Re: Regular Expression - highlighting

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Michael,

Thanks very much for the assistance, I'll have to investigate further!

Kind Regards,
Aidan Lister


"Michael Sims" <michaels@xxxxxxxxxxxxxx> wrote in message 
news:EOEIIEMPJGBOGHFJBKAFEEDICBAA.michaels@xxxxxxxxxxxxxxxxx
> Aidan Lister wrote:
>> Hello list,
>>
>> I'm pretty terrible with regular expressions, I was wondering if
>> someone would be able to help me with this
>> http://paste.phpfi.com/31964
>>
>> The problem is detailed in the above link. Basically I need to match
>> the contents of any HTML tag, except a link. I'm pretty sure a
>> lookbehind set is needed in the center (%s) bit.
>>
>> Any suggestions would be appreciated, but it's not quite as simple as
>> it sounds - if possible please make sure you run the above script and
>> see if it "PASSED".
>
> So basically, you want to put a link around "foo", only if it doesn't
> already have one, right?
>
> The problem with look-behind assertions is that they have to be 
> fixed-width.
> If you're certain of what kind of data you're going to be dealing with 
> then
> this may be sufficient.  For example, I came up with a regex that will 
> PASS
> your script but I doubt seriously that it'll be very useful to you as it
> would be easy to break it by coming up with various test cases.  For your
> single test case, however, this works:
>
> /(?<!<a href="foo">)(?<!<a href=")(foo)/
>
> The problem is that HTML tags can be split across lines...they have have 
> any
> variable amount of whitespace within the tag...they can have other
> attributes (class, id, onClick), etc.  Since look behind assertions have 
> to
> be fixed width it'd be impossible (IMHO) to come up with a single regex 
> that
> would match all cases, unless the input data was uniform.  For example,
> stuff like
>
> <a   href = "foo" ID="id1" class="redlink"
> onClick="javascript:someFunction();">foo</a>
>
> and its infinite variants could not be trapped for with a single regex 
> since
> you cannot have an infinite number of fixed width look-behind assertions.
> If quantifying modifiers such as '*', '+', and '?' were allowed in
> look-behind assertions it would be possible, but they aren't (see "man
> perlre").
>
> If your data is coming from unknown sources you'll probably have to use a
> full fledged HTML parser to pull out text that isn't already part of an 
> <a>
> tag.  I know there are several of these available for perl and I'm sure
> there are for PHP too but I'm unaware of them.
>
> Sorry if this isn't terribly helpful.  Maybe I'm overlooking something and
> someone else will point out a simple way to accomplish what you're trying 
> to
> do... 

-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php


[Index of Archives]     [PHP Home]     [Apache Users]     [PHP on Windows]     [Kernel Newbies]     [PHP Install]     [PHP Classes]     [Pear]     [Postgresql]     [Postgresql PHP]     [PHP on Windows]     [PHP Database Programming]     [PHP SOAP]

  Powered by Linux