Re: Regex pattern for preg_match_all

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



As far as I can tell, your problem lies in [^href]*. That will match any characters other than h, r, e or f, not anything other than the string href. Consider replacing it with [^>]*?. The ? makes it non-greedy so it will stop as soon as it can (when it matches the first href) rather than as late as it can (when it matches a >)
---
Simon Welsh
Sent from my phone, excuse the brevity

On 19/02/2011, at 10:36, Tommy Pham <tommyhp2@xxxxxxxxx> wrote:

> Hi folks,
> 
> This is not directly relating to PHP but it's Friday so I'm gonna give
> it a shot :).  Would someone please help me figure out why my regex
> pattern doesn't work.  Below is the code and sample data:
> 
> $html = <<<HTML
> <li class="small  tab "><a class="y-mast-link images"
> href="http://images.search.yahoo.com/images";
> data-b="http://www.yahoo.com";><span class="tab-cover y-mast-bg-hide"
> style="padding-left:0em;padding-right:0em;">Images</span></a></li>
> <li class="small  tab "><a class="y-mast-link video"
> href="http://video.search.yahoo.com/video";
> data-b="http://www.yahoo.com";><span class="tab-cover y-mast-bg-hide"
> style="padding-left:0em;padding-right:0em;">Video</span></a></li>
> <li class="small  tab "><a class="y-mast-link local"
> href="http://local.yahoo.com/results";
> data-b="http://www.yahoo.com";><span class="tab-cover y-mast-bg-hide"
> style="padding-left:0em;padding-right:0em;">Local</span></a></li>
> <li class="small  tab "><a class="y-mast-link shopping"
> href="http://shopping.yahoo.com/search";
> data-b="http://www.yahoo.com";><span class="tab-cover y-mast-bg-hide"
> style="padding-left:0em;padding-right:0em;">Shopping</span></a></li>
> <li class="small lasttab more-tab "><a class="y-mast-link more"
> href="http://tools.search.yahoo.com/about/forsearchers.html"; ><span
> class="tab-cover y-mast-bg-hide">More</span><span
> class="y-fp-pg-controls arrow"></span></a></li>
> HTML;
> 
> $pattern = '%<a\s[^href]*href\s*=\s*[\'|"]?([^\'|"|#]+)[\'|"]?\s*[^>]*>(.*)?</a>%im';
> preg_match_all($pattern, $html, $matches);
> 
> The only matches I got is:
> 
> Match 1 of 1:    <a class="y-mast-link local"
> href="http://local.yahoo.com/results";
> data-b="http://www.yahoo.com";><span class="tab-cover y-mast-bg-hide"
> style="padding-left:0em;padding-right:0em;">Local</span></a>
> 
> Group 1:    http://local.yahoo.com/results
> 
> Group 2:    <span class="tab-cover y-mast-bg-hide"
> style="padding-left:0em;padding-right:0em;">Local</span>
> 
> The pattern I made was to work in cases where the page is
> non-compliant to any of standard W3.
> 
> Thanks,
> Tommy
> 
> -- 
> PHP General Mailing List (http://www.php.net/)
> To unsubscribe, visit: http://www.php.net/unsub.php
> 

[Index of Archives]     [PHP Home]     [Apache Users]     [PHP on Windows]     [Kernel Newbies]     [PHP Install]     [PHP Classes]     [Pear]     [Postgresql]     [Postgresql PHP]     [PHP on Windows]     [PHP Database Programming]     [PHP SOAP]

  Powered by Linux