Re: Unexplained Issue Using Regex

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Mar 6, 2009 at 4:01 PM, Nitsan Bin-Nun <nitsan@xxxxxxxxxxxx> wrote:
> On Fri, Mar 6, 2009 at 11:53 PM, haliphax <haliphax@xxxxxxxxx> wrote:
>>
>> On Fri, Mar 6, 2009 at 3:44 PM, Nitsan Bin-Nun <nitsan@xxxxxxxxxxxx>
>> wrote:
>> > I'm not looking for other ideas, the main thing here is that I have
>> > about
>> > 30-100 regex's in the database and the script fetches them and applies
>> > them
>> > to the string. I can't build again the engine and I'm not going to do
>> > that.
>> > I'm trying to solve my problem ;) If you have any ideas regarding my
>> > issue
>> > and not going in another way this would be very appreciated.
>>
>> Nitsan,
>>
>> I think it's because you're referencing the capture group with index
>> instead of index 2. Also, I don't understand why you have the pipe
>> ("|") character in your regex string... is that part of your engine?
>>
>> This code:
>>
>> $orig = 'http://www.zshare.net/video/541070871c7a8d9c';
>> $matches = array();
>> preg_match('#http://(www\.)zshare\.net/video/([^/]+)#', $orig, $matches);
>> echo $matches[2];
>>
>> Grabs the correct match:
>>
>> 541070871c7a8d9c
>>
>> The regex pattern works with the pipe char, but it is unnecessary and
>> may lead to some strange behavior.
>
> Thank you Todd, I also want to capture when I don't have the www in the
> beginning of the URL.
> For instance, try to execute your code with
> $orig = 'http://zshare.net/video/541070871c7a8d9c';
>
> That's why I used (www\.|), but I'm not a regex expert and I'm sure there a
> way better solutions to this problem.

http://www.regular-expressions.info is your best friend. Spend an
afternoon playing around on it... that's really the only advantage I
have over someone who hasn't.

Anyway, you can make that entire group optional with the ? character like so:

#http://(www\.)?zshare\.net/video/([^/]+)#

And if you don't want it to be captured, making the URL suffix index 1
instead of index 2, do this:

#http://(?:www\.)?zshare\.net/video/([^/]+)#

Any group that begins with "?:" will not be captured in a match index. To recap:

$pattern = '#http://(?:www\.)?zshare\.net/video/([^/]+)#';
$orig = 'http://zshare.net/video/541070871c7a8d9c';
$matches = array();
preg_match($pattern, $orig, $matches);
echo $matches[1] . "\n";
$orig = 'http://www.zshare.net/video/541070871c7a8d9c';
preg_match($pattern, $orig, $matches);
echo $matches[1] . "\n";

Produces this output:

541070871c7a8d9c
541070871c7a8d9c

Hope this helps,


-- 
// Todd

-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php


[Index of Archives]     [PHP Home]     [Apache Users]     [PHP on Windows]     [Kernel Newbies]     [PHP Install]     [PHP Classes]     [Pear]     [Postgresql]     [Postgresql PHP]     [PHP on Windows]     [PHP Database Programming]     [PHP SOAP]

  Powered by Linux