Thanks peter for the explanation, but what I need to fix my problem. Im getting this error code Notice: iconv(): Detected an illegal character in input string in /var/www/html/rssfeed/sahafah.php on line 35 When I convert from utf-8 to cp1256 $rss2 = iconv("UTF-8", 'CP1256//TRANSLIT', $rss); > -----Original Message----- > From: Peter West [mailto:lists@xxxxxxxxx] > Sent: Wednesday, February 25, 2015 2:56 PM > To: Maciek Sokolewicz; PHP General > Subject: Re: Parsing > > > So, what does (.*?) mean? Well, simply said "any character, occuring 0 or > more times" occuring 0 or 1 times. > > I don't think so. ((.*)?) would mean that, but in (.*?), the '?' means "make > the preceding pattern non-greedy; that is, make it match the minimum > number of times. And as the minimum number of matches of (.*) is zero, it > ends up meaning 'match no character at all. So it will always be true, > wherever it occurs in a match string. > > For instance, > > $ php -a > Interactive shell > > php > $test = "aabbcc"; > php > $re = '/.+?(bb?).*/'; > php > preg_match($re, $test, $match); > php > print_r($match); > Array > ( > [0] => aabbcc > [1] => bb > ) > Note here that the initial pattern piece '.+?' is limited to the minimum > match. > The minimum match is a single character, but that is overruled by the > attempt to match > capturing sub-expression '(bb?)' so it in fact matches 'aa'. Note that in > this regexp, (bb?) means "...a 'b' char followed by zero or one 'b' chars." > Now change that initial sub-expression. > php > $re = '/.*?(bb?).*/'; > php > preg_match($re, $test, $match); > php > print_r($match); > Array > ( > [0] => aabbcc > [1] => bb > ) > The minimum match here is no characters, constrained to the minimum. > But again, > the minimum match must be extended to accommodate '(bb?)'. > Now remove the minimising constraint. > php > $re = '/.*(bb?).*/'; > php > preg_match($re, $test, $match); > php > print_r($match); > Array > ( > [0] => aabbcc > [1] => b > ) > Only one 'b'! Which 'b' is matched? It's the second 'b'. The minimum match > for (bb?) is a single 'b' followed by zero 'b's; so the second 'b' satisfies > the capturing expression, and the now-greedy initial subexpression can > gobble up all of the character to that second 'b'. > > Don't believe me? > php > $re = '/(.*)(bb?).*/'; > php > preg_match($re, $test, $match); > php > print_r($match); > Array > ( > [0] => aabbcc > [1] => aab > [2] => b > ) > Let's back up. > php > $re = '/(.*?)(bb?).*/'; > php > preg_match($re, $test, $match); > php > print_r($match); > Array > ( > [0] => aabbcc > [1] => aa > [2] => bb > ) > As before, with a non-greedy initial sub-expression, > except that we now capture that initial sub-expression. > (bb?) means "...a 'b' followed by zero or one 'b's, greedily. > Can we force that to be non-greedy? > php > $re = '/(.*?)(bb??)(.*)/'; > php > preg_match($re, $test, $match); > php > print_r($match); > Array > ( > [0] => aabbcc > [1] => aa > [2] => b > [3] => bcc > ) > Yes we can, by appending a moderating '?' which curbs the > appetite of the capturing sub-expression: (bb??) > > Peter West > "...and behold, something greater than Jonah is here." > > > On 23 Feb 2015, at 10:48 pm, Maciek Sokolewicz > <maciek.sokolewicz@xxxxxxxxx> wrote: > > > > Secondly, the above two regexp rules are slightly bloated. What they > actually mean is: > > ( = start new catchable pattern > > . = any character > > * = 0 or more of the previous pattern > > ? = 0 or 1 of the previous pattern > > ) = end catchable pattern > > \s = any whitespace character > > > > So, what does (.*?) mean? Well, simply said "any character, occuring 0 or > more times" occuring 0 or 1 times. But since the any character pattern > already occurs 0 or more times, the pattern as a whole will either be matched > (1 time) or not (0 times). Making the ? metacharacter useless. Now if it were > (.+?) then it would state "one or more of any character, with the entire > pattern optional. > > So in practice, the following patterns are equal in what they > > represent: (.*?), (.*), (.+?) > > > > > -- > PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: > http://www.php.net/unsub.php -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php