Re: strange character

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



That doesn't make sense to me.  Send me the test file.

Peter West
"...and behold, something greater than Jonah is here."

> On 26 Feb 2015, at 1:49 am, hadi <almarzuki2011@xxxxxxxxxxx> wrote:
> 
> I did what you asked and here is the result.
> Frist cmp, cmp test.txt test.utf8 give nothing output.
> Second cmp, cmp test.txt test.1256 getting this result cmp: EOF on test.txt.
> But im not understanding how this related to my php error what im getting.
> 
> And thanks peter for standing with me in my issue.
> What im trying to achieve here to remove the strange character by translating utf-8 to CP1256, but unfortunately getting error from php (Notice: iconv(): Detected an illegal character in input string in )
> 
> I did test on the strange character and its working prefect, but when it's come to downloading rss feed, it give error. 
> 
> <?php
> 
>     $str = '(–)';
> 
>     echo iconv("UTF-8", 'CP1256//TRANSLIT', $str); 
> ?>
> 
>> -----Original Message-----
>> From: Peter West [mailto:lists@xxxxxxxxx]
>> Sent: Wednesday, February 25, 2015 3:49 PM
>> To: hadi
>> Cc: PHP General
>> Subject: Re:  strange character
>> 
>> If the file is really us-ascii, then there are no characters in that file with the
>> 8th bit set, and you can do this:
>> 
>> iconv -f ASCII -t UTF8 test.txt >test.utf8 iconv -f ASCII -t CP1256 test.txt
>>> test.1256 cmp test.txt test.utf8
>>  (should be no messages about differences.) cmp test.txt test.1256
>>  (should be no messages about differences.)
>> 
>> If that all works, then the test text you have is not the same as the text you
>> have the problem with.  If you have any problems with this sequence, tell us
>> what they are.
>> 
>> Peter West
>> "...and behold, something greater than Jonah is here."
>> 
>>> On 25 Feb 2015, at 10:35 pm, hadi <almarzuki2011@xxxxxxxxxxx> wrote:
>>> 
>>> Im using linux centos.  I have access to terminal.
>>> 
>>> I did what you asked converting the test.txt to CP1256 And to verify
>>> that I did,
>>> 
>>> file -bi test.txt
>>> 
>>> text/plain; charset=us-ascii
>>> 
>>> not showing cp1256 encoding. But when I do,
>>> 
>>> iconv -l | grep CP1256
>>> CP1256//
>>> I get CP1256 encoding.
>>> 
>>>> -----Original Message-----
>>>> From: Peter West [mailto:lists@xxxxxxxxx]
>>>> Sent: Wednesday, February 25, 2015 3:11 PM
>>>> To: hadi; PHP General
>>>> Subject: Re:  strange character
>>>> 
>>>>> On 25 Feb 2015, at 9:06 pm, hadi <almarzuki2011@xxxxxxxxxxx> wrote:
>>>>> 
>>>>> Hi Peter,
>>>>> 
>>>>> I want to convert from utf-8 to CP1256 so the strange character can
>>>>> be
>>>> fixed.
>>>> 
>>>> I think it's either a) already in CP1256, or it's in some other
>>>> character set altogether.  The error message is telling you that it's not
>> recognised as UTF-8.
>>>> 
>>>> Do you have access to the iconv program in a terminal?  If you are on
>>>> a linux or OS X system, just open a terminal and type
>>>> 
>>>> iconv --help
>>>> 
>>>> If you have iconv installed you will get a help message.
>>>> 
>>>> Get the text you are trying to convert into a text file, and just try
>>>> various conversions using iconv, until it looks right.  Let's say
>>>> your text is in the file unknown.txt.
>>>> 
>>>> iconv -l
>>>> 
>>>> will list all of the character sets that iconv knows about.  Find the
>>>> likely candidates and just try
>>>> 
>>>> iconv -t UTF8 -f CP1252 unknown.txt
>>>> iconv -t UTF8 -f CP1256 unknown.txt
>>>> iconv -t UTF8 -f CP1254 unknown.txt
>>>> 
>>>> etc, until it looks right.
>>>> 
>>>> Peter West
>>>> "...and behold, something greater than Jonah is here."
>>>> 
>>>>> 
>>>>>> -----Original Message-----
>>>>>> From: Peter West [mailto:lists@xxxxxxxxx]
>>>>>> Sent: Wednesday, February 25, 2015 1:46 PM
>>>>>> To: hadi
>>>>>> Subject: Re:  strange character
>>>>>> 
>>>>>> Aren't you going the wrong way?  It looks as though the text you
>>>>>> are trying to convert is one of the 8-bit character sets.  From
>>>>>> these sets you get bad characters because a character with the
>>>>>> MSBit set will be interpreted by a
>>>>>> UTF-8 system as a multi-byte character.
>>>>>> 
>>>>>> Which way do you want to go: from CP1256 to UTF-8 or vice versa?
>>>>>> 
>>>>>> Peter West
>>>>>> "...and behold, something greater than Jonah is here."
>>>>>> 
>>>>>>> On 25 Feb 2015, at 7:15 pm, hadi <almarzuki2011@xxxxxxxxxxx>
>> wrote:
>>>>>>> 
>>>>>>> Hi,
>>>>>>> 
>>>>>>> 
>>>>>>> Im trying to use (iconv("UTF-8", 'CP1256//TRANSLIT', $rss);) to
>>>>>>> convert strange character like (–) to proper character, but im
>>>>>>> getting error
>>>>>>> 
>>>>>>> I googled about it but never found anything about it.
>>>>>>> 
>>>>>>> Here is the error
>>>>>>> 
>>>>>>> Notice: iconv(): Detected an illegal character in input string in
>>>>>>> /var/www/html/rssfeed/sahafah.php on line 35
>>>>>>> 
>>>>>>> 
>>>>>>> --
>>>>>>> PHP General Mailing List (http://www.php.net/) To unsubscribe, visit:
>>>>>>> http://www.php.net/unsub.php
>>>>>>> 
>>>>> 
>>>>> 
>>>> 
>>>> 
>>>> --
>>>> PHP General Mailing List (http://www.php.net/) To unsubscribe, visit:
>>>> http://www.php.net/unsub.php
>>> 
>>> 
> 
> 
> 
> --
> PHP General Mailing List (http://www.php.net/)
> To unsubscribe, visit: http://www.php.net/unsub.php
> 


-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php






[Index of Archives]     [PHP Home]     [Apache Users]     [PHP on Windows]     [Kernel Newbies]     [PHP Install]     [PHP Classes]     [Pear]     [Postgresql]     [Postgresql PHP]     [PHP on Windows]     [PHP Database Programming]     [PHP SOAP]

  Powered by Linux