On Thu, 2010-05-27 at 14:06 -0400, Bob McConnell wrote: > From: Ashley Sheridan > > >On Thu, 2010-05-27 at 12:08 -0400, Adam Richardson wrote: > > > >> On Thu, May 27, 2010 at 9:45 AM, Guus Ellenkamp > >> <Ellenkamp_Guus@xxxxxxxxxxx>wrote: > >> > >> > Thanks, but are you sure of that? I did some research a while ago and found > >> > that officially PHP files should be ascii and not have any specific > >> > character encoding. I believe it will work anyhow (did not try this one), > >> > but would like to stick with the standards. > >> > > >> > "Ashley Sheridan" <ash@xxxxxxxxxxxxxxxxxxxx> wrote in message > >> > news:1274883714.2202.228.camel@xxxxxxxxxxxx > >> > > On Wed, 2010-05-26 at 22:20 +0800, Guus Ellenkamp wrote: > >> > > > >> > >> We use PHP defines for defining text in different languages. As far as I > >> > >> know PHP files are supposed to be ASCII, not UTF-8 or something like > >> > >> that. > >> > >> What I want to make is a conversion program that would convert a given > >> > >> UTF-8 > >> > >> file with the format > >> > >> > >> > >> definetext1=this is a text in random UTF-8, probably arabic or similar > >> > >> text > >> > >> definetext2=this is another text in random UTF-8, probably arabic or > >> > >> similar > >> > >> text > >> > >> > >> > >> into a file with the following defines > >> > >> > >> > >> > >> > define('definetext1',chr(<t_value>).chr(<h_value>).chr(<i_value>)... > > <chr(<x_value>).chr(<t_value>)); > >> > >> > >> > define('definetext2,chr(<t_value>).chr(<h_value>).chr(<i_value>)... > > <chr(<x_value>).chr(<t_value>)); > >> > >> > >> > >> Not sure if I'm using the correct chr/ord function, but I hope the above > >> > >> is > >> > >> clear enough to make clear what I'm looking for. Basically the output > >> > >> file > >> > >> should be ascii and not contain any utf-8. > >> > >> > >> > >> Any advise? The html_special_chars did not seem to work for Vietnamese > >> > >> text > >> > >> I tried to convert, so something seems to get wrong with just reading an > >> > >> array of strings and converting the strings and putting them in defines. > >> > > > >> > > PHP files can contain utf-8, and in-fact is the preference of most > >> > > developers I know of. > >> > > > >> > > >> Because the lower range of UTF-8 matches the ascii character set > >> (intentionally by design), you'll be able to use UTF-8 for PHP files without > >> problem (i.e., ascii 7-bit chars have same encoding in UTF-8.) > >> http://www.cl.cam.ac.uk/~mgk25/unicode.html > >> > >> However, if you were to use any of the multibyte characters of UTF-8 in a > >> PHP file, you could run in to some trouble. I use UTF-8 for most of my PHP > >> files, but I've been sticking to the ASCII subset exclusively. > > > > I don't use the higher range of characters often, but I do sometimes use > > them for things like the graphical glyphs (½✉✆, etc) I know I could do > > those with regular text and the Wingdings font, but that's not available > > on every computer, and breaks the semantic meaning behind the glyphs. > > What higher range? ASCII only defined 128 values, the bottom 32 being control characters that don't print. Anything outside of that is not ASCII, but a proprietary extension. In particular, the glyphs usually associated with 0-32 and 128-255 are IBM specific and not guaranteed to be present outside of their original video ROM. So only the first 128 characters map directly into UTF-8. > > Bob McConnell > > Ref: pp 25-29 The Programmer's PC Sourcebook, 1988, Thom Hogan, Microsoft Press The higher range of utf8 characters that don't map to ascii values. Thanks, Ash http://www.ashleysheridan.co.uk