eddie, you were quite right and i was wrong. after discovering that the longest utf8 varchar column that mysql will allow is varchar(333), i did some tests. on a varchar(255) column, mysql allows strings with up to 255 utf8 characters to be inserted. and it truncates at 255 characters observing utf8 character sequences. now i have to go back to the script that convinced me otherwise and figure out what i misunderstood in it. section 10.4.1 of the manual could be explicit about it. when i read it i got the clear impression the parameter referred to octets. On 6/12/09 1:22 PM, "Tom Worster" <fsb@xxxxxxxxxx> wrote: > On 6/12/09 11:52 AM, "Eddie Drapkin" <oorza2k5@xxxxxxxxx> wrote: > >> Correct me if I'm wrong, but should varchar 255 with a utf8 character set >> mean >> 255 unicode characters, not octets? > > in mysql, the length refers to the storage space of the string, not the > decoded character count. i don't know about other dbms. > > >> On Fri, Jun 12, 2009 at 11:50 AM, Tom Worster <fsb@xxxxxxxxxx> wrote: >> say a table in the db has a varchar(255) column, 255 being the max number of >>> octets of strings that can go in the column. now say the php script very >>> occasionally has to deal with utf8 input strings with octet length > 255 -- >>> it needs to select rows matching the input string or insert the input >>> string. >>> >>> so what i think i need is a function to truncate a utf8 string to the >>> longest valid utf8 string that has octet length <= 255. >>> >>> is this what mb_strcut() is for? i'm having a hard time understanding the >>> man page for that function. -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php