Hello experts, I want to compare integer arrays basically with methods based on string similarity (i.e., levenshtein, trigrams etc).. In order to do that I hacked a custom function that converts those integer array to strings, where each integer is converted to a character by the function CHR(my_array1[i]+64) (so that 1->A, 2 ->B etc). This hack of course for large integers (I have integers up to 300,000) probably creates invalid UTF-8 characters. Levenshtein (from fuzzystrmatch module) does not seem to have a problem with that and works perfectly, since it is based on just comparing UTF8 codes. On the other hand when I try similarity function array1<->array1 for some cases it works (I think it works for all integers up to 4096) but for some larger indexes I get invalid byte sequence for encoding "UTF8" errors: Example integer sequence "8527,63586,8526,63585,63584,63583,63582,8525,8760,63820,63821,63822,860,57610,861,57611,862,57612,57613,863,57614,57615,57616,39850,39851,39852,39853,39854,39855,95275,39856,39857,95276,95277,39858,95278,95279,39859,95280,39860,95281,95282,39861,39862,39863,95283,95284,27095,27096,82406,82407,27097,27098,27099,27100,82408,27101,27102,27103,25702,80837,25703,25704,80838,25705,25706,25707,25708,30011,85343,30012,85344,30013,30014,51019,48260,48261,56809,56810,56811,56812,113829,31762,87568,31763,45925,41778,41779,41780,31778,31779,87571}"; Error message: invalid byte sequence for encoding "UTF8": 0xed 0xb8 0xa9 Is there a way to suppress these errors similar to levenshtein which does not care about validity of UTF characters? -- View this message in context: http://postgresql.1045698.n5.nabble.com/Pg-trgm-and-invalid-invalid-byte-sequence-for-encoding-UTF8-tp5791681.html Sent from the PostgreSQL - general mailing list archive at Nabble.com. -- Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general