On Fri, 2020-04-24 at 14:53 +0200, Amaury Bouchard wrote: > I have a really strange behaviour with a C function, wich gets a text as parameter. > Everything works fine when I call the function directly, giving a text string as parameter. But a problem occurs when I try to read data from a table. > > To illustrate the problem, I stripped the function down to the minimum. The source code is below, but first, here is the behaviour : > > Direct call > ----------- > > select passthru('hello world!'), passthru('utf8 çhàràtérs'), passthru(' h3110 123 456 '); > INFO: INPUT STRING: 'hello world!' (12) > INFO: INPUT STRING: 'utf8 çhàràtérs' (18) > INFO: INPUT STRING: ' h3110 123 456 ' (15) > > (as you can see, the log messages show the correct input, with the number of bytes between parentheses) > > Reading a table data > -------------------- > > create table mytable ( str text); > > insert into mytable (str) values ('hello world!'), ('utf8 çhàràtérs'), (' h3110 123 456 '); > > select passthru(str) from mytable; > INFO: INPUT STRING: 'lo world!' (12) > INFO: INPUT STRING: '8 çhàràtérs' (18) > INFO: INPUT STRING: '110 123 456 � > ' (15) > INFO: INPUT STRING: '��' (5) > INFO: INPUT STRING: '' (3) > > There, you can see that the pointer seems to be shifted 3 bytes farther. > > Do you have any clue for this strange behaviour? > > > The source code > --------------- > > #include "postgres.h" > #include "fmgr.h" > #include "funcapi.h" > > // PG module init > #ifdef PG_MODULE_MAGIC > PG_MODULE_MAGIC; > #endif > void _PG_init(void); > Datum passthru(PG_FUNCTION_ARGS); > PG_FUNCTION_INFO_V1(passthru); > > void _PG_init() { > } > > Datum passthru(PG_FUNCTION_ARGS) { > // get the input string > text *input = PG_GETARG_TEXT_PP(0); > char *input_pt = (char*)VARDATA(input); > int32 input_len = VARSIZE_ANY_EXHDR(input); > // create a null terminated copy of the input string > char *str_copy = calloc(1, input_len + 1); > memcpy(str_copy, input_pt, input_len); > // log message > elog(INFO, "INPUT STRING: '%s' (%d)", str_copy, input_len); > free(str_copy); > PG_RETURN_NULL(); > } You find this in "postgres.h": * In consumers oblivious to data alignment, call PG_DETOAST_DATUM_PACKED(), * VARDATA_ANY(), VARSIZE_ANY() and VARSIZE_ANY_EXHDR(). Elsewhere, call * PG_DETOAST_DATUM(), VARDATA() and VARSIZE(). Directly fetching an int16, * int32 or wider field in the struct representing the datum layout requires * aligned data. memcpy() is alignment-oblivious, as are most operations on * datatypes, such as text, whose layout struct contains only char fields. So you should use VARDATA_ANY. What happens is that these short text columns have a 1-byte TOAST header, but you ship the first 4 bytes unconditionally, assuming they were detoasted. Yours, Laurenz Albe -- Cybertec | https://www.cybertec-postgresql.com