Erik Wienhold <ewie@xxxxxxxxx> writes: >> On 27/02/2023 13:13 CET Laurenz Albe <laurenz.albe@xxxxxxxxxxx> wrote: >> I'd be curious to know how the customer managed to do that. >> Perhaps there is a loophole in PostgreSQL that needs to be fixed. > Another reason to prefer jsonb over json to reject such inputs right away. > The documentation states that json does not validate inputs in constrast to > jsonb. It's not that it doesn't validate, it's that the validation rules are different. Per the manual [1]: RFC 7159 permits JSON strings to contain Unicode escape sequences denoted by \uXXXX. In the input function for the json type, Unicode escapes are allowed regardless of the database encoding, and are checked only for syntactic correctness (that is, that four hex digits follow \u). However, the input function for jsonb is stricter: it disallows Unicode escapes for characters that cannot be represented in the database encoding. The jsonb type also rejects \u0000 (because that cannot be represented in PostgreSQL's text type), and it insists that any use of Unicode surrogate pairs to designate characters outside the Unicode Basic Multilingual Plane be correct. You can certainly quibble with our decisions here, but I think they are reasonably consistent. json is for data that you'd like a syntax check on (else you might as well store it as "text"), but no more than a syntax check, because you're going to do the actual JSON processing elsewhere and you don't want Postgres opining on what semi-standard JSON constructs mean. If you're actually going to process the data inside the database, jsonb is a better choice. The extra restrictions in jsonb are to ensure that a string value represented in JSON can be extracted into a valid string of our text datatype. Storing data in json and then casting to jsonb on-the-fly seems like about the worst possible combination of choices. regards, tom lane [1] https://www.postgresql.org/docs/current/datatype-json.html