On Fri, Oct 18, 2019, at 12:37, Ariadne Conill wrote: > Hello, > > I am one of the primary maintainers of Pleroma, a federated social > networking application written in Elixir, which uses PostgreSQL in > ways that may be considered outside the typical usage scenarios for > PostgreSQL. > > Namely, we leverage JSONB heavily as a backing store for JSON-LD > documents[1]. We also use JSONB in combination with Ecto's "embedded > structs" to store things like user preferences. > > The fact that we can use JSONB to achieve our design goals is a > testament to the flexibility PostgreSQL has. > > However, in the process of doing so, we have discovered a serious flaw > in the way jsonb_set() functions, but upon reading through this > mailing list, we have discovered that this flaw appears to be an > intentional design.[2] > > A few times now, we have written migrations that do things like copy > keys in a JSONB object to a new key, to rename them. These migrations > look like so: > > update users set info=jsonb_set(info, '{bar}', info->'foo'); > > Typically, this works nicely, except for cases where evaluating > info->'foo' results in an SQL null being returned. When that happens, > jsonb_set() returns an SQL null, which then results in data loss.[3] > > This is not acceptable. PostgreSQL is a database that is renowned for > data integrity, but here it is wiping out data when it encounters a > failure case. The way jsonb_set() should fail in this case is to > simply return the original input: it should NEVER return SQL null. > > But hey, we've been burned by this so many times now that we'd like to > donate a useful function to the commons, consider it a mollyguard for > the real jsonb_set() function. > > create or replace function safe_jsonb_set(target jsonb, path > text[], new_value jsonb, create_missing boolean default true) returns > jsonb as $$ > declare > result jsonb; > begin > result := jsonb_set(target, path, coalesce(new_value, > 'null'::jsonb), create_missing); > if result is NULL then > return target; > else > return result; > end if; > end; > $$ language plpgsql; > > This safe_jsonb_set() wrapper should not be necessary. PostgreSQL's > own jsonb_set() should have this safety feature built in. Without it, > using jsonb_set() is like playing russian roulette with your data, > which is not a reasonable expectation for a database renowned for its > commitment to data integrity. > > Please fix this bug so that we do not have to hack around this bug. > It has probably ruined countless people's days so far. I don't want > to hear about how the function is strict, I'm aware it is strict, and > that strictness is harmful. Please fix the function so that it is > actually safe to use. > > [1]: JSON-LD stands for JSON Linked Data. Pleroma has an "internal > representation" that shares similar qualities to JSON-LD, so I use > JSON-LD here as a simplification. > > [2]: https://www.postgresql.org/message-id/flat/qfkua9$2q0e$1@xxxxxxxxxxxxxxxx > > [3]: https://git.pleroma.social/pleroma/pleroma/issues/1324 is an > example of data loss induced by this issue. > > Ariadne > This should be directed towards the hackers list, too. What will it take to change the semantics of jsonb_set()? MySQL implements safe behavior here. It's a real shame Postgres does not. I'll offer a $200 bounty to whoever fixes it. I'm sure it's destroyed more than $200 worth of data and people's time by now, but it's something. Kind regards, -- Mark Felder ports-secteam & portmgr alumni feld@xxxxxxxxxxx