Search Postgresql Archives

Re: jsonb_set() strictness considered harmful to data

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 10/18/19 3:10 PM, Mark Felder wrote:
>
> On Fri, Oct 18, 2019, at 12:37, Ariadne Conill wrote:
>> Hello,
>>
>> I am one of the primary maintainers of Pleroma, a federated social
>> networking application written in Elixir, which uses PostgreSQL in
>> ways that may be considered outside the typical usage scenarios for
>> PostgreSQL.
>>
>> Namely, we leverage JSONB heavily as a backing store for JSON-LD
>> documents[1].  We also use JSONB in combination with Ecto's "embedded
>> structs" to store things like user preferences.
>>
>> The fact that we can use JSONB to achieve our design goals is a
>> testament to the flexibility PostgreSQL has.
>>
>> However, in the process of doing so, we have discovered a serious flaw
>> in the way jsonb_set() functions, but upon reading through this
>> mailing list, we have discovered that this flaw appears to be an
>> intentional design.[2]
>>
>> A few times now, we have written migrations that do things like copy
>> keys in a JSONB object to a new key, to rename them.  These migrations
>> look like so:
>>
>>    update users set info=jsonb_set(info, '{bar}', info->'foo');
>>
>> Typically, this works nicely, except for cases where evaluating
>> info->'foo' results in an SQL null being returned.  When that happens,
>> jsonb_set() returns an SQL null, which then results in data loss.[3]
>>
>> This is not acceptable.  PostgreSQL is a database that is renowned for
>> data integrity, but here it is wiping out data when it encounters a
>> failure case.  The way jsonb_set() should fail in this case is to
>> simply return the original input: it should NEVER return SQL null.
>>
>> But hey, we've been burned by this so many times now that we'd like to
>> donate a useful function to the commons, consider it a mollyguard for
>> the real jsonb_set() function.
>>
>>     create or replace function safe_jsonb_set(target jsonb, path
>> text[], new_value jsonb, create_missing boolean default true) returns
>> jsonb as $$
>>     declare
>>       result jsonb;
>>     begin
>>       result := jsonb_set(target, path, coalesce(new_value,
>> 'null'::jsonb), create_missing);
>>       if result is NULL then
>>         return target;
>>       else
>>         return result;
>>       end if;
>>     end;
>>     $$ language plpgsql;
>>
>> This safe_jsonb_set() wrapper should not be necessary.  PostgreSQL's
>> own jsonb_set() should have this safety feature built in.  Without it,
>> using jsonb_set() is like playing russian roulette with your data,
>> which is not a reasonable expectation for a database renowned for its
>> commitment to data integrity.
>>
>> Please fix this bug so that we do not have to hack around this bug.
>> It has probably ruined countless people's days so far.  I don't want
>> to hear about how the function is strict, I'm aware it is strict, and
>> that strictness is harmful.  Please fix the function so that it is
>> actually safe to use.
>>
>> [1]: JSON-LD stands for JSON Linked Data.  Pleroma has an "internal
>> representation" that shares similar qualities to JSON-LD, so I use
>> JSON-LD here as a simplification.
>>
>> [2]: https://www.postgresql.org/message-id/flat/qfkua9$2q0e$1@xxxxxxxxxxxxxxxx
>>
>> [3]: https://git.pleroma.social/pleroma/pleroma/issues/1324 is an
>> example of data loss induced by this issue.
>>
>> Ariadne
>>
> This should be directed towards the hackers list, too.
>
> What will it take to change the semantics of jsonb_set()? MySQL implements safe behavior here. It's a real shame Postgres does not. I'll offer a $200 bounty to whoever fixes it. I'm sure it's destroyed more than $200 worth of data and people's time by now, but it's something.
>
>


The hyperbole here is misplaced. There is a difference between a bug and
a POLA violation. This might be the latter, but it isn't the former. So
please tone it down a bit. It's not the function that's unsafe, but the
ill-informed use of it.


We invented jsonb_set() (credit to Dmitry Dolgov). And we've had it
since 9.5. That's five releases ago.  So it's a bit late to be coming to
us telling us it's not safe (according to your preconceptions of what it
should be doing).


We could change it prospectively (i.e. from release 13 on) if we choose.
But absent an actual bug (i.e. acting contrary to documented behaviour)
we do not normally backpatch such changes, especially when there is a
simple workaround for the perceived problem. And it's that policy that
is in large measure responsible for Postgres' deserved reputation for
stability.


Incidentally, why is your function written in plpgsql? Wouldn't a simple
SQL wrapper be better?


    create or replace function safe_jsonb_set
        (target jsonb, path text[], new_value jsonb, create_missing
    boolean default true)
    returns jsonb as
    $func$
        select case when new_value is null then target else
    jsonb_set(target, path, new_value, create_missing) end
    $func$ language sql;


And if we were to change it I'm not at all sure that we should do it the
way that's suggested here, which strikes me as no more intuitive than
the current behaviour. Rather I think we should possibly fill in a json
null in the indicated place.


cheers


andrew



-- 
Andrew Dunstan                https://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services






[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]

  Powered by Linux