Hello, On Fri, Oct 18, 2019 at 6:52 PM Stephen Frost <sfrost@xxxxxxxxxxx> wrote: > > Greetings, > > * Ariadne Conill (ariadne@xxxxxxxxxxxxxxxx) wrote: > > On Fri, Oct 18, 2019 at 6:01 PM Adrian Klaver <adrian.klaver@xxxxxxxxxxx> wrote: > > > https://www.postgresql.org/docs/11/functions-json.html > > > " The field/element/path extraction operators return NULL, rather than > > > failing, if the JSON input does not have the right structure to match > > > the request; for example if no such element exists" > > > > It is known that the extraction operators return NULL. The problem > > here is jsonb_set() returning NULL when it encounters SQL NULL. > > > > > Just trying to figure why one is worse then the other. > > > > Any time a user loses data, it is worse. The preference for not > > having data loss is why Pleroma uses PostgreSQL as it's database of > > choice, as PostgreSQL has traditionally valued durability. If we > > should not use PostgreSQL, just say so. > > Your contention that the documented, clear, and easily addressed > behavior of a particular strict function equates to "the database system > loses data and isn't durable" is really hurting your arguments here, not > helping it. > > The argument about how it's unintuitive and can cause application > developers to misuse the function (which is clearly an application bug, > but perhaps an understandable one if the function interface isn't > intuitive or is confusing) is a reasonable one and might be convincing > enough to result in a change here. > > I'd suggest sticking to the latter argument when making this case. > > > > > I believe that anything that can be catastrophically broken by users > > > > not following upgrade instructions precisely is a serious problem, and > > > > can lead to serious problems. I am sure that this is not the only > > > > project using JSONB which have had users destroy their own data in > > > > such a completely preventable fashion. > > Let's be clear here that the issue with the upgrade instructions was > that the user didn't follow your *application's* upgrade instructions, > and your later code wasn't written to use the function, as documented, > properly- this isn't a case of PG destroying your data. It's fine to > contend that the interface sucks and that we should change it, but the > argument that PG is eating data because the application sent a query to > the database telling it, based on our documentation, to eat the data, > isn't appropriate. Again, let's have a reasonable discussion here about > if it makes sense to make a change here because the interface isn't > intuitive and doesn't match what other systems do (I'm guessing it isn't > in the SQL standard either, so we unfortunately can't look to that for > help; though I'd hardly be surprised if they supported what PG does > today anyway). Okay, I will admit that saying PG is eating data is perhaps hyperbolic, but I will also say that the behaviour of jsonb_set() under this type of edge case is unintuitive and frequently results in unintended data loss. So, while PostgreSQL is not actually eating the data, it is putting the user in a position where they may suffer data loss if they are not extremely careful. Here is how other implementations handle this case: MySQL/MariaDB: select json_set('{"a":1,"b":2,"c":3}', '$.a', NULL) results in: {"a":null,"b":2,"c":3} Microsoft SQL Server: select json_modify('{"a":1,"b":2,"c":3}', '$.a', NULL) results in: {"b":2,"c":3} Both of these outcomes make sense, given the nature of JSON objects. I am actually more in favor of what MSSQL does however, I think that makes the most sense of all. I did not compare to other database systems, because using them I found that there is a JSON_TABLE type function and then you do stuff with that to rewrite the object and dump it back out as JSON, and it's quite a mess. But MySQL and MSSQL have an equivalent jsonb inline modification function, as seen above. > As a practical response to the issue you've raised- have you considered > using a trigger to check the validity of the new jsonb? Or, maybe, just > made the jsonb column not nullable? With a trigger you could disallow > non-null->null transistions, for example, or if it just shouldn't ever > be null then making the column 'not null' would suffice. We have already mitigated the issue in a way we find appropriate to do. The suggestion of having a non-null constraint does seem useful as well and I will look into that. > I'll echo Christoph's comments up thread too, though in my own language- > these are risks you've explicitly accepted by using JSONB and writing > your own validation and checks (or, not, apparently) rather than using > what the database system provides. That doesn't mean I'm against > making the change you suggest, but it certainly should become a lesson > to anyone who is considering using primairly jsonb for their storage > that it's risky to do so, because you're removing the database system's > knowledge and understanding of the data, and further you tend to end up > not having the necessary constraints in place to ensure that the data > doesn't end up being garbage- thus letting your application destroy all > the data easily due to an application bug. Ariadne