Re: Why is writing JSONB faster than just JSON?

Dmitry Dolgov <9erthalion6@xxxxxxxxx> · Thu, 15 Apr 2021 17:43:11 +0200

> On Wed, Apr 14, 2021 at 10:26:23AM -0400, Tom Lane wrote:
> Dmitry Dolgov <9erthalion6@xxxxxxxxx> writes:
> >> On Tue, Apr 13, 2021 at 11:38:04PM -0700, Mitar wrote:
> >> ... Namely, it looks like writing into a jsonb typed
> >> column is 30% faster than writing into a json typed column. Why is
> >> that? Does not jsonb require parsing of JSON and conversion? That
> >> should be slower than just storing a blob as-is?
>
> > * Parsing is happening in both cases, for json it's done for validation
> >   purposes.
>
> Right.
>
> > * Compression of json data is actually dominates the runtime load for large
> >   json objects, making other factors less visible and reducing difference in
> >   size (jsonb also should be a bit bigger, that's why I would think it would be
> >   slower).
> > * At the same time perf shows that json compression spends a bit more time in
> >   pglz_find_match (mostly around accessing and checking history entries), which
> >   is compression optimization via history table. So probably due to differences
> >   in data layout this optimization works slightly worse for plain json than for
> >   jsonb?
>
> Interesting.  I recall that we made some last-minute changes in the JSONB
> physical representation after finding that the original choices resulted
> in sucky compression behavior.  But I think we were only looking at the
> resultant compressed size, not time-to-compress.
>
> My own guess about this, without having tried to reproduce it, is that
> JSONB might end up physically smaller than JSON, resulting in less work
> to push the toasted datum out to disk.  This'd depend a lot on your
> formatting habits for JSON, of course.  But in any case, it'd be worth
> comparing pg_column_size() results to see what's up with that.

Oh, of course I've missed that the input I was using was indeed
formatted, without formatting both cases perform equally well and I
can't reproduce the issue. Although if I understand correctly the
original code in question doesn't actually do any formatting.