Chris Mair <chris@xxxxxxxx> writes: > ... > Interestingly, if you combine these, it quickly blows up! The following query with a limit 1000 already > has a RES of well over 1GB. With larger limits it quickly thrashes my machine. > enrico=# explain analyze > SELECT substring((field_id ->'comment')::text,1,1), > jsonb_array_elements(field_id ->'comment')->>'id' > FROM stage.fbcomment limit 1000; Yeah. The key point here is that jsonb_array_elements() returns a set, that is one tuple per array element. We've semi-deprecated use of set-returning functions in SELECT output lists, and one of the reasons is that memory management in the presence of set-returning functions is a mess: we can't clear the SELECT's expression-evaluation context until the SRF is done returning rows. Some of the rows in this dataset contain 'comment' arrays with over 1000 elements; that means that the "substring()" expression gets executed over 1000 times without any opportunity to reclaim memory. And that expression will leak a pretty fair amount of memory when dealing with a large field_id value. Multiply that by the fact that the original query does this half a dozen times, and you have a fairly large peak memory consumption. The space does get reclaimed after each original table row, but that's no comfort if you ran out before that. It would probably be possible to go through the JSONB code and reduce (though not eliminate) its appetite for memory in this sort of situation --- for instance, freeing detoasted input arguments would help a lot. But that's not likely to happen overnight. In any case, you're right that this is a very inefficiently expressed query; refactoring things so that the jsonb_array_elements() call is done just once in a subquery would probably help a great deal. regards, tom lane -- Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general