Yes of course that’s all verified and taken into account during code initialization
From: Vitalii Tymchyshyn [mailto:vit@xxxxxx]
Sent: Wednesday, February 22, 2017 8:14 PM
To: John Gorman; pgsql-performance@xxxxxxxxxxxxxx
Subject: Re: [PERFORM] Correct use of cursors for very large result sets in Postgres
For JDBC there are certain prerequisites for setFetchSize to work, e.g. using forward only result sets and transactions.
My experience with cursors in PostgreSQL with Java has been to stay away from them. We support
2 databases with our product, PostgreSQL (default) and SQL Server. While re-encrypting data in a database the application used cursors with a fetch size of 1000.
Worked perfectly on SQL Server and on PostgreSQL until we got to a PostgreSQL table with
more than 11 million rows. After spending weeks trying to figure out what was happening, I realized that when it gets to a table with more than 10 million rows for some reason, the cursor functionality just silently stopped working and it was reading the entire
table. I asked another very senior architect to look at it and he came to the same conclusion. Because of limited time, I ended up working around it using limit/offset.
Again we are using Java, so the problem could just be in the PostgreSQL JDBC driver. Also
we were on 9.1 at the time.
Regards
John
From:
pgsql-performance-owner@xxxxxxxxxxxxxx [mailto:pgsql-performance-owner@xxxxxxxxxxxxxx]
On Behalf Of Mike Beaton
Sent: Tuesday, February 21, 2017 6:49 AM
To:
pgsql-performance@xxxxxxxxxxxxxx
Subject: Re: [PERFORM] Correct use of cursors for very large result sets in Postgres
Thanks, Tom.
Wouldn't this mean that cursors are noticeably non-optimal even for normal data sizes, since the entire data to be streamed from the table is always duplicated into another buffer and then streamed?
> if you want the whole query result at once, why are you bothering with a cursor?
So following that advice, it's not unreasonable that I would actually have a cursor to a large dataset.
Then, I would ideally want to be able to fetch the data from that cursor without the entire data getting duplicated (even if
only a bit at a time instead of all at once, which seems to be the best case behaviour) as I go.
Additionally, I thought that if I had a streaming use-case (which I do), and a streaming data-access layer (which I do), then
since `SELECT * FROM large` is absolutely fine, end-to-end, in that situation, then by symmetry and the principle of least astonishment `FETCH ALL FROM cursor` might be fine too.
|