Hi,
thanks for your answer.
Let me give some background. I have a postgres instance that serves as the data storage for a web-based data analytics application. For some queries, I'm seeing postgres going OOM because the query grows too large and subsequently the linux kernel kills the postgres system. I'm then observing similar log messages and data loss. I'm now trying to reproduce this behaviour in a more deterministic way to learn about the root cause and resolve the issue.
The bulk import is not running inside of a transaction for performance reasons. My understanding is that in case of crashes I might end up with partial data (that I take care of in the application). However, I would not expect rollback behaviour a few minutes after the bulk import went through correctly.
FWIW, the analytics application also allows users to annotate the data and these annotations are written to the database in transactions.
So to answer your questions:
> What kind of transaction did you use?
No transaction for bulk import. Also, bulk import completed minutes before the kill. After the bulk import, a number of transactions are performed touching different tables.
> Did you commit the transaction?
The bulk import was not done in a transaction. The other transactions were committed through the database access framework I'm using in my (Python/django) application.
> Why?
To reproduce the problematic beahaviour that I'm seeing in my application.
Does this help? Where could I look for understanding this better?
Thanks, Manuel From: Ron <ronljohnsonjr@xxxxxxxxx>
Sent: Tuesday, June 15, 2021 13:07 To: pgsql-general@xxxxxxxxxxxxxxxxxxxx Subject: [ext] Re: Losing data because of problematic configuration? On 6/15/21 5:42 AM, Holtgrewe, Manuel wrote:
What kind of transaction did you use? Did you commit the transaction?
Why? Did you CHECKPOINT beforehand? (I'm hypothesizing that data didn't get flushed to disk, and so Pg "cleaned itself up" after the crash.)
--
Angular momentum makes the world go 'round. |