Well, so far with commit_delay=0 no problems. I will report back of couse if something happens, but I believe that the problem may indeed be solved/masked with that setting.
Rough description of our setup, or how to reproduce:
* Timeseries data in table , say, "measurements", size: 3-4TB, about 1000 inserts/second
* table measurements also has a trigger on insert to also insert on measurements_a (for daily export purposes)
Just the above would cause a stuck query after a few days.
Now for exporting we run the following CTE query (measurements_b is an empty table, measurements_a has about 5GB)
* WITH d_rows AS (DELETE FROM measurement_events_a RETURNING * ) INSERT INTO measurement_events_b SELECT * FROM d_rows;
The above caused the problem to appear every time, after a 10-20 minutes.
Regards,
-Spiros
On 20 July 2015 at 17:02, Andres Freund <andres@xxxxxxxxxxx> wrote:
On 2015-07-20 17:00:52 +0300, Spiros Ioannou wrote:
> FYI we have an 9.3.5 with commit_delay = 4000 and commit_siblings = 5 with
> a 8TB dataset which seems fine. (Runs on different - faster hardware
> though).
9.4 has a different xlog insertion algorithm (scaling much better), so
that unfortunately doesn't say very much...