Sorry, it took me a while to respond, but I re-factored all of this process to suggestions.
On Wed, Jan 7, 2015 at 7:49 PM, Maxim Boguk <maxim.boguk@xxxxxxxxx> wrote:
On Wed, Jan 7, 2015 at 8:49 PM, Pawel Veselov <pawel.veselov@xxxxxxxxx> wrote:PS: your setup look pretty complicated and hard to analyze without seeing all involved table structures, transaction/query flow, and (especially) involved procedures source code.Sure :) At this point, I've put together the "bulk merge" code as well. I can't quite see much of a difference, actually, but it's hard to trust the execution times, as on the same amount of data they vary from, say, 0.5s to 2s, and the sample data is not stepping on any other locks. In general, I'm afraid all those left joins and multiple scans, even over small amount of data, is nullifying any positive effect.Now some ideas to check.
The high CPU usage usually isn't related to locking, but related to seq scan or wrong plans or simple inefficient pl/pgsql code, locked processes usually doesn't use too much cpu.1)on the test database perform select pg_stat_reset(); then perform full round of merges, then check
select * from pg_stat_user_tables where seq_scan>0 order by seq_tup_read;and if you find a lot of seq_scan and seq_tuple_reads on the particular table try find where they coming from (it could be reason for high CPU usage).2)enable track_functions in postgresql.conf and perform the same sequence (select pg_stat_reset() + full round of merges) then checkselect * FROM pg_stat_user_functions order by self_time desc;and check which function using the most time.
These are good pointers, if the new process is having the same sort of problems, this will come in handy on figuring out where they are coming from, thank you.
3)old/lost prepared transactions can have deadly effect on the database performance at whole. So check select * from pg_prepared_xact(); and verify that you don't have a hours (or weeks) old prepared xact lying around.
If there are "lost" prepared transactions, they will lock up a particular instance from being able to write into its table data, so it will just stall the node. But does happen, and we have an application mechanism to find and delete those.
PS: btw I still don't fully understood relation between the:"
- merges data into its own node tables (using merge_xxx PL/pgSQL functions)
"and provided code for the public."merge_all02-9A-46-8B-C1-DD" and PUBLIC.merge_agrio.As I seepublic."merge_all02-9A-46-8B-C1-DD" calling PUBLIC.merge_agrio, and the PUBLIC.merge_agrio updates a global table R_AGRIO (but not the "own node table").
It's a bit irrelevant at this point, but. merge_all02-9A-46-8B-C1-DD() function will take all data for 02-9A-46-8B-C1-DD node and move it into the master table. There is an analogous merge_02-9A-46-8B-C1-DD() function that takes data from application, and writes it into the tables for 02-9A-46-8B-C1-DD node. The process of moving data node tables->main tables and application->node tables is nearly identical, hence I only provided the body once. The big difference, is when merging into master, there is a lot more data to look through, as node tables only contain data that has not yet been merged into the master yet.
I think the best implementation of such task is asynchronous processing of this changes via background process. An application only inserts events into queue table (it lockless process), and some background process read these data from queue table and merge it into main table (again lockless because it single thread so no concurrent writes), and then delete the merged data from queue table.
Well, that was a really good suggestion, thank you. Some weeks later I've put it together. This hasn't hit production yet, so I'm yet to see the overall improvement effect. Along with turning it into a queue, I've added provisions to try to combine as much data as possible before writing it out into the databse tables, and merged all of the satellite tables with the main data.
Before, I had:
r_agrio
r_brk_xxx (multiple entries reference rows in r_agrio)
Now, I have:
r_agrio_daily
r_agrio_total
r_agrio_hourly
All the data that was in the r_brk_xxx tables is now in columns of the r_agrio* tables. To get around the fact that there are potentially multiple BRK records for each AGR record, the data is now stored as JSON object. The primary key used for the BRK tables is turned into a string that serves as a key in a top level JSON object. This should help me tremendously on the side that needs to read that data, as I had to join or left join the BRK tables.
Splitting this into 3 tables may come back and bite me in back, since it's two more inserts and corresponding look ups, but it seriously helps me on the reading side of things.
The code that aggregates the JSON data is still done in PL/PGSQL, which is probably a bad idea, considering that PL doesn't have good ways of manipulating JSON data in-place (I can't run PL/V8, or untrusted languages). But I should move this logic to Java, and use updateable result sets to modify data in place. The reason I left it in PL/PGSQL is that I didn't want to do select and then update, making it two look ups per each update. Another concern on moving it to the application side is the network turn-around.