Tobias Brox wrote:
[Madison Kelly - Thu at 10:25:07AM -0500]
Will the priority of the script pass down to the pgsql queries it calls?
I figured (likely incorrectly) that because the queries were executed by
the psql server the queries ran with the server's priority.
I think you are right, and in any case, I don't think the niceness
value won't help much if the bottleneck is iowait.
In our application, I've made a special function for doing
low-priority transactions which I believe is quite smart - though maybe
not always. Before introducing this logic, we observed we had a tipping
point, too many queries, and the database wouldn't swallow them fast
enough, and the database server just jammed up, trying to work at too
many queries at once, yielding the results far too slow.
In the config file, I now have those two flags set:
stats_start_collector = on
stats_command_string = on
This will unfortunately cause some CPU-load, but the benefit is great
- one can actually check what the server is working with at any time:
select * from pg_stat_activity
with those, it is possible to check a special view pg_stat_activity -
it will contain all the queries the database is working on right now.
My idea is to peek into this table - if there is no active queries,
the database is idle, and it's safe to start our low-priority
transaction. If this view is full of stuff, one should certainly not
run any low-priority transactions, rather sleep a bit and try again
later.
select count(*) from pg_stat_activity where not current_query like
'<IDLE>%' and query_start+?<now()
The algorithm takes four parameters, the time value to put in above,
the maximum number of queries allowed to run, the sleep time between
each attempt, and the amount of attempts to try before giving up.
So here are the cons and drawbacks:
con: Given small queries and small transactions, one can tune this in
such a way that the low priority queries (almost) never causes
significant delay for the higher priority queries.
con: can be used to block users of an interactive query
application to cause disturbances on the production database.
con: can be used for pausing low-priority batch jobs to execute only
when the server is idle.
drawback: unsuitable for long-running queries and transactions
drawback: with fixed values in the parameters above, one risks that
the queries never gets run if the server is sufficiently stressed.
drawback: the stats collection requires some CPU
drawback: the "select * from pg_stats_activity" query requires some CPU
drawback: the pg_stats_activity-view is constant within the
transaction, so one has to roll back if there is activity
(this is however not a really bad thing, because one
certainly shouldn't live an idle transaction around if the
database is stressed).
I can see how this would be very useful (and may make use of it later!).
For the current job at hand though, at full tilt it can take a few hours
to run, which puts it into your "drawback" section. The server in
question is also almost under load of some sort, too.
A great tip and one I am sure to make use of later, thanks!
Madi