Search Postgresql Archives

Re: background triggers?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



My understanding of Listen/Notify is that it is a completely disconnected process running on the database server. It should be run as a dameon (or service in Windows) application. The trigger on the table calls the notify SQL command and then the trigger, and thereby the statement, ends. The notify then tells this daemon application to wake up and start processing. To make sure that a second process does not start while the first process is running, you should have a running processes table which gets inserted when it starts and updated when it ends. That way your process can check if one is currently running or not.

Another way of doing it is to have a cron job check every X minutes for records in the table. When they are there, it should run the process. It can also have a flag that says don't run another process until this one is finished.

I may not have understood exactly what you are trying to do, but from what I understood, this will solve your problem.

Sim

Rafal Pietrak wrote:
On Tue, 2006-05-23 at 13:01 -0400, Kenneth Downs wrote:
Rafal Pietrak wrote:
some other INSERT, return imediately if so, but turn into background for
a long-lasting job if not".

Rafal, I'm wondering why you want to do this. You may be "fighting the framework".

Yes, most probably. I'm afraid of that :(

Still...

If you are trying to do something that is totally unsupported, it is probably for a pretty good reason, usually dealing with security or data loss. You can probably get what you want by supported methods, but it may require looking at the problem in a different way.

What is it you are trying to accomplish? Is it just performance?


OK. here is a 'real life' example. It works more like a post office. Now
and then, there arrive a 'delivery man' with a bunch of post to deliver.
Post office takes the batch, checks and stamps each and every item, and
hands over a receipt. But the actual bin-ing (into delivery channels)
and routing (dispatch) is handled without the delivery man standing and
waiting for the process to end.

In my *real*life* case I have a file with hundreds of tousends of
'transactions' uploaded by operator (several times times a day, and in
fact, by 10-30 operators) - those are checked at INSERT time. But after
that check and INSERT, I'd like to say to the operator: "OK, jour job is
done, don't warry about the rest". But there is more work to do with the batch.
I have to 'route the transactions' to their relevant accounts, and see
how those change the 'status' of those accounts, consequently, store the
updated status within the account itself. This is tedious and time
consuming. But it have to be done, because 'customers' query account
status for those 'agregate status information' and it would be *very*
haevy for the database if those queries required browsing of the entire
'transaction log'. Number of 'Transactions' to number of accounts is
like milions to thousends.

A plain INSERT of batch takes 5-10minutes on desktop postgresql (800MHz
machine, ATA disks). When I attach trigger (*Very* simple funciton) to
update the accounts, the INSERT take hours (2-4). But when I make just
one single update of all accounts at the end of the batch insert, it
takes 20-30min.

So my solution was 1) to have an additional table "TABLE dirty_log(tiem
timestamp, who text)", which gets inserted a row *after* a batch of
INSERTS. 2) and a 'server side process', run every 5-10 minutes, which
makes accounts update, and which: A) does NOT launche when another such
process is currently running; B) purges DIRTY_LOG table after it's done.
This is quite obvoisly a 'user space implementation' of the 'background
triggers'. I needed that.

Natuaraly, having this 'bacground trigger' I loose acuracy of the
account information. But I gain on system efficiency - this is
engineering decision. When we have to take those (like the
implementation above), it's good to have 'system tools' (like
'background triggers') that support us.

But of cource I may be wrong all togather. I'd really like to know the
techniq, which is 'along the line' of RDBM systems design, which serves
that same purpose.
Is there a better solution?

NB: the 'batch INSERT' I mentioned above is done by www server. It's
quite vital to have the the server process terminated (meaning: not keep
it waiting for LISTEN/NOPTIFY event - meaning not use LISTEN/NOTIFY), as
apache will keep the connection opened until the process ends.

In 'real life', this scenario is applicable also to 'service network':
1. Say, you have 100_000 - 1000_000 vending machines (VM) network.
2. each is is loaded with c.a. 100 item types (300 types in the entire
network).
3. each VM dispatches an item every 2-3 minutes. which make overall
'transaction traffic' at the level of over hundreds per second.
4. assume, that for 'customer management', you need to store quite a bit
of data with each item-dispense 'transaction'. Meaning: transaction are
not very light, and their details have to be kept for long time.
5. obviously, you need to manage your stock (each of the 300
item-types): you keep VM loaded and keep some stock at central store.

(ATMs are a good example of such netowrk)

So:
1. 'transaction traffic' is so signifficant, that you really have to
'optimise for that'
2. you don't really have to know *exactly* when you run out of stock,
because each VM has signifficant local item store, so if you get
notified, that a particular VM gets close to the bottom with particular
item, you may dispatch a reload in 10min, but it's also OK to dispatch
that in 2hours - meaning, the 'acocunt information' does not have to be
'immediately acurate'. Far more important is 'dispatch transaction'
performance.
3. normally, you 'keep an eye' on you VM network - meaning, you issue a
'statistics' query quite frequently. If that was a 'haevy query' it
would degrade your database performance quite signifficantly - we really
need the 'agregate information' stored within 'item-accounts'.

Is there a clean, 'along the framework' design that serves this reality?

-R


---------------------------(end of broadcast)---------------------------
TIP 6: explain analyze is your friend



[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]
  Powered by Linux