Search Postgresql Archives

Re: Database slowness -- my design, hardware, or both?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi, Webb Sprague.  You wrote:
... but I see two seq scans in your explain in a loop -- this is
probably not good.  If you can find a way to rewrite the IN clause
(either de-normalizing through triggers to save whatever you need on
an insert and not have to deal with a set, or by using except in the
query, or someing else more creative)...
I would normally agree that an IN clause is a place to worry -- except that I'm using IN on a very small table, with about 4-5 rows. That might indeed affect things, and I could certainly pull out these values into a Perl variable that I then interpolate literally into the SQL. However, I have to assume that this wouldn't affect things all that much.
Also -- there is a good book on temporal databases by Snodgrass that
might give some interesting ideas; maybe you have already seen it, but
still.
Heh -- that's one of the first references that I looked at when I started this project. Perhaps I'll look at it again; the design of my data warehouse took some of those suggestions into account, but I wouldn't be surprised if there's something more I could be doing.
  I am thinking you could increment a sequence variable to give
you a "tick" integer with each action, rather than trying to use
timestamps with all their overhead and inaccuracy (1 second is a long
time, really).  Lamport also did work on clocks that might apply.
I agree that 1 second is a ridiculously long time. The logfiles were created by people outside of my research group, and are a wonderful lesson in how to use XML poorly and inefficiently. The designers of the logfile weren't expecting to get dozens or hundreds of values per second. You can be sure that my research group, which is now looking to create logfiles of our own, will place a strong emphasis on high-resolution timestamps. Practically speaking, I've managed to get around this problem by using the "id" column, which comes from a sequence, and is thus guaranteed to be increasing. We can thus be assured that a row with an ID of x will come before a row whose ID is x + 10, so long as their transaction types are the same. So we can't guarantee that a node entry comes before a variable setting via the ID, but we can order two variable settings based on the ID.
Also have you tried dropping all your fk and checks just to see if you
get a difference in speed on an update?  It would be interesting,
perhaps.
I sorta figured that because our UPDATE is taking place on a column without a foreign key, that this wouldn't matter much. Am I wrong?
If you could get rid of the sorted limit 1 clause in your function,
there would be less variablity and make it easier to understand; you
probably need to denormalize somehow, perhaps using ticks; I am not
sure....
Yeah, that's a good point. I'm not sure how possible that'll be, though, given that at *some* point, I need to find the "most recent" setting of the same variable. There might be dozens or hundreds of rows separating the assignment of a new value to a particular variable.

Could a trigger set your previous_value on insert or update, rather
than querying for it later?
Well, what I've been doing is INSERTing each row into the "transactions" table without previous_value. Then I run the mass UPDATE for previous_value, doing nothing else at the same time. Then, after previous_value is set, I do the query.

Reuven



[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]
  Powered by Linux