Search Postgresql Archives

Re: Processor usage/tuning question

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



-----------------------------------------------
Israel Brewster
Systems Analyst II
Ravn Alaska
5245 Airport Industrial Rd
Fairbanks, AK 99709
(907) 450-7293
-----------------------------------------------


BEGIN:VCARD
VERSION:3.0
N:Brewster;Israel;;;
FN:Israel Brewster
ORG:Frontier Flying Service;MIS
TITLE:PC Support Tech II
EMAIL;type=INTERNET;type=WORK;type=pref:israel@xxxxxxxxxxxxxxxxxx
TEL;type=WORK;type=pref:907-450-7293
item1.ADR;type=WORK;type=pref:;;5245 Airport Industrial Wy;Fairbanks;AK;99701;
item1.X-ABADR:us
CATEGORIES:General
X-ABUID:36305438-95EA-4410-91AB-45D16CABCDDC\:ABPerson
END:VCARD
 

On Oct 10, 2014, at 1:04 PM, Jim Nasby <Jim.Nasby@xxxxxxxxxxxxxx> wrote:

> On 10/8/14, 3:17 PM, Israel Brewster wrote:
>> Except that the last data point received is still valid as the aircraft's current location, even if it came in several hours ago - which we may well care about. That is, some users may need to see where a given aircraft (or the entire fleet) is, even if an aircraft hasn't updated in a while. That said, I did discuss this with my higher-ups, and got the ok to take it down to four hours.
> 
> Note that in your explain output nothing is filtering by time at all; are you sure you posted the right explain?
> 
>>> I don't think PostgreSQL is going to be able to reason very effectively about a ROW_NUMBER() in a inner table and then a row<=5 in the outer one being equivalent to a LIMIT query for which it could walk an index and then stopping once it finds 5 of them.
>>> 
>>> Does this need to issued as a single query?  Why not issue 55 different queries?  It seems like the client is likely going to need to pick the returned list back out by tail number anyway, so both the client and the server might be happier with separate queries.
>> 
>> Good point. Doing that brought the entire execution time down to around 60ms. Just ever so slightly better than the ~1200ms I was getting before. :-) I just have an unreasonable aversion to making multiple queries. I guess this is a prime example of why I need to overcome that :-)
> 
> Do you actually need the last 5 points? If you could get away with just the most recent point, SELECT DISTINCT ON might do a better job of this in a single query.

At the moment, unfortunately yes - I have to do some calculations based on the past few data points. At some point I should be able to re-work the system such that said calculations are done when the points are saved, rather than when they are retrieved, which would be beneficial for a number of reasons. However, until I can get that done I need multiple points here.

> 
> As for the concern about issuing multiple queries, if you code this into a database function it should still be quite fast because there won't be any round-trip between your application and the database.

I've had phenomenally bad luck with coding queries into database functions. I had a number of functions written at one point that allowed me to do things like select <table>.function, <other_column> FROM table - until I noticed that said queries ran significantly slower than just doing the query I had encoded in the function as a sub-query instead. I was doing these same sub-queries in a bunch of different places, so I figured it would clarify things if I could just code them into a DB function that I called just like a column. It's been a while since I looked at those, however, so I can't say why they were slow. This usage may not suffer from the same problem.

> 
> Something else to consider is having a second table that only keeps the last X aircraft positions. I would do this by duplicating every insert into that table via a trigger, and then have a separate process that ran once a minute to delete any records other than the newest X. Because that table would always be tiny queries against it should be blazing fast. Do note that you'll want to vacuum that table frequently, like right after each delete.

Indeed. I think I'm happy with the performance of the multiple queries, but this would doubtless be the "best" option (from a performance standpoint), as the table would be small and my select would be essentially SELECT * FROM TABLE, with a potential WHERE ... IN... clause.

Thanks for all the help!

> -- 
> Jim Nasby, Data Architect, Blue Treble Consulting
> Data in Trouble? Get it in Treble! http://BlueTreble.com
> 
> 
> -- 
> Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general

-- 
Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]
  Powered by Linux