Re: Air-traffic benchmark

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Lefteris wrote:
So we all agree that the problem is on the scans:)

So the next question is why changing shared memory buffers will fix
that? i only have one session with one connection, do I have like many
reader workers or something?

I wouldn't expect it to. Large sequential scans like this one are optimized in PostgreSQL to only use up a small portion of the shared_buffers cache. Allocating more RAM to the database won't improve the fact that you're spending the whole time waiting for physical I/O to happen very much.

What might help is increasing effective_cache_size a lot though, because there you might discover the database switching to all new sorts of plans for some of these queries. But, again, that doesn't impact the situation where a sequential scan is the only approach.

I have this whole data set on my PC already and have been trying to find time to get it loaded and start my own tests here, it is a quite interesting set of information. Can you tell me what you had to do in order to get it running in PostgreSQL? If you made any customizations there, I'd like to get a copy of them. Would save me some time and help me get to where I could give suggestions out if I had a "pgdumpall --schema-only" dump from your database for example, or however you got the schema into there, and the set of PostgreSQL-compatible queries you're using.

By the way: if anybody else wants to join in, here's a script that generates a script to download the whole data set:

#!/usr/bin/env python
for y in range(1988,2010):
   for m in range(1,13):
print "wget --limit-rate=100k http://www.transtats.bts.gov/Download/On_Time_On_Time_Performance_%s_%s.zip"; % (y,m)

It's 3.8GB of download that uncompresses into 46GB of CSV data, which is why I put the rate limiter on there--kept it from clogging my entire Internet connection.

--
Greg Smith    2ndQuadrant   Baltimore, MD
PostgreSQL Training, Services and Support
greg@xxxxxxxxxxxxxxx  www.2ndQuadrant.com


--
Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

[Postgresql General]     [Postgresql PHP]     [PHP Users]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Yosemite]

  Powered by Linux