On Thu, Jul 22, 2010 at 4:56 PM, Andy Colson <andy@xxxxxxxxxxxxxxx> wrote: > On 7/22/2010 9:41 AM, P Kishor wrote: >> >> I have been struggling with this for a while now, have even gone down >> a few paths but struck out, so I turn now to the community for ideas. >> First, the problem: Store six daily variables for ~ 25 years for cells >> in a grid. >> >> * Number of vars = 6 >> * Number of cells ~ 13 million >> * Number of days ~ 9125 (25 * 365) >> >> Optimize the store for two different kinds of queries: >> >> Query one: Retrieve the value of a single var for all or a portion of >> the cells for a single day. This is analogous to an image where every >> pixel is the value of a single var. > >> SELECT<var> FROM d WHERE yr = ? AND yday = ?; >> SELECT<var> FROM d WHERE yr = ? AND yday = ? AND cell_id IN >> (?,?,?...); > > >> >> Query two: Retrieve values for all the days or a duration of days for >> a single var for a single cell. This is like grabbing a column out of >> a table in which each row holds all the vars for a single day. > >> SELECT<var> FROM d WHERE cell_id = ?; >> SELECT<var> FROM d WHERE cell_id IN (?,?,?...); > > > > First, I must admit to not reading your entire email. I am not sure how to respond to your feedback give that you haven't read the entire email. Nevertheless, thanks for writing... > > Second, Query 1 should be fast, regardless of how you layout the tables. It is not fast. Right now I have data for about 250,000 cells loaded. That comes to circa 92 million rows per year. Performance is pretty sucky. > > Third, Query 2 will return 13M rows? I dont think it matters how you layout > the tables, returning 13M rows is always going to be slow. > Yes, I understand that. In reality I will never get 13 M rows. For display purposes, I will probably get around 10,000 rows to 50,000 rows. When more rows are needed, it will be to feed a model, so that can be offline (without an impatient human being waiting on the other end). Right now, my main problem is that I have either too many rows (~4 B rows) in a manageable number of tables (25 tables) or manageable number of rows (~13 M rows) in too many tables (~9000 tables). > > -Andy > -- Puneet Kishor http://www.punkish.org Carbon Model http://carbonmodel.org Charter Member, Open Source Geospatial Foundation http://www.osgeo.org Science Commons Fellow, http://sciencecommons.org/about/whoweare/kishor Nelson Institute, UW-Madison http://www.nelson.wisc.edu ----------------------------------------------------------------------- Assertions are politics; backing up assertions with evidence is science ======================================================================= -- Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general