Re: Growth planning

Israel Brewster <ijbrewster@xxxxxxxxxx> · Mon, 4 Oct 2021 13:37:49 -0800

On Oct 4, 2021, at 1:21 PM, Rob Sargent <robjsargent@xxxxxxxxx> wrote:

    On 10/4/21 3:09 PM, Israel Brewster
      wrote:

          On Oct 4, 2021, at 12:46 PM, Ron <ronljohnsonjr@xxxxxxxxx>
            wrote:

             On 10/4/21 12:36 PM, Israel Brewster wrote:

              [snip]

                  Indeed. Table per station as opposed to
                    partitioning? The *most* I can reasonably envision
                    needing is to query two stations, i.e. I could see
                    potentially wanting to compare station a to some
                    “baseline” station b. In general, though, the
                    stations are independent, and it seems unlikely that
                    we will need any multi-station queries. Perhaps
                    query one station, then a second query for a second
                    to display graphs for both side-by-side to look for
                    correlations or something, but nothing like that has
                    been suggested at the moment.

              Postgresql partitions are tables.  What
              if you partition by station (or range of stations)?

        Yeah, that’s what I thought, but Rob had said “Table per
          station”, so I wasn’t sure if he was referring to *not* using
          partitioning, but just making “plain” tables.

        Regardless, I intend to try portioning by station sometime
          this week, to see how performance compares to the “one big
          table” I currently have. Also to figure out how to get it set
          up, which from what I’ve seen appears to be a bit of a pain
          point.
        ---

    My "strict" table per station suggestion was meant as an option to
    avoid the partitioning pain point entirely if it wasn't going to buy
    you anything. Namely querying more than one station's data.

Ah, so in theory making “strict” tables for each would be easier than creating partitions for each? Something to consider for sure if so.

    In a write-once scenario such as this,  would a "clustered index" on
    datetime be stable, performant?  Seems a read-for-export could put
    the head down at time point A and just go? 

That’s beyond my level of DB admin knowledge, unfortunately :) I can certainly read up on it and give it a try though!

---
Israel Brewster
Software Engineer
Alaska Volcano Observatory 
Geophysical Institute - UAF 
2156 Koyukuk Drive 
Fairbanks AK 99775-7320
Work: 907-474-5172
cell:  907-328-9145