On Wed, 14 Jan 2015 11:42:45 +1100 Tobias Fielitz <tobias@xxxxxxxxxxxxxx> wrote: > > OPTION 1 - PARTITIONING: > For each query only a few columns are interesting and I could partition > the table (as it was suggested on SO) > by *created* and by *code*. > There is roughly 10 different codes and I would keep data for the last > two months (partitioned by day). So I would end up having 10 * 60 = 600 > partitions. > For every partition I could create indexes that are relevant to that > partition (examples: *created*, *created_on_server* or *latitude* and > *longitude*). > > OPTION 2 - MULTIPLE TABLES: > I could create the tables myself: one for location log lines, one for > comment log lines etc. and store them via python in the correct table > (depending on *code*). Each of these tables would only have the columns > and indexes needed. > > OUTCOME: > I expect partitioning to be faster because Postgres selects the correct > partition for me automatically. I can easily get rid of old data by > dropping the corresponding partition. The downside of the partition > approach is that all partitions inherit all columns of the master table > which is unnecessary (and consumes disc space?). > I gather from the comments in this list that null fields have a very low overhead; see : http://www.postgresql.org/message-id/87prx92lj9.fsf@xxxxxxxxxxxxxxxxxx I would worry a lot more about the maintenance problems option 2 will induce : if a code value changes or is added/deleted, your python script needs updating. SQL queries will also be a lot more complicated (union select on various tables) and harder to optimize. Maintaining the coherence between the script and the tables will get harder and harder. -- Salutations, Vincent Veyron https://libremen.com/ Legal case, contract and insurance claim management software -- Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general