Hi, I am quite new to handle a lot of data in Postgres and I would be happy to get as much advice from experienced data jongleurs as possible. THE SCENARIO: I have a big table `client_log`. It is the "main gate" for clients (and there is a lot of them) to talk to the backend. Each client sends more or less log lines, resulting in the table being written lots of times every second producing a lot of data. Here is the table definition: http://stackoverflow.com/questions/25542010/postgres-huge-table-with-delayed-read-and-write-access *code* determines the nature of the log line. Example: code=20 determines a location log line (the important information will be in *latitude* and *longitude*), code=8999 carries a comment that just needs to get stored (in the *comment* field). Just storing the data is useless, I need to look at it. At some log lines quite regularly, at some just once, at others every now and then. I need indexes on nearly every single column for quick access. So what is the `best` way of storing the data? OPTION 1 - PARTITIONING: For each query only a few columns are interesting and I could partition the table (as it was suggested on SO) by *created* and by *code*. There is roughly 10 different codes and I would keep data for the last two months (partitioned by day). So I would end up having 10 * 60 = 600 partitions. For every partition I could create indexes that are relevant to that partition (examples: *created*, *created_on_server* or *latitude* and *longitude*). OPTION 2 - MULTIPLE TABLES: I could create the tables myself: one for location log lines, one for comment log lines etc. and store them via python in the correct table (depending on *code*). Each of these tables would only have the columns and indexes needed. OUTCOME: I expect partitioning to be faster because Postgres selects the correct partition for me automatically. I can easily get rid of old data by dropping the corresponding partition. The downside of the partition approach is that all partitions inherit all columns of the master table which is unnecessary (and consumes disc space?). Which option should I go for? Is there anything else that I haven't considered? Any help and comments appreciated. PS.: Does partitioning work trouble-free with master-slave replication?
begin:vcard fn:Tobias Fielitz n:Fielitz;Tobias adr:;;90 Pitt St;Sydney;NSW;2000;Australia email;internet:tobias@xxxxxxxxxxxxxx title:StreetHawk tel;cell:0404267511 note:Skype: tobias.fielitz url:streethawk.com version:2.1 end:vcard
-- Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general