Citus is also now just an extension. Sounds pretty awesome, I'll certainly consider your system if/when we decide to make changes. On 22 April 2017 at 08:41, Andrew Staller <andrew@xxxxxxxxxxxxx> wrote: > Samuel, > > Short answer to your questions: (1) TimescaleDB and CitusDB are focusing on > solving different problems, and (2) TimescaleDB is an Apache 2-licensed > extension to run in your Postgres database, not a fork or different system. > > Longer answer to your first question: > > From what we've read and the conversations we've had with Citus (awesome > folks, btw), they're primarily solving a different problem -- OLAP queries > and distributed transactions -- while we are focused on time-series data > analysis. As such, we haven't benchmarked against Citus and if we were to, > it would require some unrealistic finagling in order to make it an > apples-to-apples comparison. > > As an example, their partitioning is based on a single primary key, while > all writes in their clustered version go to a cluster master (although their > hosted version of Citus MX changes this a bit). We perform two-dimensional > partitioning using your primary key and time, with the partitions > automatically created/closed on time intervals and data volume in order to > keep the most recent partitioned chunks of data (and their B-trees) in > memory to support fast ingest for time-series data. (***) We have only > released our single-node version, but the clustering will allow you to write > directly to any node. Citus has also decided only to support the SQL > queries they are optimized for, while we've made the decision to support > "full SQL" (even if some queries might not be fast). > > *** We blogged yesterday about TimescaleDB's partitioning design choices in > more depth, if you are interested: > https://blog.timescale.com/time-series-data-why-and-how-to-use-a-relational-database-instead-of-nosql-d0cd6975e87c > > > On Thu, Apr 20, 2017 at 5:43 PM, Samuel Williams > <space.ship.traveller@xxxxxxxxx> wrote: >> >> Andrew, how would timescaledb compare to citus - and is timescaledb an >> extension to postgres or is it an entirely separate system? >> >> On 21 April 2017 at 02:44, Andrew Staller <andrew@xxxxxxxxxxxxx> wrote: >> > Awesome thread. >> > >> > Samuel, >> > >> > Just wanted you to be aware of the work we're doing at TimescaleDB >> > (http://www.timescale.com/), a time-series database extension for >> > PostgreSQL. >> > >> > Some of how we might help you: >> > - automatic partitioning by space (primary key - like country_id, for >> > instance) and time. This creates "chunks" of your data, right-sized by >> > volume and time constraints (which makes inserts fast at much greater >> > scale >> > than Vanilla PostgreSQL - not sure if this is a problem for you) >> > - this will also help if your queries are selective on time and >> > country_id >> > (per this example) >> > - the partitioning by time allows you to DROP old chunks without the >> > need >> > for vacuums >> > >> > On Thu, Apr 20, 2017 at 8:30 AM, Vick Khera <vivek@xxxxxxxxx> wrote: >> >> >> >> I'm curious why you have so many partial indexes. Are you trying to >> >> make >> >> custom indexes per query? It seems to me you might want to consider >> >> making >> >> the indexes general, and remove the redundant ones (that have the same >> >> prefix list of indexed fields). >> >> >> >> Secondly your table is 102Gb. Clearly there's a lot of data here. How >> >> many >> >> rows does that take? I would further suggest that you partition this >> >> table >> >> such that there are no more than about 10 million rows per partition >> >> (I've >> >> done this by using a id % 100 computation). Maybe in your case it makes >> >> sense to partition it based on the "what" field, because it appears you >> >> are >> >> trying to do that with your partial indexes already. >> >> >> >> On Wed, Apr 19, 2017 at 10:11 PM, Samuel Williams >> >> <space.ship.traveller@xxxxxxxxx> wrote: >> >>> >> >>> Okay, so after changing longitude/latitude to float4, and >> >>> re-organizing the table a bit, I got the query down from about 8 >> >>> minutes to 40 seconds. >> >>> >> >>> The details are in the gist comments: >> >>> https://gist.github.com/ioquatix/bddda36d9e4ffaceb7a62d7b62259121 >> >>> >> >>> Now, just need to get performance another 2 orders of magnitude >> >>> better. Can we make the index a bit more coarse grained, perhaps >> >>> convert long/lat to integers or something, use a hilbert index, use a >> >>> postgis index... ideas? >> >>> >> >>> >> >>> -- >> >>> Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx) >> >>> To make changes to your subscription: >> >>> http://www.postgresql.org/mailpref/pgsql-general >> >> >> >> >> > >> > >> > >> > -- >> > TimescaleDB | Growth & Developer Evangelism >> > c: 908.581.9509 >> > >> > 335 Madison Ave. >> > New York, NY 10017 >> > www.timescale.com >> > https://github.com/timescale/timescaledb >> >> >> -- >> Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx) >> To make changes to your subscription: >> http://www.postgresql.org/mailpref/pgsql-general > > > > > -- > TimescaleDB | Growth & Developer Evangelism > c: 908.581.9509 > > 335 Madison Ave. > New York, NY 10017 > www.timescale.com > https://github.com/timescale/timescaledb -- Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general