Thanks Mike,
That approach seems to work very well!!!
There are something on the order of 40 billion readings in the table. This returns an arbitrary one minutes data, as required, in under 30ms:
SELECT distinct date_trunc('minute',_timer) as timer,
key::int,
first_value(value)
OVER (partition by key order by _timer desc)
FROM (SELECT timer as _timer,
(each(v.values)).*
FROM v_sec v
WHERE timer between '2020-04-15 08:10:00'
and '2020-04-15 08:10:59' ) as x
order by key::int;
key::int,
first_value(value)
OVER (partition by key order by _timer desc)
FROM (SELECT timer as _timer,
(each(v.values)).*
FROM v_sec v
WHERE timer between '2020-04-15 08:10:00'
and '2020-04-15 08:10:59' ) as x
order by key::int;
v_sec is a view on a table with 30+ years of 1 second readings from dozens of sensors, columns are timer (timestamp) & values (hstore).
I used "between timestamps" rather than "= date_trunc()" in the where clause as this uses the index on timestamps, so is MUCH faster.
FYI, we are using Timescaledb to manage the underlying data and data structures.
Appreciated!!!
Brent Wood
On Friday, April 16, 2021, 12:28:41 AM GMT+12, Mike Rylander <mrylander@xxxxxxxxx> wrote:
On Wed, Apr 14, 2021 at 5:33 PM Brent Wood <pcreso@xxxxxxxxx> wrote:
>
> Thanks for your reply,
>
> The table is essentially:
> create table readings (timer timestamp primary key,
> readings hstore);
>
> the hstore comprises (<sensor_id> <reading>) key/value pairs for readings taken at the time specified in the timestamp.
>
> eg: "67" "-45.67436", "68" "176.5424" could be key/value pairs representing latitude & longitude, with a timestamp in the timer column.
>
> There would be several lat/lon hstore pairs in a given minute, the query I want would return the last one in the timeseries for that minute (for each key).
>
> I don't think your examples will give me the separate hstore key-value pairs, extracted as the last in the interval for each key & reassembled as an hstore list in the result. The sensor id is the hstore key, as described above, not a separate attribute. That said, the keys can be extracted from the hstore much like a column, but I'm not sure that is the best approach.
>
> Treating each set of hstores in an interval as an array & extracting the last elements may be viable. But I['m not sure how...
>
Hi Brent,
With the table structure like that, you'll need to project all the
sensor data onto each timestamp and then collapse it. Something like
this:
SELECT date_trunc('minute',timer), key, first(value) over (partition
by key order by timer desc) FROM (SELECT timer, (each(readings)).*
FROM table WHERE date_trunc('minute',timer) = '2021-04-15
08:10:00-04')x;
If your queries will always target a specific timestamp truncated to
the minute, you'll find an index on date_trunc('minute',timer) will be
useful, I think. If your intervals are more complex than date_trunc()
can handle then you'll have to get more creative, but that's the basic
structure -- normalize the primary key, project the sensor data to
make it table-ish, then use regular SQL to pull out what you want.
--
Mike Rylander
| Research and Development Manager
| Equinox Open Library Initiative
| phone: 1-877-OPEN-ILS (673-6457)
| email: miker@xxxxxxxxxxxxxx
| web: https://equinoxOLI.org
>
>
> On Thursday, April 15, 2021, 6:33:08 AM GMT+12, Michael Lewis <mlewis@xxxxxxxxxxx> wrote:
>
>
>
>
>
> If you share example schema and desired output (like a dummy table or even pseudo code SQL), then I'm sure many people could help you. Right now, the description of your desired result seems a bit unclear, at least to me.
>
>
> If you wanted to run this hourly for the last 1 hour, it sounds a bit like want this-
>
> select sensor_id, date_trunc( 'minute', timestamptz_field_name_here ), last( value_from_hstore ) over ( partition by sensor_id, date_trunc( 'minute', timestamptz_field_name_here ) ) as last_value_recorded
> from data_table_here
> where timestamptz_field_name_here BETWEEN NOW() - interval '1 hour' and NOW()
> group by sensor_id, date_trunc( 'minute', timestamptz_field_name_here )
>
>
> You could also use the left join "where is null" pattern to check that a prior record in the minute period of time does not exist for that same key. Something like this-
>
> select d1.sensor_id, date_trunc( 'minute', d1.timestamptz_field_name_here ), d1.timestamptz_field_name_here as last_value_recorded
> from data_table_here as d1
> left join data_table_here as prior_d1 ON prior_d1.sensor_id = d1.sensor_id AND prior_d1.timestamptz_field_name_here < d1.timestamptz_field_name_here and prior_d1.timestamptz_field_name_here >= date_trunc( 'minute', d1.timestamptz_field_name_here )
> where d1.timestamptz_field_name_here BETWEEN NOW() - interval '1 hour' and NOW()
>
>
> >
>
>
>
> Thanks for your reply,
>
> The table is essentially:
> create table readings (timer timestamp primary key,
> readings hstore);
>
> the hstore comprises (<sensor_id> <reading>) key/value pairs for readings taken at the time specified in the timestamp.
>
> eg: "67" "-45.67436", "68" "176.5424" could be key/value pairs representing latitude & longitude, with a timestamp in the timer column.
>
> There would be several lat/lon hstore pairs in a given minute, the query I want would return the last one in the timeseries for that minute (for each key).
>
> I don't think your examples will give me the separate hstore key-value pairs, extracted as the last in the interval for each key & reassembled as an hstore list in the result. The sensor id is the hstore key, as described above, not a separate attribute. That said, the keys can be extracted from the hstore much like a column, but I'm not sure that is the best approach.
>
> Treating each set of hstores in an interval as an array & extracting the last elements may be viable. But I['m not sure how...
>
Hi Brent,
With the table structure like that, you'll need to project all the
sensor data onto each timestamp and then collapse it. Something like
this:
SELECT date_trunc('minute',timer), key, first(value) over (partition
by key order by timer desc) FROM (SELECT timer, (each(readings)).*
FROM table WHERE date_trunc('minute',timer) = '2021-04-15
08:10:00-04')x;
If your queries will always target a specific timestamp truncated to
the minute, you'll find an index on date_trunc('minute',timer) will be
useful, I think. If your intervals are more complex than date_trunc()
can handle then you'll have to get more creative, but that's the basic
structure -- normalize the primary key, project the sensor data to
make it table-ish, then use regular SQL to pull out what you want.
--
Mike Rylander
| Research and Development Manager
| Equinox Open Library Initiative
| phone: 1-877-OPEN-ILS (673-6457)
| email: miker@xxxxxxxxxxxxxx
| web: https://equinoxOLI.org
>
>
> On Thursday, April 15, 2021, 6:33:08 AM GMT+12, Michael Lewis <mlewis@xxxxxxxxxxx> wrote:
>
>
>
>
>
> If you share example schema and desired output (like a dummy table or even pseudo code SQL), then I'm sure many people could help you. Right now, the description of your desired result seems a bit unclear, at least to me.
>
>
> If you wanted to run this hourly for the last 1 hour, it sounds a bit like want this-
>
> select sensor_id, date_trunc( 'minute', timestamptz_field_name_here ), last( value_from_hstore ) over ( partition by sensor_id, date_trunc( 'minute', timestamptz_field_name_here ) ) as last_value_recorded
> from data_table_here
> where timestamptz_field_name_here BETWEEN NOW() - interval '1 hour' and NOW()
> group by sensor_id, date_trunc( 'minute', timestamptz_field_name_here )
>
>
> You could also use the left join "where is null" pattern to check that a prior record in the minute period of time does not exist for that same key. Something like this-
>
> select d1.sensor_id, date_trunc( 'minute', d1.timestamptz_field_name_here ), d1.timestamptz_field_name_here as last_value_recorded
> from data_table_here as d1
> left join data_table_here as prior_d1 ON prior_d1.sensor_id = d1.sensor_id AND prior_d1.timestamptz_field_name_here < d1.timestamptz_field_name_here and prior_d1.timestamptz_field_name_here >= date_trunc( 'minute', d1.timestamptz_field_name_here )
> where d1.timestamptz_field_name_here BETWEEN NOW() - interval '1 hour' and NOW()
>
>
> >
>
>