Re: speeding up grafana sensor-data query on raspberry pi 3

Luiz Felipph <luizfelipph@xxxxxxxxx> · Mon, 17 Apr 2023 16:06:53 -0300

Is an option partitioning the table by month? If your report is month based, you can improve performance by partitioning.
Felipph

Em dom., 16 de abr. de 2023 às 19:10, Andres Freund <andres@xxxxxxxxxxx> escreveu:
Hi,

On 2023-04-16 19:00:33 +0200, Clemens Eisserer wrote:

> I am currently trying to migrate an influxdb 1.7 smarthome database to

> postgresql (13.9) running on my raspberry 3.

> It works quite well, but for the queries executed by grafana I get a

> bit highter execution times than I'd hoped for.

>

> Example:

> table smartmeter with non-null column ts (timestamp with time zone)

> and brinc index on ts, no pk to avoid a btree index.

> Sensor values are stored every 5s, so for 1 month there are about 370k

> rows - and in total the table currently holds about 3M rows.

> The query to display the values for 1 month takes ~3s, with the bitmap

> heap scan as well as aggregation taking up most of the time, with

> sorting in between.

> 

> Is there anything that could be improved?

> With influxdb I was able to view 3 and 6 months graphs, with

> postgresql it simply takes too long.

> 

> I am currently running the 32-bit ARMv6 build, would it be a big

> improvement running ARMv8/64-bit?

Yes, I suspect so. On a 64bit system most of the datatypes you're dealing with

are going to be pass-by-value, i.e. not incur memory allocation

overhead. Whereas timestamps, doubles, etc will all require allocations on a

32bit system.

> smarthomedb=> explain analyze SELECT floor(extract(epoch from

> ts)/10800)*10800 AS "time", AVG(stromL1) as l1, AVG(stromL2) as l2,

> AVG(stroml3) as l3 FROM smartmeter WHERE ts BETWEEN '2023-03-16

> T09:51:28.397Z' AND '2023-04-16T08:51:28.397Z' GROUP BY time order by time;

> 

>      QUERY PLAN

> ---------------------------------------------------------------------------------------------------------------------------------------------------------------------

> GroupAggregate  (cost=117490.70..132536.10 rows=376135 width=32)

> (actual time=2061.253..2974.336 rows=236 loops=1)

>   Group Key: ((floor((date_part('epoch'::text, ts) / '10800'::double

> precision)) * '10800'::double precision))

>   ->  Sort  (cost=117490.70..118431.04 rows=376135 width=20) (actual

> time=2058.407..2410.467 rows=371810 loops=1)

>         Sort Key: ((floor((date_part('epoch'::text, ts) /

> '10800'::double precision)) * '10800'::double precision))

Given the number of rows you're sorting on a somewhat slow platform, the

complexity of the _expression_ here might be a relevant factor. Particularly on

a 32bit system (see above), due to the memory allocations we'll end up doing.

I don't know how much control over the query generation you have. Consider

rewriting

  floor(extract(epoch from ts)/10800)*10800 AS "time"

to something like

  date_bin('3h', ts, '2001-01-01 00:00')

>         Sort Method: external merge  Disk: 10960kB

>         ->  Bitmap Heap Scan on smartmeter  (cost=112.09..74944.93

> rows=376135 width=20) (actual time=88.336..1377.862 rows=371810

> loops=1)

Given the time spent in the bitmap heap scan, it might be beneficial to

increase effective_io_concurrency some.

>               Recheck Cond: ((ts >= '2023-03-16

> 10:51:28.397+01'::timestamp with time zone) AND (ts <= '2023-04-16

> 10:51:28.397+02'::timestamp with time zone))

>               Rows Removed by Index Recheck: 2131

>               Heap Blocks: lossy=4742

The lossiness might also incur some overhead, so increasing work_mem a bit

will help some.

Greetings,

Andres Freund