Re: speeding up grafana sensor-data query on raspberry pi 3

Clemens Eisserer <linuxhippy@xxxxxxxxx> · Thu, 20 Apr 2023 18:39:35 +0200

Hi,

I was too lazy to switch to a ARMv8 rootfs, instead I used an ARMv7
postgresql 15.2 docker image running via podman to try the
suggestions.
The performance improvements offered by the new postgresql features is
really impressive!

5318.382 ms:  original query
2372.618 ms:  with date_bin
2154.530 ms:  date_bin + work_mem=64mb (quicksort instead of disk-based sorting)
0826.196 ms:  date_bin + work-mem + create-statistics
0589.445 ms: date_bin + work-mem + create-statistics + max_workers=2
(instead of 1)

So evaluating the complex/old expression indeed was really slow, using
date_bin already reduced query time to 50%.
Hash based aggregation further more than halfed the remaining 2,1s -
down to ~825ms!
826ms for one month, and ~8s for a  whole year is actually great - as
far as I can remember even influxdb, which is actually optimized for
this kind of data, didn't perform nearly as well as postgresql -
awesome!

Thanks again for all the suggestions and for such a great dbms!

- Clemens

Am Di., 18. Apr. 2023 um 14:14 Uhr schrieb Clemens Eisserer
<linuxhippy@xxxxxxxxx>:
>
> Hi again,
>
> Thanks for the suggestions.
>
> - I  increased work_mem to 64M, which caused disk-based sorting to be
> replaced with quicksort and resulted in a modest speedup. However I
> have to admit I didn't understand why more work_mem speeds up the heap
> scan.
> - the suggestion regarding "create statistics on group by" is awesome,
> to get rid of sorting is probably the best that could happen to the
> query.
> - ARMv8 instead of ARMv6 could have a positive impact
>
> I'll mirate to a containerized postgresql-version running on raspbian
> os 64-bit as I find some time to spare and report back.
>
> Thanks again, Clemens
>
> Am So., 16. Apr. 2023 um 22:50 Uhr schrieb David Rowley <dgrowleyml@xxxxxxxxx>:
> >
> > On Mon, 17 Apr 2023 at 05:00, Clemens Eisserer <linuxhippy@xxxxxxxxx> wrote:
> > > Example:
> > > table smartmeter with non-null column ts (timestamp with time zone)
> > > and brinc index on ts, no pk to avoid a btree index.
> > > Sensor values are stored every 5s, so for 1 month there are about 370k
> > > rows - and in total the table currently holds about 3M rows.
> > > The query to display the values for 1 month takes ~3s, with the bitmap
> > > heap scan as well as aggregation taking up most of the time, with
> > > sorting in between.
> >
> > I know you likely don't have much RAM to spare here, but more work_mem
> > might help, even just 16MBs might be enough. This would help the Sort
> > and to a lesser extent the Bitmap Heap Scan too.
> >
> > Also, if you'd opted to use PostgreSQL 14 or 15, then you could have
> > performed CREATE STATISTICS on your GROUP BY clause expression and
> > then run ANALYZE.  That might cause the planner to flip to a Hash
> > Aggregate which would eliminate the Sort before aggregation.  You'd
> > only need to sort 236 rows after the Hash Aggregate for the ORDER BY.
> >
> > Plus, what Justin said.
> >
> > David