On 07/12/2022 2:13 PM Pierson Patricia L (Contractor) <patricia.l.pierson@xxxxxxx> wrote:
Hello,
Do a count on the primary key. Will force index access and you don’t access the entire row which may be very long.
LIKE : select count(ID) from my_table;
From: Mladen Gogala <gogala.mladen@xxxxxxxxx>
Sent: Tuesday, July 12, 2022 11:58 AM
To: MichaelDBA Vitale <michaeldba@xxxxxxxxxxx>
Cc: pgsql-admin@xxxxxxxxxxxxxxxxxxxx
Subject: [EXT] Re: Improve "select count(*)" query - takes more than 30 mins for some large tables
What's wrong with parallelism? That's why it was invented. If you really need an accurate count at moment's notice, create a trigger to maintain it.
Regards
On Tue, Jul 12, 2022, 10:31 AM MichaelDBA Vitale <michaeldba@xxxxxxxxxxx> wrote:
Perhaps do an analyze on the table and then select reltuples from pg_class for that table. Might be faster than the select count(*).
Regards,
Michael Vitale
On 07/12/2022 8:51 AM Mladen Gogala <gogala.mladen@xxxxxxxxx> wrote:
On 7/11/22 03:23, Florents Tselai wrote:
psql “select id from my_table" | sort -u | wc -lThat will be a lot slower than just "select count(*) from my_table". You are delivering data to the user program (psql) and then shipping them to pipe and then processing the output with "wc". Depending on the version, PostgreSQL has very reliable parallelism and can do counting rather quickly. The speed of "select count(*) from my_table" depends on the speed of I/O. Since the table is big, it cannot be cached in the file system cache, so all that you have at your disposal is the raw disk speed. For the smaller machines, NVME is the king. For larger rigs, you should consider something like Pure, XTremIO or NetApp SolidFire. People frequently expect database to do miracles with under par hardware.
--Mladen GogalaDatabase ConsultantTel: (347) 321-1217https://dbwhisperer.wordpress.com
That is not true: doing the select on the primary key will still result in a table scan, not an index scan. The heap always gets accessed for select counts.
Regards,
Michael Vitale