Re: statistic target and sample rate

Tom Lane <tgl@xxxxxxxxxxxxx> · Wed, 14 Jul 2021 10:30:29 -0400

Luca Ferrari <fluca1978@xxxxxxxxx> writes:
> Therefore my question is about how the statistic collectore decides
> about the number of tuples to be sampled.

It's basically 300 times the largest statistics target:

https://git.postgresql.org/gitweb/?p=postgresql.git;a=blob;f=src/backend/commands/analyze.c;h=0c9591415e4b97dd5c5e693af1860294284a1575;hb=HEAD#l1919

Per that comment, there is good math backing this choice for the task
of making a histogram.  It's a little shakier for other sorts of
statistics --- notably, for n_distinct estimation, the error can still
be really bad.

			regards, tom lane