Re: Query planner / Analyse statistics bad estimate rows=1 with maximum statistics 10000 on PostgreSQL 10.2

Mark <mwchambers@xxxxxxxxx> · Wed, 2 Jan 2019 18:04:19 +0000

Hi Tom, 
Thanks for your reply.

Am I correct in my understanding that any row that has been modified (i.e. UPDATE) is in state HEAPTUPLE_INSERT_IN_PROGRESS so it will not be included in the sample?

I'm going to rework the application so there is less time between the DELETE and the COMMIT so I will only see the problem if ANALYZE runs during this smaller time window. Looks like your patch will help if this happens.

Then again, it also seems no one has had a problem with its current behaviour (for 11 years!). 

Thanks, 

Mark

On Wed, 2 Jan 2019 at 16:11 Tom Lane <tgl@xxxxxxxxxxxxx> wrote:
Mark <mwchambers@xxxxxxxxx> writes:

> I have a long running job which deletes all of the common_student table and

> then repopulates it. It takes long time to load all the other data and

> commit the transaction. I didn't think the delete inside the transaction

> would have any effect until it is commited or rolled back.

> I will have to rewrite the application so it updates the existing rows

> rather than deleting all and then inserting.

Hmm ... I'm not sure if that will actually make things better.  The root

of the issue is what analyze.c does with DELETE_IN_PROGRESS tuples:

                     * We count delete-in-progress rows as still live, using

                     * the same reasoning given above; but we don't bother to

                     * include them in the sample.

The "reasoning given above" is a reference to what happens for

INSERT_IN_PROGRESS tuples:

                     * Insert-in-progress rows are not counted.  We assume

                     * that when the inserting transaction commits or aborts,

                     * it will send a stats message to increment the proper

                     * count.  This works right only if that transaction ends

                     * after we finish analyzing the table; if things happen

                     * in the other order, its stats update will be

                     * overwritten by ours.  However, the error will be large

                     * only if the other transaction runs long enough to

                     * insert many tuples, so assuming it will finish after us

                     * is the safer option.

Now the problem with this, from your perspective, is that *neither*

INSERT_IN_PROGRESS nor DELETE_IN_PROGRESS tuples are included in

ANALYZE's data sample.  So a long-running update transaction will

cause all the rows it changes to be excluded from the sample until

commit.  This seems like a bad thing, and it definitely means that

adjusting your app as you're contemplating won't help.

In order to handle the bulk-update scenario sanely, it seems like

we ought to allow one of INSERT_IN_PROGRESS and DELETE_IN_PROGRESS tuples

to be included.  And it looks like, for consistency with the row-counting

logic, the one that's included needs to be DELETE_IN_PROGRESS.  This

is a slightly annoying conclusion, because it means we're accumulating

stats that we know are likely to soon be obsolete, but I think sampling

INSERT_IN_PROGRESS tuples instead would lead to very strange results

in some cases.

In short, I think we want to do this to the DELETE_IN_PROGRESS logic:

                    if (TransactionIdIsCurrentTransactionId(HeapTupleHeaderGetUpdateXid(targtuple.t_data)))

                        deadrows += 1;

                    else

+                   {

+                       sample_it = true;

                        liverows += 1;

+                   }

with suitable adjustment of the adjacent comment.

Thoughts?

            regards, tom lane