> In PostgreSQL SELECT count(*) must scan the table to see which rows
> are visible to the executing database transaction. Without that, it
> can't give a completely accurate count from a transactional
> perspective. If you can settle for a non-transactional
> approximation, select the reltuples value from the pg_class row for
> the table.
I agree with you somewhat. I would assume that "select count(*)" is special in the sense that it is table schema independent. I would actually hope that this is an order 1 operation since the total table length should be stored somewhere as it's a reasonably useful source of information. Because it is schema independent, even an alter table shouldn't block it as why should it? The transaction comes in when you are adding more data to the end of the table so the select count(*) needs a transaction to guarantee a finish only. This should not block or be blocked on anything. The where clause, a group by or a distinct clause etc. should block on an Alter table. Is this just an edge case which is not worth looking at?
Thank you so much for your help.
~Ben
On Fri, Jun 15, 2012 at 3:32 PM, Kevin Grittner <Kevin.Grittner@xxxxxxxxxxxx> wrote:
Benedict Holland <benedict.m.holland@xxxxxxxxx> wrote:I was trying to suggest techniques which would prevent that bloat
> Yes. I needed to do a full vacuum. Again, the database is very
> large. I batch inserted quite a lot of data and then modified that
> data. The vacuum isn't blocking anything. It was blocking other
> tables (as expected) but continues to run and clean. My tables in
> general are around 10GB, each update seems to nearly double the
> size of it so I required a full vacuum.
and make the VACUUM FULL unnecessary. But, now that I've had a
chance to format the attachment into a readable format, I agree that
it isn't part of the problem. The iceberg in this case is the ALTER
TABLE, which is competing with two other queries.
OK.
> The blocked statements are the select count(*) and the alter
> table.
Not directly. The lock held by the UPDATE would *not* block the
> Both are blocked on the update table command.
SELECT; but it *does* block the ALTER TABLE command, which can't
share the table while it changes the structure of the table. The
SELECT is blocked behind the ALTER TABLE.
I'm glad we're on the same page there.
> The alter table command SHOULD be blocked and that is fine.
In PostgreSQL SELECT count(*) must scan the table to see which rows
> The select count(*) should never be blocked as that is the whole
> point of running an MVCC operation at least to my understanding. I
> can even accept the use case that the select should block with an
> Alter Table operation if data is retrieved from the table, but a
> select count(*) only returns the number of rows and should be
> table space independent.
are visible to the executing database transaction. Without that, it
can't give a completely accurate count from a transactional
perspective. If you can settle for a non-transactional
approximation, select the reltuples value from the pg_class row for
the table.
So that the table isn't dropped or truncated while the count is
> I also don't understand why a select count(*) requires an
> AccessShareLock. I don't understand why a select should lock
> anything at all.
scanning the table.
-Kevin