Re: Slow count(*) again...

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 10/10/2010 6:29 PM, Neil Whelchel wrote:
On the other hand, I copied a table out of one of my production servers that
has about 60,000 rows with 6 columns (numeric, numeric, bool, bool, timestamp,
text). The first numeric column has numbers evenly spread between 0 and 100
and it is indexed. I put the table in a pair of database servers both running
on the same physical hardware. One server is Postgres, the other is a popular
server (I am not mentioning names here).

Please do. Your comment is pretty meaningless otherwise.

If you're talking about MySQL: Were you using InnoDB or MyISAM table storage? Of course it's fast with MyISAM, it relies on locks to do updates and has bugger all capability for write concurrency, or to permit readers while writing is going on.

If you're using InnoDB, then I'd like to know how they've managed that.

If you're talking about some *other* database, please name it and provide any useful details, because the hand waving is not helpful.

> I don't think that any amount of settings
tweaking will bring them even in the same ball park.

If you are, in fact, comparing MySQL+MyISAM and PostgreSQL, then you're quite right. Pg will never have such a fast count() as MyISAM does or the same insanely fast read performance, and MyISAM will never be as reliable, robust or concurrency-friendly as Pg is. Take your pick, you can't have both.

There has been discussion
about the other server returning an incorrect count because all of the indexed
rows may not be live at the time. This is not a problem for the intended use,
that is why I suggested another function like estimate(*). It's name suggests
that the result will be close, not 100% correct, which is plenty good enough
for generating a list of results pages in most cases.

Do you have any practical suggestions for generating such an estimate, though? I find it hard to think of any way the server can do that doesn't involve executing the query. The table stats are WAY too general and a bit hit-and-miss, and there isn't really any other way to do it.

If all you want is a way to retrieve both a subset of results AND a count of how many results would've been generated, it sounds like all you really need is a way to get the total number of results returned by a cursor query, which isn't a big engineering challenge. I expect that in current Pg versions a trivial PL/PgSQL function could be used to slurp and discard unwanted results, but a better in-server option to count the results from a cursor query would certainly be nice.

--
Craig Ringer

Tech-related writing at http://soapyfrogs.blogspot.com/

--
Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[Postgresql General]     [Postgresql PHP]     [PHP Users]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Yosemite]

  Powered by Linux