Re: Performance of count(*)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On Mar 22, 2007, at 11:26 AM, Guido Neitzer wrote:

On 22.03.2007, at 11:53, Steve Atkins wrote:

As long as you're ordering by some row in the table then you can do that in
straight SQL.

select a, b, ts from foo where (stuff) and foo > X order by foo limit 10

Then, record the last value of foo you read, and plug it in as X the next
time around.

This does only work if you have unique values in foo. You might have "batch breaks" inside a list of rows with equal values for foo.

If I don't have unique values in foo, I certainly have unique values in (foo, pk).


But: a web application that needs state and doesn't maintain it by itself (or inside the dev toolkit) is imho broken by design. How should the database store a "state" for a web app? It's only possible on the web app part, because the app is either stateless and so are the queries to the database - they have to be re- evaluated for every request as the request might come from totally different sources (users, ...) or it is stateful and has to maintain the state because only the app developer knows, what information is needed for the "current state".

This is why all web application toolkits have a "session" concept.

Yes. HTTP is not very stateful. Web applications are stateful. There are some really obvious approaches to maintaining state cleanly that work well with databases and let you do some quite complex stuff (tying a persistent database connection to a single user, for instance). But they don't scale at all well.

What Craig was suggesting is, basically, to assign a persistent database connection to each user. But rather than maintain that connection as a running process, to serialise all the state out of the database connection and store that in the webapp, then when the next action from that user comes in take a database connection and stuff all that state into it again.

It's a lovely idea, but strikes me as completely infeasible in the general case. There's just too much state there. Doing it in the specific case is certainly possible, but rapidly devolves to the standard approach of "On the first page of results, run the query and record the first 5000 results. Store those in a scratch table, indexed by session-id, or in external storage. On displaying later pages of results to the same user, pull directly from the already calculated results."


I think the problem is more that most web developers aren't very good
at using the database, and tend to fall back on simplistic, wrong, approaches
to displaying the data. There's a lot of monkey-see, monkey-do in web
UI design too, which doesn't help.

Sure. That is the other problem ... ;-) But, and I think this is much more important: most toolkits today free you from using the database directly and writing lots and lots of lines of sql code which instantly breaks when you switch the storage backend. It's just the thing from where you look at something.

The real problem is the user-interface is designed around what is easy to implement in elderly cgi scripts, rather than what's appropriate to the data being displayed or useful to the user. Displaying tables of results, ten at a time, is just one of the more egregious examples of that.

Cheers,
  Steve



[Postgresql General]     [Postgresql PHP]     [PHP Users]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Yosemite]

  Powered by Linux