Tino Wildenhain wrote:
Hi,
Tim Tassonis wrote:
Hi all
I assume this is not an uncommon problem, but so far, I haven't been
able to find a good answer to it.
I've got a table that holds log entries and fills up very fast during
the day, it gets approx. 25 million rows per day. I'm now building a
web application using apache/mod_php where you can query the database
and then should be able to page through the results.
you should be aware that PHP isnt the only scripting language with an
apache module and not neccessary the best choice among them.
There's no need to become insulting. I am aware of the truly astonishing
fact that there are other scripting languages apart from php and that
not everybody loves php.
Apart from the sad fact that I quite like php, the problem is not the
choice of scripting language, but the nature of apache mpm processing,
making the postgres connection stuck to an apache process.
My idea was that whenever a user constructs a query, I create a
temporary table holding the results and then page through this table,
which should work very well in principle.
That means you are more or less constructing materialized views :-)
No, I want the data to remain fixed after the query is executed.
But if you hold the session anyway, then see below.
I don't hold the session, see above.
But from what I've been able to find out, temporary tables live only
in the Postgres Session they have been created in and are destroyed
upon session descructuion.
Now, with apache/php in a mpm environment, I have no guarantee that a
user will get the same postgresql session for a subsequent request,
thus he will not see the temporary table.
Thats the problem and if you have failover/loadbalancing situations,
even more so.
Is there a way to create temporary tables in another way, so they are
visible between sessions, or do I need to create real tables for my
purpose? And is the perfomance penalty big for real tables, as they
have been written to disk/read from disk?
To start with, you should avoid reconnecting to the database for every
request. Not only because of loosing the session context but also
to avoid connection overhead.
I don't reconnect after every request, but I'm not guaranteed by mpm
that I get the same session/process. I might, but that's hardly what I'd
call a stable application, even as a php programmer.
Usually this is done by connection pooling. You can then try to trac
user:connection relationship as much as possible thru the connection pool.
As far as I can see, there is no implementation of a multi client
process connection pool in mod_php.
I admit that my interprocess communication know-how is not very deep,
but that would mean the client postgres/tcpip connection part would have
to be held somewhere in shared memory between the different apache
processes. From reading the documentation, php does not do that.
If you have that, there is actually no need for the temp tables. Instead
you can just use a regular cursor and scroll it as neccessary.
My problem ist that I don't have that.
Almost all frameworks should give you reasonable pool implementations,
some additional memory caching on top of it and there are also a lot
of other methods to help you with that, for example pgpool and
pgbouncer.
I'm afraid you somehow missed the point, but thanks for your response.
Bye
Tim