Re: MySQL HandlerSocket - Is this possible in PG?

"Pierre C" <lists@xxxxxxxxxx> · Wed, 22 Dec 2010 22:50:16 +0100

On Wed, 22 Dec 2010 14:17:21 +0100, Michael Ben-Nes <michael@xxxxxxxxxxx>
wrote:

Thanks, it is most interesting

--------------------------------------------------
Michael Ben-Nes - Internet Consultant and Director.
http://www.epoch.co.il - weaving the Net.
Cellular: 054-4848113
--------------------------------------------------

In fact, it would be possible to implement something like MySQL
HandlerSocket, using the following Very Ugly Hack :

This would only work for ultra simple "SELECT 1 row WHERE primary key =
constant" queries.

- a pooler (separate process) waits for connections
- clients connect to the pooler and send queries
- pooler accumulates enough queries to justify the overhead of what's
going to come next
- pooler takes a bunch of queries and encodes them in some custom ad-hoc
format (not SQL)
- pooler says to postgres "SELECT do_my_queries( serialized data )"
- do_my_queries() is a user function (in C) which uses postgres access
methods directly (like index access method on primary key), processes
queries, and sends results back as binary data
- repeat for next batch

Nested Loop Index Scan processes about 400.000 rows/s which is 2.5
us/query, maybe you could get into that ballpark (per core).

Of course it's a rather extremely ugly hack.

-------------------

Note that you could very possibly have almost the same benefits with
"almost" none of the ugliness by doing the following :

same as above :
- a pooler (separate process) waits for connections
- clients connect to the pooler and send queries in the format query +
parameters (which libpq uses if you ask)
- pooler accumulates enough queries to justify the overhead of what's
going to come next

different :
- pooler looks at each query, and if it has not seen it yet on this
particular pg connection, issues a "PREPARE" on the query
- pooler sends, in one TCP block, a begin, then a bunch of "execute named
prepared statement with parameters" commands, then a rollback
- postgres executes all of those and returns all replies in one TCP block
(this would need a modification)
- pooler distributes results back to clients

This would need a very minor change to postgres (coalescing output
blocks). It would make the pooler pay TCP overhead instead of postgres,
and greatly improve cache locality in postgres.

Since error handling would be "problematic" (to say the least...) and
expensive it would only work on simple selects.

--
Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance