On Tue, Feb 23, 2010 at 1:48 AM, Scott Marlowe <scott.marlowe@xxxxxxxxx> wrote: > On Mon, Feb 22, 2010 at 10:51 PM, Yang Zhang <yanghatespam@xxxxxxxxx> wrote: >> nnnnnOn Mon, Feb 22, 2010 at 3:45 PM, Scott Marlowe >> <scott.marlowe@xxxxxxxxx> wrote: >>> >>> What do things like vmstat 10 say while the query is running on each >>> db? First time, second time, things like that. >> >> Awesome -- this actually led me to discover the problem. >> >> vmstat showed no swapping-out for a while, and then suddenly it >> started spilling a lot. Checking psql's memory stats showed that it >> was huge -- apparently, it's trying to store its full result set in >> memory. As soon as I added a LIMIT 10000, everything worked >> beautifully and finished in 4m (I verified that the planner was still >> issuing a Sort). >> >> I'm relieved that Postgresql itself does not, in fact, suck, but >> slightly disappointed in the behavior of psql. I suppose it needs to >> buffer everything in memory to properly format its tabular output, >> among other possible reasons I could imagine. > > It's best when working with big sets to do so with a cursor and fetch > a few thousand rows at a time. It's how we handle really big sets at > work and it works like a charm in keeping the client from bogging down > with a huge memory footprint. > Thing is, this is how I got here: - ran complex query that does SELECT INTO. - that never terminated, so killed it and tried a simpler SELECT (the subject of this thread) from psql to see how long that would take. I.e., my original application doesn't receive the entire dataset. -- Yang Zhang http://www.mit.edu/~y_z/ -- Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general