Tom, here's the "explain" results: Does this help explain what went wrong? (And yes, I think there will be a *lot* of groups.) explain select count(*) as count,fips_st_cd, fips_cnty_cd, farm_nbr, tract_nbr, field_nbr,crop_cd,crop_status_cd,practice_cd,seq_nbr from compliance_2006 group by fips_st_cd, fips_cnty_cd, farm_nbr, tract_nbr, field_nbr,crop_cd,crop_status_cd,practice_cd,seq_nbr order by 1 desc; QUERY PLAN -------------------------------------------------------- Sort (cost=15119390.46..15123902.54 rows=1804832 width=160) Sort Key: count(*) -> GroupAggregate (cost=13782933.29..14301822.43 rows=1804832 width=160) -> Sort (cost=13782933.29..13828054.08 rows=18048318 width=160) Sort Key: fips_st_cd, fips_cnty_cd, farm_nbr, tract_nbr, field_nbr, crop_cd, crop_status_cd, practice_cd, seq_nbr -> Seq Scan on compliance_2006 (cost=0.00..1039927.18 rows=18048318 width=160) (6 rows) On Sun, Aug 19, 2007 at 01:19:51PM -0400, Tom Lane wrote: > Bill Thoen <bthoen@xxxxxxxxxx> writes: > > I knew this would take some time, but what I didn't expect was that about > > an hour into the select, my mouse and keyboard locked up and also I > > couldn't log in from another computer via SSH. This is a Linux machine > > running Fedora Core 6 and PostgresQL is 8.1.4. There's about 50GB free on > > the disc too. > > > I finally had to shut the power off and reboot to regain control of my > > computer (that wasn't good idea, either, but eventually I got everything > > working again.) > > I've seen Fedora go nuts like that when it ran out of memory. Once it > starts to swap heavily, performance goes into the tank; and once the > kernel realizes it's in memory trouble, it starts to kill processes > more or less at random. That might explain why ssh stopped working. > > One thing to do to make it more robust is to disable memory overcommit. > I suspect also that configuring it with lots of swap space is > counterproductive, because that just encourages the kernel to allow lots > of swapping. I haven't actually experimented with that part though. > > As for why PG ran the system out of memory, I suspect that the planner > drastically underestimated the number of groups to be created by your > GROUP BY, and thought it could get away with a hash aggregation. We > don't currently have any provision for spilling hash aggregation to > disk, so if there's a very large number of groups the table just gets > very big :-(. The planner is not supposed to choose hash agg if the > estimated table size exceeds work_mem ... but if it had out-of-date > statistics to work with it might have gotten the wrong answer. Have > you ANALYZEd this table recently? What does EXPLAIN show as the > estimated number of result rows? > > regards, tom lane > ---------------------------(end of broadcast)--------------------------- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match