Tom Lane wrote:
Dirk Riehle <dirk@xxxxxxxxxx> writes:
For one analysis, I created a table of about 15 columns, and then ran an
insert with a subselect on the table, where the select was drawing from
three other tables, merging over ids. One of the other tables has about
60M rows.
The insert ran for about 18h before running out of disk space. Before
the query, there had been about 1TB free on the disk.
After the insert query failed, the disk space was not made available
again; PostgreSQL did not free it up (or would not free it up quickly).
What that part sounds like is you mistyped the insert such that it
was inserting a huge number of rows. It's not too hard to do if
you get the join condition wrong --- what you meant to be a sane
join can easily turn into a Cartesian product. In theory vacuum
could reclaim the space eventually, but it'd take awhile.
Hmm... here the insert:
insert into t select ... from commits c, diffs d, sloc_metrics sm, where
d.commit_id = c.id and sm.diff_id = d.id;
sm has the 46M rows, d has 60M rows. I don't know enough about database
engines to understand why it would not be able to incrementally do the
matching but rather might run into a cartesian product?
In any case, I pared down the tables and it ran through quickly without
further problems. So maybe I had a typo in there.
I rebooted soon thereafter.
During boot, fsck (must have been fsck) gave me tons of freeing up inode
messages.
And this part is a filesystem bug; it cannot possibly be Postgres'
fault that the filesystem got corrupted.
One would think so? But the reboot should have stopped PostgreSQL properly.
Thanks for the help!
Dirk
--
Phone: +1 (650) 215 3459
Blog: http://dirkriehle.com
http://twitter.com/dirkriehle
--
Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general