At 18.44 21/03/2006, Scott Marlowe wrote:
Here's what's happening. On the "fast" machine, you are almost
certainly using IDE drives.
Oh yes, the fast machine has IDE drives, you got it ;)
Meanwhile, back in the jungle... The machine with IDE drives operates
differently. Most, if not all, IDE drives, when told by the OS to
fsync() tell the OS immediately that the fsync() call has completed, and
the data is written to the drive. Shortly thereafter, the drive
actually commences to write the data out. When it gets a chance.
I really didn't know this behaviour of IDE drives.
I was stracing the postmaster while investigating the problem and noticed
many fsync syscalls (one after each INSERT).
I was investigating on it but I didn't explain me why SCSI was slower.
You helped me a lot ;) tnx
For PostgreSQL, the way IDE drives operate is dangerous. Write data
out, call fsync(), get an immediate return, mark the data as committed,
move on the next operation, operator trips over power cord / power
conditioner explodes, power supply dies, brown out causes the machine to
reboot, et. al., and when the machine comes up, PostgreSQL politely
informs you that your database is corrupt, and you come to the
pgsql-general group asking how to get your database back online. Very
bad.
Yes, it sounds very bad... what about SATA drives ?
I heard about command queueing in SATA but I don't know if the kernel
handles it properly
Try wrapping the inserts in the sql file in begin; / commit; statements,
like so:
begin;
insert into table ...
(100,000 inserts here)
insert into table ...
commit;
and it should fly.
Oh, yes with the insert wrapped in a transaction the import time is as follows:
- SCSI: 35 secs
- IDE: 50 secs
When a good friend of mine first started using PostgreSQL, he was a
total MySQL bigot. He was importing a 10,000 row dataset, and made a
smartassed remark after 10 minutes how it would have imported in minutes
on MySQL. It was a test database, so I had him stop the import, delete
all the imported rows, and wrap the whole import inside begin; and
commit;
The import took about 20 seconds or so.
;)
Now, for the interesting test. Run the import on both machines, with
the begin; commit; pairs around it. Halfway through the import, pull
the power cord, and see which one comes back up. Don't do this to
servers with data you like, only test machines, obviously. For an even
more interesting test, do this with MySQL, Oracle, DB2, etc...
I will surely run a test like this ;)
Tnx a lot again for help
Regards
Edoardo Serra