Search Postgresql Archives

Re: COPY losing information

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Just bringing back to life a message I sent last July.

The problem I was having was that when importing very large data sets, COPY seemed to drop some data. I built a script to use INSERTs, and same problem. My server runs 8.1.3 on Linux. Several people investigated, Reece Hart was unable to reproduce the problem using my same data file, and Tom Lane suggested a buggy client installation dropping info, or hardware problems in the server.

I've come back to this problem recently, and have found a couple of interesting things.
I'm using a 418MB data file. wc -l gives me 6,802,367 lines

On the server (Linux running Postgres 2.1.3). 4GB RAM, and the disk Postgres lives in is a network drive.
After copying the data file to the server and running psql locally.
- Import try 1: count(*) gives 6,801,664
- Import try 2: count(*) gives 6,802,241
- Every import gave a different count
The log file doesn't say anything other than
LOG: checkpoints are occurring too frequently (27 seconds apart), but that's should not be a problem, right?

I also tried it in my local-disk Windows installation, which runs Postgres 8.1.0
- Import try 1: count(*) gives 6,824,366
- All imports since then give, correctly, count(*) = 6,802,367

This time, I FTP'd the file to my mac laptop, which is running 8.2.3. The import gave me the correct count every time. Remembering the suggestion of hardware problems, I made an empty install of 8.2.3 on my server. On it, the data file was fully imported every single time. I decided to copy all my data to 8.2.3 to test dependency of the bug on a) size of the database (30GB of data) or b) some problem with data or some stored procedure. I've tried the import, and again, it's correct every single time.

During the import of the database data to 8.2.3, I got mostly a clean set, except for the 3 error messages listed below which suggest a buggy kernel.

A bunch of questions:
1) Has anybody seen this type of behavior in 8.1.*, is there a known bug that might explain the problems in both the Linux server and the Windows box? 2) Those errors on the database import are troubling, but the affected only 3 tables. Wouldn't a buggy kernel give more trouble? Is it possible that this was a bug in 8.1.3's pg_dump? Also, since the database is so big, the output of pg_dump is split'd and bzip2'd, so there's room for error there too. 3) On Friday I'm going to upgrade the production database from 8.1.3 to 8.2.3. Any caveats or words of advice?

Thank you,
Jaime


ERROR:  unexpected data beyond EOF in block 23662 of relation "portfolio"
HINT: This has been seen to occur with buggy kernels; consider updating your system.
CONTEXT:  COPY portfolio, line 3426949: "210395 1974263 1       723"
STATEMENT: COPY portfolio (deal_id, security_id, amount, portfolio_version) FROM stdin; ERROR: unexpected data beyond EOF in block 4028 of relation "coverage_test_val" HINT: This has been seen to occur with buggy kernels; consider updating your system. CONTEXT: COPY coverage_test_val, line 338035: "340676 588 2006-08-25 STEPUP_TRIGGER2 6 0.0 0.0 \
 7"
STATEMENT: COPY coverage_test_val (coverage_test_val_id, deal_id, observation_date, coverage_test_name, coverage_test\ _priority, coverage_test_value, coverage_test_trigger, coverage_test_type_id) FROM stdin; ERROR: unexpected data beyond EOF in block 4049 of relation "deal_current_val" HINT: This has been seen to occur with buggy kernels; consider updating your system. CONTEXT: COPY deal_current_val, line 511050: "612884 7008 2005-09-21 23 1.957871" STATEMENT: COPY deal_current_val (deal_current_val_id, deal_id, observation_date, type_id, deal_current_val) FROM std\
in;



***********************************************************************
Bear Stearns is not responsible for any recommendation, solicitation,
offer or agreement or any information about any transaction, customer
account or account activity contained in this communication.

Bear Stearns does not provide tax, legal or accounting advice.  You
should consult your own tax, legal and accounting advisors before
engaging in any transaction. In order for Bear Stearns to comply with
Internal Revenue Service Circular 230 (if applicable), you are notified
that any discussion of U.S. federal tax issues contained or referred to
herein is not intended or written to be used, and cannot be used, for
the purpose of:  (A) avoiding penalties that may be imposed under the
Internal Revenue Code; nor (B) promoting, marketing or recommending to
another party any transaction or matter addressed herein.
***********************************************************************


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]
  Powered by Linux