I've asked about my problem in opensolaris zfs list and what that they say
...
Most likely this is an application corruption problem. With ZFS
checksums, it's not possible to have undetected corruption of this type.
Of course, it's possible there is a ZFS bug somewhere, but none of the
information provided so far indicates that. Did you have an unexpected
panic shortly before this happened? It shouldn't matter, but if there
was a bug then it would indicate it might be present in the ZIL replay
code, or that the database isn't using sync/O_DYNC correctly.
checksums, it's not possible to have undetected corruption of this type.
Of course, it's possible there is a ZFS bug somewhere, but none of the
information provided so far indicates that. Did you have an unexpected
panic shortly before this happened? It shouldn't matter, but if there
was a bug then it would indicate it might be present in the ZIL replay
code, or that the database isn't using sync/O_DYNC correctly.
> -bash-3.00# fmdump -e
> TIME CLASS
> May 29 19:43:40.8328 ereport.fs.zfs.io
any user-visible errors.
As far as I can tell, you had a random unassociated error, and that the
problem is some higher-level piece of software (i.e. postgres). It
would be worth dumping the contents of that file and determining if it's
something completely crazy (like the contents of another file) or
somehow self-inconsistent.
...
So, what do you think, about that? And what is the strategy to clean the error? Find and drop faulty record?
On 5/30/07, Roman Chervotkin <roman.chervotkin@xxxxxxxxx> wrote:
Do usual pg_dump today and have got an error.
---------------------
pg_dump: SQL command failed
pg_dump: Error message from server: ERROR: compressed data is corrupt
pg_dump: The command was: COPY public.candidates (id, name, surname, mid_name, compensation, created, birthday, updated, creator_id, updater_id, home_region_id, home_city, home_street, other_languages, home_region_other, edu_conformity, sex, deleted, resume_text, resume_text_index, mark, kids, driving_license, marital_status, software_knowledge, business_contacts, type_writing_rus, type_writing_lat, desired_duties, desired_other, updater_name, citizenship, compensation_currency, folder_id, plus, random_cat, temp_reg, "temp") TO stdout;
----------------------
I had 8.2.0 on Solaris 10 (SunOS server3 5.10 Generic_125101-05 ), pg_data_dir on zfs mirrored pool. I've checked status of pool and got:
----------------------
-bash-3.00# zpool status tank
pool: tank
state: ONLINE
status: One or more devices has experienced an unrecoverable error. An
attempt was made to correct the error. Applications are unaffected.
action: Determine if the device needs to be replaced, and clear the errors
using 'zpool clear' or replace the device with 'zpool replace'.
see: http://www.sun.com/msg/ZFS-8000-9P
scrub: scrub completed with 0 errors on Tue May 29 19:46:12 2007
config:
NAME STATE READ WRITE CKSUM
tank ONLINE 0 0 0
mirror ONLINE 0 0 0
c1t0d0 ONLINE 1 0 0
c1t4d0 ONLINE 0 0 0
errors: No known data errors
--------------------------------------------------------
So it seems a drive should be replaced but pool is still functioning. I still can copy pg_data_dir to other filesystem without any errors. And database seems working as usual. So I guess pg_dump should work but It does not. I've upgraded to 8.2.4 but still have the same error. Copy pg_data_dir to another file system, start postgres with that data_dir, tried pg_dump again but still have the error.
So the question is what should I do in order to pg_dump start working as usual?
Thanks
Roman.