Le 2014-07-04 à 14:42, François Beausoleil <francois@xxxxxxxxxxx> a écrit : > Hi, > > I ran a query on the slave and got the error message in the subject. I rebuilt the slave two days ago on ZFS (with compression). I’m wondering how that unexpected zero page happened in the first place. > > Master: > Linux kalanchoe 3.2.0-58-generic #88-Ubuntu SMP Tue Dec 3 17:37:58 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux > PostgreSQL 9.1.11 on x86_64-unknown-linux-gnu, compiled by gcc (Ubuntu/Linaro 4.6.3-1ubuntu5) 4.6.3, 64-bit > > Slave: > Linux munn 3.2.0-65-generic #98-Ubuntu SMP Wed Jun 11 20:27:07 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux > PostgreSQL 9.1.13 on x86_64-unknown-linux-gnu, compiled by gcc (Ubuntu/Linaro 4.6.3-1ubuntu5) 4.6.3, 64-bit > > The slave is connected to the master through streaming replication. I know I’m not perfectly in sync, and I’ll switch the roles in a week, after a burn-in period for the slave. > > I was running the following query on the slave: > > select persona_followers_count, count(*) > from show_interaction_bindings > where market_id = 'e8639780-db2d-012e-ebb6-40400fe46aa7’ > and interaction_created_at >= '2014-06-18’ > and service_name = ‘Facebook' > and show_id = 'ed7e4800-c370-012e-eb90-40400fe46aa7’ > group by 1 order by 2 desc; > > and I received this error message: > > ERROR: index "canada_2014_06_23_pkey" contains unexpected zero page at block 135764 > HINT: Please REINDEX it. > > show_interaction_bindings is a master table with partitions by market_id and week, such that I have 500+ child tables. bindings.2014_06_23’s schema looks like: > > > Table "bindings.canada_2014_06_23" > Column | Type | Modifiers > -------------------------------------+-----------------------------+---------------------- > interaction_id | uuid | not null > show_id | uuid | not null > service_name | character varying(8) | not null > interaction_created_at | timestamp without time zone | not null > interaction_service_id | text | not null > interaction_text | text | > interaction_source | text | > interaction_platform_name | character varying(256) | > interaction_app_name | character varying(256) | > interaction_in_reply_to_screen_name | character varying(256) | > interaction_in_reply_to_status_id | character varying(64) | > interaction_in_reply_to_user_id | character varying(64) | > interaction_language | character varying(12) | > interaction_retweet_count | integer | > interaction_retweet_id | character varying(64) | > persona_service_id | character varying(64) | not null > persona_screen_name | character varying(256) | not null > persona_name | character varying(256) | not null > persona_description | text | > persona_location | text | > persona_lang | character varying(16) | > persona_profile_image_url | text | > persona_created_at | timestamp without time zone | > persona_followers_count | integer | not null default 130 > persona_friends_count | integer | not null default 130 > persona_time_zone | text | > persona_utc_offset | integer | > persona_gender | character varying(12) | > persona_age_group | character varying(12) | > persona_country | character varying(40) | > persona_region | character varying(80) | > persona_city | character varying(80) | > market_id | uuid | > score | real | > positive_sentiment | real | not null default 0.0 > negative_sentiment | real | not null default 0.0 > like_or_favorite | boolean | > binder_log | text | > Indexes: > "canada_2014_06_23_pkey" PRIMARY KEY, btree (show_id, interaction_service_id, service_name) > Check constraints: > "market_and_created_at_in_partition" CHECK (market_id = 'e8639780-db2d-012e-ebb6-40400fe46aa7'::uuid AND '2014-06-23 07:00:00'::timestamp without time zone <= interaction_created_at AND interaction_created_at < '2014-06-30 07:00:00'::timestamp without time zone) > Inherits: show_interaction_bindings > > I just rebuilt the slave two days ago on ZFS, from XFS. I configured ZFS to compress PG’s data directory. PG’s logs are silent, except for the mentioned error: > > 2014-07-04 17:09:46.286 UTC - @ 42408 (00000) 2014-07-02 14:21:40 UTC - CONTEXT: xlog redo delete: index 1663/18332/27715584; iblk 2042654, heap 1663/18332/27715411; > 2014-07-04 17:10:02.323 UTC - @ 42408 (00000) 2014-07-02 14:21:40 UTC - LOG: file "pg_clog/0636" doesn't exist, reading as zeroes > 2014-07-04 17:10:02.323 UTC - @ 42408 (00000) 2014-07-02 14:21:40 UTC - CONTEXT: xlog redo delete: index 1663/18332/27715584; iblk 2041497, heap 1663/18332/27715411; > 2014-07-04 17:10:32.257 UTC - @ 42408 (00000) 2014-07-02 14:21:40 UTC - LOG: file "pg_clog/08A1" doesn't exist, reading as zeroes > 2014-07-04 17:10:32.257 UTC - @ 42408 (00000) 2014-07-02 14:21:40 UTC - CONTEXT: xlog redo delete: index 1663/18332/27715584; iblk 2042678, heap 1663/18332/27715411; > 2014-07-04 17:12:45.201 UTC - @ 42453 (00000) 2014-07-02 14:21:41 UTC - LOG: restartpoint complete: wrote 84024 buffers (2.0%); 0 transaction log file(s) added, 32 removed, 64 recycled; write=540.052 s, sync=0.020 s, total=540.098 s; sync files=103, longest=0.009 s, average=0.000 s > 2014-07-04 17:12:45.201 UTC - @ 42453 (00000) 2014-07-02 14:21:41 UTC - LOG: recovery restart point at 30AB/D5E94FE8 > 2014-07-04 17:12:45.201 UTC - @ 42453 (00000) 2014-07-02 14:21:41 UTC - DETAIL: last completed transaction was at log time 2014-07-04 17:12:36.607511+00 > 2014-07-04 17:13:02.470 UTC - @ 42453 (00000) 2014-07-02 14:21:41 UTC - LOG: restartpoint starting: xlog > 2014-07-04 17:22:02.090 UTC - @ 42453 (00000) 2014-07-02 14:21:41 UTC - LOG: restartpoint complete: wrote 84292 buffers (2.0%); 0 transaction log file(s) added, 8 removed, 34 recycled; write=539.555 s, sync=0.048 s, total=539.620 s; sync files=94, longest=0.006 s, average=0.000 s > … > 2014-07-04 17:32:02.092 UTC - @ 42453 (00000) 2014-07-02 14:21:41 UTC - LOG: restartpoint complete: wrote 64598 buffers (1.5%); 0 transaction log file(s) added, 0 removed, 41 recycled; write=540.000 s, sync=0.005 s, total=540.014 s; sync files=85, longest=0.001 s, average=0.000 s > 2014-07-04 17:32:02.092 UTC - @ 42453 (00000) 2014-07-02 14:21:41 UTC - LOG: recovery restart point at 30AC/375129E0 > 2014-07-04 17:32:02.092 UTC - @ 42453 (00000) 2014-07-02 14:21:41 UTC - DETAIL: last completed transaction was at log time 2014-07-04 17:31:53.506748+00 > 2014-07-04 17:33:02.079 UTC - @ 42453 (00000) 2014-07-02 14:21:41 UTC - LOG: restartpoint starting: time > 2014-07-04 17:42:02.152 UTC - @ 42453 (00000) 2014-07-02 14:21:41 UTC - LOG: restartpoint complete: wrote 78599 buffers (1.9%); 0 transaction log file(s) added, 25 removed, 31 recycled; write=540.034 s, sync=0.013 s, total=540.073 s; sync files=86, longest=0.002 s, average=0.000 s > 2014-07-04 17:42:02.152 UTC - @ 42453 (00000) 2014-07-02 14:21:41 UTC - LOG: recovery restart point at 30AC/5C0514F8 > 2014-07-04 17:42:02.152 UTC - @ 42453 (00000) 2014-07-02 14:21:41 UTC - DETAIL: last completed transaction was at log time 2014-07-04 17:42:01.954993+00 > 2014-07-04 17:44:47.178 UTC - @ 42453 (00000) 2014-07-02 14:21:41 UTC - LOG: restartpoint starting: time > … > 2014-07-04 17:53:47.067 UTC - @ 42453 (00000) 2014-07-02 14:21:41 UTC - LOG: restartpoint complete: wrote 60429 buffers (1.4%); 0 transaction log file(s) added, 0 removed, 37 recycled; write=539.855 s, sync=0.027 s, total=539.889 s; sync files=67, longest=0.010 s, average=0.000 s > 2014-07-04 17:53:47.067 UTC - @ 42453 (00000) 2014-07-02 14:21:41 UTC - LOG: recovery restart point at 30AC/89B87FB8 > 2014-07-04 17:53:47.067 UTC - @ 42453 (00000) 2014-07-02 14:21:41 UTC - DETAIL: last completed transaction was at log time 2014-07-04 17:53:38.213828+00 > 2014-07-04 17:54:47.046 UTC - @ 42453 (00000) 2014-07-02 14:21:41 UTC - LOG: restartpoint starting: time > 2014-07-04 17:58:03.242 UTC - svanreporting@svanalytics_production 58863 (XX002) 2014-07-04 17:57:20 UTC - ERROR: index "canada_2014_06_23_pkey" contains unexpected zero page at block 135764 > 2014-07-04 17:58:03.242 UTC - svanreporting@svanalytics_production 58863 (XX002) 2014-07-04 17:57:20 UTC - HINT: Please REINDEX it. > 2014-07-04 17:58:03.242 UTC - svanreporting@svanalytics_production 58863 (XX002) 2014-07-04 17:57:20 UTC - STATEMENT: select persona_followers_count, count(*) from show_interaction_bindings where market_id = 'e8639780-db2d-012e-ebb6-40400fe46aa7' and interaction_created_at >= '2014-06-18' and service_name = 'Facebook 'and show_id = 'ed7e4800-c370-012e-eb90-40400fe46aa7' group by 1 order by 2 desc; > 2014-07-04 18:03:47.184 UTC - @ 42453 (00000) 2014-07-02 14:21:41 UTC - LOG: restartpoint complete: wrote 86440 buffers (2.1%); 0 transaction log file(s) added, 16 removed, 29 recycled; write=540.078 s, sync=0.038 s, total=540.137 s; sync files=88, longest=0.010 s, average=0.000 s > 2014-07-04 18:03:47.184 UTC - @ 42453 (00000) 2014-07-02 14:21:41 UTC - LOG: recovery restart point at 30AC/A5FF4D70 > 2014-07-04 18:03:47.184 UTC - @ 42453 (00000) 2014-07-02 14:21:41 UTC - DETAIL: last completed transaction was at log time 2014-07-04 18:03:42.458954+00 > 2014-07-04 18:05:02.004 UTC - @ 42453 (00000) 2014-07-02 14:21:41 UTC - LOG: restartpoint starting: time > 2014-07-04 18:14:02.090 UTC - @ 42453 (00000) 2014-07-02 14:21:41 UTC - LOG: restartpoint complete: wrote 57764 buffers (1.4%); 0 transaction log file(s) added, 0 removed, 28 recycled; write=540.064 s, sync=0.015 s, total=540.085 s; sync files=64, longest=0.003 s, average=0.000 s > > (skipped a few syntax errors in the logs, otherwise, this is it) > > I found http://www.postgresql.org/message-id/9b0dbfb1d330b895475232e70cff6e02@xxxxxxxxxxxxxxxxxxxxx, but this is about 9.3. The response does mention upgrading to the latest 9.3 version. > > This seems to imply a data corruption problem caused by an incorrect fsync: http://postgresql.1045698.n5.nabble.com/FATAL-index-contains-unexpected-zero-page-at-block-td5732056.html > > This is my first experience with ZFS. I configured two disks as mirrors. zpool status does not report any problems: > > $ sudo zpool status > pool: pgdata > state: ONLINE > scan: scrub canceled on Wed Jul 2 15:41:23 2014 > config: > > NAME STATE READ WRITE CKSUM > pgdata ONLINE 0 0 0 > mirror-0 ONLINE 0 0 0 > sdb ONLINE 0 0 0 > sdc ONLINE 0 0 0 > > errors: No known data errors > > $ sudo zfs list > NAME USED AVAIL REFER MOUNTPOINT > pgdata 989G 1.71T 31K /pgdata > pgdata/data 988G 1.71T 31K /pgdata/data > pgdata/data/dbanalytics 982G 1.71T 808G /var/lib/postgresql > pgdata/data/walinbox 6.66G 1.71T 6.66G /var/backups/seevibes/wal > > I have a zpool scrub starting tonight, and I’ll see if there are disk errors from there. > > Any ideas on how that data error could slip through? To put out more data: * The zpool scrub found no issues; * I successfully ran a pg_dumpall > /dev/null from the master, confirming all data files are ok. I restored from a fresh base backup, knowing that this backup was taken while the master had not been restarted. I’m running pg_dumpall > /dev/null on the slave, to determine if the data errors can be reproduced. I’ll know more in 24h. Bye, François Beausoleil