Sage, Piotr, sorry for your time and thanx for your help. Memtest showed me red results. Will be digging for bad memory chips. Thanks again. Сахинов Константин тел.: +7 (909) 945-89-42 2015-08-10 16:58 GMT+03:00 Dałek, Piotr <Piotr.Dalek@xxxxxxxxxxxxxx>: >> -----Original Message----- >> From: ceph-devel-owner@xxxxxxxxxxxxxxx [mailto:ceph-devel- >> owner@xxxxxxxxxxxxxxx] On Behalf Of Sage Weil >> Sent: Monday, August 10, 2015 3:52 PM >> To: Константин Сахинов >> >> On Mon, 10 Aug 2015, ?????????? ??????? wrote: >> > Uploaded another corrupted piece. >> > >> > 2015-08-10 16:18:40.027726 7f7979697700 -1 log_channel(cluster) log >> > [ERR] : be_compare_scrubmaps: 3.fd shard 6: soid >> > f2e832fd/rbd_data.ab7174b0dc51.0000000000000249/head//3 data_digest >> > 0x64e94460 != known data_digest 0xaec3bea8 from auth shard 10 >> > >> > # ceph-post-file ceph-6-rbd_data.ab7174b0dc51.0000000000000249 >> > ceph-post-file: e96e5828-b97c-45f1-8e3f-23abbf700865 >> > >> > # ceph-post-file ceph-10-rbd_data.ab7174b0dc51.0000000000000249 >> > ceph-post-file: e1277a33-74a5-4d46-93c8-266bd81867db >> > >> > I dont' think of bad disk: >> > - all OSDs SMART is clean, >> > - I tried to ceph pg repair 3.d8 on another OSDs in the past with the >> > same result. Then I rebalanced the cluster so that pg 3.d8 moved to >> > [7,1]. >> >> Again, it's all single bit changes: >> >> [..] >> >> It looks like it's also always the least significant bit in a 4-byte word. >> >> Can you see if there is any pattern to which OSDs are used for the >> inconsistent PGs? > > I would check machines that host those PGs for memory errors, especially if they don't have ECC RAM sticks or ECC feature is disabled. > http://www.memtest.org/ is a good tool for this purpose. > > With best regards / Pozdrawiam > Piotr Dałek > -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html