Re: 1 pg inconsistent and does not recover

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Niklas,
You may have a hardware error, but who knows...
Can you post the entire Ceph status output? (Pastebin) sometimes
list-inconsistent-obj throws that error if a scrub job is still running.
Also, please try to find mode information from the logs by doing.

grep -Hn 'ERR' /var/log/ceph/ceph-osd.33.log

Cheers.

On Tue, Jun 27, 2023 at 4:47 PM Niklas Hambüchen <mail@xxxxxx> wrote:

> Hi,
>
> I have a 3x-replicated pool with Ceph 12.2.7.
>
> One HDD broke, its OSD "2" was automatically marked as "out", the disk was
> physically replaced by a new one, and that added back in.
>
> Now `ceph health detail` continues to permanently show:
>
>      [ERR] OSD_SCRUB_ERRORS: 1 scrub errors
>      [ERR] PG_DAMAGED: Possible data damage: 1 pg inconsistent
>          pg 2.87 is active+clean+inconsistent, acting [33,2,20]
>
> What exactly is wrong here?
>
> Why can Ceph not fix the issue?
> With BlueStore I have checksums, on two unbroken disks, so what remaining
> inconsistency can there be?
>
> The suggested command in
> https://docs.ceph.com/en/pacific/rados/operations/pg-repair/#commands-for-diagnosing-pg-problems
> does not work:
>
>      # rados list-inconsistent-obj 2.87
>      No scrub information available for pg 2.87
>      error 2: (2) No such file or directory
>
> Further, I find the documentation in
> https://docs.ceph.com/en/pacific/rados/operations/pg-repair/#more-information-on-pg-repair
> extremely unclear.
> It says
>
> > In the case of replicated pools, recovery is beyond the scope of pg
> repair.
>
> while many people on the Internet suggest that `ceph pg repair` might fix
> the issue.
> Yet again others claim that Ceph will fix the issue itself.
> I am hesitant to run "ceph pg repair" without understanding what the
> problem is and what exactly this will do.
>
> I have already reported the "error 2" and the documentation in issue
> https://tracker.ceph.com/issues/61739 but not received a reply yet, and
> my cluster stays "inconsistent".
>
> How can this be fixed?
>
> I would appreciate any help!
> _______________________________________________
> ceph-users mailing list -- ceph-users@xxxxxxx
> To unsubscribe send an email to ceph-users-leave@xxxxxxx
>


-- 

Alvaro Soto

*Note: My work hours may not be your work hours. Please do not feel the
need to respond during a time that is not convenient for you.*
----------------------------------------------------------
Great people talk about ideas,
ordinary people talk about things,
small people talk... about other people.
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux