Konstantin,
Thanks for your answer, i will run a ceph pg repair.
Could you maybe elaborate globally how this repair process works? Does it just try to re-read the read_error osd?
IIRC there was a time when a ceph pg repair wasn't considered 'safe' because it just copied the primary osd shard contents to the other osd's.
Since when did this change?
btw, i woke up this morning with only 1 active+clean+inconsistent pg left so one already triggered a new (deep) scrub and re-read the primary osd and found it good.
I noticed these read_errors start to occur on this installation when available RAM gets low (We still have to reboot the cluster nodes once in a while to free up RAM).
Furthermore we will upgrade to 12.2.12 soon
Caspar Smit
Systemengineer
SuperNAS
Dorsvlegelstraat 13
1445 PA Purmerend
t: (+31) 299 410 414
e: casparsmit@xxxxxxxxxxx
w: www.supernas.eu
SuperNAS
Dorsvlegelstraat 13
1445 PA Purmerend
t: (+31) 299 410 414
e: casparsmit@xxxxxxxxxxx
w: www.supernas.eu
Op do 5 dec. 2019 om 07:26 schreef Konstantin Shalygin <k0ste@xxxxxxxx>:
Yes, you should call pg repair. Also It's better to upgrade to 12.2.12.I tried to dig in the mailinglist archives but couldn't find a clear answer to the following situation: Ceph encountered a scrub error resulting in HEALTH_ERR Two PG's are active+clean+inconsistent. When investigating the PG i see a "read_error" on the primary OSD. Both PG's are replicated PG's with 3 copies. I'm on Luminous 12.2.5 on this installation, is it safe to just run "ceph pg repair" on those PG's or will it then overwrite the two good copies with the bad one from the primary? If the latter is true, what is the correct way to resolve this?
k
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com