Re: Watcher Issue

Eugen Block <eblock@xxxxxx> · Sat, 25 Jan 2025 21:38:09 +0000

We really don't know what the cluster state was. I assume by removing  
the fifth host you removed the fullest OSD(s). But again, we only get  
a few bits of information, so it's still basically guessing.
What exactly do you mean by "so I thought to run pool repair"? What  
exactly did you do?

Zitat von Devender Singh <devender@xxxxxxxxxx>:

Hello Eugen

Thanks for your reply.

	ceph osd set nodeep-scrub is not stopping if repairs are running.
Reapir started another set for deepscrub+repair which is not  
controlled using this command.

When I started my cluster utilization as 74% and when it finished  
now my cluster is showing 43%(surprising figure, does a repair can  
shuffle that much? Even my osd’s where between 60-83%) utilized.
Before starting repair I increased pool pg’s manually from 1024 to  
2048 but my pool only came down to 84%, so I thought to run pool  
repair..
During the repair and high utilization my cluster was crying and  
giving SLOW OSD Communication from osd.x to osd.y and many number of  
times.

Not sure what was it, how utilization came that much down, If repair  
can change thing this much then it should have command to  
pause/unpause too.

Regards
Dev

On Jan 25, 2025, at 1:15 AM, Eugen Block <eblock@xxxxxx> wrote:

But they would only reveal inconsistent PGs during deep-scrub.

_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx