Hi again, My CEPH came up a while ago: 3 pgs not deep-scrubbed in time. I googled to increase osd_scrub_begin_hour and osd_scrub_end_hour but not seems to work. There was a discussion on proxmox, a similar situation, he ran "ceph osd repair all" and got it fixed. But it doesn't seem to work a day after I execute it. When I continued searching and came across a blog, I ran: ceph tell osd.* injectargs --osd_max_scrubs=100 ceph tell mon.* injectargs --osd_max_scrubs=100 This is the wrong start, the madness appears: pg active+clean+scrubbing+deep+repair I lowered this configuration immediately, but it was too late. Now: cluster: id: 48ff8b6e-1203-4dc8-b16e-d1e89f66e28f health: HEALTH_ERR 110 scrub errors Too many repaired reads on 1 OSDs Possible data damage: 12 pgs inconsistent 16 pgs not deep-scrubbed in time 23 slow ops, oldest one blocked for 183 sec, daemons [osd.1,osd.13,osd.14,osd.15,osd.16,osd.17,osd.18,osd.19,osd.2,osd .22]...have slow ops. services: mon: 3 daemons, quorum ceph-node-1,ceph-node-2,ceph-node-3 (age 5M) mgr: ceph-node-2(active, since 7M), standbys: ceph-node-1, ceph-node-3 osd: 32 osds: 32 up (since 21h), 32 in (since 4M) data: pools: 2 pools, 1025 pgs objects: 6.78M objects, 25 TiB usage: 76 TiB used, 41 TiB / 118 TiB avail pgs: 624 active+clean 389 active+clean+scrubbing+deep+repair 12 active+clean+scrubbing+deep+inconsistent io: client: 6.9 MiB/s rd, 18 MiB/s wr, 648 op/s rd, 1.21k op/s wr ceph osd perf Sat osd commit_latency(ms) apply_latency(ms) 31 11 11 28 17 17 25 1 1 24 5 5 21 1 1 17 6 6 7 0 0 30 16 16 29 13 13 26 37 37 19 6 6 3 12 12 2 4 4 1 2 2 0 15 15 13 27 27 15 33 33 12 21 21 14 36 36 18 15 15 9 26 26 8 5 5 6 1 1 5 1 1 4 6 6 27 1 1 23 5 5 10 11 11 11 17 17 20 6 6 16 6 6 22 0 0 And in the past >30 hours, except for the increase of pg inconsistent, the number of active+clean+scrubbing+deep+repair pg has not changed. Now ceph configuration: ceph tell osd.* injectargs '--osd_scrub_begin_hour 0' ceph tell osd.* injectargs '--osd_scrub_end_hour 0' ceph tell mon.* injectargs '--osd_scrub_begin_hour 0' ceph tell mon.* injectargs '--osd_scrub_end_hour 0' ceph tell osd.* injectargs '--osd_max_scrubs 10' ceph tell osd.* injectargs '--osd_scrub_chunk_min 5' ceph tell osd.* injectargs '--osd_scrub_chunk_max 25' ceph tell osd.* injectargs '--osd_deep_scrub_stride 196608' ceph tell osd.* injectargs '--osd_scrub_priority 5' ceph tell osd.* injectargs '--osd_scrub_load_threshold 10' What should I do? Do I just have to wait for ceph to finish? Or is there any way to make the repair stop? I heard that restarting the osd has an effect, but I am afraid to do it now, because it may aggravate the error. Thanks for any suggestions! _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx