Re: "ceph pg scrub" does not start

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Dear All,

Sorry to bump the thread, but I still can't manually repair inconsistent
pgs on our Mimic cluster (13.2.0, upgraded from 12.2.5)

There are many similarities to an unresolved bug:

http://tracker.ceph.com/issues/15781

To give more examples of the problem:

The following commands appear to run OK, but *nothing* appears in the
osd log to indicate that the commands are running. The OSD's are
otherwise working & logging OK.

# ceph pg scrub 4.e19
instructing pg 4.e19s0 on osd.246 to scrub

# ceph pg repair 4.e19
instructing pg 4.e19s0 on osd.246 to repair

# ceph osd scrub 246
instructed osd(s) 246 to scrub

# ceph osd repair 246
instructed osd(s) 246 to repair

It does not matter which osd or pg the repair is initiated on.

This command also fails:
# rados list-inconsistent-obj 4.e19
No scrub information available for pg 4.e19
error 2: (2) No such file or directory

>From the OSD logs, and 'ceph -s' I can see that the OSD's are still
doing automatic background pg scrubs, just not the ones I have asked
them to do, at the time of my request they are not currently scrubbing.

Could it be that my commands are not being sent to the OSD's?

Any idea on how to debug this?

...

Further info:

Output of 'ceph pg 4.e19 query' is here:
http://p.ip.fi/9x5v

Output of 'ceph daemon osd.246 config show' is here
http://p.ip.fi/RAuk

Cluster has 10 nodes, 128GB RAM, dual Xeon
450 Bluestore SATA OSD, EC 8:2
4 NVME OSD, replicated
used for cephfs (2.3PB), daily snapshots only

# ceph health detail
HEALTH_ERR 9500031/5149746146 objects misplaced (0.184%); 80 scrub
errors; Possible data damage: 7 pgs inconsistent
OBJECT_MISPLACED 9500031/5149746146 objects misplaced (0.184%)
OSD_SCRUB_ERRORS 80 scrub errors
PG_DAMAGED Possible data damage: 7 pgs inconsistent
    pg 4.ff is active+clean+inconsistent, acting
[318,403,150,13,225,261,382,175,282,324]
    pg 4.2e2 is active+clean+inconsistent, acting
[352,59,328,451,195,119,42,66,158,150]
    pg 4.551 is active+clean+inconsistent, acting
[391,105,124,150,205,22,269,184,293,91]
    pg 4.61c is active+clean+inconsistent, acting
[382,131,84,35,282,214,236,366,309,150]
    pg 4.8cd is active+clean+inconsistent, acting
[353,58,5,252,187,183,323,150,387,32]
    pg 4.a20 is active+clean+inconsistent, acting
[346,104,398,282,225,133,150,70,165,17]
    pg 4.e19 is active+clean+inconsistent, acting
[246,447,245,98,170,348,111,155,150,295]

again, thanks for any advice,

Jake
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux