What does `ceph pg 6.263 query` show you?
On Thu, Jun 30, 2016 at 12:02 PM, Goncalo Borges <goncalo.borges@xxxxxxxxxxxxx> wrote:
Dear Cephers...
Today our ceph cluster gave us a couple of scrub errors regarding inconsistent pgs. We just upgraded from 9.2.0 to 10.2.2 two days ago.
# ceph health detail
HEALTH_ERR 2 pgs inconsistent; 2 scrub errors; crush map has legacy tunables (require bobtail, min is firefly)
pg 6.39c is active+clean+inconsistent, acting [2,60,32]
pg 6.263 is active+clean+inconsistent, acting [56,39,6]
2 scrub errors
crush map has legacy tunables (require bobtail, min is firefly); see http://ceph.com/docs/master/rados/operations/crush-map/#tunables
We have started by looking to pg 6.263. Errors were only appearing in osd.56 logs but not in others.
# cat ceph-osd.56.log-20160629 | grep -Hn 'ERR'
(standard input):8569:2016-06-29 08:09:50.952397 7fd023322700 -1 log_channel(cluster) log [ERR] : scrub 6.263 6:c645f18e:::100002a343d.00000000:head on disk size (1836) does not match object info size (41242) adjusted for ondisk to (41242)
(standard input):8602:2016-06-29 08:11:11.227865 7fd023322700 -1 log_channel(cluster) log [ERR] : 6.263 scrub 1 errors
So, we did a 'ceph pg repair 6.263'.
Eventually, that pg went back to 'active+clean'
# ceph pg dump | grep ^6.263
dumped all in format plain
6.263 10845 0 0 0 0 39592671010 3037 3037 active+clean 2016-06-30 02:13:00.455293 1005'2126237 1005:2795768 [56,39,6] 56 [56,39,6] 56 1005'2076134 2016-06-30 02:13:00.455256 1005'2076134 2016-06-30 02:13:00.455256
However, in the logs i found
2016-06-30 02:03:03.992240 osd.56 192.231.127.226:6801/21569 278 : cluster [INF] 6.263 repair starts
2016-06-30 02:13:00.455237 osd.56 192.231.127.226:6801/21569 279 : cluster [INF] 6.263 repair ok, 0 fixed
I did not like the '0 fixed'.
Inspecting a bit more, I found that the object inside the pg in all involved osds are changing size. For example in osd.56 (but the same thing is true in 39 and 6) I found in consecutive 'ls -l' commands:
# ls -l /var/lib/ceph/osd/ceph-56/current/6.263_head/DIR_3/DIR_6/DIR_2/DIR_A/100002a343d.00000000__head_718FA263__6
-rw-r--r-- 1 ceph ceph 8602 Jun 30 02:53 /var/lib/ceph/osd/ceph-56/current/6.263_head/DIR_3/DIR_6/DIR_2/DIR_A/100002a343d.00000000__head_718FA263__6
[root@rccephosd8 ceph]# ls -l /var/lib/ceph/osd/ceph-56/current/6.263_head/DIR_3/DIR_6/DIR_2/DIR_A/100002a343d.00000000__head_718FA263__6
-rw-r--r-- 1 ceph ceph 170 Jun 30 02:53 /var/lib/ceph/osd/ceph-56/current/6.263_head/DIR_3/DIR_6/DIR_2/DIR_A/100002a343d.00000000__head_718FA263__6
# ls -l /var/lib/ceph/osd/ceph-56/current/6.263_head/DIR_3/DIR_6/DIR_2/DIR_A/100002a343d.00000000__head_718FA263__6
-rw-r--r-- 1 ceph ceph 15436 Jun 30 02:53 /var/lib/ceph/osd/ceph-56/current/6.263_head/DIR_3/DIR_6/DIR_2/DIR_A/100002a343d.00000000__head_718FA263__6
# ls -l /var/lib/ceph/osd/ceph-56/current/6.263_head/DIR_3/DIR_6/DIR_2/DIR_A/100002a343d.00000000__head_718FA263__6
-rw-r--r-- 1 ceph ceph 26044 Jun 30 02:53 /var/lib/ceph/osd/ceph-56/current/6.263_head/DIR_3/DIR_6/DIR_2/DIR_A/100002a343d.00000000__head_718FA263__6
# ls -l /var/lib/ceph/osd/ceph-56/current/6.263_head/DIR_3/DIR_6/DIR_2/DIR_A/100002a343d.00000000__head_718FA263__6
-rw-r--r-- 1 ceph ceph 0 Jun 30 02:53 /var/lib/ceph/osd/ceph-56/current/6.263_head/DIR_3/DIR_6/DIR_2/DIR_A/100002a343d.00000000__head_718FA263__6
# ls -l /var/lib/ceph/osd/ceph-56/current/6.263_head/DIR_3/DIR_6/DIR_2/DIR_A/100002a343d.00000000__head_718FA263__6
-rw-r--r-- 1 ceph ceph 14076 Jun 30 02:53 /var/lib/ceph/osd/ceph-56/current/6.263_head/DIR_3/DIR_6/DIR_2/DIR_A/100002a343d.00000000__head_718FA263__6
# ls -l /var/lib/ceph/osd/ceph-56/current/6.263_head/DIR_3/DIR_6/DIR_2/DIR_A/100002a343d.00000000__head_718FA263__6
-rw-r--r-- 1 ceph ceph 31110 Jun 30 02:53 /var/lib/ceph/osd/ceph-56/current/6.263_head/DIR_3/DIR_6/DIR_2/DIR_A/100002a343d.00000000__head_718FA263__6
# ls -l /var/lib/ceph/osd/ceph-56/current/6.263_head/DIR_3/DIR_6/DIR_2/DIR_A/100002a343d.00000000__head_718FA263__6
-rw-r--r-- 1 ceph ceph 0 Jun 30 02:53 /var/lib/ceph/osd/ceph-56/current/6.263_head/DIR_3/DIR_6/DIR_2/DIR_A/100002a343d.00000000__head_718FA263__6
# ls -l /var/lib/ceph/osd/ceph-56/current/6.263_head/DIR_3/DIR_6/DIR_2/DIR_A/100002a343d.00000000__head_718FA263__6
-rw-r--r-- 1 ceph ceph 20230 Jun 30 02:53 /var/lib/ceph/osd/ceph-56/current/6.263_head/DIR_3/DIR_6/DIR_2/DIR_A/100002a343d.00000000__head_718FA263__6
# ls -l /var/lib/ceph/osd/ceph-56/current/6.263_head/DIR_3/DIR_6/DIR_2/DIR_A/100002a343d.00000000__head_718FA263__6
-rw-r--r-- 1 ceph ceph 23392 Jun 30 02:53 /var/lib/ceph/osd/ceph-56/current/6.263_head/DIR_3/DIR_6/DIR_2/DIR_A/100002a343d.00000000__head_718FA263__6
# ls -l /var/lib/ceph/osd/ceph-56/current/6.263_head/DIR_3/DIR_6/DIR_2/DIR_A/100002a343d.00000000__head_718FA263__6
-rw-r--r-- 1 ceph ceph 0 Jun 30 02:53 /var/lib/ceph/osd/ceph-56/current/6.263_head/DIR_3/DIR_6/DIR_2/DIR_A/100002a343d.00000000__head_718FA263__6
# ls -l /var/lib/ceph/osd/ceph-56/current/6.263_head/DIR_3/DIR_6/DIR_2/DIR_A/100002a343d.00000000__head_718FA263__6
-rw-r--r-- 1 ceph ceph 0 Jun 30 02:53 /var/lib/ceph/osd/ceph-56/current/6.263_head/DIR_3/DIR_6/DIR_2/DIR_A/100002a343d.00000000__head_718FA263__6
# ls -l /var/lib/ceph/osd/ceph-56/current/6.263_head/DIR_3/DIR_6/DIR_2/DIR_A/100002a343d.00000000__head_718FA263__6
-rw-r--r-- 1 ceph ceph 41412 Jun 30 02:53 /var/lib/ceph/osd/ceph-56/current/6.263_head/DIR_3/DIR_6/DIR_2/DIR_A/100002a343d.00000000__head_718FA263__6
>From the size checks I did before applying the repair I know that the size of the object should be 41412. The initial error also says that.
So what is actually going on here?
Cheers
G.
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
--
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com