I have noticed something odd with the ceph-objectstore-tool command: It always reports PG X not found even on healthly OSDs/PGs. The 'list' op works on both and unhealthy PGs. ________________________________________ From: ceph-users [ceph-users-bounces@xxxxxxxxxxxxxx] on behalf of george.vasilakakos at stfc.ac.uk [george.vasilakakos at stfc.ac.uk] Sent: 21 February 2017 10:17 To: wido at 42on.com; ceph-users at lists.ceph.com; bhubbard at redhat.com Subject: Re: PG stuck peering after host reboot > Can you for the sake of redundancy post your sequence of commands you executed and their output? [root at ceph-sn852 ~]# systemctl stop ceph-osd at 307 [root at ceph-sn852 ~]# ceph-objectstore-tool --data-path /var/lib/ceph/osd/ceph-307 --op info --pgid 1.323 PG '1.323' not found [root at ceph-sn852 ~]# systemctl start ceph-osd at 307 I did the same thing for 307 (new up but not acting primary) and all the OSDs in the original set (including 595). The output was the exact same. I don't have the whole session log handy from all those sessions but here's a sample from one that's easy to pick out: [root at ceph-sn832 ~]# systemctl stop ceph-osd at 7 [root at ceph-sn832 ~]# ceph-objectstore-tool --data-path /var/lib/ceph/osd/ceph-7 --op info --pgid 1.323 PG '1.323' not found [root at ceph-sn832 ~]# systemctl start ceph-osd at 7 [root at ceph-sn832 ~]# ll /var/lib/ceph/osd/ceph-7/current/ 0.18_head/ 11.1c8s5_TEMP/ 13.3b_head/ 1.74s1_TEMP/ 2.256s6_head/ 2.c3s10_TEMP/ 3.b9s4_head/ 0.18_TEMP/ 1.16s1_head/ 13.3b_TEMP/ 1.8bs9_head/ 2.256s6_TEMP/ 2.c4s3_head/ 3.b9s4_TEMP/ 1.106s10_head/ 1.16s1_TEMP/ 1.3a6s0_head/ 1.8bs9_TEMP/ 2.2d5s2_head/ 2.c4s3_TEMP/ 4.34s10_head/ 1.106s10_TEMP/ 1.274s5_head/ 1.3a6s0_TEMP/ 2.174s10_head/ 2.2d5s2_TEMP/ 2.dbs7_head/ 4.34s10_TEMP/ 11.12as10_head/ 1.274s5_TEMP/ 1.3e4s9_head/ 2.174s10_TEMP/ 2.340s8_head/ 2.dbs7_TEMP/ commit_op_seq 11.12as10_TEMP/ 1.2ds8_head/ 1.3e4s9_TEMP/ 2.1c1s10_head/ 2.340s8_TEMP/ 3.159s3_head/ meta/ 11.148s2_head/ 1.2ds8_TEMP/ 14.1a_head/ 2.1c1s10_TEMP/ 2.36es10_head/ 3.159s3_TEMP/ nosnap 11.148s2_TEMP/ 1.323s8_head/ 14.1a_TEMP/ 2.1d0s6_head/ 2.36es10_TEMP/ 3.170s1_head/ omap/ 11.165s6_head/ 1.323s8_TEMP/ 1.6fs9_head/ 2.1d0s6_TEMP/ 2.3d3s10_head/ 3.170s1_TEMP/ 11.165s6_TEMP/ 13.32_head/ 1.6fs9_TEMP/ 2.1efs2_head/ 2.3d3s10_TEMP/ 3.1aas5_head/ 11.1c8s5_head/ 13.32_TEMP/ 1.74s1_head/ 2.1efs2_TEMP/ 2.c3s10_head/ 3.1aas5_TEMP/ [root at ceph-sn832 ~]# ll /var/lib/ceph/osd/ceph-7/current/1.323s8_ 1.323s8_head/ 1.323s8_TEMP/ [root at ceph-sn832 ~]# ll /var/lib/ceph/osd/ceph-7/current/1.323s8_head/DIR_3/DIR_2/DIR_ DIR_3/ DIR_7/ DIR_B/ DIR_F/ [root at ceph-sn832 ~]# ll /var/lib/ceph/osd/ceph-7/current/1.323s8_head/DIR_3/DIR_2/DIR_3/DIR_ DIR_0/ DIR_1/ DIR_2/ DIR_3/ DIR_4/ DIR_5/ DIR_6/ DIR_7/ DIR_8/ DIR_9/ DIR_A/ DIR_B/ DIR_C/ DIR_D/ DIR_E/ DIR_F/ [root at ceph-sn832 ~]# ll /var/lib/ceph/osd/ceph-7/current/1.323s8_head/DIR_3/DIR_2/DIR_3/DIR_1/ total 271276 -rw-r--r--. 1 ceph ceph 8388608 Feb 3 22:07 datadisk\srucio\sdata16\u13TeV\s11\sad\sDAOD\uTOPQ4.09383728.\u000436.pool.root.1.0000000000000001__head_2BA91323__1_ffffffffffffffff_8 > If you run a find in the data directory of the OSD, does that PG show up? OSDs 595 (used to be 0), 1391(1), 240(2), 7(7, the one that started this) have a 1.323_headsX directory. OSD 307 does not. I have not checked the other OSDs in the PG yet. Wido > > Best regards, > > George _______________________________________________ ceph-users mailing list ceph-users at lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com