confusion when kill 3 osds that store the same pg

clewis@xxxxxxxxxxxxxxxxxx (Craig Lewis) · Fri, 19 Sep 2014 17:52:51 -0700

Comments inline.

On Thu, Sep 18, 2014 at 8:33 PM, yuelongguang <fastsync at 163.com> wrote:

>
> 1.
> [root at cephosd5-gw current]# ceph pg 2.30 query
> Error ENOENT: i don't have pgid 2.30
>
> why i can not query infomations of this pg?  how to dump this pg?
>

I haven't actually tried this, but I expect something like that.  The
primary OSD has all the data about the PG.  In your next question, you show
the acting OSDs as [4,1].  But you shutdown all OSDS that did have pg 2.30
before osd.4 or osd.1 could backfill, so osd.4 doesn't know anything about
pg 2.30.

If you bring up one of the other OSDs, osd.4 and osd.1 can backfill, and
then osd.4 will be able to answer your query.

If this was a real 3 disk failure, you would have lost this PG, and all the
data on it.

>
> 2.
> #ceph osd map rbd rbd_data.19d92ae8944a.0000000000000000
> osdmap e1451 pool 'rbd' (2) object
> 'rbd_data.19d92ae8944a.0000000000000000' -> pg 2.c59a45b0 (2.30) -> up
> ([4,1], p4) acting ([4,1], p4)
>
> does 'ceph osd map' command just calculate map , but does not check real
> pg stat?  i do not find 2.30  on osd1 and osd.4.
> new that client will get the new map, why client hang ?
>

I know less about RBD.  I have seen Ceph block on reads, because the
current primary osd doesn't have the latest data about the PG.  Once the
current primary gets the history it's missing from the previous primary,
then it can start to return data.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ceph.com/pipermail/ceph-users-ceph.com/attachments/20140919/3be40f8a/attachment.htm>