Hi!
Thanks!
I have some suggestions for the 1st method:
>You could get the name prefix for each RBD from rbd info,
Yes, I did it already at the steps 1 and 2. I forgot to mention, that I grab rbd frefix from 'rbd info' command
>then list all objects (run find on the osds?) and then you just need to grep the OSDs for each prefix.
So, you advise to run find over ssh for all OSD hosts to traverse OSDs filesystems and find files (objects),
named with rbd prefix? Am I right? If so, I have two thoughts: (1) it may be not so fast also, because
even limiting find with rbd prefix and pool index, it have to recursively go through whole OSD filesytem
hierarchy. And (2) - find will put an additional load to OSD drives.
The second method is more attractive and I will try it soon. As we have an object name,
and can get a crushmap in some usable form to look by ourself, or indirectly through a
library/api call - finding the chain of object-to-PG-to-OSDs is a local computational
task, and it can be done without remote calls (accessing OSD hosts, finding, etc).
Also, the slow looping through 'ceph osd map <pool> <object>' could be explained:
for every object we have to spawn process, connecting cluster (with auth), receiving
maps to client, calculating placement, and ... finally throw it all away when process
exits. I think this overhead is a main reason of slowness.
Megov Igor
CIO, Yuterra
Thanks!
I have some suggestions for the 1st method:
>You could get the name prefix for each RBD from rbd info,
Yes, I did it already at the steps 1 and 2. I forgot to mention, that I grab rbd frefix from 'rbd info' command
>then list all objects (run find on the osds?) and then you just need to grep the OSDs for each prefix.
So, you advise to run find over ssh for all OSD hosts to traverse OSDs filesystems and find files (objects),
named with rbd prefix? Am I right? If so, I have two thoughts: (1) it may be not so fast also, because
even limiting find with rbd prefix and pool index, it have to recursively go through whole OSD filesytem
hierarchy. And (2) - find will put an additional load to OSD drives.
The second method is more attractive and I will try it soon. As we have an object name,
and can get a crushmap in some usable form to look by ourself, or indirectly through a
library/api call - finding the chain of object-to-PG-to-OSDs is a local computational
task, and it can be done without remote calls (accessing OSD hosts, finding, etc).
Also, the slow looping through 'ceph osd map <pool> <object>' could be explained:
for every object we have to spawn process, connecting cluster (with auth), receiving
maps to client, calculating placement, and ... finally throw it all away when process
exits. I think this overhead is a main reason of slowness.
Megov Igor
CIO, Yuterra
От: ceph-users <ceph-users-bounces@xxxxxxxxxxxxxx> от имени David Burley <david@xxxxxxxxxxxxxxxxx>
Отправлено: 25 сентября 2015 г. 17:15
Кому: Jan Schermer
Копия: ceph-users; Межов Игорь Александрович
Тема: Re: How to get RBD volume to PG mapping?
Отправлено: 25 сентября 2015 г. 17:15
Кому: Jan Schermer
Копия: ceph-users; Межов Игорь Александрович
Тема: Re: How to get RBD volume to PG mapping?
So I had two ideas here:
1. Use find as Jan suggested. You probably can bound it by the expected object naming and limit it to the OSDs that were impacted. This is probably the best way.
2. Use the osdmaptool against a copy of the osdmap that you pre-grab from the cluster, ala: https://www.hastexo.com/resources/hints-and-kinks/which-osd-stores-specific-rados-object
--David
On Fri, Sep 25, 2015 at 10:11 AM, Jan Schermer
<jan@xxxxxxxxxxx> wrote:
Ouch
1) I should have read it completely
2) I should have tested it :)
Sorry about that...
You could get the name prefix for each RBD from rbd info, then list all objects (run find on the osds?) and then you just need to grep the OSDs for each prefix... Should be much faster?
Jan
> On 25 Sep 2015, at 15:07, Межов Игорь Александрович <megov@xxxxxxxxxx> wrote:
>
> Hi!
>
> Last week I wrote, that one PG in our Firefly stuck in degraded state with 2 replicas instead of 3
> and do not try to backfill or recovery. We try to investigate, what RBD vol's are affected.
>
> The working plan are inspired by Sebastian Han's snippet
> (http://www.sebastien-han.fr/blog/2013/11/19/ceph-rbd-objects-placement/)
> and consists of next steps:
>
> 1. rbd -p <pool> ls - to list all RBD volumes on the pool
> 2. Get RBD prefix, corresponding the volume
> 3. Get a list of objects, which belongs to our RBD volume
> 4. Issue 'ceph osd map <pool> <objectname>' to get PG for object and OSD placement
>
> After writing some scripts we face a difficulty: running 'ceph osd map...' and getting object
> placement takes about 0.5 second, so iterating all 15 millions objects will take forever.
>
> Is there any other way to find to what PGs the specified RBD volume are mapped,
> or may be there is a much faster way to do our step 4 instead of calling 'ceph osd map'
> in loop for every object.
>
>
> Thanks!
>
> Megov Igor
> CIO, Yuterra
> _______________________________________________
> ceph-users mailing list
> ceph-users@xxxxxxxxxxxxxx
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com