Re: About the data movement in Ceph

Zh Chen <atrmat@xxxxxxxxx> · Sat, 28 Sep 2013 10:48:17 +0800

Thx Sage for help me understand Ceph much more deeply! 
And recently i have another questions as follows,

1. As we know, Ceph -s is the summary of system's state, and is there any tools to monitor the detail of data's flow when the Crush map is changed? 

2. In my understanding, the mapping between the obj and PG is consistent, we only need to change the mapping between the PG and OSD when our Crush map is changed, right?

3.Suppose that two pools in Ceph, each root has its OSDs in leaf. And if i move one's OSD from one to the other pool, in response to that, the PG in this OSD would be migrated in its original pool or rebalance in the target pool?

4. The crushtool is a very cool tool to understand the Crush, but i don't know how to use the --show-utilization (show OSD usage), what args or action i need to add in cli?

    Is there any cli that can query each OSD's usage and statistics?

5. I find that librados offer the api, and about this rados_ioctx_pool_stat(rados_ioctx_t io, struct rados_pool_stat_t *stats), if i want to query some pools' statistics and i need to declare some rados_ioctx io or cluster handle that each for a pool? i found the segment fault when the return for rados_ioctx_pool_stat.

Very looking forward for ur kindly reply!

2013/9/11 Sage Weil <sage@xxxxxxxxxxx>

On Tue, 10 Sep 2013, atrmat wrote:

> Hi all,

> recently i read the source code and paper, and i have some questions about

> the data movement:

> 1. when OSD's add or removal, how Ceph do this data migration and rebalance

> the crush map? is it the rados modify the crush map or cluster map, and the

> primary OSD does the data movement according to the cluster map? how to

> found the data migration in the source code?

The OSDMap changes when the osd is added or removed (or some other event

or administrator action happens).  In response, the OSDs recalculate where

the PGs should be stored, and move data in response to that.

> 2. when OSD's down or failed, how Ceph recover the data in other OSDs? is it

> the primary OSD copy the PG to the new located OSD?

The (new) primary figures out where data is/was (peering) and the

coordinates any data migration (recovery) to where the data should now be

(according to the latest OSDMap and its embedded CRUSH map).

> 3. the OSD has 4 status bits: up,down,in,out. But i can't found the defined

> status-- CEPH_OSD_DOWN, is it the OSD call the function mark_osd_down() to

> modify the OSD status in OSDMap?

See OSDMap.h: is_up() and is_down().  For in/out, it is either binary

(is_in() and is_out() or can be somewhere in between; see get_weight()).

Hope that helps!

sage

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com