Hi all, This email talks about how to design 1) ReplicaDaemon: The daemon, running on the host with DCPMM & RNIC(RDMA-NIC), reports what kind of info to Ceph/Monitor. 2) ReplicaMonitor: ReplicaMonitor, one new PaxosService in Ceph/Monitor, manage the ReplicaDaemons' info and deal with librbd's request to select the appropriate ReplicaDaemons' info to librbd. This email doesn't talk about: After librbd get the ReplicaDaemons' info, how librbd will communite with ReplicaDaemon and how to finish the replication. RFC PR: [WIP] aggregate client state and route info https://github.com/ceph/ceph/pull/37931 Detail: +-----------------------------------+ +-----------------------------------------------+ |+---------------------------------+| | +--------------------+| || ReplicaDaemonInfo: || | |PaxosServiceMessage || || || |+---------------------------------------------+| || daemon_id; || ||MReplicaDaemonBlink(MSG_REPLICADAEMON_BLINK):|| || rnic_bind_port; || || || || rnic_addr; || ||ReplicaDaemonInfo; || || free_size; || |+---------------------------------------------+| |+---------------------------------+| | +--------------------+| |+---------------------------------+| | |PaxosServiceMessage || || ReqReplicaDaemonInfo: || |+---------------------------------------------+| || || ||MMonGetReplicaDaemonMap(CEPH_MSG_MON_GET_REPL|| || replicas; || ||ICADAEMONMAP): || || replica_size; || || || |+---------------------------------+| ||ReqReplicaDaemonInfo; || |+---------------------------------+| |+---------------------------------------------++ || ReplicaDaemonMap: || | +-------+| || || | |Message|| || std::vector<ReplicaDaemonInfo>; || |+---------------------------------------------+| |+---------------------------------+| ||MReplicaDaemonMap(CEPH_MSG_REPLICADAEMON_MAP)|| | MetaData(need encode/decode) | || || | | || || | | ||ReplicaDaemonMap; || | | |+---------------------------------------------+| | | | | | | | Three messages defined for the MetaData | +-----------------------------------+ +-----------------------------------------------+ +--------+ +------------+ |Dispatch| |PaxosService| +---------------------+ Update ReplicaDaemonInfo +---------------------------+ | ReplicaDaemon: | through | ReplicaMonitor: | | | MReplicaDaemonBlink | | | ReplicaDaemonInfo; -----------------------------------> ReplicaDaemonMap; | | | | | | ms_dispatch; | | //Need implement some APIs| +---------------------+ +------^-------------|------+ Request ReplicaDaemonMap Feedback ReplicaDaemonMap through | |through MMonGetReplicaDaemonMap MReplicaDaemonMap +------|-------------v------+ | librbd | +---------------------------+ ReplicaDaemon reports ReplicaDaemonInfo to ReplicaMonitor by MReplicaDaemonBlink message. ReplicaMonitor store all the ReplicaDaemonInfo into ReplicaDaemonMap after going through Paxos. The client(librbd) send MMonGetReplicaDaemonMap to ReplicaMonitor, ReplicaMonitor will choose the approprite ReplicaDaemon and pack all the info to new ReplicaDaemonMap to send back to the client by MReplicaDaemonMap message; B.R. Changcheng _______________________________________________ Dev mailing list -- dev@xxxxxxx To unsubscribe send an email to dev-leave@xxxxxxx