Thanks for your help!^^ I am very appreciate. 2011/4/28 Sage Weil <sage@xxxxxxxxxxxx>: > On Wed, 27 Apr 2011, doki74216@xxxxxxxxx wrote: >> Dear developers >> >> I am testing the reliability of MDS in Ceph File System. >> As we know, the default setting, there are one active MDS and one standby MDS. >> I want to test the reliability of MDS. >> Here is my testing scenario: >> As easy to understand, here I assume the active MDS as mds0 and the >> standby one is mds1. >> I write data and stop the mds0 daemon by "ceph mds stop 0" at the same time. >> Will the standby mds1 change its status to active? >> I think the system should be normal and the data should not loss even >> though there is just one MDS. >> But there are many problem I met....>< >> >> (1) I want to know which mds is active and standby. >> But there are different answer when I type different commands. >> >> 1. When I type "ceph mds stat". >> It shows: (0=up: active), 1 up: standby >> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Does it means mds0 > > Right, the 0 means mds0. > >> is active and mds1 is standby ? >> 2.When I type "ceph mds dump -o -" >> It shows: mds0 is standby and mds1 is active. >> >> My question is: Why there are different status about mds0 and mds1? >> Which one is correct? > > It sounds like you named the MDS's with numeric identifiers. You should > use non-numeric names to avoid this confusion, like mds.a and mds.b. The > numeric role/rank (mds0, mds1) is assigned to cmds instances dynamically > based on who is up and needed in which role at the time. > >> (2) I want to know the issue about stoping the active mds. >> If the command which "ceph mds stat" can show the active mds correctly, >> the active mds must be mds0 and the other mds1 is standby. >> So I type"ceph mds stop 0". >> It shows: ?telling mds0 192.168.200.185:6800/14819 to stop?(0) >> ^^^^^^^^^^^^^^^^^^ mds1's IP. >> 1. Why it shows the system stoping mds0 but mds1's IP ? > > The 'stop' command takes the dynamic role, not the name. > > Hope this clears it up! > sage > > >> >> 2. When I type "ceph -w", here is the log: >> ====================================== >> mds e43: 1/1/1 up {0=up:active}, 1 up standby >> mds e44: 1/1/1 up {0=up:stoping}, 1 up standby >> mds e45: 1/1/1 up {0=up:replay} >> mds e46: 1/1/1 up {0=up:reconnect} >> mds e47: 1/1/1 up {0=up:rejoin} >> mds e44: 1/1/1 up {0=up:active} >> ===================================== >> >From the log, which MDS does the system stop? >> My command which is"ceph mds stop 0", but the log should not be {1=up:active}? >> It really confuse me... >> >> 3. When I type"ceph mds dump ?o ?", here is the log: >> 4920: 192.168.200.184:6800/30000 ?0? mds0.12 up:active seq 40828 >> ^^^^^^^^^^^^^^^^^^^ mds0's IP >> Why does the system leave the mds0? >> >> Please help me solving these problem, I am very confused... >> Thanks a lot~~~^^ >> >> >> By the way, this is my testing environment : >> ================================================================================== >> I set 7 servers which include 3 MONs(host1 host2 host3), 2 MDSs(host4 >> host5) and 2 OSDes(host6 host7). >> The version ceph 0.26 is in my system. >> ================================================================================== >> -- >> Best Regards, >> Stefanie Chen >> -- >> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in >> the body of a message to majordomo@xxxxxxxxxxxxxxx >> More majordomo info at http://vger.kernel.org/majordomo-info.html >> >> > -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html