Question about ceph paxos implementation

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



hi

I read the code of ceph paxos recently, and have a question about it, which, in my opinion, may violate the consistency.

Assume we have five monitor node m1, m2, m3, m4, m5, the prior one has larger rank than the back one. 

Consider the situation as below:

1, m1 as the leader, and all node have the same last_commited at begin, then m1 propose a new value ‘2', which then be accept by m1 and m3:
m1:   1 2
m2:   1
m3:   1 2
m4:   1
m5:   1	

2, Unfortunatly, both m1 and m3 go down, and m2 become leader without knowledge about the propse, and it propose a new value ‘3' 
m1:   1 2  down 
m2:   1 3
m3:   1 2  down
m4:   1
m5:   1	

3, Then m2 goes down before send anything to others, then m1, m3 recovered and commit value ‘2’ with the quorum m1, m3, m4
m1:   1 2
m2:   1 3  down
m3:   1 2
m4:   1 2
m5:   1	

4, Before the commit message sent to others, m1 and m3 go down again. So value ‘3’ only commit on m1. Then m2 become leader once more.
m1:   1 2  down
m2:   1 3
m3:   1 2  down
m4:   1 2
m5:   1

5, Leader m2 see the uncommited value ‘2’, but discard it by compare uncommitted_pn in function handle_last, so it commit value ‘3’ with the quorum m2, m4, m5
m1:   1 2  down
m2:   1 3
m3:   1 2  down
m4:   1 3
m5:   1 3

Now we see the value ‘2’ has been commited, but lost soon. Am I right on it?


Thanks
WANG KANG

 --
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux