Re: Another question about PG::do_peer

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, 10 Mar 2011, Henry Chang wrote:
> Hi,
> 
> Another question is about PG::do_peer. I wonder if we should modify
> the condition of inferring no missing as follows:
> 
> diff --git a/src/osd/PG.cc b/src/osd/PG.cc
> index e634296..4ce3336 100644
> --- a/src/osd/PG.cc
> +++ b/src/osd/PG.cc
> @@ -1632,7 +1632,7 @@ void PG::do_peer(ObjectStore::Transaction& t,
> list<Context*>& tfin,
>      if (pi.is_empty())
>        continue;
>      if (peer_missing.find(peer) == peer_missing.end()) {
> -      if (pi.last_update == pi.last_complete) {
> +      if (pi.last_update == pi.last_complete && pi.last_update >= log.tail) {
>         dout(10) << " infering no missing (last_update==last_complete)
> for osd" << peer << dendl;
>         peer_missing[peer].num_missing();  // just create the entry.
>         search_for_missing(peer_info[peer], &peer_missing[peer], peer);
>
> If pi.last_update < log.tail, we cannot know if the peer has any
> missing. Shouldn't we try to pull the peer's missing+backlog first
> before inferring no missing and go ahead activating the pg?

Yeah, this is definitely broken, but I think it's broken even beyond that 
specific case.  It looks like it's just an ill-conceived optimization.  
The problem is that peer_missing needs to have missing in terms of the 
master log (on the primary), not the peer's log, and in order to 
calculate that we need pull the log+missing.  (In the 
last_update==last_complete case I was trying to optimize, the missing set 
will just be empty.)  The current code missing the case where the replica 
has a divergent log or and older last_update (even one > log.tail).

I think this shortcut _only_ makes sense when the replica is perfectly up 
to date, i.e.

      if (pi.last_update == pi.last_complete &&
	  pi.last_update == info.last_update) {

Does that sound right?

> Sorry for separating my questions into multiple emails.

No problem, it's easier that way.  :)

Thanks!
sage
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux