Re: OSD not coming back up again

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 11-8-2016 13:44, Wido den Hollander wrote:
> Could be, but the log line which is interesting:
> 
> 2016-08-11 12:06:16.473351 b69c480 10 osd.0 178 handle_osd_ping osd.2
> 127.0.0.1:6811/13222 says i am down in 178

So I've augmented the code a bit:

  case MOSDPing::YOU_DIED:
    dout(10) << "handle_osd_ping " << m->get_source_inst()
             << " says i am down in " << m->map_epoch << dendl;
    dout(10) << "handle_osd_ping subscribing to: " << curmap->get_epoch()+1
             << dendl;
    osdmap_subscribe(curmap->get_epoch()+1, false);
    break;

And we go into odsmap_subscribe, which I augmented too:
void OSD::osdmap_subscribe(version_t epoch, bool force_request)
{
  OSDMapRef osdmap = service.get_osdmap();
  if (osdmap->get_epoch() >= epoch) {
    dout(10) << __func__ << " subscribing to: " << epoch
             << " but osdmap already has: " << osdmap->get_epoch()
             << dendl;
    return;
  }

  if (monc->sub_want_increment("osdmap", epoch, CEPH_SUBSCRIBE_ONETIME)
 		||force_request) {
    dout(10) << __func__ << " renewing subscription" << dendl;
    monc->renew_subs();
  }
}

If I run with force_request = false, then this call does absolutely
nothing... Neither if's trigger
If I set force_request = true (expecting it to fetch the whole map),
that does not do much either.
It reports:
	monclient: renew_subs - empty


So either this path is never/rarely used, or is not the solution to the
problem.

--WjW

--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux