Re: Questions about Accepter::stop()

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, 12 Aug 2016, Willem Jan Withagen wrote:
> Hi,
> 
> Still working on finding out why my OSD is not comming back up.
> Looking at the OSD it seems to recover, but it is not reported back to
> the other OSD and mons.
> 
> Below some of the code from
> 	./src/msg/simple/Accepter.cc
> 
> Turns out that the thread freezes on the join, and the complicating
> factor is that shoutdown always reports that
>   accepter.stop shutdown failed:  errno 57 (57) Socket is not connected
> 
> Then the code goes into the join, and gets stuck in there.
> 
> So I've execluded that part of the code, and the close section.
> 
> That seems to work, but I would very much some more opinions on this.
> Original code was doen by Sage, but John Spray added a bit of exclusion
> on the join()
> 
> And even with this change I cannot complete
> 	cephtool-test-mon.sh
> But I'm getting a lot further down the test.

This is the thread we need to wake up in Accepter::entry():

    ldout(msgr->cct,20) << "accepter calling poll" << dendl;
    int r = poll(&pfd, 1, -1);
    if (r < 0)
      break;
    ldout(msgr->cct,20) << "accepter poll got " << r << dendl;

    if (pfd.revents & (POLLERR | POLLNVAL | POLLHUP))
      break;

    ldout(msgr->cct,10) << "pfd.revents=" << pfd.revents << dendl;
    if (done) break;

It shutdown(2) isn't the "right" (portable) way to kick the thread blocked 
on poll(2) on an accept socket, maybe there is some other socket call that 
is more appropriate?  It just needs to wake up poll so that we either see 
an error event queued or done == true.

sage


> 
> --WjW
> 
> 
> void Accepter::stop()
> {
>   done = true;
>   ldout(msgr->cct,10) << __func__ << " accepter on: " << listen_sd << dendl;
> 
>   if (listen_sd >= 0) {
>     if ( ::shutdown(listen_sd, SHUT_RDWR) < 0 ) {
>       ldout(msgr->cct,0) << __func__ << " shutdown failed: "
>               << " errno " << errno << " " << cpp_strerror(errno) << dendl;
>     }
>   }
>   if (errno != ENOTCONN) {
>     // wait for thread to stop before closing the socket, to avoid
>     // racing against fd re-use.
>     if (is_started()) {
>         ldout(msgr->cct,0) << __func__ << " wait for thread to join." <<
> dendl;
>       join();
>     }
>   } else {
>     listen_sd = -1;
>   }
> 
>   if (listen_sd >= 0) {
>     if ( ::close(listen_sd) < 0 ) {
>       ldout(msgr->cct,0) << __func__ << "close failed: "
>               << " errno " << errno << " " << cpp_strerror(errno) << dendl;
>     }
>     listen_sd = -1;
>   }
>   done = false;
> }
> 
> 	
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@xxxxxxxxxxxxxxx
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> 
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux