答复: 答复: 答复: Pipe "deadlock" in Hammer, 0.94.5

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Thanks for your help:-)

I checked the source code again, and in read_message, it does hold the Connection::lock:

                     while (left > 0) {
			// wait for data
			if (tcp_read_wait() < 0)
				goto out_dethrottle;

			// get a buffer
			connection_state->lock.Lock();
			map<ceph_tid_t, pair<bufferlist, int> >::iterator p =
					connection_state->rx_buffers.find(header.tid);
			if (p != connection_state->rx_buffers.end()) {
				if (rxbuf.length() == 0 || p->second.second != rxbuf_version) {
					ldout(msgr->cct,10)
								<< "reader seleting rx buffer v "
										<< p->second.second << " at offset "
										<< offset << " len "
										<< p->second.first.length() << dendl;
					rxbuf = p->second.first;
					rxbuf_version = p->second.second;
					// make sure it's big enough
					if (rxbuf.length() < data_len)
						rxbuf.push_back(
								buffer::create(data_len - rxbuf.length()));
					blp = p->second.first.begin();
					blp.advance(offset);
				}
			} else {
				if (!newbuf.length()) {
					ldout(msgr->cct,20)
								<< "reader allocating new rx buffer at offset "
										<< offset << dendl;
					alloc_aligned_buffer(newbuf, data_len, data_off);
					blp = newbuf.begin();
					blp.advance(offset);
				}
			}
			bufferptr bp = blp.get_current_ptr();
			int read = MIN(bp.length(), left);
			ldout(msgr->cct,20)
						<< "reader reading nonblocking into "
								<< (void*) bp.c_str() << " len " << bp.length()
								<< dendl;
			int got = tcp_read_nonblocking(bp.c_str(), read);
			ldout(msgr->cct,30)
						<< "reader read " << got << " of " << read << dendl;
			connection_state->lock.Unlock();
			if (got < 0)
				goto out_dethrottle;
			if (got > 0) {
				blp.advance(got);
				data.append(bp, 0, got);
				offset += got;
				left -= got;
			} // else we got a signal or something; just loop.
		}

As shown in the above code, in the reading loop, it first lock connection_state->lock and then do tcp_read_nonblocking. connection_state is of type PipeConnectionRef, connection_state->lock is Connection::lock.

On the other hand, I'll check that whether there are a lot of message to send as you suggested. Thanks:-)



发件人: Gregory Farnum [gfarnum@xxxxxxxxxx]

发送时间: 2017年1月14日 9:39

收件人: 许雪寒

Cc: jiajia zhong; ceph-users@xxxxxxxxxxxxxx

主题: Re:  答复: 答复: Pipe "deadlock" in Hammer, 0.94.5









On Thu, Jan 12, 2017 at 7:58 PM, 许雪寒 <xuxuehan@xxxxxx> wrote:










Thank you for your continuous helpJ.
 
We are using hammer 0.94.5 version, and what I read is the version of the source code.

However, on the other hand, if Pipe::do_recv do act as blocked, is it reasonable for the Pipe::reader_thread to block threads calling SimpleMessenger::submit_message
 by holding Connection::lock?
 
I think maybe a different mutex should be used in Pipe::read_message rather than Connection::lock.







I don't think it does use that lock. Pipe::read_message() is generally called while the pipe_lock is held, but not Connection::lock. (They are separate.) 

I haven't dug into the relevant OSD code in a while, but I think it's a lot more likely your OSD is just overloaded and is taking a while to send a lot of different messages, and that the loop it's in doesn't update the HeartbeatMap or something. Did you check
 that?

-Greg




_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux