Re: radosgw hang in curl_muti_wait with libcurl 7.37.0

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Casey,

updates:
Issue reproduced with another situation today:
RGWDataSyncShardCR is blocked by RGWReadRemoteDataLogShardInfoCR, lead
to data sync incomplete. Using gdb, and I found the http_op of it is
left in the related RGWHTTPManager's reqs queue, whose
reqs_thread_entry can't complete that req. event I shutdown the
net-link of the remote IP.

Have you been able to correlate this hang with the radosgw on the receiving end?
A: Almost occur in the HTTP sender side, I have seen three kinds of
request blocked by libcurl, datalog-list, data-log-change-notify and
fetch-remote-obj.

Are you sure it's getting this request and sending a reply?
A: I am not sure.

Does this happen on every sync request, or just some?
A: Just some, occasionally. But I think this may happen on all kinds
of HTTP request

thanks
ivan from eisoo


On Thu, Aug 3, 2017 at 10:05 PM, Casey Bodley <cbodley@xxxxxxxxxx> wrote:
> Hi,
>
> Have you been able to correlate this hang with the radosgw on the receiving
> end? Are you sure it's getting this request and sending a reply? Does this
> happen on every sync request, or just some?
>
> Thanks,
> Casey
>
>
>
> On 08/02/2017 11:50 PM, yuxiang fang wrote:
>>
>> Resend mail for denied by the email server.
>>
>> For the sake of radosgw hang in curl_multi_wait, my team has upgraded
>> libcurl from 7.29.0 to 7.37.0, but this issue still appeared.
>> Is this definitely a bug of libcurl? and is there something wrong with
>> radosgw internal? I'm lost. Don't know which version of libcurl to
>> select. We ever tried
>> curl 7.54.0, but it is not stable, crashed usually.
>>
>> back trace:
>>
>> Thread 54 (Thread 0x7f32826fb700 (LWP 43459)):
>> #0  0x00007f32f3c54c3d in poll () from /lib64/libc.so.6
>> #1  0x00007f32f4caf039 in Curl_poll () from /lib64/libcurl.so.4
>> #2  0x00007f32f4ca7f04 in curl_multi_wait () from /lib64/libcurl.so.4
>> #3  0x00007f32f529753d in do_curl_wait(CephContext*, void*, int) ()
>> from /lib64/librgw.so.2
>> #4  0x00007f32f52999d9 in RGWHTTPManager::process_requests(bool,
>> bool*) () from /lib64/librgw.so.2
>> #5  0x00007f32f5299ec4 in RGWHTTPManager::complete_requests() () from
>> /lib64/librgw.so.2
>> #6  0x00007f32f5416ac4
>> inRGWRESTStreamRWRequest::get_resource(RGWAccessKey&,
>> std::map<std::string, std::string, std::less<std::string>,
>> std::allocator<std::pair<std::string const, std::string> > >&,
>> std::string const&, RGWHTTPManager*) () from /lib64/librgw.so.2
>> #7  0x00007f32f5416d2a in
>> RGWRESTStreamRWRequest::get_obj(RGWAccessKey&, std::map<std::string,
>> std::string, std::less<std::string>,
>> std::allocator<std::pair<std::string const, std::string> > >&,
>> rgw_obj&) () from /lib64/librgw.so.2
>> #8  0x00007f32f541c0fc in RGWRESTConn::get_obj(rgw_user
>> const&,req_info*, rgw_obj&,
>> std::chrono::time_point<ceph::time_detail::real_clock,
>> std::chrono::duration<unsigned long, std::ratio<1l, 1000000000l> > >
>> const*, std::chrono::time_point<ceph::time_detail::real_clock,
>> std::chrono::duration<unsigned long, std::ratio<1l, 1000000000l> > >
>> const*, unsigned int, unsigned long, bool, bool, bool, bool,
>> RGWGetDataCB*, RGWRESTStreamRWRequest**) () from /lib64/librgw.so.2
>> #9  0x00007f32f53d88d7 in RGWRados::fetch_remote_obj(RGWObjectCtx&,
>> rgw_user const&, std::string const&, std::string const&, bool,
>> req_info*, std::string const&, rgw_obj&, rgw_obj&, RGWBucketInfo&,
>> RGWBucketInfo&, std::chrono::time_point<ceph::time_detail::real_clock,
>> std::chrono::duration<unsigned long, std::ratio<1l, 1000000000l> > >*,
>> std::chrono::time_point<ceph::time_detail::real_clock,
>> std::chrono::duration<unsigned long, std::ratio<1l, 1000000000l> > >*,
>> std::chrono::time_point<ceph::time_detail::real_clock,
>> std::chrono::duration<unsigned long, std::ratio<1l, 1000000000l> > >
>> const*, std::chrono::time_point<ceph::time_detail::real_clock,
>> std::chrono::duration<unsigned long, std::ratio<1l, 1000000000l> > >
>> const*, bool, char const*, char const*, RGWRados::AttrsMod, bool,
>> std::map<std::string, ceph::buffer::list, std::less<std::string>,
>> std::allocator<std::pair<std::string const, ceph::buffer::list> > >&,
>> RGWObjCategory, unsigned long,
>> std::chrono::time_point<ceph::time_detail::real_clock,
>> std::chrono::duration<unsigned long, std::ratio<1l, 1000000000l> > >,
>> std::string*, std::string*, ceph::buffer::list*, rgw_err*, void
>> (*)(long, void*), void*) () from /lib64/librgw.so.2
>> #10 0x00007f32f5249b43 in RGWAsyncFetchRemoteObj::_send_request() ()
>> from /lib64/librgw.so.2
>> #11 0x00007f32f5246d52 in
>> RGWAsyncRadosProcessor::handle_request(RGWAsyncRadosRequest*) () from
>> /lib64/librgw.so.2
>> #12 0x00007f32f5246e1d in
>> RGWAsyncRadosProcessor::RGWWQ::_process(RGWAsyncRadosRequest*,
>> ThreadPool::TPHandle&) () from /lib64/librgw.so.2
>> #13 0x00007f32f57283be in ThreadPool::worker(ThreadPool::WorkThread*)
>> () from /lib64/librgw.so.2
>> #14 0x00007f32f57292a0 in ThreadPool::WorkThread::entry() () from
>> /lib64/librgw.so.2
>> #15 0x00007f32f485bdc5 in start_thread () from /lib64/libpthread.so.0
>> #16 0x00007f32f3c5f28d in clone () from /lib64/libc.so.6
>>
>> thanks
>> ivan from eisoo
>> --
>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>> the body of a message to majordomo@xxxxxxxxxxxxxxx
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>
>
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux