On Aug 22, 2019, at 9:12 PM, Eric Ivancich <ivancich@xxxxxxxxxx
<mailto:ivancich@xxxxxxxxxx>> wrote:
Thank you for providing the profiling data, Vladimir. There are 5078
threads and most of them are waiting. Here is a list of the deepest
call of each thread with duplicates removed.
+ 100.00% epoll_wait
+ 100.00%
get_obj_data::flush(rgw::OwningList<rgw::AioResultEntry>&&)
+ 100.00% poll
+ 100.00% poll
+ 100.00% poll
+ 100.00% pthread_cond_timedwait@@GLIBC_2.3.2
+ 100.00% pthread_cond_timedwait@@GLIBC_2.3.2
+ 100.00% pthread_cond_wait@@GLIBC_2.3.2
+ 100.00% pthread_cond_wait@@GLIBC_2.3.2
+ 100.00% read
+
100.00% _ZN5boost9intrusive9list_implINS0_8bhtraitsIN3rgw14AioResultEntryENS0_16list_node_traitsIPvEELNS0_14link_mode_typeE1ENS0_7dft_tagELj1EEEmLb1EvE4sortIZN12get_obj_data5flushEONS3_10OwningListIS4_JEEEEUlRKT_RKT0_E_EEvSH_
The only interesting ones are the second and last:
* get_obj_data::flush(rgw::OwningList<rgw::AioResultEntry>&&)
*
_ZN5boost9intrusive9list_implINS0_8bhtraitsIN3rgw14AioResultEntryENS0_16list_node_traitsIPvEELNS0_14link_mode_typeE1ENS0_7dft_tagELj1EEEmLb1EvE4sortIZN12get_obj_data5flushEONS3_10OwningListIS4_JEEEEUlRKT_RKT0_E_EEvSH_
They are essentially part of the same call stack that results from
processing a GetObj request, and five threads are in this call stack
(the only difference is wether or not they include the call into boost
intrusive list). Here’s the full call stack of those threads:
+ 100.00% clone
+ 100.00% start_thread
+ 100.00% worker_thread
+ 100.00% process_new_connection
+ 100.00% handle_request
+ 100.00% RGWCivetWebFrontend::process(mg_connection*)
+ 100.00% process_request(RGWRados*, RGWREST*,
RGWRequest*, std::string const&, rgw::auth::StrategyRegistry const&,
RGWRestfulIO*, OpsLogSocket*, opt
ional_yield, rgw::dmclock::Scheduler*, int*)
+ 100.00% rgw_process_authenticated(RGWHandler_REST*,
RGWOp*&, RGWRequest*, req_state*, bool)
+ 100.00% RGWGetObj::execute()
+ 100.00% RGWRados::Object::Read::iterate(long,
long, RGWGetDataCB*)
+ 100.00% RGWRados::iterate_obj(RGWObjectCtx&,
RGWBucketInfo const&, rgw_obj const&, long, long, unsigned long, int
(*)(rgw_raw_obj const&, l
ong, long, long, bool, RGWObjState*, void*), void*)
+ 100.00% _get_obj_iterate_cb(rgw_raw_obj
const&, long, long, long, bool, RGWObjState*, void*)
+ 100.00%
RGWRados::get_obj_iterate_cb(rgw_raw_obj const&, long, long, long,
bool, RGWObjState*, void*)
+ 100.00%
get_obj_data::flush(rgw::OwningList<rgw::AioResultEntry>&&)
+
100.00% _ZN5boost9intrusive9list_implINS0_8bhtraitsIN3rgw14AioResultEntryENS0_16list_node_traitsIPvEELNS0_14link_mode_typeE1ENS0_7dft_tagELj1EEEmLb1EvE4sortIZN12get_obj_data5flushEONS3_10OwningListIS4_JEEEEUlRKT_RKT0_E_EEvSH_
So this isn’t background processing but request processing. I’m not
clear why these requests are consuming so much CPU for so long.
From your initial message:
I am running a Ceph 14.2.1 cluster with 3 rados gateways.
Periodically, radosgw process on those machines starts consuming 100%
of 5 CPU cores for days at a time, even though the machine is not
being used for data transfers (nothing in radosgw logs, couple of
KB/s of network).
This situation can affect any number of our rados gateways, lasts
from few hours to few days and stops if radosgw process is restarted
or on its own.
I’m going to check with others who’re more familiar with this code path.
Begin forwarded message:
*From:*Vladimir Brik <vladimir.brik@xxxxxxxxxxxxxxxx
<mailto:vladimir.brik@xxxxxxxxxxxxxxxx>>
*Subject:**Re: radosgw pegging down 5 CPU cores when no
data is being transferred*
*Date:*August 21, 2019 at 4:47:01 PM EDT
*To:*"J. Eric Ivancich" <ivancich@xxxxxxxxxx
<mailto:ivancich@xxxxxxxxxx>>, Mark Nelson <mnelson@xxxxxxxxxx
<mailto:mnelson@xxxxxxxxxx>>,ceph-users@xxxxxxxxxxxxxx
<mailto:ceph-users@xxxxxxxxxxxxxx>
> Are you running multisite?
No
> Do you have dynamic bucket resharding turned on?
Yes. "radosgw-admin reshard list" prints "[]"
> Are you using lifecycle?
I am not sure. How can I check? "radosgw-admin lc list" says "[]"
> And just to be clear -- sometimes all 3 of your rados gateways are
> simultaneously in this state?
Multiple, but I have not seen all 3 being in this state
simultaneously. Currently one gateway has 1 thread using 100% of CPU,
and another has 5 threads each using 100% CPU.
Here are the fruits of my attempts to capture the call graph using
perf and gdbpmp:
https://icecube.wisc.edu/~vbrik/perf.data
https://icecube.wisc.edu/~vbrik/gdbpmp.data
These are the commands that I ran and their outputs (note I couldn't
get perf not to generate the warning):
rgw-3 gdbpmp # ./gdbpmp.py -n 100 -p 73688 -o gdbpmp.data
Attaching to process 73688...Done.
Gathering
Samples....................................................................................................
Profiling complete with 100 samples.
rgw-3 ~ # perf record --call-graph fp -p 73688 -- sleep 10
[ perf record: Woken up 54 times to write data ]
Warning:
Processed 574207 events and lost 4 chunks!
Check IO/CPU overload!
[ perf record: Captured and wrote 58.866 MB perf.data (233750 samples) ]
Eric