Fwd: radosgw pegging down 5 CPU cores when no data is being transferred

Eric Ivancich <ivancich@xxxxxxxxxx> · Thu, 22 Aug 2019 21:12:49 -0400

Thank you for providing the profiling data, Vladimir. There are 5078 threads and most of them are waiting. Here is a list of the deepest call of each thread with duplicates removed.
            + 100.00% epoll_wait
                          + 100.00% get_obj_data::flush(rgw::OwningList<rgw::AioResultEntry>&&)
            + 100.00% poll
        + 100.00% poll
      + 100.00% poll
        + 100.00% pthread_cond_timedwait@@GLIBC_2.3.2
      + 100.00% pthread_cond_timedwait@@GLIBC_2.3.2
        + 100.00% pthread_cond_wait@@GLIBC_2.3.2
      + 100.00% pthread_cond_wait@@GLIBC_2.3.2
      + 100.00% read
                            + 100.00% _ZN5boost9intrusive9list_implINS0_8bhtraitsIN3rgw14AioResultEntryENS0_16list_node_traitsIPvEELNS0_14link_mode_typeE1ENS0_7dft_tagELj1EEEmLb1EvE4sortIZN12get_obj_data5flushEONS3_10OwningListIS4_JEEEEUlRKT_RKT0_E_EEvSH_

The only interesting ones are the second and last:
* get_obj_data::flush(rgw::OwningList<rgw::AioResultEntry>&&)
* _ZN5boost9intrusive9list_implINS0_8bhtraitsIN3rgw14AioResultEntryENS0_16list_node_traitsIPvEELNS0_14link_mode_typeE1ENS0_7dft_tagELj1EEEmLb1EvE4sortIZN12get_obj_data5flushEONS3_10OwningListIS4_JEEEEUlRKT_RKT0_E_EEvSH_

They are essentially part of the same call stack that results from processing a GetObj request, and five threads are in this call stack (the only difference is wether or not they include the call into boost intrusive list). Here’s the full call stack of those threads:

+ 100.00% clone
  + 100.00% start_thread
    + 100.00% worker_thread
      + 100.00% process_new_connection
        + 100.00% handle_request
          + 100.00% RGWCivetWebFrontend::process(mg_connection*)
            + 100.00% process_request(RGWRados*, RGWREST*, RGWRequest*, std::string const&, rgw::auth::StrategyRegistry const&, RGWRestfulIO*, OpsLogSocket*, opt
ional_yield, rgw::dmclock::Scheduler*, int*)
              + 100.00% rgw_process_authenticated(RGWHandler_REST*, RGWOp*&, RGWRequest*, req_state*, bool)
                + 100.00% RGWGetObj::execute()
                  + 100.00% RGWRados::Object::Read::iterate(long, long, RGWGetDataCB*)
                    + 100.00% RGWRados::iterate_obj(RGWObjectCtx&, RGWBucketInfo const&, rgw_obj const&, long, long, unsigned long, int (*)(rgw_raw_obj const&, l
ong, long, long, bool, RGWObjState*, void*), void*)
                      + 100.00% _get_obj_iterate_cb(rgw_raw_obj const&, long, long, long, bool, RGWObjState*, void*)
                        + 100.00% RGWRados::get_obj_iterate_cb(rgw_raw_obj const&, long, long, long, bool, RGWObjState*, void*)
                          + 100.00% get_obj_data::flush(rgw::OwningList<rgw::AioResultEntry>&&)
                            + 100.00% _ZN5boost9intrusive9list_implINS0_8bhtraitsIN3rgw14AioResultEntryENS0_16list_node_traitsIPvEELNS0_14link_mode_typeE1ENS0_7dft_tagELj1EEEmLb1EvE4sortIZN12get_obj_data5flushEONS3_10OwningListIS4_JEEEEUlRKT_RKT0_E_EEvSH_

So this isn’t background processing but request processing. I’m not clear why these requests are consuming so much CPU for so long.

From your initial message:
I am running a Ceph 14.2.1 cluster with 3 rados gateways. Periodically, radosgw process on those machines starts consuming 100% of 5 CPU cores for days at a time, even though the machine is not being used for data transfers (nothing in radosgw logs, couple of KB/s of network).

This situation can affect any number of our rados gateways, lasts from few hours to few days and stops if radosgw process is restarted or on its own.

I’m going to check with others who’re more familiar with this code path.

Begin forwarded message:

From: Vladimir Brik <vladimir.brik@xxxxxxxxxxxxxxxx>
Subject: Re:  radosgw pegging down 5 CPU cores when no data is being transferred
Date: August 21, 2019 at 4:47:01 PM EDT
To: "J. Eric Ivancich" <ivancich@xxxxxxxxxx>, Mark Nelson <mnelson@xxxxxxxxxx>, ceph-users@xxxxxxxxxxxxxx

> Are you running multisite?
No

> Do you have dynamic bucket resharding turned on?
Yes. "radosgw-admin reshard list" prints "[]"

> Are you using lifecycle?
I am not sure. How can I check? "radosgw-admin lc list" says "[]"

> And just to be clear -- sometimes all 3 of your rados gateways are
> simultaneously in this state?
Multiple, but I have not seen all 3 being in this state simultaneously. Currently one gateway has 1 thread using 100% of CPU, and another has 5 threads each using 100% CPU.

Here are the fruits of my attempts to capture the call graph using perf and gdbpmp:
https://icecube.wisc.edu/~vbrik/perf.data
https://icecube.wisc.edu/~vbrik/gdbpmp.data

These are the commands that I ran and their outputs (note I couldn't get perf not to generate the warning):
rgw-3 gdbpmp # ./gdbpmp.py -n 100 -p 73688 -o gdbpmp.data
Attaching to process 73688...Done.
Gathering Samples....................................................................................................
Profiling complete with 100 samples.

rgw-3 ~ # perf record --call-graph fp -p 73688 -- sleep 10
[ perf record: Woken up 54 times to write data ]
Warning:
Processed 574207 events and lost 4 chunks!
Check IO/CPU overload!
[ perf record: Captured and wrote 58.866 MB perf.data (233750 samples) ]

Eric

--
J. Eric Ivancich
he/him/his
Red Hat Storage
Ann Arbor, Michigan, USA
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com