Re: radosgw pegging down 5 CPU cores when no data is being transferred

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



I created a ticket: https://tracker.ceph.com/issues/41511

Note that I think I was mistaken when I said that sometimes the problem goes away on its own. I've looked back through our monitoring and it looks like when the problem did go away, it was because either the machine was rebooted or the radosgw service was restarted.


Vlad



On 8/23/19 10:17 AM, Eric Ivancich wrote:
Good morning, Vladimir,

Please create a tracker for this (https://tracker.ceph.com/projects/rgw/issues/new) and include the link to it in an email reply. And if you can include any more potentially relevant details, please do so. I’ll add my initial analysis to it.

But the threads do seem to be stuck, at least for a while, in get_obj_data::flush despite a lack of traffic. And sometimes it self-resolves, so it’s not a true “infinite loop”.

Thank you,

Eric

On Aug 22, 2019, at 9:12 PM, Eric Ivancich <ivancich@xxxxxxxxxx <mailto:ivancich@xxxxxxxxxx>> wrote:

Thank you for providing the profiling data, Vladimir. There are 5078 threads and most of them are waiting. Here is a list of the deepest call of each thread with duplicates removed.

            + 100.00% epoll_wait
                          + 100.00% get_obj_data::flush(rgw::OwningList<rgw::AioResultEntry>&&)
            + 100.00% poll
        + 100.00% poll
      + 100.00% poll
        + 100.00% pthread_cond_timedwait@@GLIBC_2.3.2
      + 100.00% pthread_cond_timedwait@@GLIBC_2.3.2
        + 100.00% pthread_cond_wait@@GLIBC_2.3.2
      + 100.00% pthread_cond_wait@@GLIBC_2.3.2
      + 100.00% read
                            + 100.00% _ZN5boost9intrusive9list_implINS0_8bhtraitsIN3rgw14AioResultEntryENS0_16list_node_traitsIPvEELNS0_14link_mode_typeE1ENS0_7dft_tagELj1EEEmLb1EvE4sortIZN12get_obj_data5flushEONS3_10OwningListIS4_JEEEEUlRKT_RKT0_E_EEvSH_

The only interesting ones are the second and last:

* get_obj_data::flush(rgw::OwningList<rgw::AioResultEntry>&&)
* _ZN5boost9intrusive9list_implINS0_8bhtraitsIN3rgw14AioResultEntryENS0_16list_node_traitsIPvEELNS0_14link_mode_typeE1ENS0_7dft_tagELj1EEEmLb1EvE4sortIZN12get_obj_data5flushEONS3_10OwningListIS4_JEEEEUlRKT_RKT0_E_EEvSH_

They are essentially part of the same call stack that results from processing a GetObj request, and five threads are in this call stack (the only difference is wether or not they include the call into boost intrusive list). Here’s the full call stack of those threads:

+ 100.00% clone
  + 100.00% start_thread
    + 100.00% worker_thread
      + 100.00% process_new_connection
        + 100.00% handle_request
          + 100.00% RGWCivetWebFrontend::process(mg_connection*)
            + 100.00% process_request(RGWRados*, RGWREST*, RGWRequest*, std::string const&, rgw::auth::StrategyRegistry const&, RGWRestfulIO*, OpsLogSocket*, opt
ional_yield, rgw::dmclock::Scheduler*, int*)
              + 100.00% rgw_process_authenticated(RGWHandler_REST*, RGWOp*&, RGWRequest*, req_state*, bool)
                + 100.00% RGWGetObj::execute()
                  + 100.00% RGWRados::Object::Read::iterate(long, long, RGWGetDataCB*)                     + 100.00% RGWRados::iterate_obj(RGWObjectCtx&, RGWBucketInfo const&, rgw_obj const&, long, long, unsigned long, int (*)(rgw_raw_obj const&, l
ong, long, long, bool, RGWObjState*, void*), void*)
                      + 100.00% _get_obj_iterate_cb(rgw_raw_obj const&, long, long, long, bool, RGWObjState*, void*)                         + 100.00% RGWRados::get_obj_iterate_cb(rgw_raw_obj const&, long, long, long, bool, RGWObjState*, void*)                           + 100.00% get_obj_data::flush(rgw::OwningList<rgw::AioResultEntry>&&)                             + 100.00% _ZN5boost9intrusive9list_implINS0_8bhtraitsIN3rgw14AioResultEntryENS0_16list_node_traitsIPvEELNS0_14link_mode_typeE1ENS0_7dft_tagELj1EEEmLb1EvE4sortIZN12get_obj_data5flushEONS3_10OwningListIS4_JEEEEUlRKT_RKT0_E_EEvSH_

So this isn’t background processing but request processing. I’m not clear why these requests are consuming so much CPU for so long.

From your initial message:
I am running a Ceph 14.2.1 cluster with 3 rados gateways. Periodically, radosgw process on those machines starts consuming 100% of 5 CPU cores for days at a time, even though the machine is not being used for data transfers (nothing in radosgw logs, couple of KB/s of network).

This situation can affect any number of our rados gateways, lasts from few hours to few days and stops if radosgw process is restarted or on its own.

I’m going to check with others who’re more familiar with this code path.

Begin forwarded message:

*From:*Vladimir Brik <vladimir.brik@xxxxxxxxxxxxxxxx <mailto:vladimir.brik@xxxxxxxxxxxxxxxx>> *Subject:**Re: radosgw pegging down 5 CPU cores when no data is being transferred*
*Date:*August 21, 2019 at 4:47:01 PM EDT
*To:*"J. Eric Ivancich" <ivancich@xxxxxxxxxx <mailto:ivancich@xxxxxxxxxx>>, Mark Nelson <mnelson@xxxxxxxxxx <mailto:mnelson@xxxxxxxxxx>>,ceph-users@xxxxxxxxxxxxxx <mailto:ceph-users@xxxxxxxxxxxxxx>

> Are you running multisite?
No

> Do you have dynamic bucket resharding turned on?
Yes. "radosgw-admin reshard list" prints "[]"

> Are you using lifecycle?
I am not sure. How can I check? "radosgw-admin lc list" says "[]"

> And just to be clear -- sometimes all 3 of your rados gateways are
> simultaneously in this state?
Multiple, but I have not seen all 3 being in this state simultaneously. Currently one gateway has 1 thread using 100% of CPU, and another has 5 threads each using 100% CPU.

Here are the fruits of my attempts to capture the call graph using perf and gdbpmp:
https://icecube.wisc.edu/~vbrik/perf.data
https://icecube.wisc.edu/~vbrik/gdbpmp.data

These are the commands that I ran and their outputs (note I couldn't get perf not to generate the warning):
rgw-3 gdbpmp # ./gdbpmp.py -n 100 -p 73688 -o gdbpmp.data
Attaching to process 73688...Done.
Gathering Samples....................................................................................................
Profiling complete with 100 samples.

rgw-3 ~ # perf record --call-graph fp -p 73688 -- sleep 10
[ perf record: Woken up 54 times to write data ]
Warning:
Processed 574207 events and lost 4 chunks!
Check IO/CPU overload!
[ perf record: Captured and wrote 58.866 MB perf.data (233750 samples) ]

Eric
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux