Hi Vladimir,
On 8/21/19 8:54 AM, Vladimir Brik wrote:
Hello
I am running a Ceph 14.2.1 cluster with 3 rados gateways.
Periodically, radosgw process on those machines starts consuming 100%
of 5 CPU cores for days at a time, even though the machine is not
being used for data transfers (nothing in radosgw logs, couple of KB/s
of network).
This situation can affect any number of our rados gateways, lasts from
few hours to few days and stops if radosgw process is restarted or on
its own.
Does anybody have an idea what might be going on or how to debug it? I
don't see anything obvious in the logs. Perf top is saying that CPU is
consumed by radosgw shared object in symbol get_obj_data::flush,
which, if I interpret things correctly, is called from a symbol with a
long name that contains the substring "boost9intrusive9list_impl"
I don't normally look at the RGW code so maybe Matt/Casey/Eric can chime
in. That code is in src/rgw/rgw_rados.cc in the get_obj_data struct.
The flush method does some sorting/merging and then walks through a
listed of completed IOs and appears to copy a bufferlist out of each
one, then deletes it from the list and passes the BL off to
client_cb->handle_data. Looks like it could be pretty CPU intensive but
if you are seeing that much CPU for that long it sounds like something
is rather off.
You might want to try grabbing a a callgraph from perf instead of just
running perf top or using my wallclock profiler to see if you can drill
down and find out where in that method it's spending the most time.
My wallclock profiler is here:
https://github.com/markhpc/gdbpmp
Mark
This is our configuration:
rgw_frontends = civetweb num_threads=5000 port=443s
ssl_certificate=/etc/ceph/rgw.crt
error_log_file=/var/log/ceph/civetweb.error.log
(error log file doesn't exist)
Thanks,
Vlad
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com