Hi,
I read that civetweb and radosgw have a locking issue in combination
with ssl [1], just a thought based on
failed to acquire lock on obj_delete_at_hint.0000000079
Since Nautilus the default rgw frontend is beast, have you thought
about switching?
Regards,
Eugen
[1] https://tracker.ceph.com/issues/22951
Zitat von Brent Kennedy <bkennedy@xxxxxxxxxx>:
We are performing file maintenance( deletes essentially ) and when the
process gets to a certain point, all four rados gateways crash with the
following:
Log output:
-5> 2020-10-20 06:09:53.996 7f15f1543700 2 req 7 0.000s s3:delete_obj
verifying op params
-4> 2020-10-20 06:09:53.996 7f15f1543700 2 req 7 0.000s s3:delete_obj
pre-executing
-3> 2020-10-20 06:09:53.996 7f15f1543700 2 req 7 0.000s s3:delete_obj
executing
-2> 2020-10-20 06:09:53.997 7f161758f700 10 monclient: get_auth_request
con 0x55d2c02ff800 auth_method 0
-1> 2020-10-20 06:09:54.009 7f1609d74700 5 process_single_shard():
failed to acquire lock on obj_delete_at_hint.0000000079
0> 2020-10-20 06:09:54.035 7f15f1543700 -1 *** Caught signal
(Segmentation fault) **
in thread 7f15f1543700 thread_name:civetweb-worker
ceph version 14.2.11 (f7fdb2f52131f54b891a2ec99d8205561242cdaf) nautilus
(stable)
1: (()+0xf5d0) [0x7f161d3405d0]
2: (()+0x2bec80) [0x55d2bcd1fc80]
3: (std::string::assign(std::string const&)+0x2e) [0x55d2bcd2870e]
4: (rgw_bucket::operator=(rgw_bucket const&)+0x11) [0x55d2bce3e551]
5: (RGWObjManifest::obj_iterator::update_location()+0x184) [0x55d2bced7114]
6: (RGWObjManifest::obj_iterator::operator++()+0x263) [0x55d2bd092793]
7: (RGWRados::update_gc_chain(rgw_obj&, RGWObjManifest&,
cls_rgw_obj_chain*)+0x51a) [0x55d2bd0939ea]
8: (RGWRados::Object::complete_atomic_modification()+0x83) [0x55d2bd093c63]
9: (RGWRados::Object::Delete::delete_obj()+0x74d) [0x55d2bd0a87ad]
10: (RGWDeleteObj::execute()+0x915) [0x55d2bd04b6d5]
11: (rgw_process_authenticated(RGWHandler_REST*, RGWOp*&, RGWRequest*,
req_state*, bool)+0x915) [0x55d2bcdfbb35]
12: (process_request(RGWRados*, RGWREST*, RGWRequest*, std::string const&,
rgw::auth::StrategyRegistry const&, RGWRestfulIO*, OpsLogSocket*,
optional_yield, rgw::dmclock::Scheduler*, int*)+0x1cd8) [0x55d2bcdfdea8]
13: (RGWCivetWebFrontend::process(mg_connection*)+0x38e) [0x55d2bcd41a1e]
14: (()+0x36bace) [0x55d2bcdccace]
15: (()+0x36d76f) [0x55d2bcdce76f]
16: (()+0x36dc18) [0x55d2bcdcec18]
17: (()+0x7dd5) [0x7f161d338dd5]
18: (clone()+0x6d) [0x7f161c84302d]
NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to
interpret this.
My guess is that we need to add more resources to the gateways? They have 2
CPUs and 12GB of memory running as virtual machines on centOS 7.6 . Any
thoughts?
-Brent
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx