Re: One object in .rgw.buckets.index causes systemic instability

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



We had a similar issue in Firefly, where we had a very large number 

(about 1.500.000) of buckets for a single RGW user. We observed a 

number of slow requests in day-to-day use, but did not think much of it 

at the time.


At one point the primary OSD managing the list of buckets for that user 

crashed and could not restart, because processing the tremendous amount 

of buckets on startup - which also seemed to be single-threaded, 

judging by to 100% CPU usage we could see - took longer than the 

suicide-timeout. That lead to this OSD crashing again, and again. 

Eventually, it would be marked out and the secondary tried to process 

the list with the same result, leading to a cascading failure.


While I am quite certain it is a different code path in your case (you 

speak about a handful of buckets), it certainly sounds like the a very 

similar issue. Do you have lots of objects in those few buckets, or are 

they few, but large in size to reach the 30TB? Worst case you might be 

in for a similar procedure as we had to take: Take load off the 

cluster, increase the timeouts to ridiculous levels and copy the data 

over into a more evenly distributed set of buckets (users in our case). 

Fortunately as long as we did not try to write to the problematic 

buckets, we could still read from them.


Please notice that this is only a guess, I could be completely wrong.


Daniel


On 2015-11-03 13:33:19 +0000, Gerd Jakobovitsch said:


Dear all,


I have a cluster running hammer (0.94.5), with 5 nodes. The main usage is for S3-compatible object storage.

I am getting to a very troublesome problem at a ceph cluster. A single object in the .rgw.buckets.index is not responding to request and takes a very long time while recovering after an osd restart. During this time, the OSDs where this object is mapped got heavily loaded, with high cpu as well as memory usage. At the same time, the directory /var/lib/ceph/osd/ceph-XX/current/omap gets a large number of entries ( > 10000), that won't decrease.


Very frequently, I get >100 blocked requests for this object, and the main OSD that stores it ends up accepting no other requests. Very frequently the OSD ends up crashing due to filestore timeout, and getting it up again is very troublesome - it usually has to run alone in the node for a long time, until the object gets recovered, somehow.


At the OSD logs, there are several entries like these:

 -7051> 2015-11-03 10:46:08.339283 7f776974f700 10 log_client  logged 2015-11-03 10:46:02.942023 osd.63 10.17.0.9:6857/2002 41 : cluster [WRN] slow re

quest 120.003081 seconds old, received at 2015-11-03 10:43:56.472825: osd_repop(osd.53.236531:7 34.7 8a7482ff/.dir.default.198764998.1/head//34 v 2369

84'22) currently commit_sent



2015-11-03 10:28:32.405265 7f0035982700  0 log_channel(cluster) log [WRN] : 97 slow requests, 1 included below; oldest blocked for > 2046.502848 secs

2015-11-03 10:28:32.405269 7f0035982700  0 log_channel(cluster) log [WRN] : slow request 1920.676998 seconds old, received at 2015-11-03 09:56:31.7282

24: osd_op(client.210508702.0:14696798 .dir.default.198764998.1 [call rgw.bucket_prepare_op] 15.8a7482ff ondisk+write+known_if_redirected e236956) cur

rently waiting for blocked object


Is there any way to go deeper into this problem, or to rebuild the .rgw index without loosing data? I currently have 30 TB of data in the cluster - most of it concentrated in a handful of buckets - that I can't loose.


Regards.

-- 

 

 

 

 

 

 

 

 


--

As informações contidas nesta mensagem são CONFIDENCIAIS, protegidas pelo sigilo legal e por direitos autorais. A divulgação, distribuição, reprodução ou qualquer forma de utilização do teor deste documento depende de autorização do emissor, sujeitando-se o infrator às sanções legais. Caso esta comunicação tenha sido recebida por engano, favor avisar imediatamente, respondendo esta mensagem.


_______________________________________________

ceph-users mailing list

ceph-users@xxxxxxxxxxxxxx

http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



-- 

-- 

Daniel Schneller

Principal Cloud Engineer

 

CenterDevice GmbH

https://www.centerdevice.de

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux