Re: Civet RadosGW S3 not storing complete obects; civetweb logs stop after rotation

Sean <seapasulli@xxxxxxxxxxxx> · Fri, 01 May 2015 20:47:09 -0500

Hey there,

Sorry for the delay. I have been moving apartments UGH. Our dev team 
found out how to quickly identify these files that are downloading a 
smaller size::

iterate through all of the objects in a bucket and call for a key.size 
in each item and compare it to conn.get_bucket().get_key().size of each 
key and the sizes differ. If the sizes differ these correspond exactly 
to any object that seems to have missing objects in ceph.

The objects always seem to be intervals of 512k as well which is really 
odd.

==================
http://pastebin.com/R34wF7PB
==================

My main question is why are these sizes different at all? Shouldn't they 
be exactly the same? Why are they off by multiples of 512k as well? 
Finally I need a way to rule out that this is a ceph issue and the only 
way I can think of is grabbing a list of all of the data files and 
concatenating them together in order in hopes that the manifest is wrong 
and I get the whole file.

For example::

implicit size 7745820218     explicit size 7744771642    . Absolute 
1048576; name = 
86b6fad8-3c53-465f-8758-2009d6df01e9/TCGA-A2-A0T7-01A-21D-A099-09_IlluminaGA-DNASeq_exome.bam

I explicitly called one of the gateways and then piped the output to a 
text file while downloading this bam:

https://drive.google.com/file/d/0B16pfLB7yY6GcTZXalBQM3RHT0U/view?usp=sharing 
(25 Mb of text)

As we can see above. Ceph is saying that the size is  7745820218 bytes 
somewhere but when we download it we get 7744771642 bytes. If I download 
the object I get a 7744771642 byte file. Finally if I do a range request 
of all of the bytes from 7744771642 to the end I get a cannot compete 
request::

http://pastebin.com/CVvmex4m -- traceback of the python range request.
http://pastebin.com/4sd1Jc0G -- the radoslog of the range request

If I request the file with a shorter range (say 7744771642 -2 bytes 
(7744771640)) I am left with just a 2 byte file::

http://pastebin.com/Sn7Y0t9G -- range request of file - 2 bytes to end 
of file.
lacadmin@kh10-9:~$ ls -lhab 7gtest-range.bam
-rw-r--r-- 1 lacadmin lacadmin 2 Feb 24 01:00 7gtest-range.bam

I think that rados-gw may not be keeping track of the multipart chunks 
errors possibly? How did rados get the original and correct file size 
and why is it short when it returns the actual chunks? Finally why are 
the corrupt / missing chunks always a multipe of 512K? I do not see 
anything obvious that is set to 512K on the configuration/user side.

Sorry for the questions and babling but I am at a loss as to how to 
address this.

On 04/28/2015 05:03 PM, Yehuda Sadeh-Weinraub wrote:

----- Original Message -----
From: "Sean" <seapasulli@xxxxxxxxxxxx>
To: ceph-users@xxxxxxxxxxxxxx
Sent: Tuesday, April 28, 2015 2:52:35 PM
Subject:  Civet RadosGW S3 not storing complete obects; civetweb logs stop after rotation

Hey yall!

I have a weird issue and I am not sure where to look so any help would
be appreciated. I have a large ceph giant cluster that has been stable
and healthy almost entirely since its inception. We have stored over
1.5PB into the cluster currently through RGW and everything seems to be
functioning great. We have downloaded smaller objects without issue but
last night we did a test on our largest file (almost 1 terabyte) and it
continuously times out at almost the exact same place. Investigating
further it looks like Civetweb/RGW is returning that the uploads
completed even though the objects are truncated. At least when we
download the objects they seem to be truncated.

I have tried searching through the mailing list archives to see what may
be going on but it looks like the mailing list DB may be going through
some mainenance:

----
Unable to read word database file
'/dh/mailman/dap/archives/private/ceph-users-ceph.com/htdig/db.words.db'
----

After checking through the gzipped logs I see that civetweb just stops
logging after a rotation for some reason as well and my last log is from
the 28th of march. I tried manually running /etc/init.d/radosgw reload
but this didn't seem to work. As running the download again could take
all day to error out we instead use the range request to try and pull
the missing bites.

https://gist.github.com/MurphyMarkW/8e356823cfe00de86a48 -- there is the
code we are using to download via S3 / boto as well as the returned size
report and overview of our issue.
http://pastebin.com/cVLdQBMF-- Here is some of the log from the civetweb
server they are hitting.

Here is our current config ::
http://pastebin.com/2SGfSDYG

Current output of ceph health::
http://pastebin.com/3f6iJEbu

I am thinking that this must be a civetweb/radosgw bug of somekind. My
question is 1.) is there a way to try and download the object via rados
directly I am guessing I will need to find the prefix and then just cat
all of them together and hope I get it right? 2.) Why would ceph say the
upload went fine but then return a smaller object?

Note that the returned http resonse returns 206 (partial content):
/var/log/radosgw/client.radosgw.log:2015-04-28 16:08:26.525268 7f6e93fff700  2 req 0:1.067030:s3:GET /tcga_cghub_protected/ff9b730c-d303-4d49-b28f-e0bf9d8f1c84/759366461d2bf8bb0583d5b9566ce947.bam:get_obj:http status=206

It'll only return that if partial content is requested (through the http Range header). It's really hard to tell from these logs whether there's any actual problem. I suggest bumping up the log level (debug ms = 1, debug rgw = 20), and take a look at an entire request (one that include all the request http headers).

Yehuda

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com