Re: Civet RadosGW S3 not storing complete obects; civetweb logs stop after rotation

Yehuda Sadeh-Weinraub <yehuda@xxxxxxxxxx> · Thu, 7 May 2015 01:09:45 -0400 (EDT)

----- Original Message -----
> From: "Sean" <seapasulli@xxxxxxxxxxxx>
> To: "Yehuda Sadeh-Weinraub" <yehuda@xxxxxxxxxx>
> Cc: ceph-users@xxxxxxxxxxxxxx
> Sent: Tuesday, May 5, 2015 12:14:19 PM
> Subject: Re:  Civet RadosGW S3 not storing complete obects; civetweb logs stop after rotation
> 
> 
> 
> Hello Yehuda and the rest of the mailing list.
> 
> 
> My main question currently is why are the bucket index and the object
> manifest ever different? Based on how we are uploading data I do not think
> that the rados gateway should ever know the full file size without having
> all of the objects within ceph at one point in time. So after the multipart
> is marked as completed Rados gateway should cat through all of the objects
> and make a complete part, correct?

That's what *should* happen, but obviously there's some bug there.

> 
> 
> 
> Secondly,
> 
> I think I am not understanding the process to grab all of the parts
> correctly. To continue to use my example file
> "86b6fad8-3c53-465f-8758-2009d6df01e9/TCGA-A2-A0T7-01A-21D-A099-09_IlluminaGA-DNASeq_exome.bam"
> in bucket tcga_cghub_protected. I would be using the following to grab the
> prefix:
> 
> 
> prefix=$(radosgw-admin object stat --bucket=tcga_cghub_protected
> --object=86b6fad8-3c53-465f-8758-2009d6df01e9/TCGA-A2-A0T7-01A-21D-A099-09_IlluminaGA-DNASeq_exome.bam
> | grep -iE '"prefix"' | awk -F"\"" '{print $4}')
> 
> 
> Which should take everything between quotes for the prefix key and give me
> the value.
> 
> 
> In this case::
> 
> "prefix":
> "86b6fad8-3c53-465f-8758-2009d6df01e9\/TCGA-A2-A0T7-01A-21D-A099-09_IlluminaGA-DNASeq_exome.bam.2\/YAROhWaAm9LPwCHeP55cD4CKlLC0B4S",
> 
> 
> So
> 
> lacadmin@kh10-9:~$ echo ${prefix}
> 
> 86b6fad8-3c53-465f-8758-2009d6df01e9\/TCGA-A2-A0T7-01A-21D-A099-09_IlluminaGA-DNASeq_exome.bam.2\/YAROhWaAm9LPwCHeP55cD4CKlLC0B4S
> 
> 
> From here I list all of the objects in the .rgw.buckets pool and grep for
> that said prefix which yields 1335 objects. From here if I cat all of these
> objects together I only end up with a 5468160 byte file which is 2G short of
> what the object manifest says it should be. If I grab the file and tail the
> Rados gateway log I end up with 1849 objects and when I sum them all up I

How are these objects named?

> end up with 7744771642 which is the same size that the manifest reports. I
> understand that this does nothing other than verify the manifests accuracy
> but I still find it interesting. The missing chunks may still exist in ceph
> outside of the object manifest and tagged with the same prefix, correct? Or
> am I misunderstanding something?

Either it's missing a chunk, or one of the objects is truncated. Can you stat all the parts? I expect most of the objects to have two different sizes (e.g., 4MB, 1MB), but at it is likely that the last part is smaller, and maybe another object that is missing 512k. 

> 
> 
> We have over 40384 files in the tcga_cghub_protected bucket and only 66 of
> these files are suffering from this truncation issue. What I need to know
> is: is this happening on the gateway side or on the client side? Next I need
> to know what possible actions can occur where the bucket index and the
> object manifest would be mismatched like this as 40318 out of 40384 are
> working without issue.
> 
> 
> The truncated files are of all different sizes (5 megabytes - 980 gigabytes)
> and the truncation seems to be all over. By "all over" I mean some files are
> missing the first few bytes that should read "bam" and some are missing
> parts in the middle.

Can you give an example of an object manifest for a broken object, and all the rados objects that build it (e.g., the output of 'rados stat' on these objects). A smaller object might be easier.

> 
> 
> So our upload code is using mmap to stream chunks of the file to the Rados
> gateway via a multipart upload but no where on the client side do we have a
> direct reference to the files we are using nor do we specify the size in
> anyway. So where is the gateway getting the correct complete filesize from
> and how is the bucket index showing the intended file size?
> 
> 
> This implies that, at some point in time, ceph was able to see all of the
> parts of the file and calculate the correct total size. This to me seems
> like a rados gateway bug regardless of how the file is being uploaded. I
> think that the RGW should be able to be fuzzed and still store the data
> correctly.
> 
> 
> Why is the bucket list not matching the bucket index and how can I verify
> that the data is not being corrupted by the RGW or worse, after it is
> committed to ceph ?

That's what we're trying to find out.

Thanks,
Yehuda
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com