Yehuda Sadeh writes: > On Wed, Sep 17, 2014 at 7:39 AM, Abhishek L > <abhishek.lekshmanan@xxxxxxxxx> wrote: >> >> Hi, >> >> I'm trying to understand the internals of RadosGW, on how >> buckets/containers, objects are mapped back to rados objects. I couldn't >> find any docs, however a previous mailing list discussion[1] explained >> how an S3/Swift objects are cut into rados objects and about manifests. I was >> able to construct back a file uploaded to RadosGW by getting the rados >> objects by using the manifest to figure out the rados object names. >> For eg: >> ``` >> # random.txt is an 8 MB text file >> [r@ra:~/ceph/src]$ s3 -us put my-first-bucket/random filename=random.txt >> [r@ra:~/ceph/src]$ ./radosgw-admin object stat --bucket=my-first-bucket --object=random | grep prefix >> "prefix": "._op2xmptte2DD7z3_9EjQKgmmRcWRWL_", >> >> ``` >> >> And then getting the objects via rados and joining back >> >> ``` >> [r@ra:~/ceph/src]$ ./rados --pool .rgw.buckets ls | grep _op2xm >> default.4124.1__shadow_._op2xmptte2DD7z3_9EjQKgmmRcWRWL_2 >> default.4124.1__shadow_._op2xmptte2DD7z3_9EjQKgmmRcWRWL_1 >> [r@ra:~/ceph/src]$ ./rados get default.4124.1_random random.part0 --pool .rgw.buckets >> [r@ra:~/ceph/src]$ ./rados get default.4124.1__shadow_._op2xmptte2DD7z3_9EjQKgmmRcWRWL_1 random.part1 --pool .rgw.buckets >> [r@ra:~/ceph/src]$ ./rados get default.4124.1__shadow_._op2xmptte2DD7z3_9EjQKgmmRcWRWL_2 random.part2 --pool .rgw.buckets >> >> # Now join the objects back >> [r@ra:~/ceph/src]$ cat random.part0 random.part1 random.part2 > random.rados.txt >> [r@ra:~/ceph/src]$ diff random.txt random.rados.txt >> ``` >> >> I'm trying to find similiar information on how radosgw ends up storing >> the buckets & metadata into rados objects, what information is >> contained within them and how they are updated when say an object is >> added etc. I was able to find the bucket name & bucket meta data being >> stored in .rgw pool, but not sure how the bucket knows the objects it >> has or buckets owned by user etc. >> > > The bucket doesn't know who owns each object, this info is stored in > the object's info. The bucket index is stored as omap information in > the bucket instance object. Ah thanks, I was able to list the objects for the buckets, by getting omapkeys from the buckets.index pool ``` [r@ra:~/ceph/src](⎇ master)$ ./rados -p .rgw.buckets.index ls .dir.defualt.4124.2 .dir.default.4124.1 [r@ra:~/ceph/src](⎇ master)$ ./rados -p .rgw.buckets.index listomapkeys .dir.default.4124.1 big-object file-1 object-8 random ``` > The list of buckets per user is kept in > the user metadata object (also as omap information). There's a rados > command that lets you list the omap keys for each rados object. This also I was able to get by inspecting the <uid>.buckets objects in users.uid pool. ``` ./rados -p .users.uid listomapkeys testid.buckets another-bucket my-first-bucket ``` Thanks for the info. I'll try to combine these mailing list discussions to something of a starting point for storage in radosgw developer docs. Cheers -- Abhishek
Attachment:
signature.asc
Description: PGP signature