Looks like I need to upgrade to Firefly to get ceph-kvstore-tool before I can proceed. I am getting some hits just from grepping the LevelDB store, but so far nothing has panned out. Thanks for the help! On Tue, Aug 19, 2014 at 10:27 AM, Gregory Farnum <greg at inktank.com> wrote: > It's been a while since I worked on this, but let's see what I remember... > > On Thu, Aug 14, 2014 at 11:34 AM, Craig Lewis <clewis at centraldesktop.com> > wrote: > > In my effort to learn more of the details of Ceph, I'm trying to > > figure out how to get from an object name in RadosGW, through the > > layers, down to the files on disk. > > > > clewis at clewis-mac ~ $ s3cmd ls s3://cpltest/ > > 2014-08-13 23:02 14M 28dde9db15fdcb5a342493bc81f91151 > > s3://cpltest/vmware-freebsd-tools.tar.gz > > > > Looking at the .rgw pool's contents tells me that the cpltest bucket > > is default.73886.55: > > root at dev-ceph0:/var/lib/ceph/osd/ceph-0/current# rados -p .rgw ls | > grep cpltest > > cpltest > > .bucket.meta.cpltest:default.73886.55 > > Okay, what you're seeing here are two different types, whose names I'm > not going to get right: > 1) The bucket link "cpltest", which maps from the name "cpltest" to a > "bucket instance". The contents of cpltest, or one of its xattrs, are > pointing at ".bucket.meta.cpltest:default.73886.55" > 2) The "bucket instance" .bucket.meta.cpltest:default.73886.55. I > think this contains the bucket index (list of all objects), etc. > > > The rados objects that belong to that bucket are: > > root at dev-ceph0:~# rados -p .rgw.buckets ls | grep default.73886.55 > > default.73886.55__shadow__RpwwfOt2X-mhwU65Qa1OHDi--4OMGvQ_1 > > default.73886.55__shadow__RpwwfOt2X-mhwU65Qa1OHDi--4OMGvQ_3 > > default.73886.55_vmware-freebsd-tools.tar.gz > > default.73886.55__shadow__RpwwfOt2X-mhwU65Qa1OHDi--4OMGvQ_2 > > default.73886.55__shadow__RpwwfOt2X-mhwU65Qa1OHDi--4OMGvQ_4 > > Okay, so when you ask RGW for the object vmware-freebsd-tools.tar.gz > from the cpltest bucket, it will look up (or, if we're lucky, have > cached) the cpltest link, and find out that the "bucket prefix" is > default.73886.55. It will then try and access the object > "default.73886.55_vmware-freebsd-tools.tar.gz" (whose construction I > hope is obvious ? bucket instance ID as a prefix, _ as a separate, > then the object name). This RADOS object is called the "head" for the > RGW object. In addition to (usually) the beginning bit of data, it > will also contain some xattrs with things like a "tag" for any extra > RADOS objects which include data for this RGW object. In this case, > that tag is "RpwwfOt2X-mhwU65Qa1OHDi--4OMGvQ". (This construction is > how we do atomic overwrites of RGW objects which are larger than a > single RADOS object, in addition to a few other things.) > > I don't think there's any way of mapping from a shadow (tail) object > name back to its RGW name. but if you look at the rados object xattrs, > there might (? or might not) be an attr which contains the parent > object in one form or another. Check that out. > > (Or, if you want to check out the source, I think all the relevant > bits for this are somewhere in the > -Greg > Software Engineer #42 @ http://inktank.com | http://ceph.com > > > I know those shadow__RpwwfOt2X-mhwU65Qa1OHDi--4OMGvQ_ files are the > > rest of vmware-freebsd-tools.tar.gz. I can infer that because this > > bucket only has a single file (and the sum of the sizes matches). > > With many files, I can't infer the link anymore. > > > > How do I look up that link? > > > > I tried reading the src/rgw/rgw_rados.cc, but I'm getting lost. > > > > > > > > My real goal is the reverse. I recently repaired an inconsistent PG. > > The primary replica had the bad data, so I want to verify that the > > repaired object is correct. I have a database that stores the SHA256 > > of every object. If I can get from the filename on disk back to an S3 > > object, I can verify the file. If it's bad, I can restore from the > > replicated zone. > > > > > > Aside from today's task, I think it's really handy to understand these > > low level details. I know it's been handy in the past, when I had > > disk corruption under my PostgreSQL database. Knowing (and > > practicing) ahead of time really saved me a lot of downtime then. > > > > > > Thanks for any pointers. > > -- > > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in > > the body of a message to majordomo at vger.kernel.org > > More majordomo info at http://vger.kernel.org/majordomo-info.html > -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.ceph.com/pipermail/ceph-users-ceph.com/attachments/20140819/e798c799/attachment.htm>