On Tue, Aug 9, 2016 at 7:39 AM, George Mihaiescu <lmihaiescu@xxxxxxxxx> wrote: > Look in the cinder db, the volumes table to find the Uuid of the deleted volume. You could also look through the logs at the time of the delete and I suspect you should be able to see how the rbd image was prefixed/named at the time of the delete. HTH, Brad > > If you go through yours OSDs and look for the directories for PG index 20, you might find some fragments from the deleted volume, but it's a long shot... > >> On Aug 8, 2016, at 4:39 PM, Georgios Dimitrakakis <giorgis@xxxxxxxxxxxx> wrote: >> >> Dear David (and all), >> >> the data are considered very critical therefore all this attempt to recover them. >> >> Although the cluster hasn't been fully stopped all users actions have. I mean services are running but users are not able to read/write/delete. >> >> The deleted image was the exact same size of the example (500GB) but it wasn't the only one deleted today. Our user was trying to do a "massive" cleanup by deleting 11 volumes and unfortunately one of them was very important. >> >> Let's assume that I "dd" all the drives what further actions should I do to recover the files? Could you please elaborate a bit more on the phrase "If you've never deleted any other rbd images and assuming you can recover data with names, you may be able to find the rbd objects"?? >> >> Do you mean that if I know the file names I can go through and check for them? How? >> Do I have to know *all* file names or by searching for a few of them I can find all data that exist? >> >> Thanks a lot for taking the time to answer my questions! >> >> All the best, >> >> G. >> >>> I dont think theres a way of getting the prefix from the cluster at >>> this point. >>> >>> If the deleted image was a similar size to the example youve given, >>> you will likely have had objects on every OSD. If this data is >>> absolutely critical you need to stop your cluster immediately or make >>> copies of all the drives with something like dd. If youve never >>> deleted any other rbd images and assuming you can recover data with >>> names, you may be able to find the rbd objects. >>> >>> On Mon, Aug 8, 2016 at 7:28 PM, Georgios Dimitrakakis wrote: >>> >>>>>> Hi, >>>>>> >>>>>> On 08.08.2016 10:50, Georgios Dimitrakakis wrote: >>>>>> >>>>>>>> Hi, >>>>>>>> >>>>>>>>> On 08.08.2016 09:58, Georgios Dimitrakakis wrote: >>>>>>>>> >>>>>>>>> Dear all, >>>>>>>>> >>>>>>>>> I would like your help with an emergency issue but first >>>>>>>>> let me describe our environment. >>>>>>>>> >>>>>>>>> Our environment consists of 2OSD nodes with 10x 2TB HDDs >>>>>>>>> each and 3MON nodes (2 of them are the OSD nodes as well) >>>>>>>>> all with ceph version 0.80.9 >>>>>>>>> (b5a67f0e1d15385bc0d60a6da6e7fc810bde6047) >>>>>>>>> >>>>>>>>> This environment provides RBD volumes to an OpenStack >>>>>>>>> Icehouse installation. >>>>>>>>> >>>>>>>>> Although not a state of the art environment is working >>>>>>>>> well and within our expectations. >>>>>>>>> >>>>>>>>> The issue now is that one of our users accidentally >>>>>>>>> deleted one of the volumes without keeping its data first! >>>>>>>>> >>>>>>>>> Is there any way (since the data are considered critical >>>>>>>>> and very important) to recover them from CEPH? >>>>>>>> >>>>>>>> Short answer: no >>>>>>>> >>>>>>>> Long answer: no, but.... >>>>>>>> >>>>>>>> Consider the way Ceph stores data... each RBD is striped >>>>>>>> into chunks >>>>>>>> (RADOS objects with 4MB size by default); the chunks are >>>>>>>> distributed >>>>>>>> among the OSDs with the configured number of replicates >>>>>>>> (probably two >>>>>>>> in your case since you use 2 OSD hosts). RBD uses thin >>>>>>>> provisioning, >>>>>>>> so chunks are allocated upon first write access. >>>>>>>> If an RBD is deleted all of its chunks are deleted on the >>>>>>>> corresponding OSDs. If you want to recover a deleted RBD, >>>>>>>> you need to >>>>>>>> recover all individual chunks. Whether this is possible >>>>>>>> depends on >>>>>>>> your filesystem and whether the space of a former chunk is >>>>>>>> already >>>>>>>> assigned to other RADOS objects. The RADOS object names are >>>>>>>> composed >>>>>>>> of the RBD name and the offset position of the chunk, so if >>>>>>>> an >>>>>>>> undelete mechanism exists for the OSDs filesystem, you have >>>>>>>> to be >>>>>>>> able to recover file by their filename, otherwise you might >>>>>>>> end up >>>>>>>> mixing the content of various deleted RBDs. Due to the thin >>>>>>>> provisioning there might be some chunks missing (e.g. never >>>>>>>> allocated >>>>>>>> before). >>>>>>>> >>>>>>>> Given the fact that >>>>>>>> - you probably use XFS on the OSDs since it is the >>>>>>>> preferred >>>>>>>> filesystem for OSDs (there is RDR-XFS, but Ive never had to >>>>>>>> use it) >>>>>>>> - you would need to stop the complete ceph cluster >>>>>>>> (recovery tools do >>>>>>>> not work on mounted filesystems) >>>>>>>> - your cluster has been in use after the RBD was deleted >>>>>>>> and thus >>>>>>>> parts of its former space might already have been >>>>>>>> overwritten >>>>>>>> (replication might help you here, since there are two OSDs >>>>>>>> to try) >>>>>>>> - XFS undelete does not work well on fragmented files (and >>>>>>>> OSDs tend >>>>>>>> to introduce fragmentation...) >>>>>>>> >>>>>>>> the answer is no, since it might not be feasible and the >>>>>>>> chance of >>>>>>>> success are way too low. >>>>>>>> >>>>>>>> If you want to spend time on it I would propose the stop >>>>>>>> the ceph >>>>>>>> cluster as soon as possible, create copies of all involved >>>>>>>> OSDs, start >>>>>>>> the cluster again and attempt the recovery on the copies. >>>>>>>> >>>>>>>> Regards, >>>>>>>> Burkhard >>>>>>> >>>>>>> Hi! Thanks for the info...I understand that this is a very >>>>>>> difficult and probably not feasible task but in case I need to >>>>>>> try a recovery what other info should I need? Can I somehow >>>>>>> find out on which OSDs the specific data were stored and >>>>>>> minimize my search there? >>>>>>> Any ideas on how should I proceed? >>>>>> First of all you need to know the exact object names for the >>>>>> RADOS >>>>>> objects. As mentioned before, the name is composed of the RBD >>>>>> name and >>>>>> an offset. >>>>>> >>>>>> In case of OpenStack, there are three different patterns for >>>>>> RBD names: >>>>>> >>>>>> , e.g. 50f2a0bd-15b1-4dbb-8d1f-fc43ce535f13 >>>>>> for glance images, >>>>>> , e.g. 9aec1f45-9053-461e-b176-c65c25a48794_disk for nova >>>>>> images >>>>>> , e.g. volume-0ca52f58-7e75-4b21-8b0f-39cbcd431c42 for >>>>>> cinder volumes >>>>>> >>>>>> (not considering snapshots etc, which might use different >>>>>> patterns) >>>>>> >>>>>> The RBD chunks are created using a certain prefix (using >>>>>> examples >>>>>> from our openstack setup): >>>>>> >>>>>> # rbd -p os-images info 8fa3d9eb-91ed-4c60-9550-a62f34aed014 >>>>>> rbd image 8fa3d9eb-91ed-4c60-9550-a62f34aed014: >>>>>> size 446 MB in 56 objects >>>>>> order 23 (8192 kB objects) >>>>>> block_name_prefix: rbd_data.30e57d54dea573 >>>>>> format: 2 >>>>>> features: layering, striping >>>>>> flags: >>>>>> stripe unit: 8192 kB >>>>>> stripe count: 1 >>>>>> >>>>>> # rados -p os-images ls | grep rbd_data.30e57d54dea573 >>>>>> rbd_data.30e57d54dea573.0000000000000015 >>>>>> rbd_data.30e57d54dea573.0000000000000008 >>>>>> rbd_data.30e57d54dea573.000000000000000a >>>>>> rbd_data.30e57d54dea573.000000000000002d >>>>>> rbd_data.30e57d54dea573.0000000000000032 >>>>>> >>>>>> I dont know how whether the prefix is derived from some other >>>>>> information, but the recover the RBD you definitely need it. >>>>>> >>>>>> _If_ you are able to recover the prefix, you can use ceph osd >>>>>> map >>>>>> to find the OSDs for each chunk: >>>>>> >>>>>> # ceph osd map os-images >>>>>> rbd_data.30e57d54dea573.000000000000001a >>>>>> osdmap e418590 pool os-images (38) object >>>>>> rbd_data.30e57d54dea573.000000000000001a -> pg 38.d5d81d65 >>>>>> (38.65) >>>>>> -> up ([45,17,108], p45) acting ([45,17,108], p45) >>>>>> >>>>>> With 20 OSDs in your case you will likely have to process all >>>>>> of them >>>>>> if the RBD has a size of several GBs. >>>>>> >>>>>> Regards, >>>>>> Burkhard >>>>> >>>>> Is it possible to get the prefix if the RBD has been deleted >>>>> already?? Is this info somewhere stored? Can I retrieve it with >>>>> another way besides "rbd info"? Because when I try to get it >>>>> using the >>>>> "rbd info" command unfortunately I am getting the following >>>>> error: >>>>> >>>>> "librbd::ImageCtx: error finding header: (2) No such file or >>>>> directory" >>>>> >>>>> Any ideas? >>>>> >>>>> Best regards, >>>>> >>>>> G. >>>> >>>> Here are some more info from the cluster: >>>> >>>> $ ceph df >>>> GLOBAL: >>>> SIZE AVAIL RAW USED %RAW USED >>>> 74373G 72011G 2362G 3.18 >>>> POOLS: >>>> NAME ID USED >>>> %USED MAX AVAIL OBJECTS >>>> data 3 0 >>>> 0 35849G 0 >>>> metadata 4 1884 >>>> 0 35849G 20 >>>> rbd 5 0 >>>> 0 35849G 0 >>>> .rgw 6 1374 >>>> 0 35849G 8 >>>> .rgw.control 7 0 >>>> 0 35849G 8 >>>> .rgw.gc 8 0 >>>> 0 35849G 32 >>>> .log 9 0 >>>> 0 35849G 0 >>>> .intent-log 10 0 >>>> 0 35849G 0 >>>> .usage 11 0 >>>> 0 35849G 3 >>>> .users 12 33 >>>> 0 35849G 3 >>>> .users.email 13 22 >>>> 0 35849G 2 >>>> .users.swift 14 22 >>>> 0 35849G 2 >>>> .users.uid 15 985 >>>> 0 35849G 4 >>>> .rgw.root 16 840 >>>> 0 35849G 3 >>>> .rgw.buckets.index 17 0 0 >>>> 35849G 4 >>>> .rgw.buckets 18 170G 0.23 >>>> 35849G 810128 >>>> .rgw.buckets.extra 19 0 0 >>>> 35849G 1 >>>> volumes 20 1004G 1.35 >>>> 35849G 262613 >>>> >>>> Obviously the RBD volumes provided to OpenStack are stored on the >>>> "volumes" pool , so trying to >>>> figure out the prefix for the volume in question >>>> "volume-a490aa0c-6957-4ea2-bb5b-e4054d3765ad" produces the >>>> following: >>>> >>>> $ rbd -p volumes info volume-a490aa0c-6957-4ea2-bb5b-e4054d3765ad >>>> rbd: error opening image >>>> volume-a490aa0c-6957-4ea2-bb5b-e4054d3765ad: (2) No such file or >>>> directory >>>> 2016-08-09 03:04:56.250977 7fa9ba1ca760 -1 librbd::ImageCtx: error >>>> finding header: (2) No such file or directory >>>> >>>> On the other hand for a volume that already exists and is working >>>> normally since I get the following: >>>> >>>> $ rbd -p volumes info volume-2383fc3a-2b6f-49b4-a3f5-f840569edb73 >>>> rbd image volume-2383fc3a-2b6f-49b4-a3f5-f840569edb73: >>>> size 500 GB in 128000 objects >>>> order 22 (4096 kB objects) >>>> block_name_prefix: rbd_data.fb1bb3136c3ec >>>> format: 2 >>>> features: layering >>>> >>>> and can also get the OSD mapping etc. >>>> >>>> Does that mean that there is no way to find out on which OSDs the >>>> deleted volume was placed? >>>> If thats the case then its not possible to recover the data...Am I >>>> right??? >>>> >>>> Any other ideas people??? >>>> >>>> Looking forward for your comments...please... >>>> >>>> Best regards, >>>> >>>> G. >>>> _______________________________________________ >>>> ceph-users mailing list >>>> ceph-users@xxxxxxxxxxxxxx [1] >>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com [2] >> >> _______________________________________________ >> ceph-users mailing list >> ceph-users@xxxxxxxxxxxxxx >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > _______________________________________________ > ceph-users mailing list > ceph-users@xxxxxxxxxxxxxx > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com