Hi Maxime I have 3 of the original disks but I don't know which OSD correspond each one. Besides, I don't think I have enough technical skills to do that and I don't want to go worse... I'm trying to write a script that copy files from the damaged CephFS to a new location. Any help will be very gratefull José El 01/02/17 a las 07:56, Maxime Guyot escribió: > Hi José > > If you have some of the original OSDs (not zapped or erased) then you might be able to just re-add them to your cluster and have a happy cluster. > If you attempt the ceph_objectstore_tool –op export & import make sure to do it on a temporary OSD of weight 0 as recommended in the link provided. > > Either way and from what I can see inthe pg dump you provided, if you restore osd.0, osd.3, osd.20, osd.21 and osd.22 it should be enough to bring back the pg that are down. > > Cheers, > > On 31/01/17 11:48, "ceph-users on behalf of José M. Martín" <ceph-users-bounces@xxxxxxxxxxxxxx on behalf of jmartin@xxxxxxxxxxxxxx> wrote: > > Any idea of how could I recover files from the filesystem mount? > Doing a cp, it hungs when find a damaged file/folder. I would be happy > getting no damaged files > > Thanks > > El 31/01/17 a las 11:19, José M. Martín escribió: > > Thanks. > > I just realized I keep some of the original OSD. If it contains some of > > the incomplete PGs , would be possible to add then into the new disks? > > Maybe following this steps? http://ceph.com/community/incomplete-pgs-oh-my/ > > > > El 31/01/17 a las 10:44, Maxime Guyot escribió: > >> Hi José, > >> > >> Too late, but you could have updated the CRUSHmap *before* moving the disks. Something like: “ceph osd crush set osd.0 0.90329 root=default rack=sala2.2 host=loki05” would move the osd.0 to loki05 and would trigger the appropriate PG movements before any physical move. Then the physical move is done as usual: set noout, stop osd, physically move, active osd, unnset noout. > >> > >> It’s a way to trigger the data movement overnight (maybe with a cron) and do the physical move at your own convenience in the morning. > >> > >> Cheers, > >> Maxime > >> > >> On 31/01/17 10:35, "ceph-users on behalf of José M. Martín" <ceph-users-bounces@xxxxxxxxxxxxxx on behalf of jmartin@xxxxxxxxxxxxxx> wrote: > >> > >> Already min_size = 1 > >> > >> Thanks, > >> Jose M. Martín > >> > >> El 31/01/17 a las 09:44, Henrik Korkuc escribió: > >> > I am not sure about "incomplete" part out of my head, but you can try > >> > setting min_size to 1 for pools toreactivate some PG, if they are > >> > down/inactive due to missing replicas. > >> > > >> > On 17-01-31 10:24, José M. Martín wrote: > >> >> # ceph -s > >> >> cluster 29a91870-2ed2-40dc-969e-07b22f37928b > >> >> health HEALTH_ERR > >> >> clock skew detected on mon.loki04 > >> >> 155 pgs are stuck inactive for more than 300 seconds > >> >> 7 pgs backfill_toofull > >> >> 1028 pgs backfill_wait > >> >> 48 pgs backfilling > >> >> 892 pgs degraded > >> >> 20 pgs down > >> >> 153 pgs incomplete > >> >> 2 pgs peering > >> >> 155 pgs stuck inactive > >> >> 1077 pgs stuck unclean > >> >> 892 pgs undersized > >> >> 1471 requests are blocked > 32 sec > >> >> recovery 3195781/36460868 objects degraded (8.765%) > >> >> recovery 5079026/36460868 objects misplaced (13.930%) > >> >> mds0: Behind on trimming (175/30) > >> >> noscrub,nodeep-scrub flag(s) set > >> >> Monitor clock skew detected > >> >> monmap e5: 5 mons at > >> >> {loki01=192.168.3.151:6789/0,loki02=192.168.3.152:6789/0,loki03=192.168.3.153:6789/0,loki04=192.168.3.154:6789/0,loki05=192.168.3.155:6789/0} > >> >> > >> >> election epoch 4028, quorum 0,1,2,3,4 > >> >> loki01,loki02,loki03,loki04,loki05 > >> >> fsmap e95494: 1/1/1 up {0=zeus2=up:active}, 1 up:standby > >> >> osdmap e275373: 42 osds: 42 up, 42 in; 1077 remapped pgs > >> >> flags noscrub,nodeep-scrub > >> >> pgmap v36642778: 4872 pgs, 4 pools, 24801 GB data, 17087 kobjects > >> >> 45892 GB used, 34024 GB / 79916 GB avail > >> >> 3195781/36460868 objects degraded (8.765%) > >> >> 5079026/36460868 objects misplaced (13.930%) > >> >> 3640 active+clean > >> >> 838 active+undersized+degraded+remapped+wait_backfill > >> >> 184 active+remapped+wait_backfill > >> >> 134 incomplete > >> >> 48 active+undersized+degraded+remapped+backfilling > >> >> 19 down+incomplete > >> >> 6 > >> >> active+undersized+degraded+remapped+wait_backfill+backfill_toofull > >> >> 1 active+remapped+backfill_toofull > >> >> 1 peering > >> >> 1 down+peering > >> >> recovery io 93909 kB/s, 10 keys/s, 67 objects/s > >> >> > >> >> > >> >> > >> >> # ceph osd tree > >> >> ID WEIGHT TYPE NAME UP/DOWN REWEIGHT PRIMARY-AFFINITY > >> >> -1 77.22777 root default > >> >> -9 27.14778 rack sala1 > >> >> -2 5.41974 host loki01 > >> >> 14 0.90329 osd.14 up 1.00000 1.00000 > >> >> 15 0.90329 osd.15 up 1.00000 1.00000 > >> >> 16 0.90329 osd.16 up 1.00000 1.00000 > >> >> 17 0.90329 osd.17 up 1.00000 1.00000 > >> >> 18 0.90329 osd.18 up 1.00000 1.00000 > >> >> 25 0.90329 osd.25 up 1.00000 1.00000 > >> >> -4 3.61316 host loki03 > >> >> 0 0.90329 osd.0 up 1.00000 1.00000 > >> >> 2 0.90329 osd.2 up 1.00000 1.00000 > >> >> 20 0.90329 osd.20 up 1.00000 1.00000 > >> >> 24 0.90329 osd.24 up 1.00000 1.00000 > >> >> -3 9.05714 host loki02 > >> >> 1 0.90300 osd.1 up 0.90002 1.00000 > >> >> 31 2.72198 osd.31 up 1.00000 1.00000 > >> >> 29 0.90329 osd.29 up 1.00000 1.00000 > >> >> 30 0.90329 osd.30 up 1.00000 1.00000 > >> >> 33 0.90329 osd.33 up 1.00000 1.00000 > >> >> 32 2.72229 osd.32 up 1.00000 1.00000 > >> >> -5 9.05774 host loki04 > >> >> 3 0.90329 osd.3 up 1.00000 1.00000 > >> >> 19 0.90329 osd.19 up 1.00000 1.00000 > >> >> 21 0.90329 osd.21 up 1.00000 1.00000 > >> >> 22 0.90329 osd.22 up 1.00000 1.00000 > >> >> 23 2.72229 osd.23 up 1.00000 1.00000 > >> >> 28 2.72229 osd.28 up 1.00000 1.00000 > >> >> -10 24.61000 rack sala2.2 > >> >> -6 24.61000 host loki05 > >> >> 5 2.73000 osd.5 up 1.00000 1.00000 > >> >> 6 2.73000 osd.6 up 1.00000 1.00000 > >> >> 9 2.73000 osd.9 up 1.00000 1.00000 > >> >> 10 2.73000 osd.10 up 1.00000 1.00000 > >> >> 11 2.73000 osd.11 up 1.00000 1.00000 > >> >> 12 2.73000 osd.12 up 1.00000 1.00000 > >> >> 13 2.73000 osd.13 up 1.00000 1.00000 > >> >> 4 2.73000 osd.4 up 1.00000 1.00000 > >> >> 8 2.73000 osd.8 up 1.00000 1.00000 > >> >> 7 0.03999 osd.7 up 1.00000 1.00000 > >> >> -12 25.46999 rack sala2.1 > >> >> -11 25.46999 host loki06 > >> >> 34 2.73000 osd.34 up 1.00000 1.00000 > >> >> 35 2.73000 osd.35 up 1.00000 1.00000 > >> >> 36 2.73000 osd.36 up 1.00000 1.00000 > >> >> 37 2.73000 osd.37 up 1.00000 1.00000 > >> >> 38 2.73000 osd.38 up 1.00000 1.00000 > >> >> 39 2.73000 osd.39 up 1.00000 1.00000 > >> >> 40 2.73000 osd.40 up 1.00000 1.00000 > >> >> 43 2.73000 osd.43 up 1.00000 1.00000 > >> >> 42 0.90999 osd.42 up 1.00000 1.00000 > >> >> 41 2.71999 osd.41 up 1.00000 1.00000 > >> >> > >> >> > >> >> # ceph pg dump > >> >> You can find it in this link: > >> >> http://ergodic.ugr.es/pgdumpoutput.txt > >> >> > >> >> > >> >> What I did: > >> >> My cluster is heterogeneous, having old oss nodes with 1TB disks and > >> >> new ones with 3TB. I was having problems with balance, some 1TB osd got > >> >> nearly full meanwhile there was plenty of space in others. My plan was > >> >> changing some disks to another one biggers. I started the process with > >> >> no problems, changing one disk. Reweight to 0.0, wait for rebalance, and > >> >> removed. > >> >> After that, searching for my problem, I read about straw2. Then, I > >> >> changed the algorithm editing the crush map and some data movement did. > >> >> My setup was not optimal, I had the journal in the xfs filesystem, so I > >> >> decided to change it also. First, I did it slowly, disk by disk, but as > >> >> rebalance take much time and my group was pushing me to finish quickly, > >> >> I did > >> >> ceph osd out osd.id > >> >> ceph osd crush remove osd.id > >> >> ceph auth del osd.id > >> >> ceph osd rm id > >> >> > >> >> Then umount the disks, and using ceph-deploy add then again > >> >> ceph-deploy disk zap loki01:/dev/sda > >> >> ceph-deploy osd create loki01:/dev/sda > >> >> > >> >> For every disk in rack "sala1". First, I finished loki02. Then, I did > >> >> this steps en loki04, loki01 and loki03 at the same time. > >> >> > >> >> Thanks, > >> >> -- > >> >> José M. Martín > >> >> > >> >> > >> >> El 31/01/17 a las 00:43, Shinobu Kinjo escribió: > >> >>> First off, the followings, please. > >> >>> > >> >>> * ceph -s > >> >>> * ceph osd tree > >> >>> * ceph pg dump > >> >>> > >> >>> and > >> >>> > >> >>> * what you actually did with exact commands. > >> >>> > >> >>> Regards, > >> >>> > >> >>> On Tue, Jan 31, 2017 at 6:10 AM, José M. Martín > >> >>> <jmartin@xxxxxxxxxxxxxx> wrote: > >> >>>> Dear list, > >> >>>> > >> >>>> I'm having some big problems with my setup. > >> >>>> > >> >>>> I was trying to increase the global capacity by changing some osds by > >> >>>> bigger ones. I changed them without wait the rebalance process > >> >>>> finished, > >> >>>> thinking the replicas were saved in other buckets, but I found a > >> >>>> lot of > >> >>>> PGs incomplete, so replicas of a PG were placed in a same bucket. I > >> >>>> have > >> >>>> assumed I have lost data because I zapped the disks and used in > >> >>>> other tasks. > >> >>>> > >> >>>> My question is: what should I do to recover as much data as possible? > >> >>>> I'm using the filesystem and RBD. > >> >>>> > >> >>>> Thank you so much, > >> >>>> > >> >>>> -- > >> >>>> > >> >>>> Jose M. Martín > >> >>>> > >> >>>> > >> >>>> _______________________________________________ > >> >>>> ceph-users mailing list > >> >>>> ceph-users@xxxxxxxxxxxxxx > >> >>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > >> >> > >> >> > >> >> _______________________________________________ > >> >> ceph-users mailing list > >> >> ceph-users@xxxxxxxxxxxxxx > >> >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > >> > > >> > > >> > _______________________________________________ > >> > ceph-users mailing list > >> > ceph-users@xxxxxxxxxxxxxx > >> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > >> > >> > >> _______________________________________________ > >> ceph-users mailing list > >> ceph-users@xxxxxxxxxxxxxx > >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > >> > >> > > > > _______________________________________________ > > ceph-users mailing list > > ceph-users@xxxxxxxxxxxxxx > > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > > > _______________________________________________ > ceph-users mailing list > ceph-users@xxxxxxxxxxxxxx > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > > _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com