Thanks. I just realized I keep some of the original OSD. If it contains some of the incomplete PGs , would be possible to add then into the new disks? Maybe following this steps? http://ceph.com/community/incomplete-pgs-oh-my/ El 31/01/17 a las 10:44, Maxime Guyot escribió: > Hi José, > > Too late, but you could have updated the CRUSHmap *before* moving the disks. Something like: “ceph osd crush set osd.0 0.90329 root=default rack=sala2.2 host=loki05” would move the osd.0 to loki05 and would trigger the appropriate PG movements before any physical move. Then the physical move is done as usual: set noout, stop osd, physically move, active osd, unnset noout. > > It’s a way to trigger the data movement overnight (maybe with a cron) and do the physical move at your own convenience in the morning. > > Cheers, > Maxime > > On 31/01/17 10:35, "ceph-users on behalf of José M. Martín" <ceph-users-bounces@xxxxxxxxxxxxxx on behalf of jmartin@xxxxxxxxxxxxxx> wrote: > > Already min_size = 1 > > Thanks, > Jose M. Martín > > El 31/01/17 a las 09:44, Henrik Korkuc escribió: > > I am not sure about "incomplete" part out of my head, but you can try > > setting min_size to 1 for pools toreactivate some PG, if they are > > down/inactive due to missing replicas. > > > > On 17-01-31 10:24, José M. Martín wrote: > >> # ceph -s > >> cluster 29a91870-2ed2-40dc-969e-07b22f37928b > >> health HEALTH_ERR > >> clock skew detected on mon.loki04 > >> 155 pgs are stuck inactive for more than 300 seconds > >> 7 pgs backfill_toofull > >> 1028 pgs backfill_wait > >> 48 pgs backfilling > >> 892 pgs degraded > >> 20 pgs down > >> 153 pgs incomplete > >> 2 pgs peering > >> 155 pgs stuck inactive > >> 1077 pgs stuck unclean > >> 892 pgs undersized > >> 1471 requests are blocked > 32 sec > >> recovery 3195781/36460868 objects degraded (8.765%) > >> recovery 5079026/36460868 objects misplaced (13.930%) > >> mds0: Behind on trimming (175/30) > >> noscrub,nodeep-scrub flag(s) set > >> Monitor clock skew detected > >> monmap e5: 5 mons at > >> {loki01=192.168.3.151:6789/0,loki02=192.168.3.152:6789/0,loki03=192.168.3.153:6789/0,loki04=192.168.3.154:6789/0,loki05=192.168.3.155:6789/0} > >> > >> election epoch 4028, quorum 0,1,2,3,4 > >> loki01,loki02,loki03,loki04,loki05 > >> fsmap e95494: 1/1/1 up {0=zeus2=up:active}, 1 up:standby > >> osdmap e275373: 42 osds: 42 up, 42 in; 1077 remapped pgs > >> flags noscrub,nodeep-scrub > >> pgmap v36642778: 4872 pgs, 4 pools, 24801 GB data, 17087 kobjects > >> 45892 GB used, 34024 GB / 79916 GB avail > >> 3195781/36460868 objects degraded (8.765%) > >> 5079026/36460868 objects misplaced (13.930%) > >> 3640 active+clean > >> 838 active+undersized+degraded+remapped+wait_backfill > >> 184 active+remapped+wait_backfill > >> 134 incomplete > >> 48 active+undersized+degraded+remapped+backfilling > >> 19 down+incomplete > >> 6 > >> active+undersized+degraded+remapped+wait_backfill+backfill_toofull > >> 1 active+remapped+backfill_toofull > >> 1 peering > >> 1 down+peering > >> recovery io 93909 kB/s, 10 keys/s, 67 objects/s > >> > >> > >> > >> # ceph osd tree > >> ID WEIGHT TYPE NAME UP/DOWN REWEIGHT PRIMARY-AFFINITY > >> -1 77.22777 root default > >> -9 27.14778 rack sala1 > >> -2 5.41974 host loki01 > >> 14 0.90329 osd.14 up 1.00000 1.00000 > >> 15 0.90329 osd.15 up 1.00000 1.00000 > >> 16 0.90329 osd.16 up 1.00000 1.00000 > >> 17 0.90329 osd.17 up 1.00000 1.00000 > >> 18 0.90329 osd.18 up 1.00000 1.00000 > >> 25 0.90329 osd.25 up 1.00000 1.00000 > >> -4 3.61316 host loki03 > >> 0 0.90329 osd.0 up 1.00000 1.00000 > >> 2 0.90329 osd.2 up 1.00000 1.00000 > >> 20 0.90329 osd.20 up 1.00000 1.00000 > >> 24 0.90329 osd.24 up 1.00000 1.00000 > >> -3 9.05714 host loki02 > >> 1 0.90300 osd.1 up 0.90002 1.00000 > >> 31 2.72198 osd.31 up 1.00000 1.00000 > >> 29 0.90329 osd.29 up 1.00000 1.00000 > >> 30 0.90329 osd.30 up 1.00000 1.00000 > >> 33 0.90329 osd.33 up 1.00000 1.00000 > >> 32 2.72229 osd.32 up 1.00000 1.00000 > >> -5 9.05774 host loki04 > >> 3 0.90329 osd.3 up 1.00000 1.00000 > >> 19 0.90329 osd.19 up 1.00000 1.00000 > >> 21 0.90329 osd.21 up 1.00000 1.00000 > >> 22 0.90329 osd.22 up 1.00000 1.00000 > >> 23 2.72229 osd.23 up 1.00000 1.00000 > >> 28 2.72229 osd.28 up 1.00000 1.00000 > >> -10 24.61000 rack sala2.2 > >> -6 24.61000 host loki05 > >> 5 2.73000 osd.5 up 1.00000 1.00000 > >> 6 2.73000 osd.6 up 1.00000 1.00000 > >> 9 2.73000 osd.9 up 1.00000 1.00000 > >> 10 2.73000 osd.10 up 1.00000 1.00000 > >> 11 2.73000 osd.11 up 1.00000 1.00000 > >> 12 2.73000 osd.12 up 1.00000 1.00000 > >> 13 2.73000 osd.13 up 1.00000 1.00000 > >> 4 2.73000 osd.4 up 1.00000 1.00000 > >> 8 2.73000 osd.8 up 1.00000 1.00000 > >> 7 0.03999 osd.7 up 1.00000 1.00000 > >> -12 25.46999 rack sala2.1 > >> -11 25.46999 host loki06 > >> 34 2.73000 osd.34 up 1.00000 1.00000 > >> 35 2.73000 osd.35 up 1.00000 1.00000 > >> 36 2.73000 osd.36 up 1.00000 1.00000 > >> 37 2.73000 osd.37 up 1.00000 1.00000 > >> 38 2.73000 osd.38 up 1.00000 1.00000 > >> 39 2.73000 osd.39 up 1.00000 1.00000 > >> 40 2.73000 osd.40 up 1.00000 1.00000 > >> 43 2.73000 osd.43 up 1.00000 1.00000 > >> 42 0.90999 osd.42 up 1.00000 1.00000 > >> 41 2.71999 osd.41 up 1.00000 1.00000 > >> > >> > >> # ceph pg dump > >> You can find it in this link: > >> http://ergodic.ugr.es/pgdumpoutput.txt > >> > >> > >> What I did: > >> My cluster is heterogeneous, having old oss nodes with 1TB disks and > >> new ones with 3TB. I was having problems with balance, some 1TB osd got > >> nearly full meanwhile there was plenty of space in others. My plan was > >> changing some disks to another one biggers. I started the process with > >> no problems, changing one disk. Reweight to 0.0, wait for rebalance, and > >> removed. > >> After that, searching for my problem, I read about straw2. Then, I > >> changed the algorithm editing the crush map and some data movement did. > >> My setup was not optimal, I had the journal in the xfs filesystem, so I > >> decided to change it also. First, I did it slowly, disk by disk, but as > >> rebalance take much time and my group was pushing me to finish quickly, > >> I did > >> ceph osd out osd.id > >> ceph osd crush remove osd.id > >> ceph auth del osd.id > >> ceph osd rm id > >> > >> Then umount the disks, and using ceph-deploy add then again > >> ceph-deploy disk zap loki01:/dev/sda > >> ceph-deploy osd create loki01:/dev/sda > >> > >> For every disk in rack "sala1". First, I finished loki02. Then, I did > >> this steps en loki04, loki01 and loki03 at the same time. > >> > >> Thanks, > >> -- > >> José M. Martín > >> > >> > >> El 31/01/17 a las 00:43, Shinobu Kinjo escribió: > >>> First off, the followings, please. > >>> > >>> * ceph -s > >>> * ceph osd tree > >>> * ceph pg dump > >>> > >>> and > >>> > >>> * what you actually did with exact commands. > >>> > >>> Regards, > >>> > >>> On Tue, Jan 31, 2017 at 6:10 AM, José M. Martín > >>> <jmartin@xxxxxxxxxxxxxx> wrote: > >>>> Dear list, > >>>> > >>>> I'm having some big problems with my setup. > >>>> > >>>> I was trying to increase the global capacity by changing some osds by > >>>> bigger ones. I changed them without wait the rebalance process > >>>> finished, > >>>> thinking the replicas were saved in other buckets, but I found a > >>>> lot of > >>>> PGs incomplete, so replicas of a PG were placed in a same bucket. I > >>>> have > >>>> assumed I have lost data because I zapped the disks and used in > >>>> other tasks. > >>>> > >>>> My question is: what should I do to recover as much data as possible? > >>>> I'm using the filesystem and RBD. > >>>> > >>>> Thank you so much, > >>>> > >>>> -- > >>>> > >>>> Jose M. Martín > >>>> > >>>> > >>>> _______________________________________________ > >>>> ceph-users mailing list > >>>> ceph-users@xxxxxxxxxxxxxx > >>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > >> > >> > >> _______________________________________________ > >> ceph-users mailing list > >> ceph-users@xxxxxxxxxxxxxx > >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > > > > > > _______________________________________________ > > ceph-users mailing list > > ceph-users@xxxxxxxxxxxxxx > > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > > > _______________________________________________ > ceph-users mailing list > ceph-users@xxxxxxxxxxxxxx > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > > _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com