i set up my test cluster many years ago with only 3 OSDs and never increased the PGs :-) I plan on doing so after its healthy again... it's long overdue... maybe 512 :-)
and yes that's what i thought too.. it should have more than enough space to move data .. hmm...
i wouldn't be surprised if it fixes itself after recovery.. but still would be nice to know whats going on.....
And the initial degraded still confuses me...
by the way.. i'm on mimic :-) latest version from today. 13.2.1
Sebastian
On Sat, Jul 28, 2018 at 12:03 PM Sinan Polat <sinan@xxxxxxxx> wrote:
Ceph has tried to (re)balance your data, backfill_toofull means no available space to move data, but you have plenty of space.Why do you have so little pgs? I would increase the amount of pgs, but before doing so lets see what others will say.SinanHi,i added 4 more OSDs on my 4 node Test Cluster and now i'm in HEALTH_ERR state. Right now its still recovering, but still, should this happen ? None of my OSDs are full. Maybe i need more PGs ? But since my %USE is < 40% it should be still ok to recover without HEALTH_ERR ?data:pools: 7 pools, 484 pgsobjects: 2.70 M objects, 10 TiBusage: 31 TiB used, 114 TiB / 146 TiB availpgs: 2422839/8095065 objects misplaced (29.930%)343 active+clean101 active+remapped+backfill_wait39 active+remapped+backfilling1 active+remapped+backfill_wait+backfill_toofullio:recovery: 315 MiB/s, 78 objects/sceph osd dfID CLASS WEIGHT REWEIGHT SIZE USE AVAIL %USE VAR PGS0 hdd 2.72890 1.00000 2.7 TiB 975 GiB 1.8 TiB 34.89 1.62 311 hdd 2.72899 1.00000 2.7 TiB 643 GiB 2.1 TiB 23.00 1.07 368 hdd 7.27739 1.00000 7.3 TiB 1.7 TiB 5.5 TiB 23.85 1.11 8312 hdd 7.27730 1.00000 7.3 TiB 1.1 TiB 6.2 TiB 14.85 0.69 8116 hdd 7.27730 1.00000 7.3 TiB 2.0 TiB 5.3 TiB 27.68 1.29 7420 hdd 9.09569 1.00000 9.1 TiB 108 GiB 9.0 TiB 1.16 0.05 432 hdd 2.72899 1.00000 2.7 TiB 878 GiB 1.9 TiB 31.40 1.46 363 hdd 2.72899 1.00000 2.7 TiB 783 GiB 2.0 TiB 28.02 1.30 399 hdd 7.27739 1.00000 7.3 TiB 2.0 TiB 5.3 TiB 27.58 1.28 8513 hdd 7.27730 1.00000 7.3 TiB 2.2 TiB 5.1 TiB 30.10 1.40 7817 hdd 7.27730 1.00000 7.3 TiB 2.1 TiB 5.2 TiB 28.23 1.31 8421 hdd 9.09569 1.00000 9.1 TiB 192 GiB 8.9 TiB 2.06 0.10 414 hdd 2.72899 1.00000 2.7 TiB 927 GiB 1.8 TiB 33.18 1.54 345 hdd 2.72899 1.00000 2.7 TiB 1.0 TiB 1.7 TiB 37.57 1.75 2810 hdd 7.27739 1.00000 7.3 TiB 2.2 TiB 5.0 TiB 30.66 1.43 8714 hdd 7.27730 1.00000 7.3 TiB 1.8 TiB 5.5 TiB 24.23 1.13 8918 hdd 7.27730 1.00000 7.3 TiB 2.5 TiB 4.8 TiB 33.83 1.57 9322 hdd 9.09569 1.00000 9.1 TiB 210 GiB 8.9 TiB 2.26 0.10 446 hdd 2.72899 1.00000 2.7 TiB 350 GiB 2.4 TiB 12.51 0.58 217 hdd 2.72899 1.00000 2.7 TiB 980 GiB 1.8 TiB 35.07 1.63 3511 hdd 7.27739 1.00000 7.3 TiB 2.8 TiB 4.4 TiB 39.14 1.82 9915 hdd 7.27730 1.00000 7.3 TiB 1.6 TiB 5.6 TiB 22.49 1.05 8219 hdd 7.27730 1.00000 7.3 TiB 2.1 TiB 5.2 TiB 28.49 1.32 7723 hdd 9.09569 1.00000 9.1 TiB 285 GiB 8.8 TiB 3.06 0.14 52TOTAL 146 TiB 31 TiB 114 TiB 21.51MIN/MAX VAR: 0.05/1.82 STDDEV: 11.78Right after adding the osds it showed degraded for a few minutes, since all my pools have a redundancy of 3 and i'm adding osd i'm a bit confused why this happens ? I get why it's misplaced, but undersized and degraded ?pgs: 4611/8095032 objects degraded (0.057%)2626460/8095032 objects misplaced (32.445%)215 active+clean192 active+remapped+backfill_wait26 active+recovering+undersized+remapped17 active+recovery_wait+undersized+degraded+remapped16 active+recovering11 active+recovery_wait+degraded6 active+remapped+backfilling1 active+remapped+backfill_toofullMaybe someone can give me some pointers on what i'm missing to understand whats happening here ?Thanks!Sebastian_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com