Shall host weight auto reduce on hdd failure?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On hdd failure the number of placement groups on the rest of osds on the
same host goes up. I would expect equal distribution of failed placement
groups across the cluster, not just on the troubled host. Shall the host
weight auto reduce whenever an osd gets out?

Exibit 1: Attached osd-df-tree file. Number of placement groups per osd
on healthy nodes across the cluster is around 160, see osd050 and
osd056. Number of placement groups per osd on nodes with hdd failures
goes noticeably up, more so as more hdd failures happen on the same
node, see osd051 and osd053.

This cluster can handle this case at this moment as it has got plenty of
free space. I wonder how is this going to play out when we get to 90% of
usage on the whole cluster. A single backplane failure in a node takes
four drives out at once; that is 30% of storage space on a node. The
whole cluster would have enough space to host the failed placement
groups but one node would not.

This cluster is running Nautilus 14.2.0 with default settings deployed
using ceph-ansible.


Milan


-- 
Milan Kupcevic
Senior Cyberinfrastructure Engineer at Project NESE
Harvard University
FAS Research Computing


> ceph osd df tree name osd050
ID   CLASS WEIGHT    REWEIGHT SIZE    RAW USE DATA     OMAP    META    AVAIL   %USE  VAR  PGS STATUS TYPE NAME       
-130       110.88315        - 111 TiB 6.0 TiB  4.7 TiB 563 MiB  21 GiB 105 TiB  5.39 1.00   -            host osd050 
 517   hdd   9.20389  1.00000 9.2 TiB 442 GiB  329 GiB  16 KiB 1.7 GiB 8.8 TiB  4.69 0.87 157     up         osd.517 
 532   hdd   9.20389  1.00000 9.2 TiB 465 GiB  352 GiB  32 KiB 1.8 GiB 8.7 TiB  4.94 0.92 170     up         osd.532 
 544   hdd   9.20389  1.00000 9.2 TiB 447 GiB  334 GiB  32 KiB 1.8 GiB 8.8 TiB  4.74 0.88 153     up         osd.544 
 562   hdd   9.20389  1.00000 9.2 TiB 440 GiB  328 GiB  64 KiB 1.5 GiB 8.8 TiB  4.67 0.87 159     up         osd.562 
 575   hdd   9.20389  1.00000 9.2 TiB 479 GiB  366 GiB  88 KiB 1.9 GiB 8.7 TiB  5.08 0.94 175     up         osd.575 
 592   hdd   9.20389  1.00000 9.2 TiB 434 GiB  321 GiB  24 KiB 1.4 GiB 8.8 TiB  4.60 0.85 153     up         osd.592 
 605   hdd   9.20389  1.00000 9.2 TiB 456 GiB  343 GiB     0 B 1.5 GiB 8.8 TiB  4.84 0.90 170     up         osd.605 
 618   hdd   9.20389  1.00000 9.2 TiB 473 GiB  360 GiB  16 KiB 1.6 GiB 8.7 TiB  5.01 0.93 172     up         osd.618 
 631   hdd   9.20389  1.00000 9.2 TiB 461 GiB  348 GiB  44 KiB 1.5 GiB 8.8 TiB  4.89 0.91 165     up         osd.631 
 644   hdd   9.20389  1.00000 9.2 TiB 459 GiB  346 GiB  92 KiB 1.7 GiB 8.8 TiB  4.87 0.90 163     up         osd.644 
 656   hdd   9.20389  1.00000 9.2 TiB 433 GiB  320 GiB  68 KiB 1.4 GiB 8.8 TiB  4.59 0.85 156     up         osd.656 
 669   hdd   9.20389  1.00000 9.2 TiB 1.1 TiB 1019 GiB  36 KiB 2.6 GiB 8.1 TiB 12.01 2.23 169     up         osd.669 
 682   ssd   0.43649  1.00000 447 GiB 3.1 GiB  2.1 GiB 562 MiB 462 MiB 444 GiB  0.69 0.13 168     up         osd.682 
                        TOTAL 111 TiB 6.0 TiB  4.7 TiB 563 MiB  21 GiB 105 TiB  5.39                                 
MIN/MAX VAR: 0.13/2.23  STDDEV: 2.32

> ceph osd df tree name osd051
ID   CLASS WEIGHT    REWEIGHT SIZE    RAW USE DATA    OMAP    META    AVAIL   %USE VAR  PGS STATUS TYPE NAME       
-148       110.88315        -  83 TiB 4.9 TiB 4.0 TiB 573 MiB  20 GiB  78 TiB 5.94 1.00   -            host osd051 
 408   hdd   9.20389        0     0 B     0 B     0 B     0 B     0 B     0 B    0    0   0   down         osd.408 
 538   hdd   9.20389  1.00000 9.2 TiB 542 GiB 429 GiB  24 KiB 2.4 GiB 8.7 TiB 5.75 0.97 212     up         osd.538 
 552   hdd   9.20389        0     0 B     0 B     0 B     0 B     0 B     0 B    0    0   0   down         osd.552 
 565   hdd   9.20389        0     0 B     0 B     0 B     0 B     0 B     0 B    0    0   0   down         osd.565 
 578   hdd   9.20389  1.00000 9.2 TiB 557 GiB 444 GiB  56 KiB 2.0 GiB 8.7 TiB 5.91 0.99 213     up         osd.578 
 590   hdd   9.20389  1.00000 9.2 TiB 533 GiB 420 GiB  34 KiB 2.4 GiB 8.7 TiB 5.66 0.95 212     up         osd.590 
 603   hdd   9.20389  1.00000 9.2 TiB 562 GiB 449 GiB  76 KiB 2.2 GiB 8.7 TiB 5.96 1.00 218     up         osd.603 
 616   hdd   9.20389  1.00000 9.2 TiB 553 GiB 440 GiB  16 KiB 2.2 GiB 8.7 TiB 5.86 0.99 217     up         osd.616 
 629   hdd   9.20389  1.00000 9.2 TiB 579 GiB 466 GiB  40 KiB 2.0 GiB 8.6 TiB 6.14 1.03 228     up         osd.629 
 642   hdd   9.20389  1.00000 9.2 TiB 588 GiB 475 GiB  40 KiB 2.6 GiB 8.6 TiB 6.23 1.05 228     up         osd.642 
 655   hdd   9.20389  1.00000 9.2 TiB 583 GiB 470 GiB  32 KiB 2.3 GiB 8.6 TiB 6.18 1.04 223     up         osd.655 
 668   hdd   9.20389  1.00000 9.2 TiB 570 GiB 457 GiB  32 KiB 1.9 GiB 8.6 TiB 6.05 1.02 229     up         osd.668 
 681   ssd   0.43649  1.00000 447 GiB 3.1 GiB 2.1 GiB 573 MiB 451 MiB 444 GiB 0.69 0.12 167     up         osd.681 
                        TOTAL  83 TiB 4.9 TiB 4.0 TiB 573 MiB  20 GiB  78 TiB 5.94                                 
MIN/MAX VAR: 0.12/1.05  STDDEV: 1.67

> ceph osd df tree name osd053
ID   CLASS WEIGHT    REWEIGHT SIZE    RAW USE DATA    OMAP    META    AVAIL   %USE VAR  PGS STATUS TYPE NAME       
-136       110.88315        -  74 TiB 4.8 TiB 4.0 TiB 447 MiB  18 GiB  69 TiB 6.53 1.00   -            host osd053 
 519   hdd   9.20389  1.00000 9.2 TiB 665 GiB 552 GiB  52 KiB 2.2 GiB 8.6 TiB 7.05 1.08 256     up         osd.519 
 534   hdd   9.20389  1.00000 9.2 TiB 654 GiB 541 GiB  44 KiB 2.2 GiB 8.6 TiB 6.94 1.06 261     up         osd.534 
 546   hdd   9.20389  1.00000 9.2 TiB 641 GiB 528 GiB  46 KiB 2.2 GiB 8.6 TiB 6.80 1.04 251     up         osd.546 
 558   hdd   9.20389  1.00000 9.2 TiB 581 GiB 468 GiB  20 KiB 2.0 GiB 8.6 TiB 6.17 0.95 232     up         osd.558 
 571   hdd   9.20389  1.00000 9.2 TiB 594 GiB 481 GiB  68 KiB 2.1 GiB 8.6 TiB 6.30 0.97 240     up         osd.571 
 583   hdd   9.20389  1.00000 9.2 TiB 664 GiB 551 GiB  68 KiB 2.5 GiB 8.6 TiB 7.05 1.08 268     up         osd.583 
 596   hdd   9.20389  1.00000 9.2 TiB 569 GiB 456 GiB   8 KiB 2.4 GiB 8.6 TiB 6.04 0.93 218     up         osd.596 
 609   hdd   9.20389  1.00000 9.2 TiB 580 GiB 467 GiB   8 KiB 2.2 GiB 8.6 TiB 6.15 0.94 231     up         osd.609 
 622   hdd   9.20389        0     0 B     0 B     0 B     0 B     0 B     0 B    0    0   0   down         osd.622 
 635   hdd   9.20389        0     0 B     0 B     0 B     0 B     0 B     0 B    0    0   0   down         osd.635 
 648   hdd   9.20389        0     0 B     0 B     0 B     0 B     0 B     0 B    0    0   0   down         osd.648 
 661   hdd   9.20389        0     0 B     0 B     0 B     0 B     0 B     0 B    0    0   0   down         osd.661 
 674   ssd   0.43649  1.00000 447 GiB 3.1 GiB 2.1 GiB 447 MiB 577 MiB 444 GiB 0.70 0.11 143     up         osd.674 
                        TOTAL  74 TiB 4.8 TiB 4.0 TiB 447 MiB  18 GiB  69 TiB 6.53                                 
MIN/MAX VAR: 0.11/1.08  STDDEV: 1.98

> ceph osd df tree name osd056
ID   CLASS WEIGHT    REWEIGHT SIZE    RAW USE DATA    OMAP    META    AVAIL   %USE VAR  PGS STATUS TYPE NAME       
-160       110.88315        - 111 TiB 5.2 TiB 3.9 TiB 478 MiB  17 GiB 106 TiB 4.70 1.00   -            host osd056 
 528   hdd   9.20389  1.00000 9.2 TiB 450 GiB 337 GiB     0 B 1.5 GiB 8.8 TiB 4.78 1.02 163     up         osd.528 
 542   hdd   9.20389  1.00000 9.2 TiB 422 GiB 309 GiB  28 KiB 1.2 GiB 8.8 TiB 4.48 0.95 145     up         osd.542 
 555   hdd   9.20389  1.00000 9.2 TiB 397 GiB 284 GiB     0 B 1.1 GiB 8.8 TiB 4.21 0.90 147     up         osd.555 
 568   hdd   9.20389  1.00000 9.2 TiB 400 GiB 287 GiB  32 KiB 1.1 GiB 8.8 TiB 4.25 0.90 141     up         osd.568 
 579   hdd   9.20389  1.00000 9.2 TiB 532 GiB 419 GiB     0 B 1.5 GiB 8.7 TiB 5.64 1.20 196     up         osd.579 
 591   hdd   9.20389  1.00000 9.2 TiB 451 GiB 339 GiB 112 KiB 1.5 GiB 8.8 TiB 4.79 1.02 163     up         osd.591 
 604   hdd   9.20389  1.00000 9.2 TiB 463 GiB 350 GiB  64 KiB 1.2 GiB 8.8 TiB 4.92 1.05 168     up         osd.604 
 617   hdd   9.20389  1.00000 9.2 TiB 455 GiB 343 GiB  12 KiB 1.3 GiB 8.8 TiB 4.83 1.03 168     up         osd.617 
 630   hdd   9.20389  1.00000 9.2 TiB 407 GiB 294 GiB  32 KiB 1.6 GiB 8.8 TiB 4.32 0.92 151     up         osd.630 
 643   hdd   9.20389  1.00000 9.2 TiB 447 GiB 335 GiB  16 KiB 1.3 GiB 8.8 TiB 4.75 1.01 152     up         osd.643 
 659   hdd   9.20389  1.00000 9.2 TiB 464 GiB 351 GiB  20 KiB 1.3 GiB 8.8 TiB 4.92 1.05 167     up         osd.659 
 672   hdd   9.20389  1.00000 9.2 TiB 441 GiB 328 GiB  44 KiB 1.4 GiB 8.8 TiB 4.68 1.00 158     up         osd.672 
 685   ssd   0.43649  1.00000 447 GiB 3.1 GiB 2.1 GiB 478 MiB 546 MiB 444 GiB 0.70 0.15 156     up         osd.685 
                        TOTAL 111 TiB 5.2 TiB 3.9 TiB 478 MiB  17 GiB 106 TiB 4.70                                 
MIN/MAX VAR: 0.15/1.20  STDDEV: 1.17
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux