Re: Adding multiple OSD

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Thank you for detailed explanation!

Got one another doubt, 

This is the total space available in the cluster : 

TOTAL : 23490G 
Use  : 10170G 
Avail : 13320G 


But ecpool shows max avail as just 3 TB. What am I missing ?

==========


$ ceph df
GLOBAL:
    SIZE       AVAIL      RAW USED     %RAW USED
    23490G     13338G       10151G         43.22
POOLS:
    NAME            ID     USED      %USED     MAX AVAIL     OBJECTS
    ostemplates     1       162G      2.79         1134G       42084
    imagepool       34      122G      2.11         1891G       34196
    cvm1            54      8058         0         1891G         950
    ecpool1         55     4246G     42.77         3546G     1232590


$ ceph osd df
ID CLASS WEIGHT  REWEIGHT SIZE   USE    AVAIL  %USE  VAR  PGS
 0   ssd 1.86469  1.00000  1909G   625G  1284G 32.76 0.76 201
 1   ssd 1.86469  1.00000  1909G   691G  1217G 36.23 0.84 208
 2   ssd 0.87320  1.00000   894G   587G   306G 65.67 1.52 156
11   ssd 0.87320  1.00000   894G   631G   262G 70.68 1.63 186
 3   ssd 0.87320  1.00000   894G   605G   288G 67.73 1.56 165
14   ssd 0.87320  1.00000   894G   635G   258G 71.07 1.64 177
 4   ssd 0.87320  1.00000   894G   419G   474G 46.93 1.08 127
15   ssd 0.87320  1.00000   894G   373G   521G 41.73 0.96 114
16   ssd 0.87320  1.00000   894G   492G   401G 55.10 1.27 149
 5   ssd 0.87320  1.00000   894G   288G   605G 32.25 0.74  87
 6   ssd 0.87320  1.00000   894G   342G   551G 38.28 0.88 102
 7   ssd 0.87320  1.00000   894G   300G   593G 33.61 0.78  93
22   ssd 0.87320  1.00000   894G   343G   550G 38.43 0.89 104
 8   ssd 0.87320  1.00000   894G   267G   626G 29.90 0.69  77
 9   ssd 0.87320  1.00000   894G   376G   518G 42.06 0.97 118
10   ssd 0.87320  1.00000   894G   322G   571G 36.12 0.83 102
19   ssd 0.87320  1.00000   894G   339G   554G 37.95 0.88 109
12   ssd 0.87320  1.00000   894G   360G   534G 40.26 0.93 112
13   ssd 0.87320  1.00000   894G   404G   489G 45.21 1.04 120
20   ssd 0.87320  1.00000   894G   342G   551G 38.29 0.88 103
23   ssd 0.87320  1.00000   894G   148G   745G 16.65 0.38  61
17   ssd 0.87320  1.00000   894G   423G   470G 47.34 1.09 117
18   ssd 0.87320  1.00000   894G   403G   490G 45.18 1.04 120
21   ssd 0.87320  1.00000   894G   444G   450G 49.67 1.15 130
                    TOTAL 23490G 10170G 13320G 43.30



Karun Josy

On Tue, Dec 5, 2017 at 4:42 AM, Karun Josy <karunjosy1@xxxxxxxxx> wrote:
Thank you for detailed explanation!

Got one another doubt, 

This is the total space available in the cluster : 

TOTAL 23490G 
Use 10170G 
Avail : 13320G 


But ecpool shows max avail as just 3 TB. 



Karun Josy

On Tue, Dec 5, 2017 at 1:06 AM, David Turner <drakonstein@xxxxxxxxx> wrote:
No, I would only add disks to 1 failure domain at a time.  So in your situation where you're adding 2 more disks to each node, I would recommend adding the 2 disks into 1 node at a time.  Your failure domain is the crush-failure-domain=host.  So you can lose a host and only lose 1 copy of the data.  If all of your pools are using the k=5 m=3 profile, then I would say it's fine to add the disks into 2 nodes at a time.  If you have any replica pools for RGW metadata or anything, then I would stick with the 1 host at a time.

On Mon, Dec 4, 2017 at 2:29 PM Karun Josy <karunjosy1@xxxxxxxxx> wrote:
Thanks for your reply!

I am using erasure coded profile with k=5, m=3 settings

$ ceph osd erasure-code-profile get profile5by3
crush-device-class=
crush-failure-domain=host
crush-root=default
jerasure-per-chunk-alignment=false
k=5
m=3
plugin=jerasure
technique=reed_sol_van
w=8


Cluster has 8 nodes, with 3 disks each. We are planning to add 2 more on each nodes.

If I understand correctly, then I can add 3 disks at once right , assuming 3 disks can fail at a time as per the ec code profile. 

Karun Josy

On Tue, Dec 5, 2017 at 12:06 AM, David Turner <drakonstein@xxxxxxxxx> wrote:
Depending on how well you burn-in/test your new disks, I like to only add 1 failure domain of disks at a time in case you have bad disks that you're adding.  If you are confident that your disks aren't likely to fail during the backfilling, then you can go with more.  I just added 8 servers (16 OSDs each) to a cluster with 15 servers (16 OSDs each) all at the same time, but we spent 2 weeks testing the hardware before adding the new nodes to the cluster.

If you add 1 failure domain at a time, then any DoA disks in the new nodes will only be able to fail with 1 copy of your data instead of across multiple nodes.

On Mon, Dec 4, 2017 at 12:54 PM Karun Josy <karunjosy1@xxxxxxxxx> wrote:
Hi,

Is it recommended to add OSD disks one by one or can I add couple of disks at a time ?

Current cluster size is about 4 TB.



Karun 
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux