Re: add hard drives to 3 CEPH servers (3 server cluster)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Thanks David.
Thanks again Cary.

If I have 
682 GB used, 12998 GB / 13680 GB avail,
then I still need to divide 13680/3 (my replication setting) to get what my total storage really is, right?

Thanks!


James Okken
Lab Manager
Dialogic Research Inc.
4 Gatehall Drive
Parsippany
NJ 07054
USA

Tel:       973 967 5179
Email:   james.okken@xxxxxxxxxxxx
Web:    www.dialogic.com – The Network Fuel Company

This e-mail is intended only for the named recipient(s) and may contain information that is privileged, confidential and/or exempt from disclosure under applicable law. No waiver of privilege, confidence or otherwise is intended by virtue of communication via the internet. Any unauthorized use, dissemination or copying is strictly prohibited. If you have received this e-mail in error, or are not named as a recipient, please immediately notify the sender and destroy all copies of this e-mail.


-----Original Message-----
From: Cary [mailto:dynamic.cary@xxxxxxxxx] 
Sent: Friday, December 15, 2017 5:56 PM
To: David Turner
Cc: James Okken; ceph-users@xxxxxxxxxxxxxx
Subject: Re:  add hard drives to 3 CEPH servers (3 server cluster)

James,

You can set these values in ceph.conf.

[global]
...
osd pool default size         = 3
osd pool default min size  = 2
...

New pools that are created will use those values.

If you run a "ceph -s"  and look at the "usage" line, it shows how much space is: 1 used, 2 available, 3 total. ie.

usage:   19465 GB used, 60113 GB / 79578 GB avail

We choose to use Openstack with Ceph in this decade and do the other things, not because they are easy, but because they are hard...;-p


Cary
-Dynamic

On Fri, Dec 15, 2017 at 10:12 PM, David Turner <drakonstein@xxxxxxxxx> wrote:
> In conjunction with increasing the pool size to 3, also increase the 
> pool min_size to 2.  `ceph df` and `ceph osd df` will eventually show 
> the full size in use in your cluster.  In particular the output of 
> `ceph df` with available size in a pool takes into account the pools replication size.
> Continue watching ceph -s or ceph -w to see when the backfilling for 
> your change to replication size finishes.
>
> On Fri, Dec 15, 2017 at 5:06 PM James Okken <James.Okken@xxxxxxxxxxxx>
> wrote:
>>
>> This whole effort went extremely well, thanks to Cary, and Im not 
>> used to that with CEPH so far. (And openstack ever....) Thank you 
>> Cary.
>>
>> Ive upped the replication factor and now I see "replicated size 3" in 
>> each of my pools. Is this the only place to check replication level? 
>> Is there a Global setting or only a setting per Pool?
>>
>> ceph osd pool ls detail
>> pool 0 'rbd' replicated size 3......
>> pool 1 'images' replicated size 3...
>> ...
>>
>> One last question!
>> At this replication level how can I tell how much total space I 
>> actually have now?
>> Do I just 1/3 the Global size?
>>
>> ceph df
>> GLOBAL:
>>     SIZE       AVAIL      RAW USED     %RAW USED
>>     13680G     12998G         682G          4.99
>> POOLS:
>>     NAME        ID     USED     %USED     MAX AVAIL     OBJECTS
>>     rbd         0         0         0         6448G           0
>>     images      1      216G      3.24         6448G       27745
>>     backups     2         0         0         6448G           0
>>     volumes     3      117G      1.79         6448G       30441
>>     compute     4         0         0         6448G           0
>>
>> ceph osd df
>> ID WEIGHT  REWEIGHT SIZE   USE    AVAIL  %USE VAR  PGS
>>  0 0.81689  1.00000   836G 36549M   800G 4.27 0.86  67
>>  4 3.70000  1.00000  3723G   170G  3553G 4.58 0.92 270
>>  1 0.81689  1.00000   836G 49612M   788G 5.79 1.16  56
>>  5 3.70000  1.00000  3723G   192G  3531G 5.17 1.04 282
>>  2 0.81689  1.00000   836G 33639M   803G 3.93 0.79  58
>>  3 3.70000  1.00000  3723G   202G  3521G 5.43 1.09 291
>>               TOTAL 13680G   682G 12998G 4.99
>> MIN/MAX VAR: 0.79/1.16  STDDEV: 0.67
>>
>> Thanks!
>>
>> -----Original Message-----
>> From: Cary [mailto:dynamic.cary@xxxxxxxxx]
>> Sent: Friday, December 15, 2017 4:05 PM
>> To: James Okken
>> Cc: ceph-users@xxxxxxxxxxxxxx
>> Subject: Re:  add hard drives to 3 CEPH servers (3 server
>> cluster)
>>
>> James,
>>
>>  Those errors are normal. Ceph creates the missing files. You can 
>> check "/var/lib/ceph/osd/ceph-6", before and after you run those 
>> commands to see what files are added there.
>>
>>  Make sure you get the replication factor set.
>>
>>
>> Cary
>> -Dynamic
>>
>> On Fri, Dec 15, 2017 at 6:11 PM, James Okken 
>> <James.Okken@xxxxxxxxxxxx>
>> wrote:
>> > Thanks again Cary,
>> >
>> > Yes, once all the backfilling was done I was back to a Healthy cluster.
>> > I moved on to the same steps for the next server in the cluster, it 
>> > is backfilling now.
>> > Once that is done I will do the last server in the cluster, and 
>> > then I think I am done!
>> >
>> > Just checking on one thing. I get these messages when running this 
>> > command. I assume this is OK, right?
>> > root@node-54:~# ceph-osd -i 4 --mkfs --mkkey --osd-uuid
>> > 25c21708-f756-4593-bc9e-c5506622cf07
>> > 2017-12-15 17:28:22.849534 7fd2f9e928c0 -1 journal FileJournal::_open:
>> > disabling aio for non-block journal.  Use journal_force_aio to 
>> > force use of aio anyway
>> > 2017-12-15 17:28:22.855838 7fd2f9e928c0 -1 journal FileJournal::_open:
>> > disabling aio for non-block journal.  Use journal_force_aio to 
>> > force use of aio anyway
>> > 2017-12-15 17:28:22.856444 7fd2f9e928c0 -1
>> > filestore(/var/lib/ceph/osd/ceph-4) could not find 
>> > #-1:7b3f43c4:::osd_superblock:0# in index: (2) No such file or 
>> > directory
>> > 2017-12-15 17:28:22.893443 7fd2f9e928c0 -1 created object store
>> > /var/lib/ceph/osd/ceph-4 for osd.4 fsid 
>> > 2b9f7957-d0db-481e-923e-89972f6c594f
>> > 2017-12-15 17:28:22.893484 7fd2f9e928c0 -1 auth: error reading file:
>> > /var/lib/ceph/osd/ceph-4/keyring: can't open
>> > /var/lib/ceph/osd/ceph-4/keyring: (2) No such file or directory
>> > 2017-12-15 17:28:22.893662 7fd2f9e928c0 -1 created new key in 
>> > keyring /var/lib/ceph/osd/ceph-4/keyring
>> >
>> > thanks
>> >
>> > -----Original Message-----
>> > From: Cary [mailto:dynamic.cary@xxxxxxxxx]
>> > Sent: Thursday, December 14, 2017 7:13 PM
>> > To: James Okken
>> > Cc: ceph-users@xxxxxxxxxxxxxx
>> > Subject: Re:  add hard drives to 3 CEPH servers (3 
>> > server
>> > cluster)
>> >
>> > James,
>> >
>> >  Usually once the misplaced data has balanced out the cluster 
>> > should reach a healthy state. If you run a "ceph health detail" 
>> > Ceph will show you some more detail about what is happening.  Is 
>> > Ceph still recovering, or has it stalled? has the "objects misplaced (62.511%"
>> > changed to a lower %?
>> >
>> > Cary
>> > -Dynamic
>> >
>> > On Thu, Dec 14, 2017 at 10:52 PM, James Okken 
>> > <James.Okken@xxxxxxxxxxxx>
>> > wrote:
>> >> Thanks Cary!
>> >>
>> >> Your directions worked on my first sever. (once I found the 
>> >> missing carriage return in your list of commands, the email musta messed it up.
>> >>
>> >> For anyone else:
>> >> chown -R ceph:ceph /var/lib/ceph/osd/ceph-4 ceph auth add osd.4 
>> >> osd 'allow *' mon 'allow profile osd' -i 
>> >> /etc/ceph/ceph.osd.4.keyring really is 2 commands:
>> >> chown -R ceph:ceph /var/lib/ceph/osd/ceph-4  and ceph auth add 
>> >> osd.4 osd 'allow *' mon 'allow profile osd' -i 
>> >> /etc/ceph/ceph.osd.4.keyring
>> >>
>> >> Cary, what am I looking for in ceph -w and ceph -s to show the 
>> >> status of the data moving?
>> >> Seems like the data is moving and that I have some issue...
>> >>
>> >> root@node-53:~# ceph -w
>> >>     cluster 2b9f7957-d0db-481e-923e-89972f6c594f
>> >>      health HEALTH_WARN
>> >>             176 pgs backfill_wait
>> >>             1 pgs backfilling
>> >>             27 pgs degraded
>> >>             1 pgs recovering
>> >>             26 pgs recovery_wait
>> >>             27 pgs stuck degraded
>> >>             204 pgs stuck unclean
>> >>             recovery 10322/84644 objects degraded (12.195%)
>> >>             recovery 52912/84644 objects misplaced (62.511%)
>> >>      monmap e3: 3 mons at
>> >> {node-43=192.168.1.7:6789/0,node-44=192.168.1.5:6789/0,node-45=192.168.1.3:6789/0}
>> >>             election epoch 138, quorum 0,1,2 node-45,node-44,node-43
>> >>      osdmap e206: 4 osds: 4 up, 4 in; 177 remapped pgs
>> >>             flags sortbitwise,require_jewel_osds
>> >>       pgmap v3936175: 512 pgs, 5 pools, 333 GB data, 58184 objects
>> >>             370 GB used, 5862 GB / 6233 GB avail
>> >>             10322/84644 objects degraded (12.195%)
>> >>             52912/84644 objects misplaced (62.511%)
>> >>                  308 active+clean
>> >>                  176 active+remapped+wait_backfill
>> >>                   26 active+recovery_wait+degraded
>> >>                    1 active+remapped+backfilling
>> >>                    1 active+recovering+degraded recovery io 100605 
>> >> kB/s, 14 objects/s
>> >>   client io 0 B/s rd, 92788 B/s wr, 50 op/s rd, 11 op/s wr
>> >>
>> >> 2017-12-14 22:45:57.459846 mon.0 [INF] pgmap v3936174: 512 pgs: 1 
>> >> activating, 1 active+recovering+degraded, 26
>> >> active+recovery_wait+degraded, 1 active+remapped+backfilling, 307 
>> >> active+clean, 176 active+remapped+wait_backfill; 333 GB data, 369 
>> >> active+GB
>> >> used, 5863 GB / 6233 GB avail; 0 B/s rd, 101107 B/s wr, 19 op/s;
>> >> 10354/84644 objects degraded (12.232%); 52912/84644 objects 
>> >> misplaced (62.511%); 12224 kB/s, 2 objects/s recovering
>> >> 2017-12-14 22:45:58.466736 mon.0 [INF] pgmap v3936175: 512 pgs: 1
>> >> active+recovering+degraded, 26 active+recovery_wait+degraded, 1
>> >> active+remapped+backfilling, 308 active+clean, 176 wait_backfill; 
>> >> active+remapped+333 GB data, 370 GB used, 5862 GB /
>> >> 6233 GB avail; 0 B/s rd, 92788 B/s wr, 61 op/s; 10322/84644 
>> >> objects degraded (12.195%); 52912/84644 objects misplaced 
>> >> (62.511%); 100605 kB/s, 14 objects/s recovering
>> >> 2017-12-14 22:46:00.474335 mon.0 [INF] pgmap v3936176: 512 pgs: 1
>> >> active+recovering+degraded, 26 active+recovery_wait+degraded, 1
>> >> active+remapped+backfilling, 308 active+clean, 176 wait_backfill; 
>> >> active+remapped+333 GB data, 370 GB used, 5862 GB /
>> >> 6233 GB avail; 0 B/s rd, 434 kB/s wr, 45 op/s; 10322/84644 objects 
>> >> degraded (12.195%); 52912/84644 objects misplaced (62.511%); 84234 
>> >> kB/s, 10 objects/s recovering
>> >> 2017-12-14 22:46:02.482228 mon.0 [INF] pgmap v3936177: 512 pgs: 1
>> >> active+recovering+degraded, 26 active+recovery_wait+degraded, 1
>> >> active+remapped+backfilling, 308 active+clean, 176 wait_backfill; 
>> >> active+remapped+333 GB data, 370 GB used, 5862 GB /
>> >> 6233 GB avail; 0 B/s rd, 334 kB/s wr
>> >>
>> >>
>> >> -----Original Message-----
>> >> From: Cary [mailto:dynamic.cary@xxxxxxxxx]
>> >> Sent: Thursday, December 14, 2017 4:21 PM
>> >> To: James Okken
>> >> Cc: ceph-users@xxxxxxxxxxxxxx
>> >> Subject: Re:  add hard drives to 3 CEPH servers (3 
>> >> server
>> >> cluster)
>> >>
>> >> Jim,
>> >>
>> >> I am not an expert, but I believe I can assist.
>> >>
>> >>  Normally you will only have 1 OSD per drive. I have heard 
>> >> discussions about using multiple OSDs per disk, when using SSDs though.
>> >>
>> >>  Once your drives have been installed you will have to format 
>> >> them, unless you are using Bluestore. My steps for formatting are below.
>> >> Replace the sXX with your drive name.
>> >>
>> >> parted -a optimal /dev/sXX
>> >> print
>> >> mklabel gpt
>> >> unit mib
>> >> mkpart OSD4sdd1 1 -1
>> >> quit
>> >> mkfs.xfs -f /dev/sXX1
>> >>
>> >> # Run blkid, and copy the UUID for the newly formatted drive.
>> >> blkid
>> >> # Add the mount point/UUID to fstab. The mount point will be 
>> >> created later.
>> >> vi /etc/fstab
>> >> # For example
>> >> UUID=6386bac4-7fef-3cd2-7d64-13db51d83b12 /var/lib/ceph/osd/ceph-4 
>> >> xfs
>> >> rw,noatime,inode64,logbufs=8 0 0
>> >>
>> >>
>> >> # You can then add the OSD to the cluster.
>> >>
>> >> uuidgen
>> >> # Replace the UUID below with the UUID that was created with uuidgen.
>> >> ceph osd create 23e734d7-96d8-4327-a2b9-0fbdc72ed8f1
>> >>
>> >> # Notice what number of osd it creates usually the lowest # OSD 
>> >> available.
>> >>
>> >> # Add osd.4 to ceph.conf on all Ceph nodes.
>> >> vi /etc/ceph/ceph.conf
>> >> ...
>> >> [osd.4]
>> >> public addr = 172.1.3.1
>> >> cluster addr = 10.1.3.1
>> >> ...
>> >>
>> >> # Now add the mount point.
>> >> mkdir -p /var/lib/ceph/osd/ceph-4
>> >> chown -R ceph:ceph /var/lib/ceph/osd/ceph-4
>> >>
>> >> # The command below mounts everything in fstab.
>> >> mount -a
>> >> # The number after -i below needs changed to the correct OSD ID, 
>> >> and the osd-uuid needs to be changed the UUID created with uuidgen above.
>> >> Your keyring location may be different and need changed as well.
>> >> ceph-osd -i 4 --mkfs --mkkey --osd-uuid
>> >> 23e734d7-96d8-4327-a2b9-0fbdc72ed8f1
>> >> chown -R ceph:ceph /var/lib/ceph/osd/ceph-4 ceph auth add osd.4 
>> >> osd 'allow *' mon 'allow profile osd' -i 
>> >> /etc/ceph/ceph.osd.4.keyring
>> >>
>> >> # Add the new OSD to its host in the crush map.
>> >> ceph osd crush add osd.4 .0 host=YOURhostNAME
>> >>
>> >> # Since the weight used in the previous step was .0, you will need 
>> >> to increase it. I use 1 for a 1TB drive and 5 for a 5TB drive. The 
>> >> command below will reweight osd.4 to 1. You may need to slowly ramp up this number.
>> >> ie .10 then .20 etc.
>> >> ceph osd crush reweight osd.4 1
>> >>
>> >> You should now be able to start the drive. You can watch the data 
>> >> move to the drive with a ceph -w. Once data has migrated to the 
>> >> drive, start the next.
>> >>
>> >> Cary
>> >> -Dynamic
>> >>
>> >> On Thu, Dec 14, 2017 at 5:34 PM, James Okken 
>> >> <James.Okken@xxxxxxxxxxxx>
>> >> wrote:
>> >>> Hi all,
>> >>>
>> >>> Please let me know if I am missing steps or using the wrong steps
>> >>>
>> >>> I'm hoping to expand my small CEPH cluster by adding 4TB hard 
>> >>> drives to each of the 3 servers in the cluster.
>> >>>
>> >>> I also need to change my replication factor from 1 to 3.
>> >>> This is part of an Openstack environment deployed by Fuel and I 
>> >>> had foolishly set my replication factor to 1 in the Fuel settings before deploy.
>> >>> I know this would have been done better at the beginning. I do 
>> >>> want to keep the current cluster and not start over. I know this 
>> >>> is going thrash my cluster for a while replicating, but there isn't too much data on it yet.
>> >>>
>> >>>
>> >>> To start I need to safely turn off each CEPH server and add in 
>> >>> the 4TB
>> >>> drive:
>> >>> To do that I am going to run:
>> >>> ceph osd set noout
>> >>> systemctl stop ceph-osd@1 (or 2 or 3 on the other servers) ceph 
>> >>> osd tree (to verify it is down) poweroff, install the 4TB drive, 
>> >>> bootup again ceph osd unset noout
>> >>>
>> >>>
>> >>>
>> >>> Next step wouyld be to get CEPH to use the 4TB drives. Each CEPH 
>> >>> server already has a 836GB OSD.
>> >>>
>> >>> ceph> osd df
>> >>> ID WEIGHT  REWEIGHT SIZE  USE  AVAIL %USE  VAR  PGS
>> >>>  0 0.81689  1.00000  836G 101G  734G 12.16 0.90 167
>> >>>  1 0.81689  1.00000  836G 115G  721G 13.76 1.02 166
>> >>>  2 0.81689  1.00000  836G 121G  715G 14.49 1.08 179
>> >>>               TOTAL 2509G 338G 2171G 13.47 MIN/MAX VAR: 0.90/1.08
>> >>> STDDEV: 0.97
>> >>>
>> >>> ceph> df
>> >>> GLOBAL:
>> >>>     SIZE      AVAIL     RAW USED     %RAW USED
>> >>>     2509G     2171G         338G         13.47
>> >>> POOLS:
>> >>>     NAME        ID     USED     %USED     MAX AVAIL     OBJECTS
>> >>>     rbd         0         0         0         2145G           0
>> >>>     images      1      216G      9.15         2145G       27745
>> >>>     backups     2         0         0         2145G           0
>> >>>     volumes     3      114G      5.07         2145G       29717
>> >>>     compute     4         0         0         2145G           0
>> >>>
>> >>>
>> >>> Once I get the 4TB drive into each CEPH server should I look to 
>> >>> increasing the current OSD (ie: to 4836GB)?
>> >>> Or create a second 4000GB OSD on each CEPH server?
>> >>> If I am going to create a second OSD on each CEPH server I hope 
>> >>> to use this doc:
>> >>> http://docs.ceph.com/docs/master/rados/operations/add-or-rm-osds/
>> >>>
>> >>>
>> >>>
>> >>> As far as changing the replication factor from 1 to 3:
>> >>> Here are my pools now:
>> >>>
>> >>> ceph osd pool ls detail
>> >>> pool 0 'rbd' replicated size 1 min_size 1 crush_ruleset 0 
>> >>> object_hash rjenkins pg_num 64 pgp_num 64 last_change 1 flags 
>> >>> hashpspool stripe_width 0 pool 1 'images' replicated size 1 
>> >>> min_size 1 crush_ruleset 0 object_hash rjenkins pg_num 64 pgp_num 
>> >>> 64 last_change 116 flags hashpspool stripe_width 0
>> >>>         removed_snaps [1~3,b~6,12~8,20~2,24~6,2b~8,34~2,37~20]
>> >>> pool 2 'backups' replicated size 1 min_size 1 crush_ruleset 0 
>> >>> object_hash rjenkins pg_num 64 pgp_num 64 last_change 7 flags 
>> >>> hashpspool stripe_width 0 pool 3 'volumes' replicated size 1 
>> >>> min_size 1 crush_ruleset 0 object_hash rjenkins pg_num 256 
>> >>> pgp_num 256 last_change 73 flags hashpspool stripe_width 0
>> >>>         removed_snaps [1~3]
>> >>> pool 4 'compute' replicated size 1 min_size 1 crush_ruleset 0 
>> >>> object_hash rjenkins pg_num 64 pgp_num 64 last_change 34 flags 
>> >>> hashpspool stripe_width 0
>> >>>
>> >>> I plan on using these steps I saw online:
>> >>> ceph osd pool set rbd size 3
>> >>> ceph -s  (Verify that replication completes successfully) ceph 
>> >>> osd pool set images size 3 ceph -s ceph osd pool set backups size 
>> >>> 3 ceph -s ceph osd pool set volumes size 3 ceph -s
>> >>>
>> >>>
>> >>> please let me know any advice or better methods...
>> >>>
>> >>> thanks
>> >>>
>> >>> --Jim
>> >>>
>> >>> _______________________________________________
>> >>> ceph-users mailing list
>> >>> ceph-users@xxxxxxxxxxxxxx
>> >>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>> _______________________________________________
>> ceph-users mailing list
>> ceph-users@xxxxxxxxxxxxxx
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux