James, You can set these values in ceph.conf. [global] ... osd pool default size = 3 osd pool default min size = 2 ... New pools that are created will use those values. If you run a "ceph -s" and look at the "usage" line, it shows how much space is: 1 used, 2 available, 3 total. ie. usage: 19465 GB used, 60113 GB / 79578 GB avail We choose to use Openstack with Ceph in this decade and do the other things, not because they are easy, but because they are hard...;-p Cary -Dynamic On Fri, Dec 15, 2017 at 10:12 PM, David Turner <drakonstein@xxxxxxxxx> wrote: > In conjunction with increasing the pool size to 3, also increase the pool > min_size to 2. `ceph df` and `ceph osd df` will eventually show the full > size in use in your cluster. In particular the output of `ceph df` with > available size in a pool takes into account the pools replication size. > Continue watching ceph -s or ceph -w to see when the backfilling for your > change to replication size finishes. > > On Fri, Dec 15, 2017 at 5:06 PM James Okken <James.Okken@xxxxxxxxxxxx> > wrote: >> >> This whole effort went extremely well, thanks to Cary, and Im not used to >> that with CEPH so far. (And openstack ever....) >> Thank you Cary. >> >> Ive upped the replication factor and now I see "replicated size 3" in each >> of my pools. Is this the only place to check replication level? Is there a >> Global setting or only a setting per Pool? >> >> ceph osd pool ls detail >> pool 0 'rbd' replicated size 3...... >> pool 1 'images' replicated size 3... >> ... >> >> One last question! >> At this replication level how can I tell how much total space I actually >> have now? >> Do I just 1/3 the Global size? >> >> ceph df >> GLOBAL: >> SIZE AVAIL RAW USED %RAW USED >> 13680G 12998G 682G 4.99 >> POOLS: >> NAME ID USED %USED MAX AVAIL OBJECTS >> rbd 0 0 0 6448G 0 >> images 1 216G 3.24 6448G 27745 >> backups 2 0 0 6448G 0 >> volumes 3 117G 1.79 6448G 30441 >> compute 4 0 0 6448G 0 >> >> ceph osd df >> ID WEIGHT REWEIGHT SIZE USE AVAIL %USE VAR PGS >> 0 0.81689 1.00000 836G 36549M 800G 4.27 0.86 67 >> 4 3.70000 1.00000 3723G 170G 3553G 4.58 0.92 270 >> 1 0.81689 1.00000 836G 49612M 788G 5.79 1.16 56 >> 5 3.70000 1.00000 3723G 192G 3531G 5.17 1.04 282 >> 2 0.81689 1.00000 836G 33639M 803G 3.93 0.79 58 >> 3 3.70000 1.00000 3723G 202G 3521G 5.43 1.09 291 >> TOTAL 13680G 682G 12998G 4.99 >> MIN/MAX VAR: 0.79/1.16 STDDEV: 0.67 >> >> Thanks! >> >> -----Original Message----- >> From: Cary [mailto:dynamic.cary@xxxxxxxxx] >> Sent: Friday, December 15, 2017 4:05 PM >> To: James Okken >> Cc: ceph-users@xxxxxxxxxxxxxx >> Subject: Re: add hard drives to 3 CEPH servers (3 server >> cluster) >> >> James, >> >> Those errors are normal. Ceph creates the missing files. You can check >> "/var/lib/ceph/osd/ceph-6", before and after you run those commands to see >> what files are added there. >> >> Make sure you get the replication factor set. >> >> >> Cary >> -Dynamic >> >> On Fri, Dec 15, 2017 at 6:11 PM, James Okken <James.Okken@xxxxxxxxxxxx> >> wrote: >> > Thanks again Cary, >> > >> > Yes, once all the backfilling was done I was back to a Healthy cluster. >> > I moved on to the same steps for the next server in the cluster, it is >> > backfilling now. >> > Once that is done I will do the last server in the cluster, and then I >> > think I am done! >> > >> > Just checking on one thing. I get these messages when running this >> > command. I assume this is OK, right? >> > root@node-54:~# ceph-osd -i 4 --mkfs --mkkey --osd-uuid >> > 25c21708-f756-4593-bc9e-c5506622cf07 >> > 2017-12-15 17:28:22.849534 7fd2f9e928c0 -1 journal FileJournal::_open: >> > disabling aio for non-block journal. Use journal_force_aio to force >> > use of aio anyway >> > 2017-12-15 17:28:22.855838 7fd2f9e928c0 -1 journal FileJournal::_open: >> > disabling aio for non-block journal. Use journal_force_aio to force >> > use of aio anyway >> > 2017-12-15 17:28:22.856444 7fd2f9e928c0 -1 >> > filestore(/var/lib/ceph/osd/ceph-4) could not find >> > #-1:7b3f43c4:::osd_superblock:0# in index: (2) No such file or >> > directory >> > 2017-12-15 17:28:22.893443 7fd2f9e928c0 -1 created object store >> > /var/lib/ceph/osd/ceph-4 for osd.4 fsid >> > 2b9f7957-d0db-481e-923e-89972f6c594f >> > 2017-12-15 17:28:22.893484 7fd2f9e928c0 -1 auth: error reading file: >> > /var/lib/ceph/osd/ceph-4/keyring: can't open >> > /var/lib/ceph/osd/ceph-4/keyring: (2) No such file or directory >> > 2017-12-15 17:28:22.893662 7fd2f9e928c0 -1 created new key in keyring >> > /var/lib/ceph/osd/ceph-4/keyring >> > >> > thanks >> > >> > -----Original Message----- >> > From: Cary [mailto:dynamic.cary@xxxxxxxxx] >> > Sent: Thursday, December 14, 2017 7:13 PM >> > To: James Okken >> > Cc: ceph-users@xxxxxxxxxxxxxx >> > Subject: Re: add hard drives to 3 CEPH servers (3 server >> > cluster) >> > >> > James, >> > >> > Usually once the misplaced data has balanced out the cluster should >> > reach a healthy state. If you run a "ceph health detail" Ceph will show you >> > some more detail about what is happening. Is Ceph still recovering, or has >> > it stalled? has the "objects misplaced (62.511%" >> > changed to a lower %? >> > >> > Cary >> > -Dynamic >> > >> > On Thu, Dec 14, 2017 at 10:52 PM, James Okken <James.Okken@xxxxxxxxxxxx> >> > wrote: >> >> Thanks Cary! >> >> >> >> Your directions worked on my first sever. (once I found the missing >> >> carriage return in your list of commands, the email musta messed it up. >> >> >> >> For anyone else: >> >> chown -R ceph:ceph /var/lib/ceph/osd/ceph-4 ceph auth add osd.4 osd >> >> 'allow *' mon 'allow profile osd' -i /etc/ceph/ceph.osd.4.keyring >> >> really is 2 commands: >> >> chown -R ceph:ceph /var/lib/ceph/osd/ceph-4 and ceph auth add osd.4 >> >> osd 'allow *' mon 'allow profile osd' -i /etc/ceph/ceph.osd.4.keyring >> >> >> >> Cary, what am I looking for in ceph -w and ceph -s to show the status >> >> of the data moving? >> >> Seems like the data is moving and that I have some issue... >> >> >> >> root@node-53:~# ceph -w >> >> cluster 2b9f7957-d0db-481e-923e-89972f6c594f >> >> health HEALTH_WARN >> >> 176 pgs backfill_wait >> >> 1 pgs backfilling >> >> 27 pgs degraded >> >> 1 pgs recovering >> >> 26 pgs recovery_wait >> >> 27 pgs stuck degraded >> >> 204 pgs stuck unclean >> >> recovery 10322/84644 objects degraded (12.195%) >> >> recovery 52912/84644 objects misplaced (62.511%) >> >> monmap e3: 3 mons at >> >> {node-43=192.168.1.7:6789/0,node-44=192.168.1.5:6789/0,node-45=192.168.1.3:6789/0} >> >> election epoch 138, quorum 0,1,2 node-45,node-44,node-43 >> >> osdmap e206: 4 osds: 4 up, 4 in; 177 remapped pgs >> >> flags sortbitwise,require_jewel_osds >> >> pgmap v3936175: 512 pgs, 5 pools, 333 GB data, 58184 objects >> >> 370 GB used, 5862 GB / 6233 GB avail >> >> 10322/84644 objects degraded (12.195%) >> >> 52912/84644 objects misplaced (62.511%) >> >> 308 active+clean >> >> 176 active+remapped+wait_backfill >> >> 26 active+recovery_wait+degraded >> >> 1 active+remapped+backfilling >> >> 1 active+recovering+degraded recovery io 100605 >> >> kB/s, 14 objects/s >> >> client io 0 B/s rd, 92788 B/s wr, 50 op/s rd, 11 op/s wr >> >> >> >> 2017-12-14 22:45:57.459846 mon.0 [INF] pgmap v3936174: 512 pgs: 1 >> >> activating, 1 active+recovering+degraded, 26 >> >> active+recovery_wait+degraded, 1 active+remapped+backfilling, 307 >> >> active+clean, 176 active+remapped+wait_backfill; 333 GB data, 369 GB >> >> used, 5863 GB / 6233 GB avail; 0 B/s rd, 101107 B/s wr, 19 op/s; >> >> 10354/84644 objects degraded (12.232%); 52912/84644 objects misplaced >> >> (62.511%); 12224 kB/s, 2 objects/s recovering >> >> 2017-12-14 22:45:58.466736 mon.0 [INF] pgmap v3936175: 512 pgs: 1 >> >> active+recovering+degraded, 26 active+recovery_wait+degraded, 1 >> >> active+remapped+backfilling, 308 active+clean, 176 wait_backfill; 333 >> >> active+remapped+GB data, 370 GB used, 5862 GB / >> >> 6233 GB avail; 0 B/s rd, 92788 B/s wr, 61 op/s; 10322/84644 objects >> >> degraded (12.195%); 52912/84644 objects misplaced (62.511%); 100605 >> >> kB/s, 14 objects/s recovering >> >> 2017-12-14 22:46:00.474335 mon.0 [INF] pgmap v3936176: 512 pgs: 1 >> >> active+recovering+degraded, 26 active+recovery_wait+degraded, 1 >> >> active+remapped+backfilling, 308 active+clean, 176 wait_backfill; 333 >> >> active+remapped+GB data, 370 GB used, 5862 GB / >> >> 6233 GB avail; 0 B/s rd, 434 kB/s wr, 45 op/s; 10322/84644 objects >> >> degraded (12.195%); 52912/84644 objects misplaced (62.511%); 84234 >> >> kB/s, 10 objects/s recovering >> >> 2017-12-14 22:46:02.482228 mon.0 [INF] pgmap v3936177: 512 pgs: 1 >> >> active+recovering+degraded, 26 active+recovery_wait+degraded, 1 >> >> active+remapped+backfilling, 308 active+clean, 176 wait_backfill; 333 >> >> active+remapped+GB data, 370 GB used, 5862 GB / >> >> 6233 GB avail; 0 B/s rd, 334 kB/s wr >> >> >> >> >> >> -----Original Message----- >> >> From: Cary [mailto:dynamic.cary@xxxxxxxxx] >> >> Sent: Thursday, December 14, 2017 4:21 PM >> >> To: James Okken >> >> Cc: ceph-users@xxxxxxxxxxxxxx >> >> Subject: Re: add hard drives to 3 CEPH servers (3 server >> >> cluster) >> >> >> >> Jim, >> >> >> >> I am not an expert, but I believe I can assist. >> >> >> >> Normally you will only have 1 OSD per drive. I have heard discussions >> >> about using multiple OSDs per disk, when using SSDs though. >> >> >> >> Once your drives have been installed you will have to format them, >> >> unless you are using Bluestore. My steps for formatting are below. >> >> Replace the sXX with your drive name. >> >> >> >> parted -a optimal /dev/sXX >> >> print >> >> mklabel gpt >> >> unit mib >> >> mkpart OSD4sdd1 1 -1 >> >> quit >> >> mkfs.xfs -f /dev/sXX1 >> >> >> >> # Run blkid, and copy the UUID for the newly formatted drive. >> >> blkid >> >> # Add the mount point/UUID to fstab. The mount point will be created >> >> later. >> >> vi /etc/fstab >> >> # For example >> >> UUID=6386bac4-7fef-3cd2-7d64-13db51d83b12 /var/lib/ceph/osd/ceph-4 >> >> xfs >> >> rw,noatime,inode64,logbufs=8 0 0 >> >> >> >> >> >> # You can then add the OSD to the cluster. >> >> >> >> uuidgen >> >> # Replace the UUID below with the UUID that was created with uuidgen. >> >> ceph osd create 23e734d7-96d8-4327-a2b9-0fbdc72ed8f1 >> >> >> >> # Notice what number of osd it creates usually the lowest # OSD >> >> available. >> >> >> >> # Add osd.4 to ceph.conf on all Ceph nodes. >> >> vi /etc/ceph/ceph.conf >> >> ... >> >> [osd.4] >> >> public addr = 172.1.3.1 >> >> cluster addr = 10.1.3.1 >> >> ... >> >> >> >> # Now add the mount point. >> >> mkdir -p /var/lib/ceph/osd/ceph-4 >> >> chown -R ceph:ceph /var/lib/ceph/osd/ceph-4 >> >> >> >> # The command below mounts everything in fstab. >> >> mount -a >> >> # The number after -i below needs changed to the correct OSD ID, and >> >> the osd-uuid needs to be changed the UUID created with uuidgen above. >> >> Your keyring location may be different and need changed as well. >> >> ceph-osd -i 4 --mkfs --mkkey --osd-uuid >> >> 23e734d7-96d8-4327-a2b9-0fbdc72ed8f1 >> >> chown -R ceph:ceph /var/lib/ceph/osd/ceph-4 ceph auth add osd.4 osd >> >> 'allow *' mon 'allow profile osd' -i /etc/ceph/ceph.osd.4.keyring >> >> >> >> # Add the new OSD to its host in the crush map. >> >> ceph osd crush add osd.4 .0 host=YOURhostNAME >> >> >> >> # Since the weight used in the previous step was .0, you will need to >> >> increase it. I use 1 for a 1TB drive and 5 for a 5TB drive. The command >> >> below will reweight osd.4 to 1. You may need to slowly ramp up this number. >> >> ie .10 then .20 etc. >> >> ceph osd crush reweight osd.4 1 >> >> >> >> You should now be able to start the drive. You can watch the data move >> >> to the drive with a ceph -w. Once data has migrated to the drive, start the >> >> next. >> >> >> >> Cary >> >> -Dynamic >> >> >> >> On Thu, Dec 14, 2017 at 5:34 PM, James Okken <James.Okken@xxxxxxxxxxxx> >> >> wrote: >> >>> Hi all, >> >>> >> >>> Please let me know if I am missing steps or using the wrong steps >> >>> >> >>> I'm hoping to expand my small CEPH cluster by adding 4TB hard drives >> >>> to each of the 3 servers in the cluster. >> >>> >> >>> I also need to change my replication factor from 1 to 3. >> >>> This is part of an Openstack environment deployed by Fuel and I had >> >>> foolishly set my replication factor to 1 in the Fuel settings before deploy. >> >>> I know this would have been done better at the beginning. I do want to keep >> >>> the current cluster and not start over. I know this is going thrash my >> >>> cluster for a while replicating, but there isn't too much data on it yet. >> >>> >> >>> >> >>> To start I need to safely turn off each CEPH server and add in the 4TB >> >>> drive: >> >>> To do that I am going to run: >> >>> ceph osd set noout >> >>> systemctl stop ceph-osd@1 (or 2 or 3 on the other servers) ceph osd >> >>> tree (to verify it is down) poweroff, install the 4TB drive, bootup >> >>> again ceph osd unset noout >> >>> >> >>> >> >>> >> >>> Next step wouyld be to get CEPH to use the 4TB drives. Each CEPH >> >>> server already has a 836GB OSD. >> >>> >> >>> ceph> osd df >> >>> ID WEIGHT REWEIGHT SIZE USE AVAIL %USE VAR PGS >> >>> 0 0.81689 1.00000 836G 101G 734G 12.16 0.90 167 >> >>> 1 0.81689 1.00000 836G 115G 721G 13.76 1.02 166 >> >>> 2 0.81689 1.00000 836G 121G 715G 14.49 1.08 179 >> >>> TOTAL 2509G 338G 2171G 13.47 MIN/MAX VAR: 0.90/1.08 >> >>> STDDEV: 0.97 >> >>> >> >>> ceph> df >> >>> GLOBAL: >> >>> SIZE AVAIL RAW USED %RAW USED >> >>> 2509G 2171G 338G 13.47 >> >>> POOLS: >> >>> NAME ID USED %USED MAX AVAIL OBJECTS >> >>> rbd 0 0 0 2145G 0 >> >>> images 1 216G 9.15 2145G 27745 >> >>> backups 2 0 0 2145G 0 >> >>> volumes 3 114G 5.07 2145G 29717 >> >>> compute 4 0 0 2145G 0 >> >>> >> >>> >> >>> Once I get the 4TB drive into each CEPH server should I look to >> >>> increasing the current OSD (ie: to 4836GB)? >> >>> Or create a second 4000GB OSD on each CEPH server? >> >>> If I am going to create a second OSD on each CEPH server I hope to use >> >>> this doc: >> >>> http://docs.ceph.com/docs/master/rados/operations/add-or-rm-osds/ >> >>> >> >>> >> >>> >> >>> As far as changing the replication factor from 1 to 3: >> >>> Here are my pools now: >> >>> >> >>> ceph osd pool ls detail >> >>> pool 0 'rbd' replicated size 1 min_size 1 crush_ruleset 0 >> >>> object_hash rjenkins pg_num 64 pgp_num 64 last_change 1 flags >> >>> hashpspool stripe_width 0 pool 1 'images' replicated size 1 min_size 1 >> >>> crush_ruleset 0 object_hash rjenkins pg_num 64 pgp_num 64 last_change 116 >> >>> flags hashpspool stripe_width 0 >> >>> removed_snaps [1~3,b~6,12~8,20~2,24~6,2b~8,34~2,37~20] >> >>> pool 2 'backups' replicated size 1 min_size 1 crush_ruleset 0 >> >>> object_hash rjenkins pg_num 64 pgp_num 64 last_change 7 flags >> >>> hashpspool stripe_width 0 pool 3 'volumes' replicated size 1 min_size 1 >> >>> crush_ruleset 0 object_hash rjenkins pg_num 256 pgp_num 256 last_change 73 >> >>> flags hashpspool stripe_width 0 >> >>> removed_snaps [1~3] >> >>> pool 4 'compute' replicated size 1 min_size 1 crush_ruleset 0 >> >>> object_hash rjenkins pg_num 64 pgp_num 64 last_change 34 flags >> >>> hashpspool stripe_width 0 >> >>> >> >>> I plan on using these steps I saw online: >> >>> ceph osd pool set rbd size 3 >> >>> ceph -s (Verify that replication completes successfully) ceph osd >> >>> pool set images size 3 ceph -s ceph osd pool set backups size 3 ceph >> >>> -s ceph osd pool set volumes size 3 ceph -s >> >>> >> >>> >> >>> please let me know any advice or better methods... >> >>> >> >>> thanks >> >>> >> >>> --Jim >> >>> >> >>> _______________________________________________ >> >>> ceph-users mailing list >> >>> ceph-users@xxxxxxxxxxxxxx >> >>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >> _______________________________________________ >> ceph-users mailing list >> ceph-users@xxxxxxxxxxxxxx >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com