Re: Production System Evaluation / Problems

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

 

1.       It is possible to do that with the primary affinity setting. The documentation gives an example with SSD as primary OSD and HDD as secondary. I think it would work for Active/Passive DC scenario might be tricky for Active/Active. If you do Ceph across 2 DCs you might have problems with quorum, a third location with 1 MON can help break ties.

2.       Zap & re-create?

3.       It is common to use 2 VLANs on a LACP bond instead of 1 NIC on each VLAN. You just need to size the pipes accordingly to avoid bottlenecks.

 

Cheers,

 

 

From: ceph-users <ceph-users-bounces@xxxxxxxxxxxxxx> on behalf of Stefan Lissmats <stefan@xxxxxxxxxx>
Date: Monday 28 November 2016 11:12
To: "Strankowski, Florian" <FStrankowski@xxxxxxxxxxxxxxxxxxxxxxxxx>, "'ceph-users@xxxxxxxxxxxxxx'" <ceph-users@xxxxxxxxxxxxxx>
Subject: Re: [ceph-users] Production System Evaluation / Problems

 

Hey!

 

I have using ceph for a while bu is not a real expert but i will give you some pointers to make everyone able to help you further.

 

1. The crush map is kind of devided into two parts, the topology description, (which you provided us with) and also the crush rules that defines how the data is placed in the topology. Have you made any changes in the rules? If you have made any changes it would be great if you provided how the rules is defined. However i think you can get the data placed the way you want with some more advanced crush rules, but I don't think there is any possibility to have a read only copy. Guess you have seen this? http://docs.ceph.com/docs/jewel/rados/operations/crush-map/

 

 

2.  Have you looked into the osd logs server that osd.0 resides on? That could give some information why osd.0 never comes up. It should normally be in /var/log/ceph/ceph-

osd.0.log 

 

Other notes: 

You have 6 mons but you normally want an odd number and do not normally need more than 5 (or even 3 is).

 


Från: ceph-users [ceph-users-bounces@xxxxxxxxxxxxxx] för Strankowski, Florian [FStrankowski@xxxxxxxxxxxxxxxxxxxxxxxxx]
Skickat: den 28 november 2016 10:29
Till: 'ceph-users@xxxxxxxxxxxxxx'
Ämne: [ceph-users] Production System Evaluation / Problems

Hey guys,

 

we’re evaluating ceph at the moment for a bigger production-ready implementation. So far we’ve had some success and

some problems with ceph. In combination with Proxmox CEPH works quite well, if taken out of the box. I’ve tried to coverup my questions

with existing answers and solutions but i still find some things unclear. Here are the things i’m having problems with:

 

1.       The first question is just for my understanding: How does CEPH account failure domains? For what i’ve read by now is
that i create a new CRUSH-Map with for example 2 datacenters, each DC has a rack and in this rack there is a chassis with nodes.
By using an own CRUSH-Map CEPH will „see“ it and deal with the data automatically. What i am missing here is some more possible adjustment.
For example i want to define that by using a replica of 3 i want CEPH to store the data 2 times in datacenter A and one time in datacenter B. Further
more i want read-access exclusivly within 1 datacenter (if possible and data is available) to keep rtt low. Is this possible?

2.       I’ve build my own CRUSH-Map and tried to get it working. No success at all. I’m literally „done with this s…“ J thats why im here right now. Here is the state

of the cluster:

 

    cluster 42f04e55-0a3f-4644-8543-516cd46cd4e9

     health HEALTH_WARN

            79 pgs degraded

            262 pgs stale

            79 pgs stuck degraded

            262 pgs stuck stale

            512 pgs stuck unclean

            79 pgs stuck undersized

            79 pgs undersized

     monmap e8: 6 mons at {0=192.168.40.20:6789/0,1=192.168.40.21:6789/0,2=192.168.40.22:6789/0,3=192.168.40.23:6789/0,4=192.168.40.24:6789/0,5=192.168.40.25:6789/0}

            election epoch 86, quorum 0,1,2,3,4,5 0,1,2,3,4,5

     mdsmap e2: 0/0/1 up

     osdmap e212: 6 osds: 5 up, 5 in; 250 remapped pgs

      pgmap v366013: 512 pgs, 2 pools, 0 bytes data, 0 objects

            278 MB used, 900 GB / 901 GB avail

                 250 active+remapped

                 183 stale+active+remapped

                  79 stale+active+undersized+degraded+remapped

 

Here the config:

 

 

ID  WEIGHT  TYPE NAME                  UP/DOWN REWEIGHT PRIMARY-AFFINITY

-27 1.07997 root default

-25 0.53998     datacenter datacenter1

-23 0.53998         chassis chassis1

-1 0.17999             blade blade3

  0 0.17999                 osd.0         down        0          1.00000

-2 0.17999             blade blade4

  1 0.17999                 osd.1           up  1.00000          1.00000

-3 0.17999             blade blade5

  2 0.17999                 osd.2           up  1.00000          1.00000

-26 0.53999     datacenter datacenter2

-24 0.53999         chassis chassis2

-17 0.17999             blade blade17

  3 0.17999                 osd.3           up  0.95001          1.00000

-18 0.17999             blade blade18

  4 0.17999                 osd.4           up  1.00000          1.00000

-19 0.17999             blade blade19

  5 0.17999                 osd.5           up  1.00000          1.00000

 

I simply cant get osd.0 back up. I took it offline, out, reinserterd, resetup, deleted the osd configs, remade them, no success

whatsoever. IMHO the documentation on this part is a bit „lousy“ so im missing some points of information here, sorry folks.

 

3.       Last but not least i would like to know whether it is a good idea to have the data and config network instead on 2 dedicated nics on 2 dedicated vlans. Our

Hardware is redundant and we got 10GIG Fibreoptics inhouse and 80 GIG between the two datacenters. The data-vlan is using jumbo frames while the others dont.

 

4.       Do you guys have somekind of „best practice“-book available for large scale deployments? 20+ Servers up to 100+ and 1000+

 

Regards

 

Florian

 

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux