Re: Adding new CEPH monitor keep SYNCHRONIZING

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 05/18/2015 10:33 AM, Ali Hussein wrote:
> The two old Monitors uses Ceph version 0.87.1 , while the new added
> Monitor uses 0.87.2
> P.S:- ntp is installed and working fine

This is not related with clocks (or, at least, should not be).

State 'synchronizing' means the monitor is getting its store
synchronized from the other monitors, so that it gets to a consistent
state with the remaining monitors in order to form quorum.

If your monitor stores on the already in-quorum monitors is too big
(1G+) it may take a while.  When the monitor stores are several GBs in
size, dozens even (this is your case), this will be a hard task that may
need some fine tuning.

The upside is that when a store is big enough to cause this sort of
problems during synchronization, it also usually means the cluster is in
bad shape.  Usually due to lots of osdmaps in the monitor stores, which
is a symptom of an unclean, unhealthy cluster.

If you have an unhealthy cluster, you should first try stabilizing the
cluster and get a HEALTH_OK.  After that, the monitors will trim
no-longer-necessary maps and synchronization will be faster.  Once you
reach HEALTH_OK, restart the monitor you are trying to get into the
cluster and synchronization should run just fine.

  -Joao

> 
> On 18/05/2015 12:11 م, Mohamed Pakkeer wrote:
>> Hi Ali,
>>
>> Which version of Ceph are you using?. Is there any re-spawning osds?
>>
>> Regards
>> K.Mohamed Pakkeer
>>
>> On Mon, May 18, 2015 at 2:23 PM, Ali Hussein
>> <ali.alkhazraji@xxxxxxxxxxxxxxxxx
>> <mailto:ali.alkhazraji@xxxxxxxxxxxxxxxxx>> wrote:
>>
>>     *Hi all*
>>
>>     I have two ceph monitors working fine , i have added them a while
>>     ago, for now i have added a new Ceph Monitor and it does showing
>>     me the following log file
>>
>>     2015-05-18 10:54:42.585123 7f4a9609d700  0
>>     mon.monitor03@2(synchronizing).data_health(0) update_stats avail
>>     44% total 51175 MB, used 28280 MB, avail 22894 MB
>>     2015-05-18 10:55:44.418861 7f4a9609d700  0
>>     mon.monitor03@2(synchronizing).data_health(0) update_stats avail
>>     44% total 51175 MB, used 28437 MB, avail 22737 MB
>>     2015-05-18 10:56:45.442884 7f4a9609d700  0
>>     mon.monitor03@2(synchronizing).data_health(0) update_stats avail
>>     44% total 51175 MB, used 28287 MB, avail 22887 MB
>>     2015-05-18 10:57:47.710088 7f4a9609d700  0
>>     mon.monitor03@2(synchronizing).data_health(0) update_stats avail
>>     44% total 51175 MB, used 28449 MB, avail 22725 MB
>>     2015-05-18 10:59:13.436988 7f4a9609d700  0
>>     mon.monitor03@2(synchronizing).data_health(0) update_stats avail
>>     44% total 51175 MB, used 28266 MB, avail 22908 MB
>>     2015-05-18 11:00:15.069245 7f4a9609d700  0
>>     mon.monitor03@2(synchronizing).data_health(0) update_stats avail
>>     44% total 51175 MB, used 28511 MB, avail 22663 MB
>>     2015-05-18 11:01:46.333054 7f4a9609d700  0
>>     mon.monitor03@2(synchronizing).data_health(0) update_stats avail
>>     44% total 51175 MB, used 28285 MB, avail 22889 MB
>>     2015-05-18 11:02:48.268613 7f4a9609d700  0
>>     mon.monitor03@2(synchronizing).data_health(0) update_stats avail
>>     44% total 51175 MB, used 28521 MB, avail 22653 MB
>>     2015-05-18 11:04:21.107442 7f4a9609d700  0
>>     mon.monitor03@2(synchronizing).data_health(0) update_stats avail
>>     44% total 51175 MB, used 28287 MB, avail 22887 MB
>>     2015-05-18 11:05:24.336678 7f4a9609d700  0
>>     mon.monitor03@2(synchronizing).data_health(0) update_stats avail
>>     44% total 51175 MB, used 28552 MB, avail 22622 MB
>>     2015-05-18 11:07:02.355146 7f4a9609d700  0
>>     mon.monitor03@2(synchronizing).data_health(0) update_stats avail
>>     44% total 51175 MB, used 28266 MB, avail 22908 MB
>>     2015-05-18 11:08:04.168761 7f4a9609d700  0
>>     mon.monitor03@2(synchronizing).data_health(0) update_stats avail
>>     44% total 51175 MB, used 28527 MB, avail 22647 MB
>>     2015-05-18 11:09:25.942629 7f4a9609d700  0
>>     mon.monitor03@2(synchronizing).data_health(0) update_stats avail
>>     44% total 51175 MB, used 28296 MB, avail 22878 MB
>>     2015-05-18 11:10:28.410838 7f4a9609d700  0
>>     mon.monitor03@2(synchronizing).data_health(0) update_stats avail
>>     44% total 51175 MB, used 28555 MB, avail 22619 MB
>>     2015-05-18 11:12:06.534287 7f4a9609d700  0
>>     mon.monitor03@2(synchronizing).data_health(0) update_stats avail
>>     44% total 51175 MB, used 28284 MB, avail 22890 MB
>>     2015-05-18 11:13:09.433899 7f4a9609d700  0
>>     mon.monitor03@2(synchronizing).data_health(0) update_stats avail
>>     44% total 51175 MB, used 28337 MB, avail 22837 MB
>>     2015-05-18 11:14:09.485415 7f4a9609d700  0
>>     mon.monitor03@2(synchronizing).data_health(0) update_stats avail
>>     44% total 51175 MB, used 28297 MB, avail 22877 MB
>>     2015-05-18 11:15:13.061472 7f4a9609d700  0
>>     mon.monitor03@2(synchronizing).data_health(0) update_stats avail
>>     44% total 51175 MB, used 28520 MB, avail 22654 MB
>>     2015-05-18 11:16:47.296862 7f4a9609d700  0
>>     mon.monitor03@2(synchronizing).data_health(0) update_stats avail
>>     44% total 51175 MB, used 28288 MB, avail 22886 MB
>>     2015-05-18 11:17:48.454379 7f4a9609d700  0
>>     mon.monitor03@2(synchronizing).data_health(0) update_stats avail
>>     44% total 51175 MB, used 28480 MB, avail 22694 MB
>>     2015-05-18 11:19:24.178109 7f4a9609d700  0
>>     mon.monitor03@2(synchronizing).data_health(0) update_stats avail
>>     44% total 51175 MB, used 28288 MB, avail 22886 MB
>>     2015-05-18 11:20:25.370240 7f4a9609d700  0
>>     mon.monitor03@2(synchronizing).data_health(0) update_stats avail
>>     44% total 51175 MB, used 28536 MB, avail 22638 MB
>>     2015-05-18 11:21:25.424047 7f4a9609d700  0
>>     mon.monitor03@2(synchronizing).data_health(0) update_stats avail
>>     43% total 51175 MB, used 28712 MB, avail 22462 MB
>>     2015-05-18 11:22:27.585467 7f4a9609d700  0
>>     mon.monitor03@2(synchronizing).data_health(0) update_stats avail
>>     43% total 51175 MB, used 28978 MB, avail 22196 MB
>>     2015-05-18 11:23:27.666819 7f4a9609d700  0
>>     mon.monitor03@2(synchronizing).data_health(0) update_stats avail
>>     42% total 51175 MB, used 29197 MB, avail 21977 MB
>>     2015-05-18 11:24:30.853669 7f4a9609d700  0
>>     mon.monitor03@2(synchronizing).data_health(0) update_stats avail
>>     42% total 51175 MB, used 29429 MB, avail 21745 MB
>>     2015-05-18 11:25:32.661380 7f4a9609d700  0
>>     mon.monitor03@2(synchronizing).data_health(0) update_stats avail
>>     42% total 51175 MB, used 29672 MB, avail 21502 MB
>>     2015-05-18 11:26:32.979662 7f4a9609d700  0
>>     mon.monitor03@2(synchronizing).data_health(0) update_stats avail
>>     41% total 51175 MB, used 29716 MB, avail 21458 MB
>>     2015-05-18 11:27:33.285880 7f4a9609d700  0
>>     mon.monitor03@2(synchronizing).data_health(0) update_stats avail
>>     41% total 51175 MB, used 29871 MB, avail 21303 MB
>>     2015-05-18 11:28:36.325777 7f4a9609d700  0
>>     mon.monitor03@2(synchronizing).data_health(0) update_stats avail
>>     41% total 51175 MB, used 30034 MB, avail 21140 MB
>>     2015-05-18 11:29:36.385018 7f4a9609d700  0
>>     mon.monitor03@2(synchronizing).data_health(0) update_stats avail
>>     40% total 51175 MB, used 30203 MB, avail 20971 MB
>>     2015-05-18 11:30:42.881925 7f4a9609d700  0
>>     mon.monitor03@2(synchronizing).data_health(0) update_stats avail
>>     40% total 51175 MB, used 30205 MB, avail 20969 MB
>>     2015-05-18 11:31:44.213153 7f4a9609d700  0
>>     mon.monitor03@2(synchronizing).data_health(0) update_stats avail
>>     40% total 51175 MB, used 30457 MB, avail 20717 MB
>>     2015-05-18 11:33:05.049889 7f4a9609d700  0
>>     mon.monitor03@2(synchronizing).data_health(0) update_stats avail
>>     40% total 51175 MB, used 30301 MB, avail 20873 MB
>>     2015-05-18 11:34:06.791687 7f4a9609d700  0
>>     mon.monitor03@2(synchronizing).data_health(0) update_stats avail
>>     40% total 51175 MB, used 30514 MB, avail 20660 MB
>>     2015-05-18 11:35:35.469873 7f4a9609d700  0
>>     mon.monitor03@2(synchronizing).data_health(0) update_stats avail
>>     40% total 51175 MB, used 30293 MB, avail 20881 MB
>>     2015-05-18 11:36:37.022678 7f4a9609d700  0
>>     mon.monitor03@2(synchronizing).data_health(0) update_stats avail
>>     40% total 51175 MB, used 30534 MB, avail 20640 MB
>>     2015-05-18 11:38:03.430840 7f4a9609d700  0
>>     mon.monitor03@2(synchronizing).data_health(0) update_stats avail
>>     40% total 51175 MB, used 30293 MB, avail 20881 MB
>>     2015-05-18 11:39:04.691948 7f4a9609d700  0
>>     mon.monitor03@2(synchronizing).data_health(0) update_stats avail
>>     40% total 51175 MB, used 30503 MB, avail 20671 MB
>>     2015-05-18 11:40:28.083978 7f4a9609d700  0
>>     mon.monitor03@2(synchronizing).data_health(0) update_stats avail
>>     40% total 51175 MB, used 30293 MB, avail 20881 MB
>>
>>     And in my Ceph Dashboard Showing me these warnings for
>>     *
>>     * Cluster Status:HEALTH_WARN*
>>   # noscrub,nodeep-scrub flag(s) set
>>   # 1 mons down, quorum 0,1 monitor01,monitor02
>>   # mon.monitor01 store is getting too big! 17355 MB >= 15360 MB
>>   # mon.monitor02 store is getting too big! 23785 MB >= 15360 MB
>>     *It tells me *1 mons down* because when i added my new Ceph
>>     Monitor , it doesn't comes up in there
>>     any help i would appreciate it
>>
>>     *Thanx in advance*
>>
>>     _______________________________________________
>>     ceph-users mailing list
>>     ceph-users@xxxxxxxxxxxxxx <mailto:ceph-users@xxxxxxxxxxxxxx>
>>     http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
>>
>>
>>
>> -- 
>> Thanks & Regards   
>> K.Mohamed Pakkeer
>> Mobile- 0091-8754410114
>>
> 
> 
> 
> _______________________________________________
> ceph-users mailing list
> ceph-users@xxxxxxxxxxxxxx
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> 

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com





[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux