One Mon log huge and this Mon down often

onlydebian@xxxxxxxxx (debian Only) · Fri, 22 Aug 2014 21:31:36 +0700

this time ceph01-vm down, no big log happen ,  other 2 ok.    do not what's
the reason,  this is not my first time install Ceph.  but this is first
time i meet that mon down again and again.

ceph.conf on each OSDs and MONs
 [global]
fsid = 075f1aae-48de-412e-b024-b0f014dbc8cf
mon_initial_members = ceph01-vm, ceph02-vm, ceph04-vm
mon_host = 192.168.123.251,192.168.123.252,192.168.123.250
auth_cluster_required = cephx
auth_service_required = cephx
auth_client_required = cephx
filestore_xattr_use_omap = true

rgw print continue = false
rgw dns name = ceph-radosgw
osd pool default pg num = 128
osd pool default pgp num = 128

[client.radosgw.gateway]
host = ceph-radosgw
keyring = /etc/ceph/ceph.client.radosgw.keyring
rgw socket path = /var/run/ceph/ceph.radosgw.gateway.fastcgi.sock
log file = /var/log/ceph/client.radosgw.gateway.log

2014-08-22 18:15 GMT+07:00 Joao Eduardo Luis <joao.luis at inktank.com>:

> On 08/22/2014 10:21 AM, debian Only wrote:
>
>> i have  3 mons in Ceph 0.80.5 on Wheezy. have one RadosGW
>>
>> when happen this first time, i increase the mon log device.
>> this time mon.ceph02-vm down, only this mon down,  other 2 is ok.
>>
>> pls some one give me some guide.
>>
>>   27M Aug 22 02:11 ceph-mon.ceph04-vm.log
>>   43G Aug 22 02:11 ceph-mon.ceph02-vm.log
>>   2G Aug 22 02:11 ceph-mon.ceph01-vm.log
>>
>
> Depending on the debug level you set, and depending on which subsystems
> you set a higher debug level, the monitor can spit out A LOT of information
> in a short period of time.  43GB is nothing compared to some 100+ GB logs
> I've had churn through in the past.
>
> However, I'm not grasping what kind of help you need.  According to your
> 'ceph -s' below the monitors seem okay -- all are in, health is OK.
>
> If you issue is with having that one monitor spitting out humongous
> amounts of debug info here's what you need to do:
>
> - If you added one or more 'debug <something> = X' to that monitor's
> ceph.conf, you will want to remove them so that in a future restart the
> monitor doesn't start with non-default debug levels.
>
> - You will want to inject default debug levels into that one monitor.
>
> Depending on what debug levels you have increased, you will want to run a
> version of "ceph tell mon.ceph02-vm injectargs '--debug-mon 1/5 --debug-ms
> 0/5 --debug-paxos 1/5'"
>
>   -Joao
>
>
>> # ceph -s
>>      cluster 075f1aae-48de-412e-b024-b0f014dbc8cf
>>       health HEALTH_OK
>>       monmap e2: 3 mons at
>> {ceph01-vm=192.168.123.251:6789/0,ceph02-vm=192.168.123.
>> 252:6789/0,ceph04-vm=192.168.123.250:6789/0
>> <http://192.168.123.251:6789/0,ceph02-vm=192.168.123.252:
>> 6789/0,ceph04-vm=192.168.123.250:6789/0>},
>>
>> election epoch 44, quorum 0,1,2 ceph04-vm,ceph01-vm,ceph02-vm
>>       mdsmap e10: 1/1/1 up {0=ceph06-vm=up:active}
>>       osdmap e145: 10 osds: 10 up, 10 in
>>        pgmap v4394: 2392 pgs, 21 pools, 4503 MB data, 1250 objects
>>              13657 MB used, 4908 GB / 4930 GB avail
>>                  2392 active+clean
>>
>>
>> /2014-08-22 02:06:34.738828 7ff2b9557700  1
>>
>> mon.ceph02-vm at 2(peon).paxos(paxos active c 9037..9756) is_readable
>> now=2014-08-22 02:06:34.738830 lease_expire=2014-08-22 02:06:39.701305
>> has v0 lc 9756/
>> /2014-08-22 02:06:36.618805 7ff2b9557700  1
>>
>> mon.ceph02-vm at 2(peon).paxos(paxos active c 9037..9756) is_readable
>> now=2014-08-22 02:06:36.618807 lease_expire=2014-08-22 02:06:39.701305
>> has v0 lc 9756/
>> /2014-08-22 02:06:36.620019 7ff2b9557700  1
>>
>> mon.ceph02-vm at 2(peon).paxos(paxos active c 9037..9756) is_readable
>> now=2014-08-22 02:06:36.620021 lease_expire=2014-08-22 02:06:39.701305
>> has v0 lc 9756/
>> /2014-08-22 02:06:36.620975 7ff2b9557700  1
>>
>> mon.ceph02-vm at 2(peon).paxos(paxos active c 9037..9756) is_readable
>> now=2014-08-22 02:06:36.620977 lease_expire=2014-08-22 02:06:39.701305
>> has v0 lc 9756/
>> /2014-08-22 02:06:36.629362 7ff2b9557700  0 mon.ceph02-vm at 2(peon) e2
>>
>> handle_command mon_command({"prefix": "mon_status", "format": "json"} v
>> 0) v1/
>> /2014-08-22 02:06:36.633007 7ff2b9557700  0 mon.ceph02-vm at 2(peon) e2
>> handle_command mon_command({"prefix": "status", "format": "json"} v 0) v1/
>> /2014-08-22 02:06:36.637002 7ff2b9557700  0 mon.ceph02-vm at 2(peon) e2
>>
>> handle_command mon_command({"prefix": "health", "detail": "", "format":
>> "json"} v 0) v1/
>> /2014-08-22 02:06:36.640971 7ff2b9557700  0 mon.ceph02-vm at 2(peon) e2
>>
>> handle_command mon_command({"dumpcontents": ["pgs_brief"], "prefix": "pg
>> dump", "format": "json"} v 0) v1/
>> /2014-08-22 02:06:36.641014 7ff2b9557700  1
>>
>> mon.ceph02-vm at 2(peon).paxos(paxos active c 9037..9756) is_readable
>> now=2014-08-22 02:06:36.641016 lease_expire=2014-08-22 02:06:39.701305
>> has v0 lc 9756/
>> /2014-08-22 02:06:37.520387 7ff2b9557700  1
>>
>> mon.ceph02-vm at 2(peon).paxos(paxos active c 9037..9757) is_readable
>> now=2014-08-22 02:06:37.520388 lease_expire=2014-08-22 02:06:42.501572
>> has v0 lc 9757/
>>
>>
>>
>> _______________________________________________
>> ceph-users mailing list
>> ceph-users at lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
>>
>
> --
> Joao Eduardo Luis
> Software Engineer | http://inktank.com | http://ceph.com
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ceph.com/pipermail/ceph-users-ceph.com/attachments/20140822/5ad875c9/attachment.htm>