Hi, I can reproduce it, running MDS in foreground with "ceph-mds -i <id> -f -d --setuser ceph --setgroup ceph" failed to respawn with this error : global_init: error reading config file. I found the problem, ceph.conf file was not readable by ceph user. This is related to Proxmox way of handling config files. Will see with them. Thank you On 03/09/2016 02:11 PM, John Spray wrote: > On Wed, Mar 9, 2016 at 11:37 AM, Florent B <florent@xxxxxxxxxxx> wrote: >> Hi John and thank you for your explanations :) >> >> It could be a network issue. >> >> MDS should respawn, but "ceph-mds" process was no more running after >> last log message, so I deduced it crashed... > Hmm, that's worth investigating. You can induce the MDS to respawn > itself by simply doing "ceph mds fail <id>", or "ceph tell mds.<id> > respawn" > > Can you play around and see if it's consistently failing to respawn, > and if you can see any extra evidence, maybe try running the MDS in > the foreground to make it easier to see any output ("ceph-mds -i <id> > -f -d") > > John > >> On 03/09/2016 12:26 PM, John Spray wrote: >>> The MDS restarted because it received an MDSMap from the monitors in >>> which its own entry had been removed. >>> >>> This is usually a sign that the MDS was failing to communicate with >>> the mons for some period of time, and as a result the mons have given >>> up and cause another MDS to take over. However, in this instance we >>> can see the mds and mon exchanging beacons regularly. >>> >>> The last acknowledged beacon from was at 2016-03-09 04:53:38.824983 >>> >>> The updated mdsmap came at 04:53:56. 18 seconds shouldn't have been >>> long enough for anything to time out, unless you've changed the >>> defaults. >>> >>> I notice that the new MDSMap (epoch 573) also indicates that peer MDS >>> daemons have been failed, and that shortly before receiving the new >>> map, there are a bunch of log messages indicating various client >>> connections resetting. >>> >>> So from this log I would guess some kind of network issue? >>> >>> You say that the MDS crashed, why? From the log it looks like it's >>> respawning itself, which shouldn't immediately be noticeable, you >>> should just see another MDS daemon take over, and a few seconds later >>> this guy would come back as a standby. >>> >>> John >>> >>> On Wed, Mar 9, 2016 at 9:55 AM, Florent B <florent@xxxxxxxxxxx> wrote: >>>> Hi everyone, >>>> >>>> Last night one of my MDS crashed. >>>> >>>> It was running last Infernalis packaged version for Jessie. >>>> >>>> Here is last minutes log : http://paste.ubuntu.com/15333772/ >>>> >>>> Does anyone have an idea of what caused the crash ? >>>> >>>> Thank you. >>>> >>>> Florent >>>> _______________________________________________ >>>> ceph-users mailing list >>>> ceph-users@xxxxxxxxxxxxxx >>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com