On Thu, Jul 2, 2015 at 11:38 AM, Matteo Dacrema <mdacrema@xxxxxxxx> wrote: > Hi all, > > I'm using CephFS on Hammer and I've 1.5 million files , 2 metadata servers > in active/standby configuration with 8 GB of RAM , 20 clients with 2 GB of > RAM each and 2 OSD nodes with 4 80GB osd and 4GB of RAM. > I've noticed that if I kill the active metadata server the second one took > about 10 to 30 minutes to switch from rejoin to active state.On the rejoin > server while that is in rejoin state I can see ceph allocating RAM. Do you have example "ceph -s" or "ceph -w" output? There are a few things that could be making it take a while to finish restarting, but I don't think it should be stuck in the rejoin state. Also, how are your clients mounting CephFS? > > > Here my configuration: > > [global] > fsid = 2de7b17f-0a3e-4109-b878-c035dd2f7735 > mon_initial_members = cephmds01 > mon_host = 10.29.81.161 > auth_cluster_required = cephx > auth_service_required = cephx > auth_client_required = cephx > public network = 10.29.81.0/24 > tcp nodelay = true > tcp rcvbuf = 0 > ms tcp read timeout = 600 > > #Capacity > mon osd full ratio = .95 > mon osd nearfull ratio = .85 > > > [osd] > osd journal size = 1024 > journal dio = true > journal aio = true > > osd op threads = 2 > osd op thread timeout = 60 > osd disk threads = 2 > osd recovery threads = 1 > osd recovery max active = 1 > osd max backfills = 2 > > > # Pool > osd pool default size = 2 > > #XFS > osd mkfs type = xfs > osd mkfs options xfs = "-f -i size=2048" > osd mount options xfs = "rw,noatime,inode64,logbsize=256k,delaylog" > > #FileStore Settings > filestore xattr use omap = false > filestore max inline xattr size = 512 > filestore max sync interval = 10 > filestore merge threshold = 40 > filestore split multiple = 8 > filestore flusher = false > filestore queue max ops = 2000 > filestore queue max bytes = 536870912 > filestore queue committing max ops = 500 > filestore queue committing max bytes = 268435456 > filestore op threads = 2 > > [mds] > max mds = 1 > mds cache size = 250000 > client cache size = 1024 This particular value is only interpreted by userspace clients; it doesn't do anything inside of the [mds] section. > mds dir commit ratio = 0.5 > > Best regards, > Matteo > > > > _______________________________________________ > ceph-users mailing list > ceph-users@xxxxxxxxxxxxxx > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com