Re: metadata server rejoin time

Thomas Lemarchand <thomas.lemarchand@xxxxxxxxxxxxxxxxxx> · Wed, 08 Jul 2015 11:30:50 +0200

Hi Gregory, Matteo, list users

I have the exact same problem.

While the mds is in "rejoin" state, memory usage of the mds grows
slowly. It can take up to 20 min to reach 'active' state.

Is there something I can do to help ?

-- 
Thomas Lemarchand
Cloud Solutions SAS - Responsable des systèmes d'information

On mar., 2015-07-07 at 14:56 +0100, Gregory Farnum wrote:
> On Thu, Jul 2, 2015 at 11:38 AM, Matteo Dacrema <mdacrema@xxxxxxxx> 
> wrote:
> > Hi all,
> > 
> > I'm using CephFS on Hammer and I've 1.5 million files , 2 metadata 
> > servers
> > in active/standby configuration with 8 GB of RAM , 20 clients with 
> > 2 GB of
> > RAM each and 2 OSD nodes with 4 80GB osd and 4GB of RAM.
> > I've noticed that if I kill the active metadata server the second 
> > one took
> > about 10 to 30 minutes to switch from rejoin to active state.On the 
> > rejoin
> > server while that is in rejoin state I can see ceph allocating RAM.
> 
> Do you have example "ceph -s" or "ceph -w" output? There are a few
> things that could be making it take a while to finish restarting, but
> I don't think it should be stuck in the rejoin state.
> 
> Also, how are your clients mounting CephFS?
> 
> > 
> > 
> > Here my configuration:
> > 
> > [global]
> >         fsid = 2de7b17f-0a3e-4109-b878-c035dd2f7735
> >         mon_initial_members = cephmds01
> >         mon_host = 10.29.81.161
> >         auth_cluster_required = cephx
> >         auth_service_required = cephx
> >         auth_client_required = cephx
> >         public network = 10.29.81.0/24
> >         tcp nodelay = true
> >         tcp rcvbuf = 0
> >         ms tcp read timeout = 600
> > 
> >         #Capacity
> >         mon osd full ratio = .95
> >         mon osd nearfull ratio = .85
> > 
> > 
> > [osd]
> >         osd journal size = 1024
> >         journal dio = true
> >         journal aio = true
> > 
> >         osd op threads = 2
> >         osd op thread timeout = 60
> >         osd disk threads = 2
> >         osd recovery threads = 1
> >         osd recovery max active = 1
> >         osd max backfills = 2
> > 
> > 
> >         # Pool
> >         osd pool default size = 2
> > 
> >         #XFS
> >         osd mkfs type = xfs
> >         osd mkfs options xfs = "-f -i size=2048"
> >         osd mount options xfs = 
> > "rw,noatime,inode64,logbsize=256k,delaylog"
> > 
> >         #FileStore Settings
> >         filestore xattr use omap = false
> >         filestore max inline xattr size = 512
> >         filestore max sync interval = 10
> >         filestore merge threshold = 40
> >         filestore split multiple = 8
> >         filestore flusher = false
> >         filestore queue max ops = 2000
> >         filestore queue max bytes = 536870912
> >         filestore queue committing max ops = 500
> >         filestore queue committing max bytes = 268435456
> >         filestore op threads = 2
> > 
> > [mds]
> >         max mds = 1
> >         mds cache size = 250000
> >         client cache size = 1024
> 
> This particular value is only interpreted by userspace clients; it
> doesn't do anything inside of the [mds] section.
> 
> >         mds dir commit ratio = 0.5
> > 
> > Best regards,
> > Matteo
> > 
> > 
> > 
> > _______________________________________________
> > ceph-users mailing list
> > ceph-users@xxxxxxxxxxxxxx
> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> > 
> _______________________________________________
> ceph-users mailing list
> ceph-users@xxxxxxxxxxxxxx
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> 

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com