> On Jan 15, 2016, at 08:01, Gregory Farnum <gfarnum@xxxxxxxxxx> wrote: > > On Thu, Jan 14, 2016 at 3:46 PM, Mike Carlson <mike@xxxxxxxxxxxx> wrote: >> Hey Zheng, >> >> I've been in the #ceph irc channel all day about this. >> >> We did that, we set max_mds back to 1, but, instead of stopping mds 1, we >> did a "ceph mds rmfailed 1". Running ceph mds stop 1 produces: >> >> # ceph mds stop 1 >> Error EEXIST: mds.1 not active (???) >> >> >> Our mds in a state of resolve, and will not come back. >> >> We then tried to roll back the mds map to the epoch just before we set >> max_mds to 2, but that command crashes all but one of our monitors and never >> completes >> >> We do not know what to do at this point, if there was a way to get the mds >> back up just so we could back it up, we're okay with rebuilding. We just >> need the data back. > > It's not clear to me how much you've screwed up your monitor cluster. > If that's still alive, you should just need to set max mds to 2, turn > on an mds daemon, and let it resolve. Then you can follow the steps > Zheng outlined for reducing the number of nodes cleanly. > (That assumes that your MDS state is healthy and that the reason for > your mounts hanging was a problem elsewhere, like with directory > fragmentation confusing NFS.) > > If your monitor cluster is actually in trouble (ie, the crashing > problem made it to disk), that's a whole other thing now. But I > suspect/hope it didn't and you just need to shut down the client > trying to do the setmap and then turn the monitors all back on. > Meanwhile, please post a bug at tracker.ceph.com with the actual > monitor commands you ran and as much of the backtrace/log as you can; > we don't want to have commands which break the system! ;) > -Greg > >> >> Mike C >> >> >> >> On Thu, Jan 14, 2016 at 3:33 PM, Yan, Zheng <ukernel@xxxxxxxxx> wrote: >>> >>> On Fri, Jan 15, 2016 at 3:28 AM, Mike Carlson <mike@xxxxxxxxxxxx> wrote: >>>> Thank you for the reply Zheng >>>> >>>> We tried set mds bal frag to true, but the end result was less than >>>> desirable. All nfs and smb clients could no longer browse the share, >>>> they >>>> would hang on a directory with anything more than a few hundred files. >>>> >>>> We then tried to back out the active/active mds change, no luck, >>>> stopping >>>> one of the mds's (mds 1) prevented us from mounting the cephfs >>>> filesystem >>>> >>>> So we failed and removed the secondary MDS, and now our primary mds is >>>> stuck >>>> in a "resovle" state: >>>> >>>> # ceph -s >>>> cluster cabd1728-2eca-4e18-a581-b4885364e5a4 >>>> health HEALTH_WARN >>>> clock skew detected on mon.lts-mon >>>> mds cluster is degraded >>>> Monitor clock skew detected >>>> monmap e1: 4 mons at >>>> >>>> {lts-mon=10.5.68.236:6789/0,lts-osd1=10.5.68.229:6789/0,lts-osd2=10.5.68.230:6789/0,lts-osd3=10.5.68.203:6789/0} >>>> election epoch 1282, quorum 0,1,2,3 >>>> lts-osd3,lts-osd1,lts-osd2,lts-mon >>>> mdsmap e7892: 1/2/1 up {0=lts-mon=up:resolve} >>>> osdmap e10183: 102 osds: 101 up, 101 in >>>> pgmap v6714309: 4192 pgs, 7 pools, 31748 GB data, 23494 kobjects >>>> 96188 GB used, 273 TB / 367 TB avail >>>> 4188 active+clean >>>> 4 active+clean+scrubbing+deep >>>> >>>> Now we are really down for the count. We cannot get our MDS back up in >>>> an >>>> active state and none of our data is accessible. >>> >>> you can't remove active mds this way, you need to: >>> >>> 1. make sure all active mds are running >>> 2. run 'ceph mds set max_mds 1' >>> 3. run 'ceph mds stop 1' >>> >>> step 3 changes the second mds's state to stopping. Wait a while, the >>> second mds will go to standby state. Occasionally, the second MDS can >>> stuck in stopping state. If it happens, restart all MDS, then repeat >>> step 3. >>> >>> Regards >>> Yan, Zheng >>> >>> >>> >>>> >>>> >>>> On Wed, Jan 13, 2016 at 7:05 PM, Yan, Zheng <ukernel@xxxxxxxxx> wrote: >>>>> >>>>> On Thu, Jan 14, 2016 at 3:37 AM, Mike Carlson <mike@xxxxxxxxxxxx> >>>>> wrote: >>>>>> Hey Greg, >>>>>> >>>>>> The inconsistent view is only over nfs/smb on top of our /ceph mount. >>>>>> >>>>>> When I look directly on the /ceph mount (which is using the cephfs >>>>>> kernel >>>>>> module), everything looks fine >>>>>> >>>>>> It is possible that this issue just went unnoticed, and it only being >>>>>> a >>>>>> infernalis problem is just a red herring. With that, it is oddly >>>>>> coincidental that we just started seeing issues. >>>>> >>>>> This seems like seekdir bugs in kernel client, could you try 4.0+ >>>>> kernel. >>>>> >>>>> Besides, do you enable "mds bal frag" for ceph-mds >>>>> >>>>> >>>>> Regards >>>>> Yan, Zheng >>>>> >>>>> >>>>> >>>>>> >>>>>> On Wed, Jan 13, 2016 at 11:30 AM, Gregory Farnum <gfarnum@xxxxxxxxxx> >>>>>> wrote: >>>>>>> >>>>>>> On Wed, Jan 13, 2016 at 11:24 AM, Mike Carlson <mike@xxxxxxxxxxxx> >>>>>>> wrote: >>>>>>>> Hello. >>>>>>>> >>>>>>>> Since we upgraded to Infernalis last, we have noticed a severe >>>>>>>> problem >>>>>>>> with >>>>>>>> cephfs when we have it shared over Samba and NFS >>>>>>>> >>>>>>>> Directory listings are showing an inconsistent view of the files: >>>>>>>> >>>>>>>> >>>>>>>> $ ls /lts-mon/BD/xmlExport/ | wc -l >>>>>>>> 100 >>>>>>>> $ sudo umount /lts-mon >>>>>>>> $ sudo mount /lts-mon >>>>>>>> $ ls /lts-mon/BD/xmlExport/ | wc -l >>>>>>>> 3507 >>>>>>>> >>>>>>>> >>>>>>>> The only work around I have found is un-mounting and re-mounting >>>>>>>> the >>>>>>>> nfs >>>>>>>> share, that seems to clear it up >>>>>>>> Same with samba, I'd post it here but its thousands of lines. I >>>>>>>> can >>>>>>>> add >>>>>>>> additional details on request. >>>>>>>> >>>>>>>> This happened after our upgrade to infernalis. Is it possible the >>>>>>>> MDS >>>>>>>> is >>>>>>>> in >>>>>>>> an inconsistent state? >>>>>>> >>>>>>> So this didn't happen to you until after you upgraded? Are you >>>>>>> seeing >>>>>>> missing files when looking at cephfs directly, or only over the >>>>>>> NFS/Samba re-exports? Are you also sharing Samba by re-exporting the >>>>>>> kernel cephfs mount? >>>>>>> >>>>>>> Zheng, any ideas about kernel issues which might cause this or be >>>>>>> more >>>>>>> visible under infernalis? >>>>>>> -Greg >>>>>>> >>>>>>>> >>>>>>>> We have cephfs mounted on a server using the built in cephfs >>>>>>>> kernel >>>>>>>> module: >>>>>>>> >>>>>>>> lts-mon:6789:/ /ceph ceph >>>>>>>> name=admin,secretfile=/etc/ceph/admin.secret,noauto,_netdev >>>>>>>> >>>>>>>> >>>>>>>> We are running all of our ceph nodes on ubuntu 14.04 LTS. Samba is >>>>>>>> up >>>>>>>> to >>>>>>>> date, 4.1.6, and we export nfsv3 to linux and freebsd systems. All >>>>>>>> seem >>>>>>>> to >>>>>>>> exhibit the same behavior. >>>>>>>> >>>>>>>> system info: >>>>>>>> >>>>>>>> # uname -a >>>>>>>> Linux lts-osd1 3.13.0-63-generic #103-Ubuntu SMP Fri Aug 14 >>>>>>>> 21:42:59 >>>>>>>> UTC >>>>>>>> 2015 x86_64 x86_64 x86_64 GNU/Linux >>>>>>>> root@lts-osd1:~# lsb >>>>>>>> lsblk lsb_release >>>>>>>> root@lts-osd1:~# lsb_release -a >>>>>>>> No LSB modules are available. >>>>>>>> Distributor ID: Ubuntu >>>>>>>> Description: Ubuntu 14.04.3 LTS >>>>>>>> Release: 14.04 >>>>>>>> Codename: trusty >>>>>>>> >>>>>>>> >>>>>>>> package info: >>>>>>>> >>>>>>>> # dpkg -l|grep ceph >>>>>>>> ii ceph 9.2.0-1trusty >>>>>>>> amd64 distributed storage and file system >>>>>>>> ii ceph-common 9.2.0-1trusty >>>>>>>> amd64 common utilities to mount and interact with a ceph >>>>>>>> storage >>>>>>>> cluster >>>>>>>> ii ceph-fs-common 9.2.0-1trusty >>>>>>>> amd64 common utilities to mount and interact with a ceph >>>>>>>> file >>>>>>>> system >>>>>>>> ii ceph-mds 9.2.0-1trusty >>>>>>>> amd64 metadata server for the ceph distributed file system >>>>>>>> ii libcephfs1 9.2.0-1trusty >>>>>>>> amd64 Ceph distributed file system client library >>>>>>>> ii python-ceph 9.2.0-1trusty >>>>>>>> amd64 Meta-package for python libraries for the Ceph >>>>>>>> libraries >>>>>>>> ii python-cephfs 9.2.0-1trusty >>>>>>>> amd64 Python libraries for the Ceph libcephfs library >>>>>>>> >>>>>>>> >>>>>>>> What is interesting, is a directory or file will not show up in a >>>>>>>> listing, >>>>>>>> however, if we directly access the file, it shows up in that >>>>>>>> instance: >>>>>>>> >>>>>>>> >>>>>>>> # ls -al |grep SCHOOL >>>>>>>> # ls -alnd SCHOOL667055 >>>>>>>> drwxrwsr-x 1 21695 21183 2962751438 Jan 13 09:33 SCHOOL667055 >>>>>>>> >>>>>>>> >>>>>>>> Any tips are appreciated! >>>>>>>> >>>>>>>> Thanks, >>>>>>>> Mike C >>>>>>>> >>>>>>>> >>>>>>>> _______________________________________________ >>>>>>>> ceph-users mailing list >>>>>>>> ceph-users@xxxxxxxxxxxxxx >>>>>>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >>>>>>>> >>>>>> >>>>>> >>>>>> >>>>>> _______________________________________________ >>>>>> ceph-users mailing list >>>>>> ceph-users@xxxxxxxxxxxxxx >>>>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com