Thanks for this thread. We just did the same mistake (rmfailed) on our hammer cluster which broke it similarly. The addfailed patch worked for us too. -- Dan On Fri, Jan 15, 2016 at 6:30 AM, Mike Carlson <mike@xxxxxxxxxxxx> wrote: > Hey ceph-users, > > I wanted to follow up, Zheng's patch did the trick. We re-added the removed > mds, and it all came back. We're sync-ing our data off to a backup server. > > Thanks for all of the help, Ceph has a great community to work with! > > Mike C > > On Thu, Jan 14, 2016 at 4:46 PM, Yan, Zheng <zyan@xxxxxxxxxx> wrote: >> >> Here is patch for v9.2.0. After install the modified version of ceph-mon, >> run “ceph mds add failed 1” >> >> >> >> >> >> > On Jan 15, 2016, at 08:20, Mike Carlson <mike@xxxxxxxxxxxx> wrote: >> > >> > okay, that sounds really good. >> > >> > Would it help if you had access to our cluster? >> > >> > On Thu, Jan 14, 2016 at 4:19 PM, Yan, Zheng <zyan@xxxxxxxxxx> wrote: >> > >> > > On Jan 15, 2016, at 08:16, Mike Carlson <mike@xxxxxxxxxxxx> wrote: >> > > >> > > Did I just loose all of my data? >> > > >> > > If we were able to export the journal, could we create a brand new mds >> > > out of that and retrieve our data? >> > >> > No. it’s early to fix. but you need to re-compile ceph-mon from source >> > code. I’m writing the patch. >> > >> > >> > >> > >> > > >> > > On Thu, Jan 14, 2016 at 4:15 PM, Yan, Zheng <zyan@xxxxxxxxxx> wrote: >> > > >> > > > On Jan 15, 2016, at 08:01, Gregory Farnum <gfarnum@xxxxxxxxxx> >> > > > wrote: >> > > > >> > > > On Thu, Jan 14, 2016 at 3:46 PM, Mike Carlson <mike@xxxxxxxxxxxx> >> > > > wrote: >> > > >> Hey Zheng, >> > > >> >> > > >> I've been in the #ceph irc channel all day about this. >> > > >> >> > > >> We did that, we set max_mds back to 1, but, instead of stopping mds >> > > >> 1, we >> > > >> did a "ceph mds rmfailed 1". Running ceph mds stop 1 produces: >> > > >> >> > > >> # ceph mds stop 1 >> > > >> Error EEXIST: mds.1 not active (???) >> > > >> >> > > >> >> > > >> Our mds in a state of resolve, and will not come back. >> > > >> >> > > >> We then tried to roll back the mds map to the epoch just before we >> > > >> set >> > > >> max_mds to 2, but that command crashes all but one of our monitors >> > > >> and never >> > > >> completes >> > > >> >> > > >> We do not know what to do at this point, if there was a way to get >> > > >> the mds >> > > >> back up just so we could back it up, we're okay with rebuilding. We >> > > >> just >> > > >> need the data back. >> > > > >> > > > It's not clear to me how much you've screwed up your monitor >> > > > cluster. >> > > > If that's still alive, you should just need to set max mds to 2, >> > > > turn >> > > > on an mds daemon, and let it resolve. Then you can follow the steps >> > > > Zheng outlined for reducing the number of nodes cleanly. >> > > > (That assumes that your MDS state is healthy and that the reason for >> > > > your mounts hanging was a problem elsewhere, like with directory >> > > > fragmentation confusing NFS.) >> > > > >> > > > If your monitor cluster is actually in trouble (ie, the crashing >> > > > problem made it to disk), that's a whole other thing now. But I >> > > > suspect/hope it didn't and you just need to shut down the client >> > > > trying to do the setmap and then turn the monitors all back on. >> > > > Meanwhile, please post a bug at tracker.ceph.com with the actual >> > > > monitor commands you ran and as much of the backtrace/log as you >> > > > can; >> > > > we don't want to have commands which break the system! ;) >> > > > -Greg >> > > >> > > the problem is that he ran ‘ceph mds rmfailed 1’ and there is no >> > > command to undo this. I think we need a command “ceph mds addfailed rank’ >> > > >> > > Regards >> > > Yan, Zheng >> > > >> > > >> > > > >> > > >> >> > > >> Mike C >> > > >> >> > > >> >> > > >> >> > > >> On Thu, Jan 14, 2016 at 3:33 PM, Yan, Zheng <ukernel@xxxxxxxxx> >> > > >> wrote: >> > > >>> >> > > >>> On Fri, Jan 15, 2016 at 3:28 AM, Mike Carlson <mike@xxxxxxxxxxxx> >> > > >>> wrote: >> > > >>>> Thank you for the reply Zheng >> > > >>>> >> > > >>>> We tried set mds bal frag to true, but the end result was less >> > > >>>> than >> > > >>>> desirable. All nfs and smb clients could no longer browse the >> > > >>>> share, >> > > >>>> they >> > > >>>> would hang on a directory with anything more than a few hundred >> > > >>>> files. >> > > >>>> >> > > >>>> We then tried to back out the active/active mds change, no luck, >> > > >>>> stopping >> > > >>>> one of the mds's (mds 1) prevented us from mounting the cephfs >> > > >>>> filesystem >> > > >>>> >> > > >>>> So we failed and removed the secondary MDS, and now our primary >> > > >>>> mds is >> > > >>>> stuck >> > > >>>> in a "resovle" state: >> > > >>>> >> > > >>>> # ceph -s >> > > >>>> cluster cabd1728-2eca-4e18-a581-b4885364e5a4 >> > > >>>> health HEALTH_WARN >> > > >>>> clock skew detected on mon.lts-mon >> > > >>>> mds cluster is degraded >> > > >>>> Monitor clock skew detected >> > > >>>> monmap e1: 4 mons at >> > > >>>> >> > > >>>> >> > > >>>> {lts-mon=10.5.68.236:6789/0,lts-osd1=10.5.68.229:6789/0,lts-osd2=10.5.68.230:6789/0,lts-osd3=10.5.68.203:6789/0} >> > > >>>> election epoch 1282, quorum 0,1,2,3 >> > > >>>> lts-osd3,lts-osd1,lts-osd2,lts-mon >> > > >>>> mdsmap e7892: 1/2/1 up {0=lts-mon=up:resolve} >> > > >>>> osdmap e10183: 102 osds: 101 up, 101 in >> > > >>>> pgmap v6714309: 4192 pgs, 7 pools, 31748 GB data, 23494 >> > > >>>> kobjects >> > > >>>> 96188 GB used, 273 TB / 367 TB avail >> > > >>>> 4188 active+clean >> > > >>>> 4 active+clean+scrubbing+deep >> > > >>>> >> > > >>>> Now we are really down for the count. We cannot get our MDS back >> > > >>>> up in >> > > >>>> an >> > > >>>> active state and none of our data is accessible. >> > > >>> >> > > >>> you can't remove active mds this way, you need to: >> > > >>> >> > > >>> 1. make sure all active mds are running >> > > >>> 2. run 'ceph mds set max_mds 1' >> > > >>> 3. run 'ceph mds stop 1' >> > > >>> >> > > >>> step 3 changes the second mds's state to stopping. Wait a while, >> > > >>> the >> > > >>> second mds will go to standby state. Occasionally, the second MDS >> > > >>> can >> > > >>> stuck in stopping state. If it happens, restart all MDS, then >> > > >>> repeat >> > > >>> step 3. >> > > >>> >> > > >>> Regards >> > > >>> Yan, Zheng >> > > >>> >> > > >>> >> > > >>> >> > > >>>> >> > > >>>> >> > > >>>> On Wed, Jan 13, 2016 at 7:05 PM, Yan, Zheng <ukernel@xxxxxxxxx> >> > > >>>> wrote: >> > > >>>>> >> > > >>>>> On Thu, Jan 14, 2016 at 3:37 AM, Mike Carlson >> > > >>>>> <mike@xxxxxxxxxxxx> >> > > >>>>> wrote: >> > > >>>>>> Hey Greg, >> > > >>>>>> >> > > >>>>>> The inconsistent view is only over nfs/smb on top of our /ceph >> > > >>>>>> mount. >> > > >>>>>> >> > > >>>>>> When I look directly on the /ceph mount (which is using the >> > > >>>>>> cephfs >> > > >>>>>> kernel >> > > >>>>>> module), everything looks fine >> > > >>>>>> >> > > >>>>>> It is possible that this issue just went unnoticed, and it only >> > > >>>>>> being >> > > >>>>>> a >> > > >>>>>> infernalis problem is just a red herring. With that, it is >> > > >>>>>> oddly >> > > >>>>>> coincidental that we just started seeing issues. >> > > >>>>> >> > > >>>>> This seems like seekdir bugs in kernel client, could you try >> > > >>>>> 4.0+ >> > > >>>>> kernel. >> > > >>>>> >> > > >>>>> Besides, do you enable "mds bal frag" for ceph-mds >> > > >>>>> >> > > >>>>> >> > > >>>>> Regards >> > > >>>>> Yan, Zheng >> > > >>>>> >> > > >>>>> >> > > >>>>> >> > > >>>>>> >> > > >>>>>> On Wed, Jan 13, 2016 at 11:30 AM, Gregory Farnum >> > > >>>>>> <gfarnum@xxxxxxxxxx> >> > > >>>>>> wrote: >> > > >>>>>>> >> > > >>>>>>> On Wed, Jan 13, 2016 at 11:24 AM, Mike Carlson >> > > >>>>>>> <mike@xxxxxxxxxxxx> >> > > >>>>>>> wrote: >> > > >>>>>>>> Hello. >> > > >>>>>>>> >> > > >>>>>>>> Since we upgraded to Infernalis last, we have noticed a >> > > >>>>>>>> severe >> > > >>>>>>>> problem >> > > >>>>>>>> with >> > > >>>>>>>> cephfs when we have it shared over Samba and NFS >> > > >>>>>>>> >> > > >>>>>>>> Directory listings are showing an inconsistent view of the >> > > >>>>>>>> files: >> > > >>>>>>>> >> > > >>>>>>>> >> > > >>>>>>>> $ ls /lts-mon/BD/xmlExport/ | wc -l >> > > >>>>>>>> 100 >> > > >>>>>>>> $ sudo umount /lts-mon >> > > >>>>>>>> $ sudo mount /lts-mon >> > > >>>>>>>> $ ls /lts-mon/BD/xmlExport/ | wc -l >> > > >>>>>>>> 3507 >> > > >>>>>>>> >> > > >>>>>>>> >> > > >>>>>>>> The only work around I have found is un-mounting and >> > > >>>>>>>> re-mounting >> > > >>>>>>>> the >> > > >>>>>>>> nfs >> > > >>>>>>>> share, that seems to clear it up >> > > >>>>>>>> Same with samba, I'd post it here but its thousands of lines. >> > > >>>>>>>> I >> > > >>>>>>>> can >> > > >>>>>>>> add >> > > >>>>>>>> additional details on request. >> > > >>>>>>>> >> > > >>>>>>>> This happened after our upgrade to infernalis. Is it possible >> > > >>>>>>>> the >> > > >>>>>>>> MDS >> > > >>>>>>>> is >> > > >>>>>>>> in >> > > >>>>>>>> an inconsistent state? >> > > >>>>>>> >> > > >>>>>>> So this didn't happen to you until after you upgraded? Are you >> > > >>>>>>> seeing >> > > >>>>>>> missing files when looking at cephfs directly, or only over >> > > >>>>>>> the >> > > >>>>>>> NFS/Samba re-exports? Are you also sharing Samba by >> > > >>>>>>> re-exporting the >> > > >>>>>>> kernel cephfs mount? >> > > >>>>>>> >> > > >>>>>>> Zheng, any ideas about kernel issues which might cause this or >> > > >>>>>>> be >> > > >>>>>>> more >> > > >>>>>>> visible under infernalis? >> > > >>>>>>> -Greg >> > > >>>>>>> >> > > >>>>>>>> >> > > >>>>>>>> We have cephfs mounted on a server using the built in cephfs >> > > >>>>>>>> kernel >> > > >>>>>>>> module: >> > > >>>>>>>> >> > > >>>>>>>> lts-mon:6789:/ /ceph ceph >> > > >>>>>>>> name=admin,secretfile=/etc/ceph/admin.secret,noauto,_netdev >> > > >>>>>>>> >> > > >>>>>>>> >> > > >>>>>>>> We are running all of our ceph nodes on ubuntu 14.04 LTS. >> > > >>>>>>>> Samba is >> > > >>>>>>>> up >> > > >>>>>>>> to >> > > >>>>>>>> date, 4.1.6, and we export nfsv3 to linux and freebsd >> > > >>>>>>>> systems. All >> > > >>>>>>>> seem >> > > >>>>>>>> to >> > > >>>>>>>> exhibit the same behavior. >> > > >>>>>>>> >> > > >>>>>>>> system info: >> > > >>>>>>>> >> > > >>>>>>>> # uname -a >> > > >>>>>>>> Linux lts-osd1 3.13.0-63-generic #103-Ubuntu SMP Fri Aug 14 >> > > >>>>>>>> 21:42:59 >> > > >>>>>>>> UTC >> > > >>>>>>>> 2015 x86_64 x86_64 x86_64 GNU/Linux >> > > >>>>>>>> root@lts-osd1:~# lsb >> > > >>>>>>>> lsblk lsb_release >> > > >>>>>>>> root@lts-osd1:~# lsb_release -a >> > > >>>>>>>> No LSB modules are available. >> > > >>>>>>>> Distributor ID: Ubuntu >> > > >>>>>>>> Description: Ubuntu 14.04.3 LTS >> > > >>>>>>>> Release: 14.04 >> > > >>>>>>>> Codename: trusty >> > > >>>>>>>> >> > > >>>>>>>> >> > > >>>>>>>> package info: >> > > >>>>>>>> >> > > >>>>>>>> # dpkg -l|grep ceph >> > > >>>>>>>> ii ceph 9.2.0-1trusty >> > > >>>>>>>> amd64 distributed storage and file system >> > > >>>>>>>> ii ceph-common 9.2.0-1trusty >> > > >>>>>>>> amd64 common utilities to mount and interact with a >> > > >>>>>>>> ceph >> > > >>>>>>>> storage >> > > >>>>>>>> cluster >> > > >>>>>>>> ii ceph-fs-common 9.2.0-1trusty >> > > >>>>>>>> amd64 common utilities to mount and interact with a >> > > >>>>>>>> ceph >> > > >>>>>>>> file >> > > >>>>>>>> system >> > > >>>>>>>> ii ceph-mds 9.2.0-1trusty >> > > >>>>>>>> amd64 metadata server for the ceph distributed file >> > > >>>>>>>> system >> > > >>>>>>>> ii libcephfs1 9.2.0-1trusty >> > > >>>>>>>> amd64 Ceph distributed file system client library >> > > >>>>>>>> ii python-ceph 9.2.0-1trusty >> > > >>>>>>>> amd64 Meta-package for python libraries for the Ceph >> > > >>>>>>>> libraries >> > > >>>>>>>> ii python-cephfs 9.2.0-1trusty >> > > >>>>>>>> amd64 Python libraries for the Ceph libcephfs library >> > > >>>>>>>> >> > > >>>>>>>> >> > > >>>>>>>> What is interesting, is a directory or file will not show up >> > > >>>>>>>> in a >> > > >>>>>>>> listing, >> > > >>>>>>>> however, if we directly access the file, it shows up in that >> > > >>>>>>>> instance: >> > > >>>>>>>> >> > > >>>>>>>> >> > > >>>>>>>> # ls -al |grep SCHOOL >> > > >>>>>>>> # ls -alnd SCHOOL667055 >> > > >>>>>>>> drwxrwsr-x 1 21695 21183 2962751438 Jan 13 09:33 >> > > >>>>>>>> SCHOOL667055 >> > > >>>>>>>> >> > > >>>>>>>> >> > > >>>>>>>> Any tips are appreciated! >> > > >>>>>>>> >> > > >>>>>>>> Thanks, >> > > >>>>>>>> Mike C >> > > >>>>>>>> >> > > >>>>>>>> >> > > >>>>>>>> _______________________________________________ >> > > >>>>>>>> ceph-users mailing list >> > > >>>>>>>> ceph-users@xxxxxxxxxxxxxx >> > > >>>>>>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >> > > >>>>>>>> >> > > >>>>>> >> > > >>>>>> >> > > >>>>>> >> > > >>>>>> _______________________________________________ >> > > >>>>>> ceph-users mailing list >> > > >>>>>> ceph-users@xxxxxxxxxxxxxx >> > > >>>>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >> > > >> > > >> > >> > >> >> > > > _______________________________________________ > ceph-users mailing list > ceph-users@xxxxxxxxxxxxxx > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com