On Fri, Jan 15, 2016 at 3:28 AM, Mike Carlson <mike@xxxxxxxxxxxx> wrote: > Thank you for the reply Zheng > > We tried set mds bal frag to true, but the end result was less than > desirable. All nfs and smb clients could no longer browse the share, they > would hang on a directory with anything more than a few hundred files. > > We then tried to back out the active/active mds change, no luck, stopping > one of the mds's (mds 1) prevented us from mounting the cephfs filesystem > > So we failed and removed the secondary MDS, and now our primary mds is stuck > in a "resovle" state: > > # ceph -s > cluster cabd1728-2eca-4e18-a581-b4885364e5a4 > health HEALTH_WARN > clock skew detected on mon.lts-mon > mds cluster is degraded > Monitor clock skew detected > monmap e1: 4 mons at > {lts-mon=10.5.68.236:6789/0,lts-osd1=10.5.68.229:6789/0,lts-osd2=10.5.68.230:6789/0,lts-osd3=10.5.68.203:6789/0} > election epoch 1282, quorum 0,1,2,3 > lts-osd3,lts-osd1,lts-osd2,lts-mon > mdsmap e7892: 1/2/1 up {0=lts-mon=up:resolve} > osdmap e10183: 102 osds: 101 up, 101 in > pgmap v6714309: 4192 pgs, 7 pools, 31748 GB data, 23494 kobjects > 96188 GB used, 273 TB / 367 TB avail > 4188 active+clean > 4 active+clean+scrubbing+deep > > Now we are really down for the count. We cannot get our MDS back up in an > active state and none of our data is accessible. you can't remove active mds this way, you need to: 1. make sure all active mds are running 2. run 'ceph mds set max_mds 1' 3. run 'ceph mds stop 1' step 3 changes the second mds's state to stopping. Wait a while, the second mds will go to standby state. Occasionally, the second MDS can stuck in stopping state. If it happens, restart all MDS, then repeat step 3. Regards Yan, Zheng > > > On Wed, Jan 13, 2016 at 7:05 PM, Yan, Zheng <ukernel@xxxxxxxxx> wrote: >> >> On Thu, Jan 14, 2016 at 3:37 AM, Mike Carlson <mike@xxxxxxxxxxxx> wrote: >> > Hey Greg, >> > >> > The inconsistent view is only over nfs/smb on top of our /ceph mount. >> > >> > When I look directly on the /ceph mount (which is using the cephfs >> > kernel >> > module), everything looks fine >> > >> > It is possible that this issue just went unnoticed, and it only being a >> > infernalis problem is just a red herring. With that, it is oddly >> > coincidental that we just started seeing issues. >> >> This seems like seekdir bugs in kernel client, could you try 4.0+ kernel. >> >> Besides, do you enable "mds bal frag" for ceph-mds >> >> >> Regards >> Yan, Zheng >> >> >> >> > >> > On Wed, Jan 13, 2016 at 11:30 AM, Gregory Farnum <gfarnum@xxxxxxxxxx> >> > wrote: >> >> >> >> On Wed, Jan 13, 2016 at 11:24 AM, Mike Carlson <mike@xxxxxxxxxxxx> >> >> wrote: >> >> > Hello. >> >> > >> >> > Since we upgraded to Infernalis last, we have noticed a severe >> >> > problem >> >> > with >> >> > cephfs when we have it shared over Samba and NFS >> >> > >> >> > Directory listings are showing an inconsistent view of the files: >> >> > >> >> > >> >> > $ ls /lts-mon/BD/xmlExport/ | wc -l >> >> > 100 >> >> > $ sudo umount /lts-mon >> >> > $ sudo mount /lts-mon >> >> > $ ls /lts-mon/BD/xmlExport/ | wc -l >> >> > 3507 >> >> > >> >> > >> >> > The only work around I have found is un-mounting and re-mounting the >> >> > nfs >> >> > share, that seems to clear it up >> >> > Same with samba, I'd post it here but its thousands of lines. I can >> >> > add >> >> > additional details on request. >> >> > >> >> > This happened after our upgrade to infernalis. Is it possible the MDS >> >> > is >> >> > in >> >> > an inconsistent state? >> >> >> >> So this didn't happen to you until after you upgraded? Are you seeing >> >> missing files when looking at cephfs directly, or only over the >> >> NFS/Samba re-exports? Are you also sharing Samba by re-exporting the >> >> kernel cephfs mount? >> >> >> >> Zheng, any ideas about kernel issues which might cause this or be more >> >> visible under infernalis? >> >> -Greg >> >> >> >> > >> >> > We have cephfs mounted on a server using the built in cephfs kernel >> >> > module: >> >> > >> >> > lts-mon:6789:/ /ceph ceph >> >> > name=admin,secretfile=/etc/ceph/admin.secret,noauto,_netdev >> >> > >> >> > >> >> > We are running all of our ceph nodes on ubuntu 14.04 LTS. Samba is up >> >> > to >> >> > date, 4.1.6, and we export nfsv3 to linux and freebsd systems. All >> >> > seem >> >> > to >> >> > exhibit the same behavior. >> >> > >> >> > system info: >> >> > >> >> > # uname -a >> >> > Linux lts-osd1 3.13.0-63-generic #103-Ubuntu SMP Fri Aug 14 21:42:59 >> >> > UTC >> >> > 2015 x86_64 x86_64 x86_64 GNU/Linux >> >> > root@lts-osd1:~# lsb >> >> > lsblk lsb_release >> >> > root@lts-osd1:~# lsb_release -a >> >> > No LSB modules are available. >> >> > Distributor ID: Ubuntu >> >> > Description: Ubuntu 14.04.3 LTS >> >> > Release: 14.04 >> >> > Codename: trusty >> >> > >> >> > >> >> > package info: >> >> > >> >> > # dpkg -l|grep ceph >> >> > ii ceph 9.2.0-1trusty >> >> > amd64 distributed storage and file system >> >> > ii ceph-common 9.2.0-1trusty >> >> > amd64 common utilities to mount and interact with a ceph >> >> > storage >> >> > cluster >> >> > ii ceph-fs-common 9.2.0-1trusty >> >> > amd64 common utilities to mount and interact with a ceph file >> >> > system >> >> > ii ceph-mds 9.2.0-1trusty >> >> > amd64 metadata server for the ceph distributed file system >> >> > ii libcephfs1 9.2.0-1trusty >> >> > amd64 Ceph distributed file system client library >> >> > ii python-ceph 9.2.0-1trusty >> >> > amd64 Meta-package for python libraries for the Ceph libraries >> >> > ii python-cephfs 9.2.0-1trusty >> >> > amd64 Python libraries for the Ceph libcephfs library >> >> > >> >> > >> >> > What is interesting, is a directory or file will not show up in a >> >> > listing, >> >> > however, if we directly access the file, it shows up in that >> >> > instance: >> >> > >> >> > >> >> > # ls -al |grep SCHOOL >> >> > # ls -alnd SCHOOL667055 >> >> > drwxrwsr-x 1 21695 21183 2962751438 Jan 13 09:33 SCHOOL667055 >> >> > >> >> > >> >> > Any tips are appreciated! >> >> > >> >> > Thanks, >> >> > Mike C >> >> > >> >> > >> >> > _______________________________________________ >> >> > ceph-users mailing list >> >> > ceph-users@xxxxxxxxxxxxxx >> >> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >> >> > >> > >> > >> > >> > _______________________________________________ >> > ceph-users mailing list >> > ceph-users@xxxxxxxxxxxxxx >> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >> > > > _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com