Thank you for the reply Zheng
We tried set mds bal frag to true, but the end result was less than desirable. All nfs and smb clients could no longer browse the share, they would hang on a directory with anything more than a few hundred files.We then tried to back out the active/active mds change, no luck, stopping one of the mds's (mds 1) prevented us from mounting the cephfs filesystem
So we failed and removed the secondary MDS, and now our primary mds is stuck in a "resovle" state:
# ceph -s
cluster cabd1728-2eca-4e18-a581-b4885364e5a4
health HEALTH_WARN
clock skew detected on mon.lts-mon
mds cluster is degraded
Monitor clock skew detected
monmap e1: 4 mons at {lts-mon=10.5.68.236:6789/0,lts-osd1=10.5.68.229:6789/0,lts-osd2=10.5.68.230:6789/0,lts-osd3=10.5.68.203:6789/0}
election epoch 1282, quorum 0,1,2,3 lts-osd3,lts-osd1,lts-osd2,lts-mon
mdsmap e7892: 1/2/1 up {0=lts-mon=up:resolve}
osdmap e10183: 102 osds: 101 up, 101 in
pgmap v6714309: 4192 pgs, 7 pools, 31748 GB data, 23494 kobjects
96188 GB used, 273 TB / 367 TB avail
4188 active+clean
4 active+clean+scrubbing+deep
cluster cabd1728-2eca-4e18-a581-b4885364e5a4
health HEALTH_WARN
clock skew detected on mon.lts-mon
mds cluster is degraded
Monitor clock skew detected
monmap e1: 4 mons at {lts-mon=10.5.68.236:6789/0,lts-osd1=10.5.68.229:6789/0,lts-osd2=10.5.68.230:6789/0,lts-osd3=10.5.68.203:6789/0}
election epoch 1282, quorum 0,1,2,3 lts-osd3,lts-osd1,lts-osd2,lts-mon
mdsmap e7892: 1/2/1 up {0=lts-mon=up:resolve}
osdmap e10183: 102 osds: 101 up, 101 in
pgmap v6714309: 4192 pgs, 7 pools, 31748 GB data, 23494 kobjects
96188 GB used, 273 TB / 367 TB avail
4188 active+clean
4 active+clean+scrubbing+deep
Now we are really down for the count. We cannot get our MDS back up in an active state and none of our data is accessible.
On Wed, Jan 13, 2016 at 7:05 PM, Yan, Zheng <ukernel@xxxxxxxxx> wrote:
On Thu, Jan 14, 2016 at 3:37 AM, Mike Carlson <mike@xxxxxxxxxxxx> wrote:
> Hey Greg,
>
> The inconsistent view is only over nfs/smb on top of our /ceph mount.
>
> When I look directly on the /ceph mount (which is using the cephfs kernel
> module), everything looks fine
>
> It is possible that this issue just went unnoticed, and it only being a
> infernalis problem is just a red herring. With that, it is oddly
> coincidental that we just started seeing issues.
This seems like seekdir bugs in kernel client, could you try 4.0+ kernel.
Besides, do you enable "mds bal frag" for ceph-mds
Regards
Yan, Zheng
>
> On Wed, Jan 13, 2016 at 11:30 AM, Gregory Farnum <gfarnum@xxxxxxxxxx> wrote:
>>
>> On Wed, Jan 13, 2016 at 11:24 AM, Mike Carlson <mike@xxxxxxxxxxxx> wrote:
>> > Hello.
>> >
>> > Since we upgraded to Infernalis last, we have noticed a severe problem
>> > with
>> > cephfs when we have it shared over Samba and NFS
>> >
>> > Directory listings are showing an inconsistent view of the files:
>> >
>> >
>> > $ ls /lts-mon/BD/xmlExport/ | wc -l
>> > 100
>> > $ sudo umount /lts-mon
>> > $ sudo mount /lts-mon
>> > $ ls /lts-mon/BD/xmlExport/ | wc -l
>> > 3507
>> >
>> >
>> > The only work around I have found is un-mounting and re-mounting the nfs
>> > share, that seems to clear it up
>> > Same with samba, I'd post it here but its thousands of lines. I can add
>> > additional details on request.
>> >
>> > This happened after our upgrade to infernalis. Is it possible the MDS is
>> > in
>> > an inconsistent state?
>>
>> So this didn't happen to you until after you upgraded? Are you seeing
>> missing files when looking at cephfs directly, or only over the
>> NFS/Samba re-exports? Are you also sharing Samba by re-exporting the
>> kernel cephfs mount?
>>
>> Zheng, any ideas about kernel issues which might cause this or be more
>> visible under infernalis?
>> -Greg
>>
>> >
>> > We have cephfs mounted on a server using the built in cephfs kernel
>> > module:
>> >
>> > lts-mon:6789:/ /ceph ceph
>> > name=admin,secretfile=/etc/ceph/admin.secret,noauto,_netdev
>> >
>> >
>> > We are running all of our ceph nodes on ubuntu 14.04 LTS. Samba is up to
>> > date, 4.1.6, and we export nfsv3 to linux and freebsd systems. All seem
>> > to
>> > exhibit the same behavior.
>> >
>> > system info:
>> >
>> > # uname -a
>> > Linux lts-osd1 3.13.0-63-generic #103-Ubuntu SMP Fri Aug 14 21:42:59 UTC
>> > 2015 x86_64 x86_64 x86_64 GNU/Linux
>> > root@lts-osd1:~# lsb
>> > lsblk lsb_release
>> > root@lts-osd1:~# lsb_release -a
>> > No LSB modules are available.
>> > Distributor ID: Ubuntu
>> > Description: Ubuntu 14.04.3 LTS
>> > Release: 14.04
>> > Codename: trusty
>> >
>> >
>> > package info:
>> >
>> > # dpkg -l|grep ceph
>> > ii ceph 9.2.0-1trusty
>> > amd64 distributed storage and file system
>> > ii ceph-common 9.2.0-1trusty
>> > amd64 common utilities to mount and interact with a ceph storage
>> > cluster
>> > ii ceph-fs-common 9.2.0-1trusty
>> > amd64 common utilities to mount and interact with a ceph file
>> > system
>> > ii ceph-mds 9.2.0-1trusty
>> > amd64 metadata server for the ceph distributed file system
>> > ii libcephfs1 9.2.0-1trusty
>> > amd64 Ceph distributed file system client library
>> > ii python-ceph 9.2.0-1trusty
>> > amd64 Meta-package for python libraries for the Ceph libraries
>> > ii python-cephfs 9.2.0-1trusty
>> > amd64 Python libraries for the Ceph libcephfs library
>> >
>> >
>> > What is interesting, is a directory or file will not show up in a
>> > listing,
>> > however, if we directly access the file, it shows up in that instance:
>> >
>> >
>> > # ls -al |grep SCHOOL
>> > # ls -alnd SCHOOL667055
>> > drwxrwsr-x 1 21695 21183 2962751438 Jan 13 09:33 SCHOOL667055
>> >
>> >
>> > Any tips are appreciated!
>> >
>> > Thanks,
>> > Mike C
>> >
>> >
>> > _______________________________________________
>> > ceph-users mailing list
>> > ceph-users@xxxxxxxxxxxxxx
>> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>> >
>
>
>
> _______________________________________________
> ceph-users mailing list
> ceph-users@xxxxxxxxxxxxxx
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com