On Wed, Mar 14, 2018 at 3:17 AM, David C <dcsysengineer@xxxxxxxxx> wrote: > Hi All > > I have a Samba server that is exporting directories from a Cephfs Kernel > mount. Performance has been pretty good for the last year but users have > recently been complaining of short "freezes", these seem to coincide with > MDS related slow requests in the monitor ceph.log such as: > >> 2018-03-13 13:34:58.461030 osd.15 osd.15 10.10.10.211:6812/13367 5752 : >> cluster [WRN] slow request 31.834418 seconds old, received at 2018-03-13 >> 13:34:26.626474: osd_repop(mds.0.5495:810644 3.3e e14085/14019 >> 3:7cea5bac:::10001a88b8f.00000000:head v 14085'846936) currently commit_sent >> 2018-03-13 13:34:59.461270 osd.15 osd.15 10.10.10.211:6812/13367 5754 : >> cluster [WRN] slow request 32.832059 seconds old, received at 2018-03-13 >> 13:34:26.629151: osd_repop(mds.0.5495:810671 2.dc2 e14085/14020 >> 2:43bdcc3f:::10001e91a91.00000000:head v 14085'21394) currently commit_sent >> 2018-03-13 14:23:57.409427 osd.30 osd.30 10.10.10.212:6824/14997 5708 : >> cluster [WRN] slow request 30.536832 seconds old, received at 2018-03-13 >> 14:23:26.872513: osd_repop(mds.0.5495:865403 2.fb6 e14085/14077 >> 2:6df955ef:::10001e93542.000000c4:head v 14085'21296) currently commit_sent >> 2018-03-13 14:23:57.409449 osd.30 osd.30 10.10.10.212:6824/14997 5709 : >> cluster [WRN] slow request 30.529640 seconds old, received at 2018-03-13 >> 14:23:26.879704: osd_repop(mds.0.5495:865407 2.595 e14085/14019 >> 2:a9a56101:::10001e93542.000000c8:head v 14085'20437) currently commit_sent >> 2018-03-13 14:23:57.409453 osd.30 osd.30 10.10.10.212:6824/14997 5710 : >> cluster [WRN] slow request 30.503138 seconds old, received at 2018-03-13 >> 14:23:26.906207: osd_repop(mds.0.5495:865423 2.ea e14085/14055 >> 2:57096bbf:::10001e93542.000000d8:head v 14085'21147) currently commit_sent > > > -- > > Looking in the MDS log, with debug set to 4, it's full of "setfilelockrule > 1" and "setfilelockrule 2": > >> 2018-03-13 14:23:00.446905 7fde43e73700 4 mds.0.server >> handle_client_request client_request(client.9174621:141162337 >> setfilelockrule 1, type 4, owner 14971048052668053939, pid 7, start 120, >> length 1, wait 0 #0x10001e8dc37 2018-03-13 14:22:58.838521 caller_uid=1155, >> caller_gid=1131{}) v2 >> 2018-03-13 14:23:00.447050 7fde43e73700 4 mds.0.server >> handle_client_request client_request(client.9174621:141162338 >> setfilelockrule 2, type 4, owner 14971048137043556787, pid 4632, start 0, >> length 0, wait 0 #0x10001e8dc37 2018-03-13 14:22:58.838521 caller_uid=0, >> caller_gid=0{}) v2 >> 2018-03-13 14:23:00.447258 7fde43e73700 4 mds.0.server >> handle_client_request client_request(client.9174621:141162339 >> setfilelockrule 2, type 4, owner 14971048137043550643, pid 4632, start 0, >> length 0, wait 0 #0x10001e8dc37 2018-03-13 14:22:58.838521 caller_uid=0, >> caller_gid=0{}) v2 >> 2018-03-13 14:23:00.447393 7fde43e73700 4 mds.0.server >> handle_client_request client_request(client.9174621:141162340 >> setfilelockrule 1, type 4, owner 14971048052668053939, pid 7, start 124, >> length 1, wait 0 #0x10001e8dc37 2018-03-13 14:22:58.838521 caller_uid=1155, >> caller_gid=1131{}) v2 > > > -- > > I don't have a particularly good monitoring set up on this cluster yet, but > a cursory look at a few things such as iostat doesn't seem to suggest OSDs > are being hammered. > > Some questions: > > 1) Can anyone recommend a way of diagnosing this issue? > 2) Are the multiple "setfilelockrule" per inode to be expected? I assume > this is something to do with the Samba oplocks. > 3) What's the recommended highest MDS debug setting before performance > starts to be adversely affected (I'm aware log files will get huge)? > 4) What's the best way of matching inodes in the MDS log to the file names > in cephfs? you have running mds, the most easy way to find path of inode is dump mds cache > > Hardware/Versions: > > Luminous 12.1.1 > Cephfs client 3.10.0-514.2.2.el7.x86_64 > Samba 4.4.4 > 4 node cluster, each node 1xIntel 3700 NVME, 12x SATA, 40Gbps networking > > Thanks in advance! > > Cheers, > David > > > > _______________________________________________ > ceph-users mailing list > ceph-users@xxxxxxxxxxxxxx > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com