Yep, that seems more likely than anything else — there are no other running external ops to hold up a read lock, and if restarting the MDS isn't fixing it, then it's permanent state. So, RADOS. On Mon, Jul 25, 2016 at 7:53 PM, Oliver Dzombic <info@xxxxxxxxxxxxxxxxx> wrote: > Hi Greg, > > > I can see that sometimes its showing an evict (full) > > > cluster a8171427-141c-4766-9e0f-533d86dd4ef8 > health HEALTH_WARN > noscrub,nodeep-scrub,sortbitwise flag(s) set > monmap e1: 3 mons at > {cephmon1=10.0.0.11:6789/0,cephmon2=10.0.0.12:6789/0,cephmon3=10.0.0.13:6789/0} > election epoch 126, quorum 0,1,2 cephmon1,cephmon2,cephmon3 > fsmap e92: 1/1/1 up {0=cephmon1=up:active}, 1 up:standby > osdmap e2168: 24 osds: 24 up, 24 in > flags noscrub,nodeep-scrub,sortbitwise > pgmap v3235879: 2240 pgs, 4 pools, 13308 GB data, 4615 kobjects > 26646 GB used, 27279 GB / 53926 GB avail > 2238 active+clean > 2 active+clean+scrubbing+deep > > > > client io 5413 kB/s rd, 384 kB/s wr, 233 op/s rd, 1547 op/s wr > cache io 498 MB/s evict, 563 op/s promote, 4 PG(s) evicting > > cluster a8171427-141c-4766-9e0f-533d86dd4ef8 > health HEALTH_WARN > noscrub,nodeep-scrub,sortbitwise flag(s) set > monmap e1: 3 mons at > {cephmon1=10.0.0.11:6789/0,cephmon2=10.0.0.12:6789/0,cephmon3=10.0.0.13:6789/0} > election epoch 126, quorum 0,1,2 cephmon1,cephmon2,cephmon3 > fsmap e92: 1/1/1 up {0=cephmon1=up:active}, 1 up:standby > osdmap e2168: 24 osds: 24 up, 24 in > flags noscrub,nodeep-scrub,sortbitwise > pgmap v3235917: 2240 pgs, 4 pools, 13309 GB data, 4601 kobjects > 26649 GB used, 27277 GB / 53926 GB avail > 2239 active+clean > 1 active+clean+scrubbing+deep > client io 1247 kB/s rd, 439 kB/s wr, 213 op/s rd, 789 op/s wr > cache io 253 MB/s evict, 350 op/s promote, 1 PG(s) evicting > > > > cluster a8171427-141c-4766-9e0f-533d86dd4ef8 > health HEALTH_WARN > noscrub,nodeep-scrub,sortbitwise flag(s) set > monmap e1: 3 mons at > {cephmon1=10.0.0.11:6789/0,cephmon2=10.0.0.12:6789/0,cephmon3=10.0.0.13:6789/0} > election epoch 126, quorum 0,1,2 cephmon1,cephmon2,cephmon3 > fsmap e92: 1/1/1 up {0=cephmon1=up:active}, 1 up:standby > osdmap e2168: 24 osds: 24 up, 24 in > flags noscrub,nodeep-scrub,sortbitwise > pgmap v3235946: 2240 pgs, 4 pools, 13310 GB data, 4589 kobjects > 26650 GB used, 27275 GB / 53926 GB avail > 2239 active+clean > 1 active+clean+scrubbing+deep > client io 0 B/s rd, 490 kB/s wr, 203 op/s rd, 1185 op/s wr > cache io 343 MB/s evict, 408 op/s promote, 1 PG(s) evicting, 1 PG(s) > evicting (full) > > ceph osd df > ID WEIGHT REWEIGHT SIZE USE AVAIL %USE VAR PGS > 4 3.63699 1.00000 3724G 1760G 1964G 47.26 0.96 148 > 5 3.63699 1.00000 3724G 1830G 1894G 49.14 0.99 158 > 6 3.63699 1.00000 3724G 2056G 1667G 55.23 1.12 182 > 7 3.63699 1.00000 3724G 1856G 1867G 49.86 1.01 163 > 20 2.79199 1.00000 2793G 1134G 1659G 40.60 0.82 98 > 21 2.79199 1.00000 2793G 990G 1803G 35.45 0.72 89 > 22 2.79199 1.00000 2793G 1597G 1195G 57.20 1.16 134 > 23 2.79199 1.00000 2793G 1337G 1455G 47.87 0.97 116 > 12 3.63699 1.00000 3724G 1819G 1904G 48.86 0.99 154 > 13 3.63699 1.00000 3724G 1681G 2042G 45.16 0.91 144 > 14 3.63699 1.00000 3724G 1892G 1832G 50.80 1.03 165 > 15 3.63699 1.00000 3724G 1494G 2229G 40.14 0.81 132 > 16 2.79199 1.00000 2793G 1375G 1418G 49.23 1.00 121 > 17 2.79199 1.00000 2793G 1444G 1348G 51.71 1.05 127 > 18 2.79199 1.00000 2793G 1509G 1283G 54.04 1.09 129 > 19 2.79199 1.00000 2793G 1345G 1447G 48.19 0.97 116 > 0 0.21799 1.00000 223G 180G 44268M 80.65 1.63 269 > 1 0.21799 1.00000 223G 201G 22758M 90.05 1.82 303 > 2 0.21799 1.00000 223G 182G 42246M 81.54 1.65 284 > 3 0.21799 1.00000 223G 200G 23599M 89.69 1.81 296 > 8 0.21799 1.00000 223G 177G 46963M 79.48 1.61 272 > 9 0.21799 1.00000 223G 203G 20730M 90.94 1.84 307 > 10 0.21799 1.00000 223G 190G 34104M 85.10 1.72 288 > 11 0.21799 1.00000 223G 193G 31155M 86.38 1.75 285 > TOTAL 53926G 26654G 27272G 49.43 > MIN/MAX VAR: 0.72/1.84 STDDEV: 21.46 > > > -- > Mit freundlichen Gruessen / Best regards > > Oliver Dzombic > IP-Interactive > > mailto:info@xxxxxxxxxxxxxxxxx > > Anschrift: > > IP Interactive UG ( haftungsbeschraenkt ) > Zum Sonnenberg 1-3 > 63571 Gelnhausen > > HRB 93402 beim Amtsgericht Hanau > Geschäftsführung: Oliver Dzombic > > Steuer Nr.: 35 236 3622 1 > UST ID: DE274086107 > > > Am 26.07.2016 um 04:47 schrieb Gregory Farnum: >> On Mon, Jul 25, 2016 at 7:38 PM, Oliver Dzombic <info@xxxxxxxxxxxxxxxxx> wrote: >>> Hi, >>> >>> currently some productive stuff is down, because it can not be accessed >>> through cephfs. >>> >>> Client server restart, did not help. >>> Cluster restart, did not help. >>> >>> Only ONE directory inside cephfs has this issue. >>> >>> All other directories are working fine. >> >> What's the full output of "ceph -s"? >> >>> >>> >>> MDS Server: Kernel 4.5.4 >>> client server: Kernel 4.5.4 >>> ceph version 10.2.2 >>> >>> # ceph fs dump >>> dumped fsmap epoch 92 >>> e92 >>> enable_multiple, ever_enabled_multiple: 0,0 >>> compat: compat={},rocompat={},incompat={1=base v0.20,2=client writeable >>> ranges,3=default file layouts on dirs,4=dir inode in separate >>> object,5=mds uses versioned encoding,6=dirfrag is stored in omap,8=file >>> layout v2} >>> >>> Filesystem 'ceph-gen2' (2) >>> fs_name ceph-gen2 >>> epoch 92 >>> flags 0 >>> created 2016-06-11 21:53:02.142649 >>> modified 2016-06-14 11:09:16.783356 >>> tableserver 0 >>> root 0 >>> session_timeout 60 >>> session_autoclose 300 >>> max_file_size 1099511627776 >>> last_failure 0 >>> last_failure_osd_epoch 2164 >>> compat compat={},rocompat={},incompat={1=base v0.20,2=client writeable >>> ranges,3=default file layouts on dirs,4=dir inode in separate >>> object,5=mds uses versioned encoding,6=dirfrag is stored in omap,8=file >>> layout v2} >>> max_mds 1 >>> in 0 >>> up {0=234109} >>> failed >>> damaged >>> stopped >>> data_pools 4 >>> metadata_pool 5 >>> inline_data disabled >>> 234109: 10.0.0.11:6801/22255 'cephmon1' mds.0.89 up:active seq 250 >>> >>> >>> Standby daemons: >>> >>> 204171: 10.0.0.13:6800/19434 'cephmon3' mds.-1.0 up:standby seq 1 >>> >>> >>> ceph --admin-daemon ceph-mds.cephmon1.asok dump_ops_in_flight >>> { >>> "ops": [ >>> { >>> "description": "client_request(client.204153:432 getattr >>> pAsLsXsFs #10000001432 2016-07-25 21:57:30.697894 RETRY=2)", >>> "initiated_at": "2016-07-26 04:24:05.528832", >>> "age": 816.092461, >>> "duration": 816.092528, >>> "type_data": [ >>> "failed to rdlock, waiting", >>> "client.204153:432", >>> "client_request", >>> { >>> "client": "client.204153", >>> "tid": 432 >>> }, >>> [ >>> { >>> "time": "2016-07-26 04:24:05.528832", >>> "event": "initiated" >>> }, >>> { >>> "time": "2016-07-26 04:24:07.613779", >>> "event": "failed to rdlock, waiting" >>> } >>> ] >>> ] >>> } >>> ], >>> "num_ops": 1 >>> } >>> >>> >>> 2016-07-26 04:32:09.355503 7ffb331ca700 0 log_channel(cluster) log >>> [WRN] : 1 slow requests, 1 included below; oldest blocked for > >>> 483.826590 secs >>> >>> 2016-07-26 04:32:09.355531 7ffb331ca700 0 log_channel(cluster) log >>> [WRN] : slow request 483.826590 seconds old, received at 2016-07-26 >>> 04:24:05.528832: client_request(client.204153:432 getattr pAsLsXsFs >>> #10000001432 2016-07-25 21:57:30.697894 RETRY=2) currently failed to >>> rdlock, waiting >>> >>> >>> Any idea ? :( >>> >>> -- >>> Mit freundlichen Gruessen / Best regards >>> >>> Oliver Dzombic >>> IP-Interactive >>> >>> mailto:info@xxxxxxxxxxxxxxxxx >>> >>> Anschrift: >>> >>> IP Interactive UG ( haftungsbeschraenkt ) >>> Zum Sonnenberg 1-3 >>> 63571 Gelnhausen >>> >>> HRB 93402 beim Amtsgericht Hanau >>> Geschäftsführung: Oliver Dzombic >>> >>> Steuer Nr.: 35 236 3622 1 >>> UST ID: DE274086107 >>> >>> _______________________________________________ >>> ceph-users mailing list >>> ceph-users@xxxxxxxxxxxxxx >>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > _______________________________________________ > ceph-users mailing list > ceph-users@xxxxxxxxxxxxxx > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com