Hi Greg, well so what would be the next step to solve that ? For some reason its working like hell with the cache: cache io 500 MB/s evict, 627 op/s promote, 3 PG(s) evicting even there are no changes and no nothing. Where is this pressure coming from ? -- Mit freundlichen Gruessen / Best regards Oliver Dzombic IP-Interactive mailto:info@xxxxxxxxxxxxxxxxx Anschrift: IP Interactive UG ( haftungsbeschraenkt ) Zum Sonnenberg 1-3 63571 Gelnhausen HRB 93402 beim Amtsgericht Hanau Geschäftsführung: Oliver Dzombic Steuer Nr.: 35 236 3622 1 UST ID: DE274086107 Am 26.07.2016 um 04:56 schrieb Gregory Farnum: > Yep, that seems more likely than anything else — there are no other > running external ops to hold up a read lock, and if restarting the MDS > isn't fixing it, then it's permanent state. So, RADOS. > > On Mon, Jul 25, 2016 at 7:53 PM, Oliver Dzombic <info@xxxxxxxxxxxxxxxxx> wrote: >> Hi Greg, >> >> >> I can see that sometimes its showing an evict (full) >> >> >> cluster a8171427-141c-4766-9e0f-533d86dd4ef8 >> health HEALTH_WARN >> noscrub,nodeep-scrub,sortbitwise flag(s) set >> monmap e1: 3 mons at >> {cephmon1=10.0.0.11:6789/0,cephmon2=10.0.0.12:6789/0,cephmon3=10.0.0.13:6789/0} >> election epoch 126, quorum 0,1,2 cephmon1,cephmon2,cephmon3 >> fsmap e92: 1/1/1 up {0=cephmon1=up:active}, 1 up:standby >> osdmap e2168: 24 osds: 24 up, 24 in >> flags noscrub,nodeep-scrub,sortbitwise >> pgmap v3235879: 2240 pgs, 4 pools, 13308 GB data, 4615 kobjects >> 26646 GB used, 27279 GB / 53926 GB avail >> 2238 active+clean >> 2 active+clean+scrubbing+deep >> >> >> >> client io 5413 kB/s rd, 384 kB/s wr, 233 op/s rd, 1547 op/s wr >> cache io 498 MB/s evict, 563 op/s promote, 4 PG(s) evicting >> >> cluster a8171427-141c-4766-9e0f-533d86dd4ef8 >> health HEALTH_WARN >> noscrub,nodeep-scrub,sortbitwise flag(s) set >> monmap e1: 3 mons at >> {cephmon1=10.0.0.11:6789/0,cephmon2=10.0.0.12:6789/0,cephmon3=10.0.0.13:6789/0} >> election epoch 126, quorum 0,1,2 cephmon1,cephmon2,cephmon3 >> fsmap e92: 1/1/1 up {0=cephmon1=up:active}, 1 up:standby >> osdmap e2168: 24 osds: 24 up, 24 in >> flags noscrub,nodeep-scrub,sortbitwise >> pgmap v3235917: 2240 pgs, 4 pools, 13309 GB data, 4601 kobjects >> 26649 GB used, 27277 GB / 53926 GB avail >> 2239 active+clean >> 1 active+clean+scrubbing+deep >> client io 1247 kB/s rd, 439 kB/s wr, 213 op/s rd, 789 op/s wr >> cache io 253 MB/s evict, 350 op/s promote, 1 PG(s) evicting >> >> >> >> cluster a8171427-141c-4766-9e0f-533d86dd4ef8 >> health HEALTH_WARN >> noscrub,nodeep-scrub,sortbitwise flag(s) set >> monmap e1: 3 mons at >> {cephmon1=10.0.0.11:6789/0,cephmon2=10.0.0.12:6789/0,cephmon3=10.0.0.13:6789/0} >> election epoch 126, quorum 0,1,2 cephmon1,cephmon2,cephmon3 >> fsmap e92: 1/1/1 up {0=cephmon1=up:active}, 1 up:standby >> osdmap e2168: 24 osds: 24 up, 24 in >> flags noscrub,nodeep-scrub,sortbitwise >> pgmap v3235946: 2240 pgs, 4 pools, 13310 GB data, 4589 kobjects >> 26650 GB used, 27275 GB / 53926 GB avail >> 2239 active+clean >> 1 active+clean+scrubbing+deep >> client io 0 B/s rd, 490 kB/s wr, 203 op/s rd, 1185 op/s wr >> cache io 343 MB/s evict, 408 op/s promote, 1 PG(s) evicting, 1 PG(s) >> evicting (full) >> >> ceph osd df >> ID WEIGHT REWEIGHT SIZE USE AVAIL %USE VAR PGS >> 4 3.63699 1.00000 3724G 1760G 1964G 47.26 0.96 148 >> 5 3.63699 1.00000 3724G 1830G 1894G 49.14 0.99 158 >> 6 3.63699 1.00000 3724G 2056G 1667G 55.23 1.12 182 >> 7 3.63699 1.00000 3724G 1856G 1867G 49.86 1.01 163 >> 20 2.79199 1.00000 2793G 1134G 1659G 40.60 0.82 98 >> 21 2.79199 1.00000 2793G 990G 1803G 35.45 0.72 89 >> 22 2.79199 1.00000 2793G 1597G 1195G 57.20 1.16 134 >> 23 2.79199 1.00000 2793G 1337G 1455G 47.87 0.97 116 >> 12 3.63699 1.00000 3724G 1819G 1904G 48.86 0.99 154 >> 13 3.63699 1.00000 3724G 1681G 2042G 45.16 0.91 144 >> 14 3.63699 1.00000 3724G 1892G 1832G 50.80 1.03 165 >> 15 3.63699 1.00000 3724G 1494G 2229G 40.14 0.81 132 >> 16 2.79199 1.00000 2793G 1375G 1418G 49.23 1.00 121 >> 17 2.79199 1.00000 2793G 1444G 1348G 51.71 1.05 127 >> 18 2.79199 1.00000 2793G 1509G 1283G 54.04 1.09 129 >> 19 2.79199 1.00000 2793G 1345G 1447G 48.19 0.97 116 >> 0 0.21799 1.00000 223G 180G 44268M 80.65 1.63 269 >> 1 0.21799 1.00000 223G 201G 22758M 90.05 1.82 303 >> 2 0.21799 1.00000 223G 182G 42246M 81.54 1.65 284 >> 3 0.21799 1.00000 223G 200G 23599M 89.69 1.81 296 >> 8 0.21799 1.00000 223G 177G 46963M 79.48 1.61 272 >> 9 0.21799 1.00000 223G 203G 20730M 90.94 1.84 307 >> 10 0.21799 1.00000 223G 190G 34104M 85.10 1.72 288 >> 11 0.21799 1.00000 223G 193G 31155M 86.38 1.75 285 >> TOTAL 53926G 26654G 27272G 49.43 >> MIN/MAX VAR: 0.72/1.84 STDDEV: 21.46 >> >> >> -- >> Mit freundlichen Gruessen / Best regards >> >> Oliver Dzombic >> IP-Interactive >> >> mailto:info@xxxxxxxxxxxxxxxxx >> >> Anschrift: >> >> IP Interactive UG ( haftungsbeschraenkt ) >> Zum Sonnenberg 1-3 >> 63571 Gelnhausen >> >> HRB 93402 beim Amtsgericht Hanau >> Geschäftsführung: Oliver Dzombic >> >> Steuer Nr.: 35 236 3622 1 >> UST ID: DE274086107 >> >> >> Am 26.07.2016 um 04:47 schrieb Gregory Farnum: >>> On Mon, Jul 25, 2016 at 7:38 PM, Oliver Dzombic <info@xxxxxxxxxxxxxxxxx> wrote: >>>> Hi, >>>> >>>> currently some productive stuff is down, because it can not be accessed >>>> through cephfs. >>>> >>>> Client server restart, did not help. >>>> Cluster restart, did not help. >>>> >>>> Only ONE directory inside cephfs has this issue. >>>> >>>> All other directories are working fine. >>> >>> What's the full output of "ceph -s"? >>> >>>> >>>> >>>> MDS Server: Kernel 4.5.4 >>>> client server: Kernel 4.5.4 >>>> ceph version 10.2.2 >>>> >>>> # ceph fs dump >>>> dumped fsmap epoch 92 >>>> e92 >>>> enable_multiple, ever_enabled_multiple: 0,0 >>>> compat: compat={},rocompat={},incompat={1=base v0.20,2=client writeable >>>> ranges,3=default file layouts on dirs,4=dir inode in separate >>>> object,5=mds uses versioned encoding,6=dirfrag is stored in omap,8=file >>>> layout v2} >>>> >>>> Filesystem 'ceph-gen2' (2) >>>> fs_name ceph-gen2 >>>> epoch 92 >>>> flags 0 >>>> created 2016-06-11 21:53:02.142649 >>>> modified 2016-06-14 11:09:16.783356 >>>> tableserver 0 >>>> root 0 >>>> session_timeout 60 >>>> session_autoclose 300 >>>> max_file_size 1099511627776 >>>> last_failure 0 >>>> last_failure_osd_epoch 2164 >>>> compat compat={},rocompat={},incompat={1=base v0.20,2=client writeable >>>> ranges,3=default file layouts on dirs,4=dir inode in separate >>>> object,5=mds uses versioned encoding,6=dirfrag is stored in omap,8=file >>>> layout v2} >>>> max_mds 1 >>>> in 0 >>>> up {0=234109} >>>> failed >>>> damaged >>>> stopped >>>> data_pools 4 >>>> metadata_pool 5 >>>> inline_data disabled >>>> 234109: 10.0.0.11:6801/22255 'cephmon1' mds.0.89 up:active seq 250 >>>> >>>> >>>> Standby daemons: >>>> >>>> 204171: 10.0.0.13:6800/19434 'cephmon3' mds.-1.0 up:standby seq 1 >>>> >>>> >>>> ceph --admin-daemon ceph-mds.cephmon1.asok dump_ops_in_flight >>>> { >>>> "ops": [ >>>> { >>>> "description": "client_request(client.204153:432 getattr >>>> pAsLsXsFs #10000001432 2016-07-25 21:57:30.697894 RETRY=2)", >>>> "initiated_at": "2016-07-26 04:24:05.528832", >>>> "age": 816.092461, >>>> "duration": 816.092528, >>>> "type_data": [ >>>> "failed to rdlock, waiting", >>>> "client.204153:432", >>>> "client_request", >>>> { >>>> "client": "client.204153", >>>> "tid": 432 >>>> }, >>>> [ >>>> { >>>> "time": "2016-07-26 04:24:05.528832", >>>> "event": "initiated" >>>> }, >>>> { >>>> "time": "2016-07-26 04:24:07.613779", >>>> "event": "failed to rdlock, waiting" >>>> } >>>> ] >>>> ] >>>> } >>>> ], >>>> "num_ops": 1 >>>> } >>>> >>>> >>>> 2016-07-26 04:32:09.355503 7ffb331ca700 0 log_channel(cluster) log >>>> [WRN] : 1 slow requests, 1 included below; oldest blocked for > >>>> 483.826590 secs >>>> >>>> 2016-07-26 04:32:09.355531 7ffb331ca700 0 log_channel(cluster) log >>>> [WRN] : slow request 483.826590 seconds old, received at 2016-07-26 >>>> 04:24:05.528832: client_request(client.204153:432 getattr pAsLsXsFs >>>> #10000001432 2016-07-25 21:57:30.697894 RETRY=2) currently failed to >>>> rdlock, waiting >>>> >>>> >>>> Any idea ? :( >>>> >>>> -- >>>> Mit freundlichen Gruessen / Best regards >>>> >>>> Oliver Dzombic >>>> IP-Interactive >>>> >>>> mailto:info@xxxxxxxxxxxxxxxxx >>>> >>>> Anschrift: >>>> >>>> IP Interactive UG ( haftungsbeschraenkt ) >>>> Zum Sonnenberg 1-3 >>>> 63571 Gelnhausen >>>> >>>> HRB 93402 beim Amtsgericht Hanau >>>> Geschäftsführung: Oliver Dzombic >>>> >>>> Steuer Nr.: 35 236 3622 1 >>>> UST ID: DE274086107 >>>> >>>> _______________________________________________ >>>> ceph-users mailing list >>>> ceph-users@xxxxxxxxxxxxxx >>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >> _______________________________________________ >> ceph-users mailing list >> ceph-users@xxxxxxxxxxxxxx >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com