Re: cephfs failed to rdlock, waiting

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




> -----Original Message-----
> From: ceph-users [mailto:ceph-users-bounces@xxxxxxxxxxxxxx] On Behalf Of Oliver Dzombic
> Sent: 26 July 2016 04:30
> To: ceph-users@xxxxxxxxxxxxxx
> Subject: Re:  cephfs failed to rdlock, waiting
> 
> Hi Greg,
> 
> i switched the cache tier to forward, and began to evit everything.
> 
> I restarted the mds, it was switching to another node.
> 
> Still the same issue...
> 
> So how can it be a pg full issue this way ?

Oliver, I saw in a post a couple of weeks ago that you had set the promotion throttles up to around 1.6GB/s from the default of 4MB/s. Is this still the case? Once the cache is full, any further promotions will cause a similar level of evictions, which can lead to a full tier, if the evictions cannot keep up.

Also the way the PG fullness is calculated is not 100% tied to the fullness of the OSD, due to the uneven way PG's are distributed. It’s a rough estimation. So you can find yourself in a situation where a PG thinks its full but the OSD its on still has space available. Is your target full ratio at .8? This is normally enough to give it enough space to take into account for any inaccuracies. 

I'm afraid I can't offer much help regarding the CephFS problem, but from what Greg said, I suspect it might be the caching sending some sort of full error. I would get the caching under control and see if this resolves the problem.

> 
> 
>     cluster a8171427-141c-4766-9e0f-533d86dd4ef8
>      health HEALTH_OK
>      monmap e1: 3 mons at
> {cephmon1=10.0.0.11:6789/0,cephmon2=10.0.0.12:6789/0,cephmon3=10.0.0.13:6789/0}
>             election epoch 126, quorum 0,1,2 cephmon1,cephmon2,cephmon3
>       fsmap e98: 1/1/1 up {0=cephmon3=up:active}, 1 up:standby
>      osdmap e2173: 24 osds: 24 up, 24 in
>             flags sortbitwise
>       pgmap v3238008: 2240 pgs, 4 pools, 13228 GB data, 3899 kobjects
>             26487 GB used, 27439 GB / 53926 GB avail
>                 2233 active+clean
>                    5 active+clean+scrubbing+deep
>                    2 active+clean+scrubbing
>   client io 0 B/s rd, 7997 kB/s wr, 24 op/s rd, 70 op/s wr
>   cache io 1980 kB/s evict
> 
> # ceph osd df
> ID WEIGHT  REWEIGHT SIZE   USE    AVAIL  %USE  VAR  PGS
>  4 3.63699  1.00000  3724G  1760G  1964G 47.26 0.96 148
>  5 3.63699  1.00000  3724G  1830G  1894G 49.14 1.00 158
>  6 3.63699  1.00000  3724G  2056G  1667G 55.23 1.12 182
>  7 3.63699  1.00000  3724G  1856G  1867G 49.86 1.02 163
> 20 2.79199  1.00000  2793G  1134G  1659G 40.60 0.83  98
> 21 2.79199  1.00000  2793G   990G  1803G 35.45 0.72  89
> 22 2.79199  1.00000  2793G  1597G  1195G 57.20 1.16 134
> 23 2.79199  1.00000  2793G  1337G  1455G 47.87 0.97 116
> 12 3.63699  1.00000  3724G  1819G  1904G 48.86 0.99 154
> 13 3.63699  1.00000  3724G  1681G  2042G 45.16 0.92 144
> 14 3.63699  1.00000  3724G  1892G  1832G 50.80 1.03 165
> 15 3.63699  1.00000  3724G  1494G  2229G 40.14 0.82 132
> 16 2.79199  1.00000  2793G  1375G  1418G 49.23 1.00 121
> 17 2.79199  1.00000  2793G  1444G  1348G 51.71 1.05 127
> 18 2.79199  1.00000  2793G  1509G  1283G 54.04 1.10 129
> 19 2.79199  1.00000  2793G  1345G  1447G 48.19 0.98 116
>  0 0.21799  1.00000   223G   158G 66268M 71.04 1.45 269
>  1 0.21799  1.00000   223G   181G 43363M 81.05 1.65 303
>  2 0.21799  1.00000   223G   166G 57845M 74.72 1.52 284
>  3 0.21799  1.00000   223G   172G 52129M 77.22 1.57 296
>  8 0.21799  1.00000   223G   159G 65453M 71.40 1.45 272
>  9 0.21799  1.00000   223G   187G 37270M 83.71 1.70 307
> 10 0.21799  1.00000   223G   169G 55478M 75.75 1.54 288
> 11 0.21799  1.00000   223G   163G 61722M 73.03 1.49 285
>               TOTAL 53926G 26484G 27442G 49.11 MIN/MAX VAR: 0.72/1.70  STDDEV: 16.36
> 
> 
> 
> --
> Mit freundlichen Gruessen / Best regards
> 
> Oliver Dzombic
> IP-Interactive
> 
> mailto:info@xxxxxxxxxxxxxxxxx
> 
> Anschrift:
> 
> IP Interactive UG ( haftungsbeschraenkt ) Zum Sonnenberg 1-3
> 63571 Gelnhausen
> 
> HRB 93402 beim Amtsgericht Hanau
> Geschäftsführung: Oliver Dzombic
> 
> Steuer Nr.: 35 236 3622 1
> UST ID: DE274086107
> 
> 
> Am 26.07.2016 um 04:56 schrieb Gregory Farnum:
> > Yep, that seems more likely than anything else — there are no other
> > running external ops to hold up a read lock, and if restarting the MDS
> > isn't fixing it, then it's permanent state. So, RADOS.
> >
> > On Mon, Jul 25, 2016 at 7:53 PM, Oliver Dzombic <info@xxxxxxxxxxxxxxxxx> wrote:
> >> Hi Greg,
> >>
> >>
> >> I can see that sometimes its showing an evict (full)
> >>
> >>
> >>     cluster a8171427-141c-4766-9e0f-533d86dd4ef8
> >>      health HEALTH_WARN
> >>             noscrub,nodeep-scrub,sortbitwise flag(s) set
> >>      monmap e1: 3 mons at
> >> {cephmon1=10.0.0.11:6789/0,cephmon2=10.0.0.12:6789/0,cephmon3=10.0.0.13:6789/0}
> >>             election epoch 126, quorum 0,1,2 cephmon1,cephmon2,cephmon3
> >>       fsmap e92: 1/1/1 up {0=cephmon1=up:active}, 1 up:standby
> >>      osdmap e2168: 24 osds: 24 up, 24 in
> >>             flags noscrub,nodeep-scrub,sortbitwise
> >>       pgmap v3235879: 2240 pgs, 4 pools, 13308 GB data, 4615 kobjects
> >>             26646 GB used, 27279 GB / 53926 GB avail
> >>                 2238 active+clean
> >>                    2 active+clean+scrubbing+deep
> >>
> >>
> >>
> >>   client io 5413 kB/s rd, 384 kB/s wr, 233 op/s rd, 1547 op/s wr
> >>   cache io 498 MB/s evict, 563 op/s promote, 4 PG(s) evicting
> >>
> >>     cluster a8171427-141c-4766-9e0f-533d86dd4ef8
> >>      health HEALTH_WARN
> >>             noscrub,nodeep-scrub,sortbitwise flag(s) set
> >>      monmap e1: 3 mons at
> >> {cephmon1=10.0.0.11:6789/0,cephmon2=10.0.0.12:6789/0,cephmon3=10.0.0.13:6789/0}
> >>             election epoch 126, quorum 0,1,2 cephmon1,cephmon2,cephmon3
> >>       fsmap e92: 1/1/1 up {0=cephmon1=up:active}, 1 up:standby
> >>      osdmap e2168: 24 osds: 24 up, 24 in
> >>             flags noscrub,nodeep-scrub,sortbitwise
> >>       pgmap v3235917: 2240 pgs, 4 pools, 13309 GB data, 4601 kobjects
> >>             26649 GB used, 27277 GB / 53926 GB avail
> >>                 2239 active+clean
> >>                    1 active+clean+scrubbing+deep
> >>   client io 1247 kB/s rd, 439 kB/s wr, 213 op/s rd, 789 op/s wr
> >>   cache io 253 MB/s evict, 350 op/s promote, 1 PG(s) evicting
> >>
> >>
> >>
> >>     cluster a8171427-141c-4766-9e0f-533d86dd4ef8
> >>      health HEALTH_WARN
> >>             noscrub,nodeep-scrub,sortbitwise flag(s) set
> >>      monmap e1: 3 mons at
> >> {cephmon1=10.0.0.11:6789/0,cephmon2=10.0.0.12:6789/0,cephmon3=10.0.0.13:6789/0}
> >>             election epoch 126, quorum 0,1,2 cephmon1,cephmon2,cephmon3
> >>       fsmap e92: 1/1/1 up {0=cephmon1=up:active}, 1 up:standby
> >>      osdmap e2168: 24 osds: 24 up, 24 in
> >>             flags noscrub,nodeep-scrub,sortbitwise
> >>       pgmap v3235946: 2240 pgs, 4 pools, 13310 GB data, 4589 kobjects
> >>             26650 GB used, 27275 GB / 53926 GB avail
> >>                 2239 active+clean
> >>                    1 active+clean+scrubbing+deep
> >>   client io 0 B/s rd, 490 kB/s wr, 203 op/s rd, 1185 op/s wr
> >>   cache io 343 MB/s evict, 408 op/s promote, 1 PG(s) evicting, 1
> >> PG(s) evicting (full)
> >>
> >> ceph osd df
> >> ID WEIGHT  REWEIGHT SIZE   USE    AVAIL  %USE  VAR  PGS
> >>  4 3.63699  1.00000  3724G  1760G  1964G 47.26 0.96 148
> >>  5 3.63699  1.00000  3724G  1830G  1894G 49.14 0.99 158
> >>  6 3.63699  1.00000  3724G  2056G  1667G 55.23 1.12 182
> >>  7 3.63699  1.00000  3724G  1856G  1867G 49.86 1.01 163
> >> 20 2.79199  1.00000  2793G  1134G  1659G 40.60 0.82  98
> >> 21 2.79199  1.00000  2793G   990G  1803G 35.45 0.72  89
> >> 22 2.79199  1.00000  2793G  1597G  1195G 57.20 1.16 134
> >> 23 2.79199  1.00000  2793G  1337G  1455G 47.87 0.97 116
> >> 12 3.63699  1.00000  3724G  1819G  1904G 48.86 0.99 154
> >> 13 3.63699  1.00000  3724G  1681G  2042G 45.16 0.91 144
> >> 14 3.63699  1.00000  3724G  1892G  1832G 50.80 1.03 165
> >> 15 3.63699  1.00000  3724G  1494G  2229G 40.14 0.81 132
> >> 16 2.79199  1.00000  2793G  1375G  1418G 49.23 1.00 121
> >> 17 2.79199  1.00000  2793G  1444G  1348G 51.71 1.05 127
> >> 18 2.79199  1.00000  2793G  1509G  1283G 54.04 1.09 129
> >> 19 2.79199  1.00000  2793G  1345G  1447G 48.19 0.97 116
> >>  0 0.21799  1.00000   223G   180G 44268M 80.65 1.63 269
> >>  1 0.21799  1.00000   223G   201G 22758M 90.05 1.82 303
> >>  2 0.21799  1.00000   223G   182G 42246M 81.54 1.65 284
> >>  3 0.21799  1.00000   223G   200G 23599M 89.69 1.81 296
> >>  8 0.21799  1.00000   223G   177G 46963M 79.48 1.61 272
> >>  9 0.21799  1.00000   223G   203G 20730M 90.94 1.84 307
> >> 10 0.21799  1.00000   223G   190G 34104M 85.10 1.72 288
> >> 11 0.21799  1.00000   223G   193G 31155M 86.38 1.75 285
> >>               TOTAL 53926G 26654G 27272G 49.43 MIN/MAX VAR: 0.72/1.84
> >> STDDEV: 21.46
> >>
> >>
> >> --
> >> Mit freundlichen Gruessen / Best regards
> >>
> >> Oliver Dzombic
> >> IP-Interactive
> >>
> >> mailto:info@xxxxxxxxxxxxxxxxx
> >>
> >> Anschrift:
> >>
> >> IP Interactive UG ( haftungsbeschraenkt ) Zum Sonnenberg 1-3
> >> 63571 Gelnhausen
> >>
> >> HRB 93402 beim Amtsgericht Hanau
> >> Geschäftsführung: Oliver Dzombic
> >>
> >> Steuer Nr.: 35 236 3622 1
> >> UST ID: DE274086107
> >>
> >>
> >> Am 26.07.2016 um 04:47 schrieb Gregory Farnum:
> >>> On Mon, Jul 25, 2016 at 7:38 PM, Oliver Dzombic <info@xxxxxxxxxxxxxxxxx> wrote:
> >>>> Hi,
> >>>>
> >>>> currently some productive stuff is down, because it can not be
> >>>> accessed through cephfs.
> >>>>
> >>>> Client server restart, did not help.
> >>>> Cluster restart, did not help.
> >>>>
> >>>> Only ONE directory inside cephfs has this issue.
> >>>>
> >>>> All other directories are working fine.
> >>>
> >>> What's the full output of "ceph -s"?
> >>>
> >>>>
> >>>>
> >>>> MDS Server: Kernel 4.5.4
> >>>> client server: Kernel 4.5.4
> >>>> ceph version 10.2.2
> >>>>
> >>>> # ceph fs dump
> >>>> dumped fsmap epoch 92
> >>>> e92
> >>>> enable_multiple, ever_enabled_multiple: 0,0
> >>>> compat: compat={},rocompat={},incompat={1=base v0.20,2=client
> >>>> writeable ranges,3=default file layouts on dirs,4=dir inode in
> >>>> separate object,5=mds uses versioned encoding,6=dirfrag is stored
> >>>> in omap,8=file layout v2}
> >>>>
> >>>> Filesystem 'ceph-gen2' (2)
> >>>> fs_name ceph-gen2
> >>>> epoch   92
> >>>> flags   0
> >>>> created 2016-06-11 21:53:02.142649
> >>>> modified        2016-06-14 11:09:16.783356
> >>>> tableserver     0
> >>>> root    0
> >>>> session_timeout 60
> >>>> session_autoclose       300
> >>>> max_file_size   1099511627776
> >>>> last_failure    0
> >>>> last_failure_osd_epoch  2164
> >>>> compat  compat={},rocompat={},incompat={1=base v0.20,2=client
> >>>> writeable ranges,3=default file layouts on dirs,4=dir inode in
> >>>> separate object,5=mds uses versioned encoding,6=dirfrag is stored
> >>>> in omap,8=file layout v2} max_mds 1
> >>>> in      0
> >>>> up      {0=234109}
> >>>> failed
> >>>> damaged
> >>>> stopped
> >>>> data_pools      4
> >>>> metadata_pool   5
> >>>> inline_data     disabled
> >>>> 234109: 10.0.0.11:6801/22255 'cephmon1' mds.0.89 up:active seq 250
> >>>>
> >>>>
> >>>> Standby daemons:
> >>>>
> >>>> 204171: 10.0.0.13:6800/19434 'cephmon3' mds.-1.0 up:standby seq 1
> >>>>
> >>>>
> >>>> ceph --admin-daemon ceph-mds.cephmon1.asok dump_ops_in_flight {
> >>>>     "ops": [
> >>>>         {
> >>>>             "description": "client_request(client.204153:432
> >>>> getattr pAsLsXsFs #10000001432 2016-07-25 21:57:30.697894 RETRY=2)",
> >>>>             "initiated_at": "2016-07-26 04:24:05.528832",
> >>>>             "age": 816.092461,
> >>>>             "duration": 816.092528,
> >>>>             "type_data": [
> >>>>                 "failed to rdlock, waiting",
> >>>>                 "client.204153:432",
> >>>>                 "client_request",
> >>>>                 {
> >>>>                     "client": "client.204153",
> >>>>                     "tid": 432
> >>>>                 },
> >>>>                 [
> >>>>                     {
> >>>>                         "time": "2016-07-26 04:24:05.528832",
> >>>>                         "event": "initiated"
> >>>>                     },
> >>>>                     {
> >>>>                         "time": "2016-07-26 04:24:07.613779",
> >>>>                         "event": "failed to rdlock, waiting"
> >>>>                     }
> >>>>                 ]
> >>>>             ]
> >>>>         }
> >>>>     ],
> >>>>     "num_ops": 1
> >>>> }
> >>>>
> >>>>
> >>>> 2016-07-26 04:32:09.355503 7ffb331ca700  0 log_channel(cluster) log
> >>>> [WRN] : 1 slow requests, 1 included below; oldest blocked for >
> >>>> 483.826590 secs
> >>>>
> >>>> 2016-07-26 04:32:09.355531 7ffb331ca700  0 log_channel(cluster) log
> >>>> [WRN] : slow request 483.826590 seconds old, received at 2016-07-26
> >>>> 04:24:05.528832: client_request(client.204153:432 getattr pAsLsXsFs
> >>>> #10000001432 2016-07-25 21:57:30.697894 RETRY=2) currently failed
> >>>> to rdlock, waiting
> >>>>
> >>>>
> >>>> Any idea ? :(
> >>>>
> >>>> --
> >>>> Mit freundlichen Gruessen / Best regards
> >>>>
> >>>> Oliver Dzombic
> >>>> IP-Interactive
> >>>>
> >>>> mailto:info@xxxxxxxxxxxxxxxxx
> >>>>
> >>>> Anschrift:
> >>>>
> >>>> IP Interactive UG ( haftungsbeschraenkt ) Zum Sonnenberg 1-3
> >>>> 63571 Gelnhausen
> >>>>
> >>>> HRB 93402 beim Amtsgericht Hanau
> >>>> Geschäftsführung: Oliver Dzombic
> >>>>
> >>>> Steuer Nr.: 35 236 3622 1
> >>>> UST ID: DE274086107
> >>>>
> >>>> _______________________________________________
> >>>> ceph-users mailing list
> >>>> ceph-users@xxxxxxxxxxxxxx
> >>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> >> _______________________________________________
> >> ceph-users mailing list
> >> ceph-users@xxxxxxxxxxxxxx
> >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> _______________________________________________
> ceph-users mailing list
> ceph-users@xxxxxxxxxxxxxx
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux