Re: outdated mds slow requests

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,
It get cleared by restarting ceph client with issues. It works. to do that,
you would umount problematic cephfs volume and remount. All ceph warning is
gone in couple minutes, trimming well now. Indeed I wouldn't restart mds
unless I had to.

Many thanks for help,
Ben

Eugen Block <eblock@xxxxxx> 于2023年10月10日周二 15:44写道:

> Hi,
>
> > 2, restart problematic mds with trimming behind issue: 3,4,5: mds will
> > start up quickly, won't they? investigating...
>
> this one you should be able to answer better than the rest of us. You
> probably have restarted MDS daemons before, I would assume.
> Just don't restart them all at once but one after the other after
> everything has settled.
>
> Zitat von Ben <ruidong.gao@xxxxxxxxx>:
>
> > Hi Eugen,
> >
> > warnings continue to spam cluster log.Actually for the whole picture of
> the
> > issue please see:
> >
> https://lists.ceph.io/hyperkitty/list/ceph-users@xxxxxxx/thread/VDL56J75FG5LO4ZECIWWGGBW4ULPZUIP/
> >
> > I was thinking about the following options:
> > 1, restart problematic nodes: 24,32,34,36: need to schedule with business
> > partners
> > 2, restart problematic mds with trimming behind issue: 3,4,5: mds will
> > start up quickly, won't they? investigating...
> >
> > mds 3,4,5 have logsegment item in segments list that are stuck in
> expiring
> > status, which break the trimming process. Growing segments lists draw
> > concerns overtime.
> > Any other ideas?
> >
> > Thanks,
> > Ben
> >
> >
> > Eugen Block <eblock@xxxxxx> 于2023年10月4日周三 16:44写道:
> >
> >> Hi,
> >>
> >> is this still an issue? If so, I would try to either evict the client
> >> via admin socket:
> >>
> >> ceph tell mds.5 client evict [<filters>...] --- Evict client
> >> session(s) based on a filter
> >>
> >> alternatively locally on the MDS:
> >> cephadm enter mds.<MDS>
> >> ceph daemon mds.<MDS> client evict <client>
> >>
> >> or restart the MDS which should also clear the client, I believe.
> >>
> >> Zitat von Ben <ruidong.gao@xxxxxxxxx>:
> >>
> >> > Hi,
> >> > It is running 17.2.5. there are slow requests warnings in cluster log.
> >> >
> >> > ceph tell mds.5 dump_ops_in_flight,
> >> > get the following.
> >> >
> >> > These look like outdated and clients were k8s pods. There are warning
> of
> >> > the kind in other mds as well. How could they be cleaned from warnings
> >> > safely?
> >> >
> >> > Many thanks.
> >> >
> >> > {
> >> > "ops": [
> >> > {
> >> > "description": "peer_request(mds.3:5311742.0 authpin)",
> >> > "initiated_at": "2023-09-14T12:25:43.092201+0000",
> >> > "age": 926013.05098558997,
> >> > "duration": 926013.051015759,
> >> > "type_data": {
> >> > "flag_point": "dispatched",
> >> > "reqid": "mds.3:5311742",
> >> > "op_type": "peer_request",
> >> > "leader_info": {
> >> > "leader": "3"
> >> > },
> >> > "request_info": {
> >> > "attempt": 0,
> >> > "op_type": "authpin",
> >> > "lock_type": 0,
> >> > "object_info": "0x60001205d6d.head",
> >> > "srcdnpath": "",
> >> > "destdnpath": "",
> >> > "witnesses": "",
> >> > "has_inode_export": false,
> >> > "inode_export_v": 0,
> >> > "op_stamp": "0.000000"
> >> > },
> >> > "events": [
> >> > {
> >> > "time": "2023-09-14T12:25:43.092201+0000",
> >> > "event": "initiated"
> >> > },
> >> > {
> >> > "time": "2023-09-14T12:25:43.092202+0000",
> >> > "event": "throttled"
> >> > },
> >> > {
> >> > "time": "2023-09-14T12:25:43.092201+0000",
> >> > "event": "header_read"
> >> > },
> >> > {
> >> > "time": "2023-09-14T12:25:43.092207+0000",
> >> > "event": "all_read"
> >> > },
> >> > {
> >> > "time": "2023-09-14T12:25:43.092218+0000",
> >> > "event": "dispatched"
> >> > }
> >> > ]
> >> > }
> >> > },
> >> > {
> >> > "description": "peer_request(mds.3:5311743.0 authpin)",
> >> > "initiated_at": "2023-09-14T12:25:43.092371+0000",
> >> > "age": 926013.05081614305,
> >> > "duration": 926013.05089185503,
> >> > "type_data": {
> >> > "flag_point": "dispatched",
> >> > "reqid": "mds.3:5311743",
> >> > "op_type": "peer_request",
> >> > "leader_info": {
> >> > "leader": "3"
> >> > },
> >> > "request_info": {
> >> > "attempt": 0,
> >> > "op_type": "authpin",
> >> > "lock_type": 0,
> >> > "object_info": "0x60001205d6d.head",
> >> > "srcdnpath": "",
> >> > "destdnpath": "",
> >> > "witnesses": "",
> >> > "has_inode_export": false,
> >> > "inode_export_v": 0,
> >> > "op_stamp": "0.000000"
> >> > },
> >> > "events": [
> >> > {
> >> > "time": "2023-09-14T12:25:43.092371+0000",
> >> > "event": "initiated"
> >> > },
> >> > {
> >> > "time": "2023-09-14T12:25:43.092371+0000",
> >> > "event": "throttled"
> >> > },
> >> > {
> >> > "time": "2023-09-14T12:25:43.092371+0000",
> >> > "event": "header_read"
> >> > },
> >> > {
> >> > "time": "2023-09-14T12:25:43.092374+0000",
> >> > "event": "all_read"
> >> > },
> >> > {
> >> > "time": "2023-09-14T12:25:43.092381+0000",
> >> > "event": "dispatched"
> >> > }
> >> > ]
> >> > }
> >> > },
> >> > {
> >> > "description": "peer_request(mds.4:4503615.0 authpin)",
> >> > "initiated_at": "2023-09-14T13:40:25.150040+0000",
> >> > "age": 921530.99314722,
> >> > "duration": 921530.99326053297,
> >> > "type_data": {
> >> > "flag_point": "dispatched",
> >> > "reqid": "mds.4:4503615",
> >> > "op_type": "peer_request",
> >> > "leader_info": {
> >> > "leader": "4"
> >> > },
> >> > "request_info": {
> >> > "attempt": 0,
> >> > "op_type": "authpin",
> >> > "lock_type": 0,
> >> > "object_info": "0x60001205c4f.head",
> >> > "srcdnpath": "",
> >> > "destdnpath": "",
> >> > "witnesses": "",
> >> > "has_inode_export": false,
> >> > "inode_export_v": 0,
> >> > "op_stamp": "0.000000"
> >> > },
> >> > "events": [
> >> > {
> >> > "time": "2023-09-14T13:40:25.150040+0000",
> >> > "event": "initiated"
> >> > },
> >> > {
> >> > "time": "2023-09-14T13:40:25.150040+0000",
> >> > "event": "throttled"
> >> > },
> >> > {
> >> > "time": "2023-09-14T13:40:25.150040+0000",
> >> > "event": "header_read"
> >> > },
> >> > {
> >> > "time": "2023-09-14T13:40:25.150045+0000",
> >> > "event": "all_read"
> >> > },
> >> > {
> >> > "time": "2023-09-14T13:40:25.150053+0000",
> >> > "event": "dispatched"
> >> > }
> >> > ]
> >> > }
> >> > },
> >> > {
> >> > "description": "client_request(client.460983:5731820 getattr pAsLsXsFs
> >> > #0x60001205c4f 2023-09-14T13:40:25.144336+0000 caller_uid=0,
> >> > caller_gid=0{})",
> >> > "initiated_at": "2023-09-14T13:40:25.150176+0000",
> >> > "age": 921530.99301089498,
> >> > "duration": 921530.99316312897,
> >> > "type_data": {
> >> > "flag_point": "failed to authpin, inode is being exported",
> >> > "reqid": "client.460983:5731820",
> >> > "op_type": "client_request",
> >> > "client_info": {
> >> > "client": "client.460983",
> >> > "tid": 5731820
> >> > },
> >> > "events": [
> >> > {
> >> > "time": "2023-09-14T13:40:25.150176+0000",
> >> > "event": "initiated"
> >> > },
> >> > {
> >> > "time": "2023-09-14T13:40:25.150177+0000",
> >> > "event": "throttled"
> >> > },
> >> > {
> >> > "time": "2023-09-14T13:40:25.150176+0000",
> >> > "event": "header_read"
> >> > },
> >> > {
> >> > "time": "2023-09-14T13:40:25.150180+0000",
> >> > "event": "all_read"
> >> > },
> >> > {
> >> > "time": "2023-09-14T13:40:25.150186+0000",
> >> > "event": "dispatched"
> >> > },
> >> > {
> >> > "time": "2023-09-14T13:40:25.150195+0000",
> >> > "event": "failed to authpin, inode is being exported"
> >> > }
> >> > ]
> >> > }
> >> > }
> >> > ],
> >> > "num_ops": 4
> >> > }
> >> > _______________________________________________
> >> > ceph-users mailing list -- ceph-users@xxxxxxx
> >> > To unsubscribe send an email to ceph-users-leave@xxxxxxx
> >>
> >>
> >> _______________________________________________
> >> ceph-users mailing list -- ceph-users@xxxxxxx
> >> To unsubscribe send an email to ceph-users-leave@xxxxxxx
> >>
>
>
>
>
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux