Re: Many ceph commands hang. broken mgr?

Paul Mezzanini <pfmeec@xxxxxxx> · Mon, 30 Nov 2020 22:43:32 +0000

datapoint:  "ceph crash" is also not working.  This is smelling more and more like a mgr issue.

--
Paul Mezzanini
Sr Systems Administrator / Engineer, Research Computing
Information & Technology Services
Finance & Administration
Rochester Institute of Technology
o:(585) 475-3245 | pfmeec@xxxxxxx

CONFIDENTIALITY NOTE: The information transmitted, including attachments, is
intended only for the person(s) or entity to which it is addressed and may
contain confidential and/or privileged material. Any review, retransmission,
dissemination or other use of, or taking of any action in reliance upon this
information by persons or entities other than the intended recipient is
prohibited. If you received this in error, please contact the sender and
destroy any copies of this information.
------------------------

________________________________________
From: Paul Mezzanini <pfmeec@xxxxxxx>
Sent: Monday, November 30, 2020 11:59 AM
To: ceph-users
Subject:  Re: Many ceph commands hang. broken mgr?

Still going on.  I want to start using the balancer module but all of those commands are hanging.

I'm just doing shotgun debugging now.

--
Paul Mezzanini
Sr Systems Administrator / Engineer, Research Computing
Information & Technology Services
Finance & Administration
Rochester Institute of Technology
o:(585) 475-3245 | pfmeec@xxxxxxx

CONFIDENTIALITY NOTE: The information transmitted, including attachments, is
intended only for the person(s) or entity to which it is addressed and may
contain confidential and/or privileged material. Any review, retransmission,
dissemination or other use of, or taking of any action in reliance upon this
information by persons or entities other than the intended recipient is
prohibited. If you received this in error, please contact the sender and
destroy any copies of this information.
------------------------

________________________________________
From: Paul Mezzanini <pfmeec@xxxxxxx>
Sent: Tuesday, November 24, 2020 8:17 PM
To: Rafael Lopez
Cc: ceph-users
Subject:  Re: Many ceph commands hang. broken mgr?

While the "progress off" was hung, I did a systemctl restart of the active ceph-mgr.  The progress toggle command completed and reported that progress disabled.

All commands that were hanging before are still unresponsive.  That was worth a shot.

Thanks

--
Paul Mezzanini
Sr Systems Administrator / Engineer, Research Computing
Information & Technology Services
Finance & Administration
Rochester Institute of Technology
o:(585) 475-3245 | pfmeec@xxxxxxx

CONFIDENTIALITY NOTE: The information transmitted, including attachments, is
intended only for the person(s) or entity to which it is addressed and may
contain confidential and/or privileged material. Any review, retransmission,
dissemination or other use of, or taking of any action in reliance upon this
information by persons or entities other than the intended recipient is
prohibited. If you received this in error, please contact the sender and
destroy any copies of this information.
------------------------

________________________________________
From: Rafael Lopez <rafael.lopez@xxxxxxxxxx>
Sent: Tuesday, November 24, 2020 8:08 PM
To: Paul Mezzanini
Cc: ceph-users
Subject:  Re: Many ceph commands hang. broken mgr?

Interesting. There is a more forceful way to disable progress which I had
to do as we have an older version. Basically, you stop the mgrs, and then
move the progress module files:

systemctl stop ceph-mgr.target
mv /usr/share//ceph/mgr/progress {some backup location}
systemctl start ceph-mgr.target

It's not nice because ceph reports as health_err, but it isn't required for
the cluster to function, and it made our (eg.) `ceph fs status` become
responsive again.
    health: HEALTH_ERR
            Module 'progress' has failed: Not found or unloadable

On Wed, 25 Nov 2020 at 12:02, Paul Mezzanini <pfmeec@xxxxxxx> wrote:

> "ceph progress off" is just hanging like the others.
>
> I'll fiddle with it later tonight to see if I can get it to stick when I
> bounce a daemon.
>
> --
> Paul Mezzanini
> Sr Systems Administrator / Engineer, Research Computing
> Information & Technology Services
> Finance & Administration
> Rochester Institute of Technology
> o:(585) 475-3245 | pfmeec@xxxxxxx
>
> CONFIDENTIALITY NOTE: The information transmitted, including attachments,
> is
> intended only for the person(s) or entity to which it is addressed and may
> contain confidential and/or privileged material. Any review,
> retransmission,
> dissemination or other use of, or taking of any action in reliance upon
> this
> information by persons or entities other than the intended recipient is
> prohibited. If you received this in error, please contact the sender and
> destroy any copies of this information.
> ------------------------
>
> ________________________________________
> From: Rafael Lopez <rafael.lopez@xxxxxxxxxx>
> Sent: Tuesday, November 24, 2020 6:56 PM
> To: Paul Mezzanini
> Cc: ceph-users
> Subject:  Re: Many ceph commands hang. broken mgr?
>
> Hi Paul,
>
> We had similar experience with redhat ceph, and it turned out to be the mgr
> progress module. I think there are some works to fix this, though the one I
> thought would impact you seems to be in 14.2.11.
> https://github.com/ceph/ceph/pull/36076
>
> If you have 14.2.15, you can try turning off the progress module altogether
> to see if it makes a difference.
> https://docs.ceph.com/en/latest/releases/nautilus/
> MGR: progress module can now be turned on/off, using the commands: ceph
> progress on and ceph progress off.
>
> Rafael
>
>
> On Wed, 25 Nov 2020 at 06:04, Paul Mezzanini <pfmeec@xxxxxxx> wrote:
>
> > Ever since we jumped from 14.2.9 to .12 (and beyond) a lot of the ceph
> > commands just hang.  The mgr daemon also just stops responding to our
> > Prometheus scrapes occasionally.  A daemon restart and it wakes back
> up.  I
> > have nothing pointing to these being related but it feels that way.
> >
> > I also tried to get device health monitoring with smart up and running
> > around that upgrade time.  It never seemed to be able to pull in and
> report
> > on the health across the drives.  I did see the osd process firing off
> > smartctl on occasion though so it was trying to do something.  Again, I
> > have nothing pointing to this being related but it feels like it may be.
> >
> > Some commands that currently hang:
> > ceph osd pool autoscale-status
> > ceph balancer *
> > ceph iostat (oddly, this spit out a line of all 0 stats once and then
> hung)
> > ceph fs status
> > toggling ceph device monitoring on or off and a lot of the device health
> > stuff too
> >
> >
> >
> > Mgr logs on disk show flavors of this:
> > 2020-11-24 13:05:07.883 7f19e2c40700  0 log_channel(audit) log [DBG] :
> > from='mon.0 -' entity='mon.' cmd=[{,",p,r,e,f,i,x,",:, ,",o,s,d,
> > ,p,e,r,f,",,, ,",f,o,r,m,a,t,",:, ,",j,s,o,n,",}]: dispatch
> > 2020-11-24 13:05:07.895 7f19e2c40700  0 log_channel(audit) log [DBG] :
> > from='mon.0 -' entity='mon.' cmd=[{,",p,r,e,f,i,x,",:, ,",o,s,d,
> ,p,o,o,l,
> > ,s,t,a,t,s,",,, ,",f,o,r,m,a,t,",:, ,",j,s,o,n,",}]: dispatch
> > 2020-11-24 13:05:08.567 7f19e1c3e700  0 log_channel(cluster) log [DBG] :
> > pgmap v587: 17149 pgs: 1 active+remapped+backfill_wait, 2
> > active+clean+scrubbing, 55 active+clean+scrubbing+deep, 9
> > active+remapped+backfilling, 17082 active+clean; 2.1 PiB data, 3.5 PiB
> > used, 2.9 PiB / 6.4 PiB avail; 108 MiB/s rd, 53 MiB/s wr, 1.20k op/s;
> > 7525420/9900121381 objects misplaced (0.076%); 99 MiB/s, 40 objects/s
> > recovering
> >
> > ceph status:
> >   cluster:
> >     id:     971a5242-f00d-421e-9bf4-5a716fcc843a
> >     health: HEALTH_WARN
> >             1 nearfull osd(s)
> >             1 pool(s) nearfull
> >
> >   services:
> >     mon: 3 daemons, quorum ceph-mon-01,ceph-mon-03,ceph-mon-02 (age 4h)
> >     mgr: ceph-mon-01(active, since 97s), standbys: ceph-mon-03,
> ceph-mon-02
> >     mds: cephfs:1 {0=ceph-mds-02=up:active} 3 up:standby
> >     osd: 843 osds: 843 up (since 13d), 843 in (since 2w); 10 remapped pgs
> >     rgw: 1 daemon active (ceph-rgw-01)
> >
> >   task status:
> >     scrub status:
> >         mds.ceph-mds-02: idle
> >
> >   data:
> >     pools:   16 pools, 17149 pgs
> >     objects: 1.61G objects, 2.1 PiB
> >     usage:   3.5 PiB used, 2.9 PiB / 6.4 PiB avail
> >     pgs:     6482000/9900825469 objects misplaced (0.065%)
> >              17080 active+clean
> >              54    active+clean+scrubbing+deep
> >              9     active+remapped+backfilling
> >              5     active+clean+scrubbing
> >              1     active+remapped+backfill_wait
> >
> >   io:
> >     client:   877 MiB/s rd, 1.8 GiB/s wr, 1.91k op/s rd, 3.33k op/s wr
> >     recovery: 136 MiB/s, 55 objects/s
> >
> > ceph config dump:
> > WHO                MASK LEVEL    OPTION
> >      VALUE                                             RO
> > global                  advanced cluster_network
> >       192.168.42.0/24                                   *
> > global                  advanced mon_max_pg_per_osd
> >      400
> > global                  advanced mon_pg_warn_max_object_skew
> >       -1.000000
> > global                  dev      mon_warn_on_pool_pg_num_not_power_of_two
> >      false
> > global                  advanced osd_max_backfills
> >       2
> > global                  advanced osd_max_scrubs
> >      4
> > global                  advanced osd_scrub_during_recovery
> >       false
> > global                  advanced public_network
> >      1xx.xx.171.0/24 10.16.171.0/24                    *
> >   mon                   advanced mon_allow_pool_delete
> >       true
> >   mgr                   advanced mgr/balancer/mode
> >       none
> >   mgr                   advanced mgr/devicehealth/enable_monitoring
> >      false
> >   osd                   advanced bluestore_compression_mode
> >      passive
> >   osd                   advanced
> > osd_deep_scrub_large_omap_object_key_threshold 2000000
> >
> >   osd                   advanced osd_op_queue_cut_off
> >      high                                              *
> >   osd                   advanced osd_scrub_load_threshold
> >      5.000000
> >   mds                   advanced mds_beacon_grace
> >      300.000000
> >   mds                   basic    mds_cache_memory_limit
> >      16384000000
> >   mds                   advanced mds_log_max_segments
> >      256
> >   client                advanced rbd_default_features
> >      5
> >     client.libvirt      advanced admin_socket
> >      /var/run/ceph/$cluster-$type.$id.$pid.$cctid.asok *
> >     client.libvirt      basic    log_file
> >      /var/log/ceph/qemu-guest-$pid.log                 *
> >
> >
> > /etc/ceph/ceph.conf is the stub file with fsid and the mons listed.
> > Yes I have a drive that just started to tickle the full warn limit.
> > That's what pulled me back into the "I should fix this" mode.  I'm
> manually
> > adjusting the weight on that one for the time being along with slowly
> > lowering pg_num on an oversized pool.  The cluster still has this issue
> > when in health_ok.
> >
> > I'm free to do a lot of debugging and poking around even though this is
> > our production cluster.  The only service I refuse to play around with is
> > the MDS.  That one bites back.  Does anyone have more ideas on where to
> > look to try and figure out what's going on?
> >
> > --
> > Paul Mezzanini
> > Sr Systems Administrator / Engineer, Research Computing
> > Information & Technology Services
> > Finance & Administration
> > Rochester Institute of Technology
> > o:(585) 475-3245 | pfmeec@xxxxxxx
> >
> > CONFIDENTIALITY NOTE: The information transmitted, including attachments,
> > is
> > intended only for the person(s) or entity to which it is addressed and
> may
> > contain confidential and/or privileged material. Any review,
> > retransmission,
> > dissemination or other use of, or taking of any action in reliance upon
> > this
> > information by persons or entities other than the intended recipient is
> > prohibited. If you received this in error, please contact the sender and
> > destroy any copies of this information.
> > ------------------------
> > _______________________________________________
> > ceph-users mailing list -- ceph-users@xxxxxxx
> > To unsubscribe send an email to ceph-users-leave@xxxxxxx
> >
>
>
> --
> *Rafael Lopez*
> Devops Systems Engineer
> Monash University eResearch Centre
> E: rafael.lopez@xxxxxxxxxx
> _______________________________________________
> ceph-users mailing list -- ceph-users@xxxxxxx
> To unsubscribe send an email to ceph-users-leave@xxxxxxx
>

--
*Rafael Lopez*
Devops Systems Engineer
Monash University eResearch Centre

T: +61 3 9905 9118
E: rafael.lopez@xxxxxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx