Re: Many ceph commands hang. broken mgr?

Rafael Lopez <rafael.lopez@xxxxxxxxxx> · Wed, 25 Nov 2020 12:08:52 +1100

Interesting. There is a more forceful way to disable progress which I had
to do as we have an older version. Basically, you stop the mgrs, and then
move the progress module files:

systemctl stop ceph-mgr.target
mv /usr/share//ceph/mgr/progress {some backup location}
systemctl start ceph-mgr.target

It's not nice because ceph reports as health_err, but it isn't required for
the cluster to function, and it made our (eg.) `ceph fs status` become
responsive again.
    health: HEALTH_ERR
            Module 'progress' has failed: Not found or unloadable

On Wed, 25 Nov 2020 at 12:02, Paul Mezzanini <pfmeec@xxxxxxx> wrote:

> "ceph progress off" is just hanging like the others.
>
> I'll fiddle with it later tonight to see if I can get it to stick when I
> bounce a daemon.
>
> --
> Paul Mezzanini
> Sr Systems Administrator / Engineer, Research Computing
> Information & Technology Services
> Finance & Administration
> Rochester Institute of Technology
> o:(585) 475-3245 | pfmeec@xxxxxxx
>
> CONFIDENTIALITY NOTE: The information transmitted, including attachments,
> is
> intended only for the person(s) or entity to which it is addressed and may
> contain confidential and/or privileged material. Any review,
> retransmission,
> dissemination or other use of, or taking of any action in reliance upon
> this
> information by persons or entities other than the intended recipient is
> prohibited. If you received this in error, please contact the sender and
> destroy any copies of this information.
> ------------------------
>
> ________________________________________
> From: Rafael Lopez <rafael.lopez@xxxxxxxxxx>
> Sent: Tuesday, November 24, 2020 6:56 PM
> To: Paul Mezzanini
> Cc: ceph-users
> Subject:  Re: Many ceph commands hang. broken mgr?
>
> Hi Paul,
>
> We had similar experience with redhat ceph, and it turned out to be the mgr
> progress module. I think there are some works to fix this, though the one I
> thought would impact you seems to be in 14.2.11.
> https://github.com/ceph/ceph/pull/36076
>
> If you have 14.2.15, you can try turning off the progress module altogether
> to see if it makes a difference.
> https://docs.ceph.com/en/latest/releases/nautilus/
> MGR: progress module can now be turned on/off, using the commands: ceph
> progress on and ceph progress off.
>
> Rafael
>
>
> On Wed, 25 Nov 2020 at 06:04, Paul Mezzanini <pfmeec@xxxxxxx> wrote:
>
> > Ever since we jumped from 14.2.9 to .12 (and beyond) a lot of the ceph
> > commands just hang.  The mgr daemon also just stops responding to our
> > Prometheus scrapes occasionally.  A daemon restart and it wakes back
> up.  I
> > have nothing pointing to these being related but it feels that way.
> >
> > I also tried to get device health monitoring with smart up and running
> > around that upgrade time.  It never seemed to be able to pull in and
> report
> > on the health across the drives.  I did see the osd process firing off
> > smartctl on occasion though so it was trying to do something.  Again, I
> > have nothing pointing to this being related but it feels like it may be.
> >
> > Some commands that currently hang:
> > ceph osd pool autoscale-status
> > ceph balancer *
> > ceph iostat (oddly, this spit out a line of all 0 stats once and then
> hung)
> > ceph fs status
> > toggling ceph device monitoring on or off and a lot of the device health
> > stuff too
> >
> >
> >
> > Mgr logs on disk show flavors of this:
> > 2020-11-24 13:05:07.883 7f19e2c40700  0 log_channel(audit) log [DBG] :
> > from='mon.0 -' entity='mon.' cmd=[{,",p,r,e,f,i,x,",:, ,",o,s,d,
> > ,p,e,r,f,",,, ,",f,o,r,m,a,t,",:, ,",j,s,o,n,",}]: dispatch
> > 2020-11-24 13:05:07.895 7f19e2c40700  0 log_channel(audit) log [DBG] :
> > from='mon.0 -' entity='mon.' cmd=[{,",p,r,e,f,i,x,",:, ,",o,s,d,
> ,p,o,o,l,
> > ,s,t,a,t,s,",,, ,",f,o,r,m,a,t,",:, ,",j,s,o,n,",}]: dispatch
> > 2020-11-24 13:05:08.567 7f19e1c3e700  0 log_channel(cluster) log [DBG] :
> > pgmap v587: 17149 pgs: 1 active+remapped+backfill_wait, 2
> > active+clean+scrubbing, 55 active+clean+scrubbing+deep, 9
> > active+remapped+backfilling, 17082 active+clean; 2.1 PiB data, 3.5 PiB
> > used, 2.9 PiB / 6.4 PiB avail; 108 MiB/s rd, 53 MiB/s wr, 1.20k op/s;
> > 7525420/9900121381 objects misplaced (0.076%); 99 MiB/s, 40 objects/s
> > recovering
> >
> > ceph status:
> >   cluster:
> >     id:     971a5242-f00d-421e-9bf4-5a716fcc843a
> >     health: HEALTH_WARN
> >             1 nearfull osd(s)
> >             1 pool(s) nearfull
> >
> >   services:
> >     mon: 3 daemons, quorum ceph-mon-01,ceph-mon-03,ceph-mon-02 (age 4h)
> >     mgr: ceph-mon-01(active, since 97s), standbys: ceph-mon-03,
> ceph-mon-02
> >     mds: cephfs:1 {0=ceph-mds-02=up:active} 3 up:standby
> >     osd: 843 osds: 843 up (since 13d), 843 in (since 2w); 10 remapped pgs
> >     rgw: 1 daemon active (ceph-rgw-01)
> >
> >   task status:
> >     scrub status:
> >         mds.ceph-mds-02: idle
> >
> >   data:
> >     pools:   16 pools, 17149 pgs
> >     objects: 1.61G objects, 2.1 PiB
> >     usage:   3.5 PiB used, 2.9 PiB / 6.4 PiB avail
> >     pgs:     6482000/9900825469 objects misplaced (0.065%)
> >              17080 active+clean
> >              54    active+clean+scrubbing+deep
> >              9     active+remapped+backfilling
> >              5     active+clean+scrubbing
> >              1     active+remapped+backfill_wait
> >
> >   io:
> >     client:   877 MiB/s rd, 1.8 GiB/s wr, 1.91k op/s rd, 3.33k op/s wr
> >     recovery: 136 MiB/s, 55 objects/s
> >
> > ceph config dump:
> > WHO                MASK LEVEL    OPTION
> >      VALUE                                             RO
> > global                  advanced cluster_network
> >       192.168.42.0/24                                   *
> > global                  advanced mon_max_pg_per_osd
> >      400
> > global                  advanced mon_pg_warn_max_object_skew
> >       -1.000000
> > global                  dev      mon_warn_on_pool_pg_num_not_power_of_two
> >      false
> > global                  advanced osd_max_backfills
> >       2
> > global                  advanced osd_max_scrubs
> >      4
> > global                  advanced osd_scrub_during_recovery
> >       false
> > global                  advanced public_network
> >      1xx.xx.171.0/24 10.16.171.0/24                    *
> >   mon                   advanced mon_allow_pool_delete
> >       true
> >   mgr                   advanced mgr/balancer/mode
> >       none
> >   mgr                   advanced mgr/devicehealth/enable_monitoring
> >      false
> >   osd                   advanced bluestore_compression_mode
> >      passive
> >   osd                   advanced
> > osd_deep_scrub_large_omap_object_key_threshold 2000000
> >
> >   osd                   advanced osd_op_queue_cut_off
> >      high                                              *
> >   osd                   advanced osd_scrub_load_threshold
> >      5.000000
> >   mds                   advanced mds_beacon_grace
> >      300.000000
> >   mds                   basic    mds_cache_memory_limit
> >      16384000000
> >   mds                   advanced mds_log_max_segments
> >      256
> >   client                advanced rbd_default_features
> >      5
> >     client.libvirt      advanced admin_socket
> >      /var/run/ceph/$cluster-$type.$id.$pid.$cctid.asok *
> >     client.libvirt      basic    log_file
> >      /var/log/ceph/qemu-guest-$pid.log                 *
> >
> >
> > /etc/ceph/ceph.conf is the stub file with fsid and the mons listed.
> > Yes I have a drive that just started to tickle the full warn limit.
> > That's what pulled me back into the "I should fix this" mode.  I'm
> manually
> > adjusting the weight on that one for the time being along with slowly
> > lowering pg_num on an oversized pool.  The cluster still has this issue
> > when in health_ok.
> >
> > I'm free to do a lot of debugging and poking around even though this is
> > our production cluster.  The only service I refuse to play around with is
> > the MDS.  That one bites back.  Does anyone have more ideas on where to
> > look to try and figure out what's going on?
> >
> > --
> > Paul Mezzanini
> > Sr Systems Administrator / Engineer, Research Computing
> > Information & Technology Services
> > Finance & Administration
> > Rochester Institute of Technology
> > o:(585) 475-3245 | pfmeec@xxxxxxx
> >
> > CONFIDENTIALITY NOTE: The information transmitted, including attachments,
> > is
> > intended only for the person(s) or entity to which it is addressed and
> may
> > contain confidential and/or privileged material. Any review,
> > retransmission,
> > dissemination or other use of, or taking of any action in reliance upon
> > this
> > information by persons or entities other than the intended recipient is
> > prohibited. If you received this in error, please contact the sender and
> > destroy any copies of this information.
> > ------------------------
> > _______________________________________________
> > ceph-users mailing list -- ceph-users@xxxxxxx
> > To unsubscribe send an email to ceph-users-leave@xxxxxxx
> >
>
>
> --
> *Rafael Lopez*
> Devops Systems Engineer
> Monash University eResearch Centre
> E: rafael.lopez@xxxxxxxxxx
> _______________________________________________
> ceph-users mailing list -- ceph-users@xxxxxxx
> To unsubscribe send an email to ceph-users-leave@xxxxxxx
>

-- 
*Rafael Lopez*
Devops Systems Engineer
Monash University eResearch Centre

T: +61 3 9905 9118
E: rafael.lopez@xxxxxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx