Hello! Our test cluster is a few months old, was
initially set up from scratch with Pacific and has now had two
separate small patches 16.2.5 and then a couple weeks ago,
16.2.6 applied to it. The issue I?m describing has been present
since the beginning. We have an active and standby mgr daemon, and
the dashboard module is installed with SSL turned on. Self
signed certificates only, not trusted by browsers, but I always
just click ?okay? through Chrome and Firefox?s warnings about
that. I have noticed that every 2-3 days, in the
morning when I start work, our ceph dashboard page does not
respond in the browser. It works fine throughout the day, but it
seems like after a certain unknown hours without anyone
accessing it (I?m the only one using the dashboard now since
it?s just a test) something must be going wrong with the
dashboard module, or mgr daemon, because when I try to load (or
refresh when it's already loaded) the ceph dashboard site, the
browser just does the ?throbber? ? no content on the page
ever appears, no errors or anything. None of the buttons on the
page load ? nor time out and show a 404 ? for example,
Block\Images or Cluster\Hosts in the left sidebar will load, but
show empty. And the throbber never stops. Confirmed that this happens in all browsers
too. I can easily fix it with ceph mgr module disable dashboard
and then waiting 10 seconds, then ceph mgr module enable dashboard
? this makes it start working again, until the next time
I go a few days without using the dashboard, at which point I
need to do the same process again. Any ideas as to what could be causing this? I
have already turned on debug mode. When I?m in this hanging
state, I check the cephadm logs with cephadm logs
--name mgr.ceph01.fblojp -- -f but there?s nothing
obvious (to my untrained eyes at least). When the dashboard is
functional, I can see my own navigation around the dashboard in
the logs so I know that logging is working: Nov 01 15:46:32 ceph01.domain conmon[5814]: debug
2021-11-01T20:46:32.601+0000 7f7cbb42e700 0 [dashboard INFO
request] [10.130.50.252:52267] [GET] [200] [0.013s] [admin]
[1.0K] /api/summary I already confirmed that the same thing
happens regardless of whether I?m using default ports of http://ceph01.domain:8080 or
https://ceph01.domain:8443
(although as mentioned I usually use self-signed SSL). At this moment the dashboard is currently in
this hanging state so I am happy to try to get logs. Thanks, -Zach |
_______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx