Hi Zach, I remember the Cherrypy webserver (Cheroot 8.5.1) had a hellish deadlock-kind of issue <https://github.com/cherrypy/cheroot/issues/358> no that long ago, but that was already fixed in 8.5.2. Could you please run the same curl command with the "-v" flag to get a verbose output? You can compare that with a sample output of a freezing Cherrypy server at this tracker: https://tracker.ceph.com/issues/48973 BTW we also managed to speed up reproduction by using a benchmark tool like Apache benchmark. You can get here a ready to use reproducer code: https://bugzilla.redhat.com/show_bug.cgi?id=1920461#c2 Kind Regards, Ernesto On Fri, Nov 19, 2021 at 8:09 PM Zach Heise (SSCC) <heise@xxxxxxxxxxxx> wrote: > Thanks for writing, Ernesto. > > 1. output of ceph mgr services: > ceph mgr services > { > "dashboard": "https://144.92.190.200:8443/" > <https://144.92.190.200:8443/>, > "prometheus": "http://144.92.190.200:9283/" > <http://144.92.190.200:9283/> > } > 2. Network tab in dev tools, doing a reload just results in a GET -> > DOMAIN: ceph01.ssc.wisc.edu:8443, file / > 1. Nothing else comes up as the throbber throbs. > 2. No assets list as being downloaded. > 3. Similar result with curl: curl -k https://144.92.190.200:8443 > just results in a blinking cursor. No errors, just hanging. If I try any > other random port, curl (as expected) says "connection refused" and quits > instantly. > > Zach > > > On 2021-11-19 10:17 AM, Ernesto Puerta wrote: > > Hi Zach, > > Thanks for the thorough description. We haven't noticed this issue so far > and have some long-running clusters, but let's try to debug it: > > - First of all, as Kai suggested, let's ensure we're hitting the > active manager address (there's a redirection mechanism, but let's ensure > it anyway): a "ceph mgr services" should give you the active Dashboard URL. > - After that, my suggestion for you is to open the Browser's Dev Tools > (built-in in both Chrome or Firefox) and visit the Networking tab. In > there, you should be able a few network requests on hard reload (remember > to keep CTRL+SHIFT pressed while clicking on the reload icon). You should > see a few HTML, CSS and JS assets downloading. > - Let's try to perform a "curl" from the CLI: "curl -k https://<hostname>:<port>". > That should return the index HTML file. > > Are you using a reverse proxy/cache that might be interfering with this? > > Kind Regards, > Ernesto > > > On Fri, Nov 19, 2021 at 12:04 AM Zach Heise (SSCC) <heise@xxxxxxxxxxxx> > wrote: > >> Hello! >> >> >> >> Our test cluster is a few months old, was initially set up from scratch >> with Pacific and has now had two separate small patches 16.2.5 and then a >> couple weeks ago, 16.2.6 applied to it. The issue I?m describing has been >> present since the beginning. >> >> >> >> We have an active and standby mgr daemon, and the dashboard module is >> installed with SSL turned on. Self signed certificates only, not trusted by >> browsers, but I always just click ?okay? through Chrome and Firefox?s >> warnings about that. >> >> >> >> I have noticed that every 2-3 days, in the morning when I start work, our >> ceph dashboard page does not respond in the browser. It works fine >> throughout the day, but it seems like after a certain unknown hours without >> anyone accessing it (I?m the only one using the dashboard now since it?s >> just a test) something must be going wrong with the dashboard module, or >> mgr daemon, because when I try to load (or refresh when it's already >> loaded) the ceph dashboard site, the browser just does the ?throbber >> <https://en.wikipedia.org/wiki/Throbber>? ? no content on the page ever >> appears, no errors or anything. None of the buttons on the page load ? nor >> time out and show a 404 ? for example, Block\Images or Cluster\Hosts in the >> left sidebar will load, but show empty. And the throbber never stops. >> >> >> >> Confirmed that this happens in all browsers too. >> >> >> >> I can easily fix it with ceph mgr module disable dashboard and then >> waiting 10 seconds, then ceph mgr module enable dashboard ? this makes >> it start working again, until the next time I go a few days without using >> the dashboard, at which point I need to do the same process again. >> >> >> >> Any ideas as to what could be causing this? I have already turned on >> debug mode. When I?m in this hanging state, I check the cephadm logs with cephadm >> logs --name mgr.ceph01.fblojp -- -f but there?s nothing obvious (to my >> untrained eyes at least). When the dashboard is functional, I can see my >> own navigation around the dashboard in the logs so I know that logging is >> working: >> >> >> >> Nov 01 15:46:32 ceph01.domain conmon[5814]: debug >> 2021-11-01T20:46:32.601+0000 7f7cbb42e700 0 [dashboard INFO request] [ >> 10.130.50.252:52267] [GET] [200] [0.013s] [admin] [1.0K] /api/summary >> >> >> >> I already confirmed that the same thing happens regardless of whether I?m >> using default ports of http://ceph01.domain:8080 or >> https://ceph01.domain:8443 (although as mentioned I usually use >> self-signed SSL). >> >> >> >> At this moment the dashboard is currently in this hanging state so I am >> happy to try to get logs. >> >> >> >> Thanks, >> >> -Zach >> _______________________________________________ >> ceph-users mailing list -- ceph-users@xxxxxxx >> To unsubscribe send an email to ceph-users-leave@xxxxxxx >> > _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx