Re: Dashboard's website hangs during loading, no errors

Ernesto Puerta <epuertat@xxxxxxxxxx> · Fri, 19 Nov 2021 22:06:04 +0100

Hi Zach,

I remember the Cherrypy webserver (Cheroot 8.5.1) had a hellish
deadlock-kind of issue <https://github.com/cherrypy/cheroot/issues/358> no
that long ago, but that was already fixed in 8.5.2.

Could you please run the same curl command with the "-v" flag to get a
verbose output?

You can compare that with a sample output of a freezing Cherrypy server at
this tracker: https://tracker.ceph.com/issues/48973

BTW we also managed to speed up reproduction by using a benchmark tool like
Apache benchmark. You can get here a ready to use reproducer code:
https://bugzilla.redhat.com/show_bug.cgi?id=1920461#c2

Kind Regards,
Ernesto

On Fri, Nov 19, 2021 at 8:09 PM Zach Heise (SSCC) <heise@xxxxxxxxxxxx>
wrote:

> Thanks for writing, Ernesto.
>
>    1. output of ceph mgr services:
>    ceph mgr services
>    {
>        "dashboard": "https://144.92.190.200:8443/";
>    <https://144.92.190.200:8443/>,
>        "prometheus": "http://144.92.190.200:9283/";
>    <http://144.92.190.200:9283/>
>    }
>    2. Network tab in dev tools, doing a reload just results in a GET ->
>    DOMAIN: ceph01.ssc.wisc.edu:8443, file /
>       1. Nothing else comes up as the throbber throbs.
>       2. No assets list as being downloaded.
>       3. Similar result with curl: curl -k https://144.92.190.200:8443
>    just results in a blinking cursor. No errors, just hanging. If I try any
>    other random port, curl (as expected) says "connection refused" and quits
>    instantly.
>
> Zach
>
>
> On 2021-11-19 10:17 AM, Ernesto Puerta wrote:
>
> Hi Zach,
>
> Thanks for the thorough description. We haven't noticed this issue so far
> and have some long-running clusters, but let's try to debug it:
>
>    - First of all, as Kai suggested, let's ensure we're hitting the
>    active manager address (there's a redirection mechanism, but let's ensure
>    it anyway): a "ceph mgr services" should give you the active Dashboard URL.
>    - After that, my suggestion for you is to open the Browser's Dev Tools
>    (built-in in both Chrome or Firefox) and visit the Networking tab. In
>    there, you should be able a few network requests on hard reload (remember
>    to keep CTRL+SHIFT pressed while clicking on the reload icon). You should
>    see a few HTML, CSS and JS assets downloading.
>    - Let's try to perform a "curl" from the CLI: "curl -k https://<hostname>:<port>".
>    That should return the index HTML file.
>
> Are you using a reverse proxy/cache that might be interfering with this?
>
> Kind Regards,
> Ernesto
>
>
> On Fri, Nov 19, 2021 at 12:04 AM Zach Heise (SSCC) <heise@xxxxxxxxxxxx>
> wrote:
>
>> Hello!
>>
>>
>>
>> Our test cluster is a few months old, was initially set up from scratch
>> with Pacific and has now had two separate small patches 16.2.5 and then a
>> couple weeks ago, 16.2.6 applied to it. The issue I?m describing has been
>> present since the beginning.
>>
>>
>>
>> We have an active and standby mgr daemon, and the dashboard module is
>> installed with SSL turned on. Self signed certificates only, not trusted by
>> browsers, but I always just click ?okay? through Chrome and Firefox?s
>> warnings about that.
>>
>>
>>
>> I have noticed that every 2-3 days, in the morning when I start work, our
>> ceph dashboard page does not respond in the browser. It works fine
>> throughout the day, but it seems like after a certain unknown hours without
>> anyone accessing it (I?m the only one using the dashboard now since it?s
>> just a test) something must be going wrong with the dashboard module, or
>> mgr daemon, because when I try to load (or refresh when it's already
>> loaded) the ceph dashboard site, the browser just does the ?throbber
>> <https://en.wikipedia.org/wiki/Throbber>? ? no content on the page ever
>> appears, no errors or anything. None of the buttons on the page load ? nor
>> time out and show a 404 ? for example, Block\Images or Cluster\Hosts in the
>> left sidebar will load, but show empty. And the throbber never stops.
>>
>>
>>
>> Confirmed that this happens in all browsers too.
>>
>>
>>
>> I can easily fix it with ceph mgr module disable dashboard and then
>> waiting 10 seconds, then ceph mgr module enable dashboard ? this makes
>> it start working again, until the next time I go a few days without using
>> the dashboard, at which point I need to do the same process again.
>>
>>
>>
>> Any ideas as to what could be causing this? I have already turned on
>> debug mode. When I?m in this hanging state, I check the cephadm logs with cephadm
>> logs --name mgr.ceph01.fblojp -- -f but there?s nothing obvious (to my
>> untrained eyes at least). When the dashboard is functional, I can see my
>> own navigation around the dashboard in the logs so I know that logging is
>> working:
>>
>>
>>
>> Nov 01 15:46:32 ceph01.domain conmon[5814]: debug
>> 2021-11-01T20:46:32.601+0000 7f7cbb42e700  0 [dashboard INFO request] [
>> 10.130.50.252:52267] [GET] [200] [0.013s] [admin] [1.0K] /api/summary
>>
>>
>>
>> I already confirmed that the same thing happens regardless of whether I?m
>> using default ports of http://ceph01.domain:8080 or
>> https://ceph01.domain:8443 (although as mentioned I usually use
>> self-signed SSL).
>>
>>
>>
>> At this moment the dashboard is currently in this hanging state so I am
>> happy to try to get logs.
>>
>>
>>
>> Thanks,
>>
>> -Zach
>> _______________________________________________
>> ceph-users mailing list -- ceph-users@xxxxxxx
>> To unsubscribe send an email to ceph-users-leave@xxxxxxx
>>
>
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx