Re: Grafana service fails to start due to bad directory name after Quincy upgrade

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

You can upgrade the grafana version individually by setting the config_opt
for grafana container image like:
ceph config set mgr mgr/cephadm/container_image_grafana
quay.io/ceph/ceph-grafana:8.3.5

and then redeploy the grafana container again either via dashboard or
cephadm.

Regards,
Nizam



On Fri, Jun 23, 2023 at 12:05 AM Adiga, Anantha <anantha.adiga@xxxxxxxxx>
wrote:

> Hi Eugen,
>
> Thank you so much for the details.  Here is the update (comments in-line
> >>):
>
> Regards,
> Anantha
> -----Original Message-----
> From: Eugen Block <eblock@xxxxxx>
> Sent: Monday, June 19, 2023 5:27 AM
> To: ceph-users@xxxxxxx
> Subject:  Re: Grafana service fails to start due to bad
> directory name after Quincy upgrade
>
> Hi,
>
> so grafana is starting successfully now? What did you change?
> >>  I stopped and removed the Grafana image and  started it from "Ceph
> Dashboard" service. The version is still 6.7.4. I also had to change the
> following.
> I do not have a way to make  this permanent, if the service is redeployed
> I  will lose  the changes.
> I did not save the file that cephadm generated. This was one reason why
> Grafana service would not start. I had replace it with the one below to
> resolve this issue.
> [users]
>   default_theme = light
> [auth.anonymous]
>   enabled = true
>   org_name = 'Main Org.'
>   org_role = 'Viewer'
> [server]
>   domain = 'bootstrap.storage.lab'
>   protocol = https
>   cert_file = /etc/grafana/certs/cert_file
>   cert_key = /etc/grafana/certs/cert_key
>   http_port = 3000
>   http_addr =
> [snapshots]
>   external_enabled = false
> [security]
>   disable_initial_admin_creation = false
>   cookie_secure = true
>   cookie_samesite = none
>   allow_embedding = true
>   admin_password = paswd-value
>   admin_user = user-name
>
> Also this was the other change:
> # This file is generated by cephadm.
> apiVersion: 1   <--  This was the line added to
> var/lib/ceph/d0a3b6e0-d2c3-11ed-be05-a7a3a1d7a87e/grafana.fl31ca104ja0201/etc/grafana/provisioning/datasources/ceph-dashboard.yml
> >>
> Regarding the container images, yes there are defaults in cephadm which
> can be overridden with ceph config. Can you share this output?
>
> ceph config dump | grep container_image
> >>
> Here it is
> root@fl31ca104ja0201:/# ceph config dump | grep container_image
> global                                                   basic
>  container_image
> quay.io/ceph/ceph@sha256:af79fedafc42237b7612fe2d18a9c64ca62a0b38ab362e614ad671efa4a0547e
> *
> mgr                                                      advanced
> mgr/cephadm/container_image_alertmanager
> docker.io/prom/alertmanager:v0.16.2
>                   *
> mgr                                                      advanced
> mgr/cephadm/container_image_base           quay.io/ceph/daemon
> mgr                                                      advanced
> mgr/cephadm/container_image_grafana        docker.io/grafana/grafana:6.7.4
>                                                           *
> mgr                                                      advanced
> mgr/cephadm/container_image_node_exporter
> docker.io/prom/node-exporter:v0.17.0
>                  *
> mgr                                                      advanced
> mgr/cephadm/container_image_prometheus
> docker.io/prom/prometheus:v2.7.2
>                  *
> client.rgw.default.default.fl31ca104ja0201.ninovs        basic
>  container_image
> quay.io/ceph/ceph@sha256:af79fedafc42237b7612fe2d18a9c64ca62a0b38ab362e614ad671efa4a0547e
> *
> client.rgw.default.default.fl31ca104ja0202.yhjkmb        basic
>  container_image
> quay.io/ceph/ceph@sha256:af79fedafc42237b7612fe2d18a9c64ca62a0b38ab362e614ad671efa4a0547e
> *
> client.rgw.default.default.fl31ca104ja0203.fqnriq        basic
>  container_image
> quay.io/ceph/ceph@sha256:af79fedafc42237b7612fe2d18a9c64ca62a0b38ab362e614ad671efa4a0547e
> *
> >>
> I tend to always use a specific image as described here [2]. I also
> haven't deployed grafana via dashboard yet so I can't really comment on
> that as well as on the warnings you report.
>
>
> >>OK. The need for that is, in Quincy when you enable Loki and Promtail,
> to view the daemon logs Ceph board pulls in Grafana  dashboard. I will let
> you know once that issue is resolved.
>
> Regards,
> Eugen
>
> [2]
>
> https://docs.ceph.com/en/latest/cephadm/services/monitoring/#using-custom-images
> >> Thank you I am following the document now
>
> Zitat von "Adiga, Anantha" <anantha.adiga@xxxxxxxxx>:
>
> > Hi Eugene,
> >
> > Thank you for your response, here is the update.
> >
> > The upgrade to Quincy was done  following the cephadm orch upgrade
> > procedure ceph orch upgrade start --image quay.io/ceph/ceph:v17.2.6
> >
> > Upgrade completed with out errors. After the upgrade, upon creating
> > the Grafana service from Ceph dashboard, it deployed Grafana 6.7.4.
> > The version is hardcoded in the code, should it not be 8.3.5 as listed
> > below in Quincy documentation? See below
> >
> > [Grafana service started from Cephdashboard]
> >
> > Quincy documentation states:
> > https://docs.ceph.com/en/latest/releases/quincy/
> > ……documentation snippet
> > Monitoring and alerting:
> > 43 new alerts have been added (totalling 68) improving observability
> > of events affecting: cluster health, monitors, storage devices, PGs
> > and CephFS.
> > Alerts can now be sent externally as SNMP traps via the new SNMP
> > gateway service (the MIB is provided).
> > Improved integrated full/nearfull event notifications.
> > Grafana Dashboards now use grafonnet format (though they’re still
> > available in JSON format).
> > Stack update: images for monitoring containers have been updated.
> > Grafana 8.3.5, Prometheus 2.33.4, Alertmanager 0.23.0 and Node
> > Exporter 1.3.1. This reduced exposure to several Grafana
> > vulnerabilities (CVE-2021-43798, CVE-2021-39226, CVE-2021-43798,
> > CVE-2020-29510, CVE-2020-29511).
> > ………………….
> >
> > I notice that the versions of the remaining stack, that Ceph
> > dashboard deploys,  are also older than what is documented.
> > Prometheus 2.7.2, Alertmanager 0.16.2 and Node Exporter 0.17.0.
> >
> > AND 6.7.4 Grafana service reports a few warnings: highlighted below
> >
> > root@fl31ca104ja0201:/home/general# systemctl status
> > ceph-d0a3b6e0-d2c3-11ed-be05-a7a3a1d7a87e@xxxxxxxxxxxxxxxxxxxxxxxxxxxx
> > ice
> > ●
> > ceph-d0a3b6e0-d2c3-11ed-be05-a7a3a1d7a87e@xxxxxxxxxxxxxxxxxxxxxxxxxxxx
> > ice - Ceph grafana.fl31ca104ja0201 for
> d0a3b6e0-d2c3-11ed-be05-a7a3a1d7a87e
> >      Loaded: loaded
> > (/etc/systemd/system/ceph-d0a3b6e0-d2c3-11ed-be05-a7a3a1d7a87e@.servic
> > e;
> > enabled; vendor preset: enabled)
> >      Active: active (running) since Tue 2023-06-13 03:37:58 UTC; 11h ago
> >    Main PID: 391896 (bash)
> >       Tasks: 53 (limit: 618607)
> >      Memory: 17.9M
> >      CGroup:
> >
> /system.slice/system-ceph\x2dd0a3b6e0\x2dd2c3\x2d11ed\x2dbe05\x2da7a3a1d7a87e.slice/ceph-d0a3b6e0-d2c3-11ed-be05-a7a3a1d7a87e@grafana.fl31ca104j
> >
> >              ├─391896 /bin/bash
> >
> /var/lib/ceph/d0a3b6e0-d2c3-11ed-be05-a7a3a1d7a87e/grafana.fl31ca104ja0201/unit.run
> >              └─391969 /usr/bin/docker run --rm --ipc=host
> > --stop-signal=SIGTERM --net=host --init --name
> > ceph-d0a3b6e0-d2c3-11ed-be05-a7a3a1d7a87e-grafana-fl>
> > -- Logs begin at Sun 2023-06-11 20:41:51 UTC, end at Tue 2023-06-13
> > 15:35:12 UTC. --
> > Jun 13 03:37:59 fl31ca104ja0201 bash[391969]:
> > t=2023-06-13T03:37:59+0000 lvl=info msg="Executing migration"
> > logger=migrator id="alter user_auth.auth_id to length 190"
> > Jun 13 03:37:59 fl31ca104ja0201 bash[391969]:
> > t=2023-06-13T03:37:59+0000 lvl=info msg="Executing migration"
> > logger=migrator id="Add OAuth access token to user_auth"
> > Jun 13 03:37:59 fl31ca104ja0201 bash[391969]:
> > t=2023-06-13T03:37:59+0000 lvl=info msg="Executing migration"
> > logger=migrator id="Add OAuth refresh token to user_auth"
> > Jun 13 03:37:59 fl31ca104ja0201 bash[391969]:
> > t=2023-06-13T03:37:59+0000 lvl=info msg="Executing migration"
> > logger=migrator id="Add OAuth token type to user_auth"
> > Jun 13 03:37:59 fl31ca104ja0201 bash[391969]:
> > t=2023-06-13T03:37:59+0000 lvl=info msg="Executing migration"
> > logger=migrator id="Add OAuth expiry to user_auth"
> > Jun 13 03:37:59 fl31ca104ja0201 bash[391969]:
> > t=2023-06-13T03:37:59+0000 lvl=info msg="Executing migration"
> > logger=migrator id="Add index to user_id column in user_auth"
> > Jun 13 03:37:59 fl31ca104ja0201 bash[391969]:
> > t=2023-06-13T03:37:59+0000 lvl=info msg="Executing migration"
> > logger=migrator id="create server_lock table"
> > Jun 13 03:37:59 fl31ca104ja0201 bash[391969]:
> > t=2023-06-13T03:37:59+0000 lvl=info msg="Executing migration"
> > logger=migrator id="add index server_lock.operation_uid"
> > Jun 13 03:37:59 fl31ca104ja0201 bash[391969]:
> > t=2023-06-13T03:37:59+0000 lvl=info msg="Executing migration"
> > logger=migrator id="create user auth token table"
> > Jun 13 03:37:59 fl31ca104ja0201 bash[391969]:
> > t=2023-06-13T03:37:59+0000 lvl=info msg="Executing migration"
> > logger=migrator id="add unique index user_auth_token.auth_token"
> > Jun 13 03:37:59 fl31ca104ja0201 bash[391969]:
> > t=2023-06-13T03:37:59+0000 lvl=info msg="Executing migration"
> > logger=migrator id="add unique index user_auth_token.prev_auth_token"
> > Jun 13 03:37:59 fl31ca104ja0201 bash[391969]:
> > t=2023-06-13T03:37:59+0000 lvl=info msg="Executing migration"
> > logger=migrator id="create cache_data table"
> > Jun 13 03:37:59 fl31ca104ja0201 bash[391969]:
> > t=2023-06-13T03:37:59+0000 lvl=info msg="Executing migration"
> > logger=migrator id="add unique index cache_data.cache_key"
> > Jun 13 03:37:59 fl31ca104ja0201 bash[391969]:
> > t=2023-06-13T03:37:59+0000 lvl=info msg="Created default organization"
> > logger=sqlstore Jun 13 03:37:59 fl31ca104ja0201 bash[391969]:
> > t=2023-06-13T03:37:59+0000 lvl=info msg="Initializing HTTPServer"
> > logger=server
> > Jun 13 03:37:59 fl31ca104ja0201 bash[391969]:
> > t=2023-06-13T03:37:59+0000 lvl=info msg="Initializing
> > BackendPluginManager" logger=server Jun 13 03:37:59 fl31ca104ja0201
> > bash[391969]:
> > t=2023-06-13T03:37:59+0000 lvl=info msg="Initializing PluginManager"
> > logger=server
> > Jun 13 03:37:59 fl31ca104ja0201 bash[391969]:
> > t=2023-06-13T03:37:59+0000 lvl=info msg="Starting plugin search"
> > logger=plugins
> > Jun 13 03:37:59 fl31ca104ja0201 bash[391969]:
> > t=2023-06-13T03:37:59+0000 lvl=info msg="Initializing HooksService"
> > logger=server
> > Jun 13 03:37:59 fl31ca104ja0201 bash[391969]:
> > t=2023-06-13T03:37:59+0000 lvl=info msg="Initializing
> > OSSLicensingService" logger=server Jun 13 03:37:59 fl31ca104ja0201
> > bash[391969]:
> > t=2023-06-13T03:37:59+0000 lvl=info msg="Initializing
> > InternalMetricsService" logger=server Jun 13 03:37:59 fl31ca104ja0201
> > bash[391969]:
> > t=2023-06-13T03:37:59+0000 lvl=info msg="Initializing RemoteCache"
> > logger=server
> > Jun 13 03:37:59 fl31ca104ja0201 bash[391969]:
> > t=2023-06-13T03:37:59+0000 lvl=info msg="Initializing
> > RenderingService" logger=server Jun 13 03:37:59 fl31ca104ja0201
> > bash[391969]:
> > t=2023-06-13T03:37:59+0000 lvl=info msg="Initializing AlertEngine"
> > logger=server
> > Jun 13 03:37:59 fl31ca104ja0201 bash[391969]:
> > t=2023-06-13T03:37:59+0000 lvl=info msg="Initializing QuotaService"
> > logger=server
> > Jun 13 03:37:59 fl31ca104ja0201 bash[391969]:
> > t=2023-06-13T03:37:59+0000 lvl=info msg="Initializing
> > ServerLockService" logger=server Jun 13 03:37:59 fl31ca104ja0201
> > bash[391969]:
> > t=2023-06-13T03:37:59+0000 lvl=info msg="Initializing
> > UserAuthTokenService" logger=server Jun 13 03:37:59 fl31ca104ja0201
> > bash[391969]:
> > t=2023-06-13T03:37:59+0000 lvl=info msg="Initializing
> > DatasourceCacheService" logger=server Jun 13 03:37:59 fl31ca104ja0201
> > bash[391969]:
> > t=2023-06-13T03:37:59+0000 lvl=info msg="Initializing LoginService"
> > logger=server
> > Jun 13 03:37:59 fl31ca104ja0201 bash[391969]:
> > t=2023-06-13T03:37:59+0000 lvl=info msg="Initializing SearchService"
> > logger=server
> > Jun 13 03:37:59 fl31ca104ja0201 bash[391969]:
> > t=2023-06-13T03:37:59+0000 lvl=info msg="Initializing TracingService"
> > logger=server Jun 13 03:37:59 fl31ca104ja0201 bash[391969]:
> > t=2023-06-13T03:37:59+0000 lvl=info msg="Initializing
> > UsageStatsService" logger=server Jun 13 03:37:59 fl31ca104ja0201
> > bash[391969]:
> > t=2023-06-13T03:37:59+0000 lvl=info msg="Initializing CleanUpService"
> > logger=server Jun 13 03:37:59 fl31ca104ja0201 bash[391969]:
> > t=2023-06-13T03:37:59+0000 lvl=info msg="Initializing
> > NotificationService" logger=server Jun 13 03:37:59 fl31ca104ja0201
> > bash[391969]:
> > t=2023-06-13T03:37:59+0000 lvl=info msg="Initializing
> > provisioningServiceImpl" logger=server Jun 13 03:37:59 fl31ca104ja0201
> > bash[391969]:
> > t=2023-06-13T03:37:59+0000 lvl=warn msg="[Deprecated] the datasource
> > provisioning config is outdated. please upgrade"
> > logger=provisioning.datasources
> > filename=/etc/grafana/provisioning/datasources/ceph-dashboard.yml
> >
> > This warning  comes due to the missing “ apiVersion: 1”  first line
> > entry in  /etc/grafana/provisioning/datasources/ceph-dashboard.yml
> > created by cephadm.
> > If the file is modified to include the apiversion line  and  restart
> > Grafana service,
> >
> > Is this a known ISSUE ?
> >
> > Here is the content of the ceph-dashboard.yml  produced by cephadm
> > deleteDatasources:
> >   - name: 'Dashboard1'
> >     orgId: 1
> >
> >   - name: 'Loki'
> >     orgId: 2
> >
> > datasources:
> >   - name: 'Dashboard1'
> >     type: 'prometheus'
> >     access: 'proxy'
> >     orgId: 1
> >     url: 'http://fl31ca104ja0201.xxx.xxx.com:9095'
> >     basicAuth: false
> >     isDefault: true
> >     editable: false
> >
> >   - name: 'Loki'
> >     type: 'loki'
> >     access: 'proxy'
> >     orgId: 2
> >     url: ''
> >     basicAuth: false
> >     isDefault: true
> >     editable: false
> > --------------------------------------------------------------
> > Jun 13 03:37:59 fl31ca104ja0201 bash[391969]:
> > t=2023-06-13T03:37:59+0000 lvl=info msg="inserting datasource from
> > configuration " logger=provisioning.datasources name=Dashboard1 Jun 13
> > 03:37:59 fl31ca104ja0201 bash[391969]:
> > t=2023-06-13T03:37:59+0000 lvl=info msg="inserting datasource from
> > configuration " logger=provisioning.datasources name=Loki Jun 13
> > 03:37:59 fl31ca104ja0201 bash[391969]:
> > t=2023-06-13T03:37:59+0000 lvl=info msg="Backend rendering via
> > phantomJS" logger=rendering renderer=phantomJS Jun 13 03:37:59
> > fl31ca104ja0201 bash[391969]:
> > t=2023-06-13T03:37:59+0000 lvl=warn msg="phantomJS is deprecated and
> > will be removed in a future release. You should consider migrating
> > from phantomJS to grafana-image-renderer plugin. Read more at
> > https://grafana.com/docs/grafana/latest/administration/image_rendering/";
> > logger=rendering renderer=phantomJS
> > Jun 13 03:37:59 fl31ca104ja0201 bash[391969]:
> > t=2023-06-13T03:37:59+0000 lvl=info msg="Initializing Stream Manager"
> > Jun 13 03:37:59 fl31ca104ja0201 bash[391969]:
> > t=2023-06-13T03:37:59+0000 lvl=info msg="HTTP Server Listen"
> > logger=http.server address=[::]:3000 protocol=https subUrl= socket=
> >
> >
> > I also had to change a few other things to keep all the services
> > running. The last issue that I have not been able to resolve yet is
> > the Cephbash board gives this error even though grafana is running on
> > the same server. However, the grafana dashboard cannot be accessed
> > without  tunnelling.
> >
> > [cid:image002.png@01D9A10B.F8B9D220]
>
>
> _______________________________________________
> ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an
> email to ceph-users-leave@xxxxxxx
> _______________________________________________
> ceph-users mailing list -- ceph-users@xxxxxxx
> To unsubscribe send an email to ceph-users-leave@xxxxxxx
>
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux