17.2.6 Dashboard/RGW Signature Mismatch

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi

I have 3 Ceph clusters, all configured similarly, which have been happy for some months on 17.2.5:

1. A test cluster
2. A small production cluster
3. A larger production cluster

All are debian 11 built from packages - no cephadm.

I upgraded (1) to 17.2.6 without any problems at all. In particular the Object Gateway sections of the dashboard work as usual.

I then upgraded (2). Nothing seemed amiss, and everything seems to work except... when I try to access the Object Gateway sections of the dashboard I always get:


     *The Object Gateway Service is not configured*


       Error connecting to Object Gateway: RGW REST API failed request
       with status code 403
       (b'{"Code":"SignatureDoesNotMatch","RequestId":"tx0000022ba920e82ac4a9c-0064381'
       b'934-10e73385-default","HostId":"10e73385-default-default"}')

(Just the RequestId changes each time). Before the upgrade it worked just fine.

Other info:

 * RGW requests using awscli and rclone all work with normal RGW
   accounts. It just seems to be the dashboard that's died.
 * Just the one zonegroup, no multisite/replication
 * "radosgw-admin user info --uid=rgwadmin" gives the correct output
   with the right access_key & secret_key. The other fields are as in (1).
 * "ceph dashboard get-rgw-api-access-key/get-rgw-api-secret-key" both
   give the right values.

The rgw logs from (2) which fails show:

2023-04-13T16:36:28.720+0100 7fcc7966a700  1 ====== starting new request req=0x7fcd88c10720 =====
2023-04-13T16:36:28.720+0100 7fcc80e79700  1 req 8090309398268968541 0.000000000s op->ERRORHANDLER: err_no=-2027 new_err_no=-2027
2023-04-13T16:36:28.724+0100 7fcc80e79700  1 ====== req done req=0x7fcd88c10720 op status=0 http_status=403 latency=0.003999980s ======
2023-04-13T16:36:28.724+0100 7fcc80e79700  1 beast: 0x7fcd88c10720: 192.168.xx.xx - - [13/Apr/2023:16:36:28.720 +0100] "GET /admin/metadata/user?myself HTTP/1.1" 403 134 - "python-requests/2.25.1" - latency=0.003999980s

(Note this does not have rgwadmin as the user, and is always the same URL)


Whereas the rgw logs from (1) which works show things like:

2023-04-13T15:44:19.396+0000 7f8478da1700  1 ====== starting new request req=0x7f86284f5720 =====
2023-04-13T15:44:19.412+0000 7f8478da1700  1 ====== req done req=0x7f86284f5720 op status=0 http_status=200 latency=0.016000060s ======
2023-04-13T15:44:19.412+0000 7f8478da1700  1 beast: 0x7f86284f5720: 10.xx.xx.xx - rgwadmin [13/Apr/2023:15:44:19.396 +0000] "GET /admin/realm?list HTTP/1.1" 200 31 - "python-requests/2.25.1" - latency=0.016000060s

(Note this has rgwadmin as the user, and various URLs)

The only thing I can see in the release notes that looks even vaguely related is https://github.com/ceph/ceph/pull/47547, but it doesn't seem likely.

I am really stumped on this, with no idea what has gone wrong on (2), and what the difference is between (1) and (2). I'm not going to touch (3) until I have resolved this.

Grateful for any help...

And thanks for all the good work.

Regards, Chris



_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux