Re: Fw: Incompatibilities (implicit_tenants & barbican) with Openstack after migrating from Ceph Luminous to Nautilus.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Dear Casey



We cherry picked your backports for the patches for multi-tenant and barbican (and also one for keystone caching) on rgw 14.2.8 :

    Merge pull request #26095 from bbc/s3secretcache  
    rgw: Added caching for S3 credentials retrieved from keystone
    (cherry picked from commit affb7d396f76273e885cfdbcd363c1882496726c)

    get barbican secret key request return error code   
    Signed-off-by: Richard Bai(白学余) <baixueyu@xxxxxxxxxx>
    (cherry picked from commit fbe2be57474df43996dd45bf04d1a1137a02c729)

    rgw: making implicit_tenants backwards compatible.
    Signed-off-by: Marcus Watts <mwatts@xxxxxxxxxx>
    (cherry picked from commit 3ba7be8d1ac7ee43e69eebb58263cd080cca1d38)
    
After building this new rgw 14.2.8, we tested it successfully on two stage ceph clusters:
- one with all ceph daemons 14.2.5
- one with all ceph daemons 12.2.12

We tested barbican and keystone integration,  put & get & list, bucket moving between tenants and flat namespace without any issue.
Again a big thanks for your help and PR!



Our understanding was that rgw is a client of the librados/RADOS layers (managed by the OSD's and MON's with a clean separation of layers)
and that a newer rgw daemon will work on older OSD's & MON's, with maybe some newer rgw features not available.

But I was told on the maillist that 
"The reason is that many parts of RGW are implemented in the OSD themselves, so you can't run a new RGW against an old OSD."
cf. https://lists.ceph.io/hyperkitty/list/ceph-users@xxxxxxx/thread/45VKDHFLUKG53HSSYUPA7BJKTRZFOYAB/

Does it mean that doing that could lead to data corruption & loss ?
Or is it just about newer features being unavailable?
(again during our tests we encountered no issues)

Isn't rgw asking 'ceph features' to adapt itself to the available featureset?
Is RGW really partly implemented in the OSD code? Or is just that some RGW features depends on OSD features?

Thank you for your insights!


Cheers
Francois


________________________________________
From: Casey Bodley <cbodley@xxxxxxxxxx>
Sent: Thursday, March 5, 2020 3:57 PM
To: Scheurer François; ceph-users@xxxxxxx
Cc: Engelmann Florian; Rafael Weingärtner
Subject: Re: Fw: Incompatibilities (implicit_tenants & barbican) with Openstack after migrating from Ceph Luminous to Nautilus.

On 3/3/20 2:33 PM, Scheurer François wrote:
>
> /(resending to the new maillist)/
>
>
> Dear Casey, Dear All,
>
>
>
> We tested the migration from Luminous to Nautilus and noticed two
> regressions breaking the RGW integration in Openstack:
>
>
>
>
>
>
> 1)  the following config parameter is not working on Nautilus but is
> valid on Luminous and on Master:
>         rgw_keystone_implicit_tenants = swift
>
>         In the log: parse error setting
> 'rgw_keystone_implicit_tenants' to 'swift' (Expected option value to
> be integer, got 'swift')
>
>     This param is important to make RGW working for S3 and Swift.
>     Setting it to false breaks swift/openstack and setting it to true
> makes S3 incompatible with dns-style bucketnames (with shared or
> public access).
>     Please note that path-style bucketnames are deprecated by AWS and
> most clients are only supporting dns-style...
>
>     Ref.:
> https://tracker.ceph.com/issues/24348
> <https://tracker.ceph.com/issues/24348>
> https://github.com/ceph/ceph/commit/3ba7be8d1ac7ee43e69eebb58263cd080cca1d38
>
>
Ok, wow. It looks like this commit was backported to luminous in
https://github.com/ceph/ceph/pull/22363 over a year before it actually
merged to master as part of https://github.com/ceph/ceph/pull/28813, so
missed the mimic and nautilus releases. I prepared those backports in
https://tracker.ceph.com/issues/44445 and
https://tracker.ceph.com/issues/44444.


>
>
>
>
> 2) the server-side encryption (SSE-KMS) is broken on Nautilus:
>
>     to reproduce the issue:
>         s3cmd --access_key $ACCESSKEY --secret_key $SECRETKEY
> --host-bucket "%(bucket)s.$ENDPOINT" --host "$ENDPOINT"
> --region="$REGION" --signature-v2 --no-preserve --no-ssl
> --server-side-encryption --server-side-encryption-kms-id ${SECRET##*/}
> put helloenc.txt s3://testenc/
>
>         output:
>             upload: 'helloenc.txt' -> 's3://testenc/helloenc.txt'  [1
> of 1]
>             9 of 9   100% in    0s    37.50 B/s  done
>             ERROR: S3 error: 403 (AccessDenied): Failed to retrieve
> the actual key, kms-keyid: cd0903db-c613-49be-96d9-165c02544bc7
>         rgw log: see below
>
>
>     TLDR: after investigating, I found that radosgw was actually
> getting the barbican secret correctly but the HTTP CODE (=200)
> validation was failing because of a bug in Nautilus.
>
>     My understanding is following (please correct me):
>
>         The bug in src/rgw/rgw_http_client.cc .
>
>         Since Nautilus HTTP_CODE are converted into ERROR_CODE (200
> becomes 0) in the request processing.
>         This happens in RGWHTTPManager::reqs_thread_entry(), which
> centralizes the processing of (curl) HTTP Requests with multi-treading.
>
>         This is fine but the member variable http_status of the class
> RGWHTTPClient is not updated with the resulting HTTP CODE, so the
> variable keeps its initial value of 0.
>
>         Then in src/rgw/rgw_crypt.cc the logic is still verifying that
> http_status is in range [200,299] and this fails...
>
>         I wrote the following oneliner bugfix for
> src/rgw/rgw_http_client.cc:
>
>             diff --git a/src/rgw/rgw_http_client.cc
> b/src/rgw/rgw_http_client.cc
>             index d0f0baead6..7c115293ad 100644
>             --- a/src/rgw/rgw_http_client.cc
>             +++ b/src/rgw/rgw_http_client.cc
>             @@ -1146,6 +1146,7 @@ void
> *RGWHTTPManager::reqs_thread_entry()
>                        status = -EAGAIN;
>                      }
>                      int id = req_data->id;
>             + req_data->client->http_status = http_status;
>                     finish_request(req_data, status);
>                      switch (result) {
>                        case CURLE_OK:
>
>         The s3cmd is then working fine with KMS server side encryption.
>
>
>
>
Thanks. This one was also fixed on master in
https://github.com/ceph/ceph/pull/29639 but didn't get backports. I
opened https://tracker.ceph.com/issues/44443 to track those for mimic
and nautilus.

>
>
> Questions:
>
>   *     Could someone please write a fix for the regression of 1) and
>     make a PR ?
>   *     Could somebody also make a PR for 2?
>
>
>
> Thank you for your help. :-)
>
>
>
> Cheers
> Francois Scheurer
>
>
> rgw log:
>         export CLUSTER=ceph; /home/local/ceph/build/bin/radosgw -f
> --cluster ${CLUSTER} --name client.rgw.$(hostname) --setuser ceph
> --setgroup ceph &
>         tail -fn0 /var/log/ceph/ceph-client.rgw.ewos1-osd1-stage.log |
> less -IS
>             2020-02-26 16:32:59.208 7fc1f1c54700 20 Getting KMS
> encryption key for key=cd0903db-c613-49be-96d9-165c02544bc7
>             2020-02-26 16:32:59.208 7fc1f1c54700 20 Requesting secret
> from barbican
> url=http://keystone.service.stage.i.ewcs.ch:5000/v3/auth/tokens
>             2020-02-26 16:32:59.208 7fc1f1c54700 20 ewdebug:
> RGWHTTPClient::process: http_status: 0
>             2020-02-26 16:32:59.208 7fc1f1c54700 20 ewdebug:
> RGWHTTP::process
>             2020-02-26 16:32:59.208 7fc1f1c54700 20 ewdebug: RGWHTTP::send
>             2020-02-26 16:32:59.208 7fc1f1c54700 20 sending request to
> http://keystone.service.stage.i.ewcs.ch:5000/v3/auth/tokens
>             2020-02-26 16:32:59.208 7fc1f1c54700 20 ssl verification
> is set to off
>             2020-02-26 16:32:59.208 7fc1f1c54700 20 ewdebug:
> RGWHTTPManager::add_request: client->init_request(req_data): 0
>             2020-02-26 16:32:59.208 7fc1f1c54700 20 register_request
> mgr=0x56374b865540 req_data->id=4, curl_handle=0x56374c77c4a0
>             2020-02-26 16:32:59.208 7fc1f1c54700 20 ewdebug:
> RGWHTTPManager::signal_thread(): write(thread_pipe[1], (void *)&buf,
> sizeof(buf)): 4
>             2020-02-26 16:32:59.208 7fc1f1c54700 20 ewdebug:
> RGWHTTPManager::add_request: signal_thread(): 0
>             2020-02-26 16:32:59.208 7fc1f1c54700 20 ewdebug:
> RGWHTTP::send: rgw_http_manager->add_request(req): 0
>             2020-02-26 16:32:59.208 7fc1f1c54700 20 ewdebug:
> RGWHTTP::process: send(req): 0
>             2020-02-26 16:32:59.208 7fc1f1c54700 20 ewdebug: struct
> rgw_http_req_data : public RefCountedObject : int wait() : ret: 0
>             2020-02-26 16:32:59.208 7fc2184a1700 20 link_request
> req_data=0x56374c96a240 req_data->id=4, curl_handle=0x56374c77c4a0
>             2020-02-26 16:32:59.608 7fc2184a1700 20 ewdebug:
> RGWHTTPManager::reqs_thread_entry: http_status: 201
>             2020-02-26 16:32:59.608 7fc2184a1700 20 ewdebug:
> RGWHTTPManager::reqs_thread_entry: rgw_http_error_to_errno(http_status): 0
>             2020-02-26 16:32:59.608 7fc2184a1700 20 ewdebug:
> RGWHTTPManager::reqs_thread_entry: finish_request(req_data, status):
> status: 0
>             2020-02-26 16:32:59.608 7fc2184a1700 20 ewdebug: struct
> rgw_http_req_data : public RefCountedObject : void finish(int r) : ret: 0
>             2020-02-26 16:32:59.652 7fc1f1c54700  5 ewdebug:
> request_key_from_barbican: Accept application/octet-stream
> X-Auth-Token gAAAAABeVo-xxx
>             2020-02-26 16:32:59.652 7fc1f1c54700 20 ewdebug:
> RGWHTTPClient::process: http_status: 0
>             2020-02-26 16:32:59.652 7fc1f1c54700 20 ewdebug:
> RGWHTTP::process
>             2020-02-26 16:32:59.652 7fc1f1c54700 20 ewdebug: RGWHTTP::send
>             2020-02-26 16:32:59.652 7fc1f1c54700 20 sending request to
> http://barbican.service.stage.i.ewcs.ch:9311/v1/secrets/cd0903db-c613-49be-96d9-165c02544bc7
>             2020-02-26 16:32:59.652 7fc1f1c54700 20 ewdebug:
> RGWHTTPManager::add_request: client->init_request(req_data): 0
>             2020-02-26 16:32:59.652 7fc1f1c54700 20 register_request
> mgr=0x56374b865540 req_data->id=5, curl_handle=0x56374c77c4a0
>             2020-02-26 16:32:59.652 7fc1f1c54700 20 ewdebug:
> RGWHTTPManager::signal_thread(): write(thread_pipe[1], (void *)&buf,
> sizeof(buf)): 4
>             2020-02-26 16:32:59.652 7fc1f1c54700 20 ewdebug:
> RGWHTTPManager::add_request: signal_thread(): 0
>             2020-02-26 16:32:59.652 7fc1f1c54700 20 ewdebug:
> RGWHTTP::send: rgw_http_manager->add_request(req): 0
>             2020-02-26 16:32:59.652 7fc1f1c54700 20 ewdebug:
> RGWHTTP::process: send(req): 0
>             2020-02-26 16:32:59.652 7fc1f1c54700 20 ewdebug: struct
> rgw_http_req_data : public RefCountedObject : int wait() : ret: 0
>             2020-02-26 16:32:59.652 7fc2184a1700 20 link_request
> req_data=0x56374c96a240 req_data->id=5, curl_handle=0x56374c77c4a0
>          => 2020-02-26 16:32:59.752 7fc2184a1700 20 ewdebug:
> RGWHTTPManager::reqs_thread_entry: http_status: 200
>             2020-02-26 16:32:59.752 7fc2184a1700 20 ewdebug:
> RGWHTTPManager::reqs_thread_entry: rgw_http_error_to_errno(http_status): 0
>             2020-02-26 16:32:59.752 7fc2184a1700 20 ewdebug:
> RGWHTTPManager::reqs_thread_entry: finish_request(req_data, status):
> status: 0
>             2020-02-26 16:32:59.752 7fc2184a1700 20 ewdebug: struct
> rgw_http_req_data : public RefCountedObject : void finish(int r) : ret: 0
>             2020-02-26 16:32:59.752 7fc1f1c54700  5 ewdebug:
> request_key_from_barbican: secret_req.process: 0
>          => 2020-02-26 16:32:59.752 7fc1f1c54700  5 ewdebug:
> request_key_from_barbican: secret_req.get_http_status: 0
>             2020-02-26 16:32:59.752 7fc1f1c54700  5 ewdebug:
> request_key_from_barbican: secret_req.get_http_status not in [200,299]
> range!
>             2020-02-26 16:32:59.752 7fc1f1c54700  5 Failed to retrieve
> secret from barbican:cd0903db-c613-49be-96d9-165c02544bc7
>             2020-02-26 16:32:59.752 7fc1f1c54700  5 ERROR: failed to
> retrieve actual key from key_id: cd0903db-c613-49be-96d9-165c02544bc7
>             2020-02-26 16:32:59.752 7fc1f1c54700  2 req 1 1.092s
> s3:put_obj completing
>             2020-02-26 16:32:59.752 7fc1f1c54700  2 req 1 1.092s
> s3:put_obj op status=-13
>             2020-02-26 16:32:59.752 7fc1f1c54700  2 req 1 1.092s
> s3:put_obj http status=403
>             2020-02-26 16:32:59.752 7fc1f1c54700  1 ====== req done
> req=0x56374c9808d0 op status=-13 http_status=403 latency=1.092s ======
>
>         => we see that http_status is correct (200) but the variable
> secret_req.get_http_status (member of class RGWHTTPClient) is
> incorrect (0 instead of 200)
>
>
>

Attachment: smime.p7s
Description: S/MIME cryptographic signature

_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx

[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux