Re: AssumeRoleWithWebIdentity in RGW with Azure AD

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



This is very helpful, I'll take a look at it.

Thanks,
Pritha

On Thu, Jul 11, 2024 at 8:04 PM Ryan Rempel <rgrempel@xxxxxx> wrote:

> Thanks!
>
> I took a crack at it myself, and have some work-in-progress here:
>
> https://github.com/cmu-rgrempel/ceph/pull/1
>
> Feel free to use any of that if you like it. It's working for me, but I've
> only tested it with Azure AD – I haven't tested the cases that it used to
> work for. (I believe it doesn't break them, but haven't tested).
>
> --
>
> Ryan Rempel
>
>
> ------------------------------
> *From:* Pritha Srivastava <prsrivas@xxxxxxxxxx>
> *Sent:* Monday, July 8, 2024 10:38 PM
>
> Hi Ryan,
>
> This appears to be a known issue and is tracked here:
> https://tracker.ceph.com/issues/54562. There is a workaround mentioned in
> the tracker that has worked and you can try that. Otherwise, I will be
> working on this 'invalid padding' problem very soon.
>
> Thanks,
> Pritha
>
> On Tue, Jul 9, 2024 at 1:16 AM Ryan Rempel <rgrempel@xxxxxx> wrote:
>
> I'm trying to setup the OIDC provider for RGW so that I can have roles
> that can be assumed by people logging into their regular Azure AD
> identities. The client I'm planning to use is Cyberduck – it seems like one
> of the few GUI S3 clients that manages the OIDC login process in a way that
> could work for relatively naive users.
>
> I've gotten a fair ways down the road. I've been able to configure
> Cyberduck so that it performs the login with Azure AD, gets an identity
> token, and then sends it to Ceph to engage with the
> AssumeRoleWithWebIdentity process. However, I then get an error, which
> shows up in the Ceph rgw logs like this:
>
> 2024-07-08T17:18:09.749+0000 7fb2d7845700  0 req 15967124976712370684
> 1.284013867s sts:assume_role_web_identity Signature validation failed: evp
> verify final failed: 0 error:0407008A:rsa
> routines:RSA_padding_check_PKCS1_type_1:invalid padding
>
> I turned the logging for rgw up to 20 to see if I could follow along to
> see how much of the process succeeds and learn more about what fails. I can
> then see logging messages from this file in the source code:
>
>
> https://github.com/ceph/ceph/blob/08d7ff952d78d1bbda04d5ff7e3db1e733301072/src/rgw/rgw_rest_sts.cc
>
> We get to WebTokenEngine::get_from_jwt, and it logs the JWT payload in a
> way that seems to be as expected. The logs then indicate that a request is
> sent to the /.well-known/openid-configuration endpoint that appears to be
> appropriate for the issuer of the JWT. The logs eventually indicate what
> looks like a successful and appropriate response to that. The logs then
> show that a request is sent to the jwks_uri that is indicated in the
> openid-configuration document. The response to that is logged, and it
> appears to be appropriate.
>
> We then get some logging starting with "Certificate is", so it looks like
> we're getting as far as WebTokenEngine::validate_signature. So, several
> things appear to have happened successfully – we've loading the OIDC
> provider that corresponds to the iss, and we've found a client ID that
> corresponds to what I registered when I configured things. (This is why I
> say we appear to be a fair ways down the road – a lot of this is working).
>
> It looks as though what's happening in the code now is that it's iterating
> through the certificates given in the jwks_uri content. There are 6
> certificates listed, but the code only gets as far as the first one.
> Looking at the code, what appears to be happening is that, among the
> various certificates in the jwks_uri, it's finding the first one which
> matches a thumbprint registered with Ceph (that is, which I registered with
> Ceph). This must be succeeding (for the first certificate), because the
> "Signature validation failed" logging comes later. So, the code does verify
> that the thumbprint of the first certificate matches one of the thumbprints
> I registered with Ceph for this OIDC provider.
>
> We then get to a part of the code where it tries to verify the JWT using
> the certificate, with jwt::verify. Given what gets logged ("Signature
> validateion failed: ", this must be throwing an exception.
>
> The thing I find surprising about this is that there really isn't any
> reason to think that the first certificate listed in the jwks_uri content
> is going to be the certificate used to sign the JWT. If I understand JWT
> correctly, it's appropriate to sign the JWT with any of the certificates
> listed in the jwks_uri content. Furthermore, the JWT header includes a
> reference to the kid, so it's possible for Ceph to know exactly which
> certificate the JWT purports to be signed by. And, Ceph knows that there
> might be multiple thumbprints, because we can register 5. So, the logic of
> trying the first valid certificate in x5c and then stopping if it fails
> seems broken, actually.
>
> I suppose what I could do as a workaround is try to figure out whether
> Azure AD is consistently using the same kid to sign the JWTs for me, and
> then only register that thumbprint with Ceph. Then, Ceph would actually
> choose the correct certificate (as the others wouldn't match a thumbprint I
> registered). I may try this – in part, just to verify what I think is
> happening. But it would be awfully fragile – I don't believe there is any
> requirement in JWT to just use one of the certificates listed in x5c.
>
> An alternative would be to try rewriting the code to apply a different
> kind of logic. The way it ought to work (it seems to me) is something like
> this:
>
>
>   *
> Get the openid_configuration, and get the jwks stuff from the jwks_uri
> (which Ceph does already).
>   *
> Look at the header of the JWT to see which kid it purports to be signed by.
>   *
> Find the certificate that corresponds to that kid (from the jwks_uri
> content)
>   *
> Validate the JWT with that certificate.
>
> That ought to work, at least given what I'm seeing. (But, I'm not a JWT
> expert, so I don't know whether there is something unusual in how Azure AD
> generates JWT's and handles the jwks_uri content).
>
> Anyway, I'm curious whether anyone else has been trying to get this to
> work with Azure AD, and whether they have run into similar problems. And,
> of course, whether I appear to be misunderstanding anything about how this
> is supposed to work.
>
>
> Ryan Rempel
>
> Director of Information Technology
>
> Canadian Mennonite University
> _______________________________________________
> ceph-users mailing list -- ceph-users@xxxxxxx
> To unsubscribe send an email to ceph-users-leave@xxxxxxx
>
>
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux