On 17/05/2021 14:41, Григорий Сморкалов wrote:
Hello. I am trying to debug some ssl related code and I need some help.
We have a HTTP client based on libuv and libopenssl for TLS. It is an
internal C++ library with its own TCP wrapper around lubuv and HTTP
parser. It works fine and our servers make millions of HTTPS requests to
social networks with it. If it is one connection per request
(Connection: Close) there is no problem at all. But sometimes
connections with keep-alive receive strange ssl errors:
error:04067084:rsa routines:rsa_ossl_public_decrypt:data too large for
modulus , or error:04067072:rsa routines:rsa_ossl_public_decrypt:padding
check failed
It is a really rare event, once per million I think. The error is
returned from SSL_read when new data comes from the server. It is never
the first response, usually there are more than ten requests/responses
in the connection before the error.
If I understand you correctly then you are seeing this at some point
*after* the initial handshake and the connection has been running for a
while.
If so that is a very strange error indeed. These are RSA errors. But
once the initial handshake is complete there should be no reason for
libssl to be performing RSA calls. RSA is *only* used during the
handshake and not during application data transfer. Unless that is there
is a reneg handshake happening (only applies for TLSv1.2 or less)...but
if so that would be fairly clear in the wireshark logs.
Does your application do anything with libcrypto directly? Or does it
only use libssl?
Is your client application multi-threaded?
I'm wondering whether this error is actually a stale error left in the
queue from some earlier problem on the same thread.
You could try forcefully clearing any stale errors (ERR_clear_error())
before any SSL_read() calls and see if the problem goes away.
We have a tcpdump of such connections and keylog made with
SSL_CTX_set_keylog_callback. Wireshark opens this dump and decrypts it
normally using keylog as pre-master keyfile. The last packet produces an
error in our HTTP client but in wireshark it is ok and it contains
normal HTTP response with 200 OK. No sign of any error or data
corruption. That fact makes me think that data is ok and my openssl
usage has some problems.
I want to reproduce this situation and replay this tcpdump. It means run
our server (actually only http client part) and give it captured data.
It is no problem to make a server that sends exactly the same data from
tcp dump. It is no problem to make exactly the same http request. But I
need to use the pre-master key from the keylog on the client side. I
cannot find any function that sets keys to SSL_CTX* or SSL*. Is there any?
I tried to build my own libopenssl with constant keys. I put
memcpy(s->session->master_key, overriden_secret, 48) in
ssl_generate_master_secret and tls13_generate_secret. Also
memcpy(s->s3->client_random, overriden_random, 32); in
tls_construct_client_hello and tls_early_post_process_client_hello. It
doesn't work and produces ssl error on handshake phase
error:1416C095:SSL routines:tls_process_finished:digest check failed.
Client Hello produced by this patched libopenssl is always different,
this means I haven't replaced all keys. It is something in s->tmp
structure I cannot understand to replace all usages and values.
This is very much a non-trivial task. OpenSSL has no support for this
kind of thing at the moment and it would be difficult to add it. I don't
think the key logging logs the ephemeral keys that are used as input to
the master secret generation. So the ClientHello key_share (assuming
TLSv1.3) is going to be different regardless.
You could conceivably hack the finished check so that it passes
regardless if you still end up with the right master secret - but
everything isn't necessarily going to be the *same* as in the initial
failing run. If my possibly theory about a stale error on the queue is
right then even if you got everything right then it might still not show
up the problem if the stale error is related to some other thing that
happened on the same thread.
Matt
Is there a simpler way?
Without reproducing it is practically impossible to find a bug. Even if
it does not reproduce, I'll get some information. Maybe it is UB in a
different place.
I've asked the same question on stackoverflow, so you can answer there
if it is easier or better for you:
https://stackoverflow.com/questions/67570255/how-to-replay-encrypted-traffic-with-libopenssl
<https://stackoverflow.com/questions/67570255/how-to-replay-encrypted-traffic-with-libopenssl>
Thank you!