Re: Securing Ceph with TLS

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Thank you for your interest. We are in the process to make the code
available. I will announce it here when it is available

Regards,
Kadir

On Wed, Mar 7, 2018 at 6:39 AM, Junwang Zhao <zhjwpku@xxxxxxxxx> wrote:
> Hi Kadir,
>
> I think this is something that I once wanted to implement during my
> GSoC project: On-the-wire encryption, but I am not capable of finishing
> that, so I am really curious to see your code, can you share a branch?
>
> Regards,
> Zhao
>
> On Wed, Mar 7, 2018 at 10:12 AM, Kadir Ozdemir <kozdemir@xxxxxxxxxxxxxx> wrote:
>> Hi All,
>>
>>
>> We cannot deploy Ceph in our data centers without encrypting its
>> network traffic. We have decided to implement SSL/TLS within the
>> Messenger layer using the OpenSSL library on our jewel branch.
>> Although our implementation has not been completed, we do not see any
>> obstacles to complete and deploy it. I would like to give some
>> overview, sketch the design in the rest of this email, and get your
>> feedback. If the community is interested in merging this effort to the
>> upstream, we will be very happy to collaborate and contribute.
>>
>> Two main structures of the OpenSSL library are SSL Context and SSL. An
>> SSL Context object holds certificates, a private key, and options
>> regarding the TLS protocol and algorithms. An SSL Context object is
>> used to provide the context to create SSL objects which represent SSL
>> sessions. SSL objects are responsible for encryption, and session
>> handshake and renegotiation among other things.
>>
>> Ceph Messengers have implemented somewhat sophisticated control logic
>> over the socket layer, and require a different programing model than
>> the simple way of using OpenSSL. OpenSSL has a rich set of APIs at
>> different abstraction levels. In order to decouple the TLS operations
>> from the underlying I/O layer (that is, in our case the socket layer),
>> OpenSSL provides an abstraction for handling input and output at the
>> encrypted side of TLS, which is referred as BIO (Basic Input Output).
>> We will be using a pair of (i.e., input and output) memory BIOs for
>> buffering encrypted data to be received from or sent to the socket
>> layer. BIOs allow us to preserve the implementation of the existing
>> socket handling logic within the Ceph networking layer.
>>
>> When TLS is enabled on top of a TCP connection, TLS inserts handshake
>> and renegotiation messages to the TCP stream and encapsulates the
>> plaintext stream of bytes transferred by an application into TLS
>> records. An application in our context is a Ceph Messenger. The TLS
>> handshake happens at the beginning of a TLS session as part of
>> establishing the session. Before any application data can be sent or
>> received over a TLS session, the TLS handshake has to be completed.
>> During handshake, the end points of the session are verified using PKI
>> certificates, and encryption key is established among other things.
>>
>> Assuming that TLS handshake is completed (after TCP connection is
>> established), now we can trace the send path. The send path starts
>> with calling SSL_write() which takes the handle for the SSL object,
>> the address of a data buffer, and the length of the buffer. The SSL
>> object encrypts the data into one or more TLS records using its
>> internal buffer and then writes the TLS records to the output BIO.
>> When SSL_write() succeeds, we know that the data is encrypted and
>> copied to the internal buffer of the output BIO. The next step is to
>> read the encrypted data from the output BIO, and write it to the
>> socket. In order to do this operation, a separate buffer needs to be
>> used to copy data from the output BIO to the socket.
>>
>> TLS can be in the middle of renegotiating while the application
>> attempts to receive or send data. Please note that the SSL object can
>> only write to or read from a BIO in the context of the caller (i.e.,
>> the calling thread). This is one of the reasons, it fails SSL_write or
>> SSL_read calls to inform the caller that it wants to read or write. In
>> this case, the (SSL_write or SSL_read) operation needs to be repeated
>> after the necessary action is taken. If the SSL object needs to read,
>> then the application needs to read more encrypted data from the
>> socket, and write it to the input BIO. If the SSL object needs to
>> write, then the data from the output BIO needs to be read and written
>> to the socket.
>>
>> The read path is a bit more complicated as the SSL object may have
>> some remaining plaintext from the earlier SSL_read operation. So, the
>> application needs to attempt to read from the SSL object first using
>> SSL_read(). If there is no leftover data, then SSL_read() will return
>> zero bytes. Then, the application needs to check why the read failed
>> using SSL_get_error() as in the case of SSL_write() failure. If the
>> SSL layer wants to read as expected when there is no data in the SSL
>> object, then the application needs to receive more encrypted data from
>> the socket, push this data to the input BIO using BIO_write(), and
>> attempt to read the plaintext using the SSL_read() operation on the
>> SSL object.
>>
>> There are two main objectives that shape the design to be described
>> here. The first one is to change the existing Ceph code and behavior
>> minimally. The second one is to support Simple and Async Messengers.
>> The main ideas behind the design are as follows:
>>
>> - Introduce a class called Socket to replace the socket descriptor in
>> the Pipe and AsyncConnection classes. Socket will wrap the socket
>> operations used currently in both Messengers. The socket descriptor
>> will be an attribute of Socket. Essentially, the system calls for
>> socket operations such as send and recv will be replaced by the
>> corresponding member functions in this class. This is to achieve the
>> minimal code change objective. Socket implements the plain TCP socket.
>>
>> - Introduce a class called TlsSocket to implement TLS specific
>> behavior common to both Simple and Async Messengers, such as
>> retrieving SSL Context, initiating TLS handshake, and reading from and
>> writing to the SSL object. It maintains two sets each of which
>> contains a lock, buffer, and BIO object; one set for the receive path
>> and the other for the send path. The locks serialize the access to
>> SSL, BIO, buffer and socket on the receive and send paths. The buffer
>> size is set to 16KB. TlsSocket inherits from the Socket class, that
>> is, it as a type of Socket. The existing Ceph code does not need to
>> distinguish if the socket instance is a plain TCP socket or TLS
>> enabled socket. TlsSocket allows us to separate the TLS specific code
>> from the rest.
>>
>> - Introduce SimpleTlsSocket and AsyncTlsSocket classes to implement
>> the behavior specific to Simple and Async Messengers. These classes
>> inherit from TlsSocket. Simple Messenger uses blocking sockets while
>> Async Messenger uses non-blocking sockets. The differences between
>> them will be implemented within these classes. These classes are
>> responsible for interacting with the BIO objects and socket layer for
>> sending and receiving encrypted data.
>>
>> - Introduce a class called SslContext to be a wrapper for the SSL
>> Context object of the OpenSSL library. Each Messenger object can have
>> up to two SslContext objects, one for the client role and the other
>> for the server role. Ceph clients allocate only the client SslContext
>> objects since they only initiate connections. Ceph servers both
>> initiate and accept connections, and thus, allocate both client and
>> server SslContext objects.
>>
>> - Introduce a class called TLS to represent OpenSSL library and be
>> responsible for initializing the library.
>>
>> New configuration parameters are defined to enable TLS. These are
>> “tls”, “tls_client_cert_file”, “tls_sever_cert_file”,
>> “tls_client_key_file”, “tls_server_key_file”,and “tls_ca_cert_file”.
>> The tls parameter can take one the three values : “none”, “desired”,
>> “required”. “none” means TLS is not enabled. The default value for
>> this parameter is “none”. For the older Ceph versions, it is
>> considered that “tls” is “none”.
>>
>> “desired” means if both ends of a TCP connection are configured with
>> “desired”, or “required”, the session between them must be a TLS
>> session. Otherwise, the session would be a plain TCP session.  The
>> "desired" value is used temporarily during rolling upgrade from plain
>> TCP sessions to TLS sessions. If one side is configured with
>> “required” and the other side is “none”, then Ceph connection attempts
>> between them will fail.
>>
>> The rolling upgrade from plain TCP sessions to TLS sessions can be
>> done as follows. After a Ceph client or server is upgraded to a TLS
>> supported version, the “tls” parameter is set to “desired”. For Ceph
>> clients, this parameter is read from the ceph.config fie, and for the
>> servers, the parameter is dynamically set without restarting the
>> server using the injectargs capability. When all the clients and
>> servers are configured with “tls = desired” then the servers and
>> clients can be configured with “tls = required”. When the “tls” value
>> is changed dynamically to “required”, existing connections initiated
>> or terminated from a server are dropped (and new connections where tls
>> is required are established). Client config files are updated with
>> “tls = required”, and clients can be restarted.
>>
>> We have considered two options for rolling upgrades. The first one
>> requires changing the Ceph protocol to advertise TLS configuration
>> during Ceph handshake. This allows accepting TLS sessions over the
>> existing Ceph ports that are used for plain TCP connections. In this
>> case, the connections are upgraded to the TLS sessions by starting TLS
>> handshake during or immediately after Ceph handshake.
>>
>> The second option is to use a separate set of port numbers for TLS.
>> This does not require changing the existing protocol since Ceph
>> handshake (i.e., the banner and the rest of) messages will be
>> exchanged only after TLS sessions are established. The clients
>> configured with the desired mode attempts to connect servers over the
>> TLS ports first. If it is not successful, then they attempt over the
>> plain TCP ports. The clients and servers configured with the required
>> mode just use the TLS ports. We appreciate the feedback on these
>> options. Our security team prefers the second option.
>>
>> The other parameters, the parameters for certificate and key files,
>> hold the locations (i.e., paths) of the corresponding files.
>>
>> The existing Ceph authentication protocol works as before over TLS
>> since this design is compatible with it. The design does NOT support
>> the kernel rbd module (krbd). We are planning to use librbd (via the
>> user space rbd-nbd client) for the block use cases. The overhead of
>> rbd-nbd is mostly around 10 to 15% based on the fio runs on Centos 7.
>> Our initial performance runs show that TLS overhead should also be
>> about 10 to 15% based on the rados bench throughput tests. More
>> performance characterization needs to be done to get more reliable
>> results.
>>
>> Thanks,
>> Kadir
>> --
>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>> the body of a message to majordomo@xxxxxxxxxxxxxxx
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux