Re: Securing Ceph with TLS

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Mar 7, 2018 at 2:12 AM, Kadir Ozdemir <kozdemir@xxxxxxxxxxxxxx> wrote:
> Hi All,
>
>
> We cannot deploy Ceph in our data centers without encrypting its
> network traffic. We have decided to implement SSL/TLS within the
> Messenger layer using the OpenSSL library on our jewel branch.
> Although our implementation has not been completed, we do not see any
> obstacles to complete and deploy it. I would like to give some
> overview, sketch the design in the rest of this email, and get your
> feedback. If the community is interested in merging this effort to the
> upstream, we will be very happy to collaborate and contribute.

I'm curious about the motivation for TLS in particular, as opposed to
using a stream cipher (plain AES-256 or similar) based on the existing
cephx shared secrets and authorization tickets.  Because the endpoints
are already authorized, it seems like it should be possible to avoid
introducing an additional set of certificates.

John

> Two main structures of the OpenSSL library are SSL Context and SSL. An
> SSL Context object holds certificates, a private key, and options
> regarding the TLS protocol and algorithms. An SSL Context object is
> used to provide the context to create SSL objects which represent SSL
> sessions. SSL objects are responsible for encryption, and session
> handshake and renegotiation among other things.
>
> Ceph Messengers have implemented somewhat sophisticated control logic
> over the socket layer, and require a different programing model than
> the simple way of using OpenSSL. OpenSSL has a rich set of APIs at
> different abstraction levels. In order to decouple the TLS operations
> from the underlying I/O layer (that is, in our case the socket layer),
> OpenSSL provides an abstraction for handling input and output at the
> encrypted side of TLS, which is referred as BIO (Basic Input Output).
> We will be using a pair of (i.e., input and output) memory BIOs for
> buffering encrypted data to be received from or sent to the socket
> layer. BIOs allow us to preserve the implementation of the existing
> socket handling logic within the Ceph networking layer.
>
> When TLS is enabled on top of a TCP connection, TLS inserts handshake
> and renegotiation messages to the TCP stream and encapsulates the
> plaintext stream of bytes transferred by an application into TLS
> records. An application in our context is a Ceph Messenger. The TLS
> handshake happens at the beginning of a TLS session as part of
> establishing the session. Before any application data can be sent or
> received over a TLS session, the TLS handshake has to be completed.
> During handshake, the end points of the session are verified using PKI
> certificates, and encryption key is established among other things.
>
> Assuming that TLS handshake is completed (after TCP connection is
> established), now we can trace the send path. The send path starts
> with calling SSL_write() which takes the handle for the SSL object,
> the address of a data buffer, and the length of the buffer. The SSL
> object encrypts the data into one or more TLS records using its
> internal buffer and then writes the TLS records to the output BIO.
> When SSL_write() succeeds, we know that the data is encrypted and
> copied to the internal buffer of the output BIO. The next step is to
> read the encrypted data from the output BIO, and write it to the
> socket. In order to do this operation, a separate buffer needs to be
> used to copy data from the output BIO to the socket.
>
> TLS can be in the middle of renegotiating while the application
> attempts to receive or send data. Please note that the SSL object can
> only write to or read from a BIO in the context of the caller (i.e.,
> the calling thread). This is one of the reasons, it fails SSL_write or
> SSL_read calls to inform the caller that it wants to read or write. In
> this case, the (SSL_write or SSL_read) operation needs to be repeated
> after the necessary action is taken. If the SSL object needs to read,
> then the application needs to read more encrypted data from the
> socket, and write it to the input BIO. If the SSL object needs to
> write, then the data from the output BIO needs to be read and written
> to the socket.
>
> The read path is a bit more complicated as the SSL object may have
> some remaining plaintext from the earlier SSL_read operation. So, the
> application needs to attempt to read from the SSL object first using
> SSL_read(). If there is no leftover data, then SSL_read() will return
> zero bytes. Then, the application needs to check why the read failed
> using SSL_get_error() as in the case of SSL_write() failure. If the
> SSL layer wants to read as expected when there is no data in the SSL
> object, then the application needs to receive more encrypted data from
> the socket, push this data to the input BIO using BIO_write(), and
> attempt to read the plaintext using the SSL_read() operation on the
> SSL object.
>
> There are two main objectives that shape the design to be described
> here. The first one is to change the existing Ceph code and behavior
> minimally. The second one is to support Simple and Async Messengers.
> The main ideas behind the design are as follows:
>
> - Introduce a class called Socket to replace the socket descriptor in
> the Pipe and AsyncConnection classes. Socket will wrap the socket
> operations used currently in both Messengers. The socket descriptor
> will be an attribute of Socket. Essentially, the system calls for
> socket operations such as send and recv will be replaced by the
> corresponding member functions in this class. This is to achieve the
> minimal code change objective. Socket implements the plain TCP socket.
>
> - Introduce a class called TlsSocket to implement TLS specific
> behavior common to both Simple and Async Messengers, such as
> retrieving SSL Context, initiating TLS handshake, and reading from and
> writing to the SSL object. It maintains two sets each of which
> contains a lock, buffer, and BIO object; one set for the receive path
> and the other for the send path. The locks serialize the access to
> SSL, BIO, buffer and socket on the receive and send paths. The buffer
> size is set to 16KB. TlsSocket inherits from the Socket class, that
> is, it as a type of Socket. The existing Ceph code does not need to
> distinguish if the socket instance is a plain TCP socket or TLS
> enabled socket. TlsSocket allows us to separate the TLS specific code
> from the rest.
>
> - Introduce SimpleTlsSocket and AsyncTlsSocket classes to implement
> the behavior specific to Simple and Async Messengers. These classes
> inherit from TlsSocket. Simple Messenger uses blocking sockets while
> Async Messenger uses non-blocking sockets. The differences between
> them will be implemented within these classes. These classes are
> responsible for interacting with the BIO objects and socket layer for
> sending and receiving encrypted data.
>
> - Introduce a class called SslContext to be a wrapper for the SSL
> Context object of the OpenSSL library. Each Messenger object can have
> up to two SslContext objects, one for the client role and the other
> for the server role. Ceph clients allocate only the client SslContext
> objects since they only initiate connections. Ceph servers both
> initiate and accept connections, and thus, allocate both client and
> server SslContext objects.
>
> - Introduce a class called TLS to represent OpenSSL library and be
> responsible for initializing the library.
>
> New configuration parameters are defined to enable TLS. These are
> “tls”, “tls_client_cert_file”, “tls_sever_cert_file”,
> “tls_client_key_file”, “tls_server_key_file”,and “tls_ca_cert_file”.
> The tls parameter can take one the three values : “none”, “desired”,
> “required”. “none” means TLS is not enabled. The default value for
> this parameter is “none”. For the older Ceph versions, it is
> considered that “tls” is “none”.
>
> “desired” means if both ends of a TCP connection are configured with
> “desired”, or “required”, the session between them must be a TLS
> session. Otherwise, the session would be a plain TCP session.  The
> "desired" value is used temporarily during rolling upgrade from plain
> TCP sessions to TLS sessions. If one side is configured with
> “required” and the other side is “none”, then Ceph connection attempts
> between them will fail.
>
> The rolling upgrade from plain TCP sessions to TLS sessions can be
> done as follows. After a Ceph client or server is upgraded to a TLS
> supported version, the “tls” parameter is set to “desired”. For Ceph
> clients, this parameter is read from the ceph.config fie, and for the
> servers, the parameter is dynamically set without restarting the
> server using the injectargs capability. When all the clients and
> servers are configured with “tls = desired” then the servers and
> clients can be configured with “tls = required”. When the “tls” value
> is changed dynamically to “required”, existing connections initiated
> or terminated from a server are dropped (and new connections where tls
> is required are established). Client config files are updated with
> “tls = required”, and clients can be restarted.
>
> We have considered two options for rolling upgrades. The first one
> requires changing the Ceph protocol to advertise TLS configuration
> during Ceph handshake. This allows accepting TLS sessions over the
> existing Ceph ports that are used for plain TCP connections. In this
> case, the connections are upgraded to the TLS sessions by starting TLS
> handshake during or immediately after Ceph handshake.
>
> The second option is to use a separate set of port numbers for TLS.
> This does not require changing the existing protocol since Ceph
> handshake (i.e., the banner and the rest of) messages will be
> exchanged only after TLS sessions are established. The clients
> configured with the desired mode attempts to connect servers over the
> TLS ports first. If it is not successful, then they attempt over the
> plain TCP ports. The clients and servers configured with the required
> mode just use the TLS ports. We appreciate the feedback on these
> options. Our security team prefers the second option.
>
> The other parameters, the parameters for certificate and key files,
> hold the locations (i.e., paths) of the corresponding files.
>
> The existing Ceph authentication protocol works as before over TLS
> since this design is compatible with it. The design does NOT support
> the kernel rbd module (krbd). We are planning to use librbd (via the
> user space rbd-nbd client) for the block use cases. The overhead of
> rbd-nbd is mostly around 10 to 15% based on the fio runs on Centos 7.
> Our initial performance runs show that TLS overhead should also be
> about 10 to 15% based on the rados bench throughput tests. More
> performance characterization needs to be done to get more reliable
> results.
>
> Thanks,
> Kadir
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@xxxxxxxxxxxxxxx
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux