Re: Securing Ceph with TLS

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Kadir,

I think this is something that I once wanted to implement during my
GSoC project: On-the-wire encryption, but I am not capable of finishing
that, so I am really curious to see your code, can you share a branch?

Regards,
Zhao

On Wed, Mar 7, 2018 at 10:12 AM, Kadir Ozdemir <kozdemir@xxxxxxxxxxxxxx> wrote:
> Hi All,
>
>
> We cannot deploy Ceph in our data centers without encrypting its
> network traffic. We have decided to implement SSL/TLS within the
> Messenger layer using the OpenSSL library on our jewel branch.
> Although our implementation has not been completed, we do not see any
> obstacles to complete and deploy it. I would like to give some
> overview, sketch the design in the rest of this email, and get your
> feedback. If the community is interested in merging this effort to the
> upstream, we will be very happy to collaborate and contribute.
>
> Two main structures of the OpenSSL library are SSL Context and SSL. An
> SSL Context object holds certificates, a private key, and options
> regarding the TLS protocol and algorithms. An SSL Context object is
> used to provide the context to create SSL objects which represent SSL
> sessions. SSL objects are responsible for encryption, and session
> handshake and renegotiation among other things.
>
> Ceph Messengers have implemented somewhat sophisticated control logic
> over the socket layer, and require a different programing model than
> the simple way of using OpenSSL. OpenSSL has a rich set of APIs at
> different abstraction levels. In order to decouple the TLS operations
> from the underlying I/O layer (that is, in our case the socket layer),
> OpenSSL provides an abstraction for handling input and output at the
> encrypted side of TLS, which is referred as BIO (Basic Input Output).
> We will be using a pair of (i.e., input and output) memory BIOs for
> buffering encrypted data to be received from or sent to the socket
> layer. BIOs allow us to preserve the implementation of the existing
> socket handling logic within the Ceph networking layer.
>
> When TLS is enabled on top of a TCP connection, TLS inserts handshake
> and renegotiation messages to the TCP stream and encapsulates the
> plaintext stream of bytes transferred by an application into TLS
> records. An application in our context is a Ceph Messenger. The TLS
> handshake happens at the beginning of a TLS session as part of
> establishing the session. Before any application data can be sent or
> received over a TLS session, the TLS handshake has to be completed.
> During handshake, the end points of the session are verified using PKI
> certificates, and encryption key is established among other things.
>
> Assuming that TLS handshake is completed (after TCP connection is
> established), now we can trace the send path. The send path starts
> with calling SSL_write() which takes the handle for the SSL object,
> the address of a data buffer, and the length of the buffer. The SSL
> object encrypts the data into one or more TLS records using its
> internal buffer and then writes the TLS records to the output BIO.
> When SSL_write() succeeds, we know that the data is encrypted and
> copied to the internal buffer of the output BIO. The next step is to
> read the encrypted data from the output BIO, and write it to the
> socket. In order to do this operation, a separate buffer needs to be
> used to copy data from the output BIO to the socket.
>
> TLS can be in the middle of renegotiating while the application
> attempts to receive or send data. Please note that the SSL object can
> only write to or read from a BIO in the context of the caller (i.e.,
> the calling thread). This is one of the reasons, it fails SSL_write or
> SSL_read calls to inform the caller that it wants to read or write. In
> this case, the (SSL_write or SSL_read) operation needs to be repeated
> after the necessary action is taken. If the SSL object needs to read,
> then the application needs to read more encrypted data from the
> socket, and write it to the input BIO. If the SSL object needs to
> write, then the data from the output BIO needs to be read and written
> to the socket.
>
> The read path is a bit more complicated as the SSL object may have
> some remaining plaintext from the earlier SSL_read operation. So, the
> application needs to attempt to read from the SSL object first using
> SSL_read(). If there is no leftover data, then SSL_read() will return
> zero bytes. Then, the application needs to check why the read failed
> using SSL_get_error() as in the case of SSL_write() failure. If the
> SSL layer wants to read as expected when there is no data in the SSL
> object, then the application needs to receive more encrypted data from
> the socket, push this data to the input BIO using BIO_write(), and
> attempt to read the plaintext using the SSL_read() operation on the
> SSL object.
>
> There are two main objectives that shape the design to be described
> here. The first one is to change the existing Ceph code and behavior
> minimally. The second one is to support Simple and Async Messengers.
> The main ideas behind the design are as follows:
>
> - Introduce a class called Socket to replace the socket descriptor in
> the Pipe and AsyncConnection classes. Socket will wrap the socket
> operations used currently in both Messengers. The socket descriptor
> will be an attribute of Socket. Essentially, the system calls for
> socket operations such as send and recv will be replaced by the
> corresponding member functions in this class. This is to achieve the
> minimal code change objective. Socket implements the plain TCP socket.
>
> - Introduce a class called TlsSocket to implement TLS specific
> behavior common to both Simple and Async Messengers, such as
> retrieving SSL Context, initiating TLS handshake, and reading from and
> writing to the SSL object. It maintains two sets each of which
> contains a lock, buffer, and BIO object; one set for the receive path
> and the other for the send path. The locks serialize the access to
> SSL, BIO, buffer and socket on the receive and send paths. The buffer
> size is set to 16KB. TlsSocket inherits from the Socket class, that
> is, it as a type of Socket. The existing Ceph code does not need to
> distinguish if the socket instance is a plain TCP socket or TLS
> enabled socket. TlsSocket allows us to separate the TLS specific code
> from the rest.
>
> - Introduce SimpleTlsSocket and AsyncTlsSocket classes to implement
> the behavior specific to Simple and Async Messengers. These classes
> inherit from TlsSocket. Simple Messenger uses blocking sockets while
> Async Messenger uses non-blocking sockets. The differences between
> them will be implemented within these classes. These classes are
> responsible for interacting with the BIO objects and socket layer for
> sending and receiving encrypted data.
>
> - Introduce a class called SslContext to be a wrapper for the SSL
> Context object of the OpenSSL library. Each Messenger object can have
> up to two SslContext objects, one for the client role and the other
> for the server role. Ceph clients allocate only the client SslContext
> objects since they only initiate connections. Ceph servers both
> initiate and accept connections, and thus, allocate both client and
> server SslContext objects.
>
> - Introduce a class called TLS to represent OpenSSL library and be
> responsible for initializing the library.
>
> New configuration parameters are defined to enable TLS. These are
> “tls”, “tls_client_cert_file”, “tls_sever_cert_file”,
> “tls_client_key_file”, “tls_server_key_file”,and “tls_ca_cert_file”.
> The tls parameter can take one the three values : “none”, “desired”,
> “required”. “none” means TLS is not enabled. The default value for
> this parameter is “none”. For the older Ceph versions, it is
> considered that “tls” is “none”.
>
> “desired” means if both ends of a TCP connection are configured with
> “desired”, or “required”, the session between them must be a TLS
> session. Otherwise, the session would be a plain TCP session.  The
> "desired" value is used temporarily during rolling upgrade from plain
> TCP sessions to TLS sessions. If one side is configured with
> “required” and the other side is “none”, then Ceph connection attempts
> between them will fail.
>
> The rolling upgrade from plain TCP sessions to TLS sessions can be
> done as follows. After a Ceph client or server is upgraded to a TLS
> supported version, the “tls” parameter is set to “desired”. For Ceph
> clients, this parameter is read from the ceph.config fie, and for the
> servers, the parameter is dynamically set without restarting the
> server using the injectargs capability. When all the clients and
> servers are configured with “tls = desired” then the servers and
> clients can be configured with “tls = required”. When the “tls” value
> is changed dynamically to “required”, existing connections initiated
> or terminated from a server are dropped (and new connections where tls
> is required are established). Client config files are updated with
> “tls = required”, and clients can be restarted.
>
> We have considered two options for rolling upgrades. The first one
> requires changing the Ceph protocol to advertise TLS configuration
> during Ceph handshake. This allows accepting TLS sessions over the
> existing Ceph ports that are used for plain TCP connections. In this
> case, the connections are upgraded to the TLS sessions by starting TLS
> handshake during or immediately after Ceph handshake.
>
> The second option is to use a separate set of port numbers for TLS.
> This does not require changing the existing protocol since Ceph
> handshake (i.e., the banner and the rest of) messages will be
> exchanged only after TLS sessions are established. The clients
> configured with the desired mode attempts to connect servers over the
> TLS ports first. If it is not successful, then they attempt over the
> plain TCP ports. The clients and servers configured with the required
> mode just use the TLS ports. We appreciate the feedback on these
> options. Our security team prefers the second option.
>
> The other parameters, the parameters for certificate and key files,
> hold the locations (i.e., paths) of the corresponding files.
>
> The existing Ceph authentication protocol works as before over TLS
> since this design is compatible with it. The design does NOT support
> the kernel rbd module (krbd). We are planning to use librbd (via the
> user space rbd-nbd client) for the block use cases. The overhead of
> rbd-nbd is mostly around 10 to 15% based on the fio runs on Centos 7.
> Our initial performance runs show that TLS overhead should also be
> about 10 to 15% based on the rados bench throughput tests. More
> performance characterization needs to be done to get more reliable
> results.
>
> Thanks,
> Kadir
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@xxxxxxxxxxxxxxx
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux