> I've been testing release-3.7 in the lead up to tagging 3.7.11, and > found that the fix I did to allow daemons to start when management > encryption is enabled, doesn't work always. The daemons fail to start > because they can't connect to glusterd to fetch the volfiles, and the > connection failure is partly due to own-thread not being enabled. > > I'd like to know why own-thread is kept optional, and is not the > default for any encrypted connection? > Encrypted RPC in GlusterFS can only works with a poller on its > own-thread, and cannot work with epoll. When this is the case, why is > it even possible to disable own-thread. > > In GlusterFS currently, own-thread gets enabled for most encrypted > connections by default. But in certain cases, it doesn't get enabled > when it should be and leads to connection failures. This sort of > failure is most visible when a glusterfs/glusterfsd process attempts > to fetch volfiles from glusterd. > > I'm going to be sending a change that removes the option of disabling > own-thread, and make all encrypted connections use it. Do you see any > reasons not to do this? The reasons are basically historical. Own-thread was implemented along with SSL as a way to make up for the performance impact of doing SSL in our single polling thread. At the time any combination worked, but the defaults were aligned together because they were both new and kind of experimental. I figured people would be willing to risk losing a bit of stability to avoid a significant performance loss when they were already using another experimental feature to get better security, but they wouldn't want to make an already-stable system less so to get (what I thought would be) a modest performance gain otherwise. Time passed. The performance benefit of own-thread without SSL turned out to be greater than I'd thought, I implemented SSL in the management as well as the I/O path, SSL became TLS, etc. Somewhere along the line we should have made own-thread the default. We would have seen a performance benefit, and epoll might not have happened, but my attention was elsewhere so own-thread didn't become the default and epoll did happen. Just as I had warned people many times, loudly, it broke TLS. It also uncovered race conditions elsewhere, and introduced many other forms of instability - as I'm sure you know. IMO it was one of the dumber ideas in the history of the project. So, what do we do *now*? There are good reasons for us to consider making TLS the default, for both I/O and management. Allowing unauthenticated connections is just bad in principle, especially in the cloud. If the quickest route to making TLS stable is to disallow its use without own-thread, then I say let's do that . . . and if we're going to do that then we might as well get rid of epoll. If TLS is the default, and requires own-thread, then epoll is only applicable in an insecure non-default setting. We don't need to be spending our precious time on bugs - both those we already know about and those we have yet to find - that only exist in such a context. "Thread per connection" isn't my favorite approach to network concurrency any more than it's anyone else's, but for the connection counts we're dealing with it's sufficient. I'd rather maximize stability and development velocity than academic elegance. _______________________________________________ Gluster-devel mailing list Gluster-devel@xxxxxxxxxxx http://www.gluster.org/mailman/listinfo/gluster-devel