On 11/06/2012 11:27 AM, Fernando Frediani (Qube) wrote: > Why does it need to rely on FUSE. Why can't it be something that run in > kernel that doesn't have any reliance on FUSE ? I imagine that would require > a lot of engineering but the benefits no need to mention. "A lot" doesn't even capture the magnitude of the effort. For all of its warts, which seem to be most of what we hear about, GlusterFS embodies some pretty advanced technology - not just in the I/O path, but things like online maintenance and even reconfiguration as well. None of that would have been possible with the slower development pace of working within the kernel (said as a kernel developer since before there was Linux BTW). We'd have to completely stop all other feature development or bug fixing, retrain the half the staff, and work for a full year or two to get our own code plus all of the libraries we rely on into the kernel, and then we wouldn't be portable to other platforms as we are now. There's a reason Ceph, which was brilliantly conceived and is being worked on by an excellent team, has taken this long to become semi-stable and still lags in most areas other than performance. The fact that it's in user space is an integral part of how GlusterFS has evolved and will continue to evolve. > Does anyone know a > bit of architecture of Isilon and of other POSIX compliant distributed > filesystems ? Quite a bit, actually, though of course more about the open-source ones than about the proprietary ones like Isilon. We all face many of the same problems, and make some of the same tradeoffs. Some have chosen single-metadata-server models that work great for small systems that never fail but become a nightmare for truly large systems or those that have to stay up despite hardware failures. Some have chosen to do more caching, either with or without invalidation from the server. GlusterFS has historically chosen a pretty strong consistency model, but has also chosen a client-centric model that precludes an invalidation-based implementation. One could certainly argue about whether those are the right choices and they might change some day - I personally would like to see both weaker consistency and more use of lease-based caching - but for now and for the immediate future those choices determine whether a given workload will perform well or poorly. There's only so much we can do with optimization.