GlusterFS compared to KosmosFS (now called cloudstore)?

vikasgp at gmail.com (Vikas Gorur) · Mon, 20 Oct 2008 15:47:57 -0000

2008/10/18 Stas Oskin <stas.oskin at gmail.com>:
> Hi.
>
> I'm evaluating GlusterFS for our DFS implementation, and wondered how it
> compares to KFS/CloudStore?
>
> These features here look especially nice
> (http://kosmosfs.sourceforge.net/features.html). Any idea what of them exist
> in GlusterFS as well?

Stas,

Here's how GlusterFS compares to KFS, feature by feature:

> Incremental scalability:

Currently adding new storage nodes requires a change in the config
file and restarting servers and clients. However, there is no need to
move/copy data or perform any other maintenance steps. "Hot add"
capability is planned for the 1.5 release.

> Availability

GlusterFS supports n-way data replication through the AFR translator.

> Per file degree of replication

GlusterFS used to have this feature, but it was dropped due to lack
of interest. It would not be too hard to bring it back.

> Re-balancing

The DHT and unify translators have extensive support for distributing
data across nodes. One can use unify schedulers to define file creation
policies such as:

* ALU - Adaptively (based on disk space utilization, disk speed, etc.)
schedule file creation.

* Round robin

* Non uniform (NUFA) - prefer a local volume for file creation and use remote
ones only when there is no space on the local volume.

> Data integrity

GlusterFS arguably provides better data integrity since it runs over
an existing filesystem, and does not access disks at the block level.
Thus in the worst case (which shouldn't happen), even if GlusterFS
crashes, your data will still be readable with normal tools.

> Rack-aware data placement

None of our users have mentioned this need until now, thus GlusterFS
has no rack awareness. One could incorporate this intelligence into
our cluster translators (unify, afr, stripe) quite easily.

> File writes and caching

GlusterFS provides a POSIX-compliant filesystem interface. GlusterFS
has fine-tunable caching translators, such as read-ahead (to read ahead),
write-behind (to reduce write latency), and io-cache (caching file data).

> Language support

This is irrelevant to GlusterFS since it is mounted and accessed as a normal
filesystem, through FUSE. This means all your applications can run on GlusterFS
without any modifications.

> Deploy scripts

Users have found GlusterFS to be so simple to setup compared to other
cluster filesystems that there isn't really a need for deploy scripts. ;)

> Local read optimization

As mentioned earlier, if your data access patterns justify it (that
is, if users generally access local data and only occassionly want
remote data), you can configure 'unify' with the NUFA scheduler to achieve
this.

In addition, I'd like to mention two particular strengths of GlusterFS.

1) GlusterFS has no notion of a 'meta-server'. I have not looked through
KFS' design in detail, but the mention of a 'meta-server' leads me to
believe that failure of the meta-server will take the entire cluster offline.
Please correct me if the impression is wrong.

GlusterFS on the other hand has no single point of failure such as central
meta server.

2) GlusterFS 1.4 will have a web-based interface which will allow
you to start/stop GlusterFS, monitor logs and performance, and do
other admin activities.

Please contact us if you need further clarifications or details.

Vikas Gorur
Engineer - Z Research