On Fri, Oct 23, 2009 at 5:41 AM, MORITA Kazutaka <morita.kazutaka@xxxxxxxxxxxxx> wrote: > On Fri, Oct 23, 2009 at 12:30 AM, Avi Kivity <avi@xxxxxxxxxx> wrote: >> If so, is it reasonable to compare this to a cluster file system setup (like >> GFS) with images as files on this filesystem? The difference would be that >> clustering is implemented in userspace in sheepdog, but in the kernel for a >> clustering filesystem. > > I think that the major difference between sheepdog and cluster file > systems such as Google File system, pNFS, etc is the interface between > clients and a storage system. note that GFS is "Global File System" (written by Sistina (the same folks from LVM) and bought by RedHat). Google Filesystem is a different thing, and ironically the client/storage interface is a little more like sheepdog and unlike a regular cluster filesystem. >> How is load balancing implemented? Can you move an image transparently >> while a guest is running? Will an image be moved closer to its guest? > > Sheepdog uses consistent hashing to decide where objects store; I/O > load is balanced across the nodes. When a new node is added or the > existing node is removed, the hash table changes and the data > automatically and transparently are moved over nodes. > > We plan to implement a mechanism to distribute the data not randomly > but intelligently; we could use machine load, the locations of VMs, etc. i don't have much hands-on experience on consistent hashing; but it sounds reasonable to make each node's ring segment proportional to its storage capacity. dynamic load balancing seems a tougher nut to crack, especially while keeping all clients mapping consistent. >> Do you support multiple guests accessing the same image? > > A VM image can be attached to any VMs but one VM at a time; multiple > running VMs cannot access to the same VM image. this is a must-have safety measure; but a 'manual override' is quite useful for those that know how to manage a cluster-aware filesystem inside a VM image, maybe like Xen's "w!" flag does. justs be sure to avoid distributed caching for a shared image! in all, great project, and with such a clean patch into KVM/Qemu, high hopes of making into regular use. i'd just want to add my '+1 votes' on both getting rid of JVM dependency and using block devices (usually LVM) instead of ext3/btrfs -- Javier -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html