Hi all.
I'm looking for some information on several distributed filesystems for our application.
It
looks like it finally came down to two candidates, Ceph being one of
them. But there are still a few questions about ir that I would really
like to clarify, if possible.
Our plan,
initially on 6 workstations, is to have it hosting a distributed file
system that can withstand two simultaneous computers failures without
data loss (something that can remember a raid 6, but over the network).
This file system will also need to be also remotely mounted (NFS server
with fallbacks) by other 5+ computers. Students will be working on all
11+ computers at the same time (different requisites from different
softwares: some use many small files, other a few really big, 100s gb,
files), and absolutely no hardware modifications are allowed. This
initial test bed is for undergraduate students usage, but if successful
will be employed also for our small clusters. The connection is a
simple GbE.
Our actual concerns are:
1) Data Resilience: It seems that double copy of each block is the standard setting, is it correct? As such, it will strip-parity data among three computers for each block?
1) Data Resilience: It seems that double copy of each block is the standard setting, is it correct? As such, it will strip-parity data among three computers for each block?
2)
Metadata Resilience: We seen that we can now have more than a single
Metadata Server (which was a show-stopper on previous versions). However, do they have to be dedicated boxes, or they
can share boxes with the Data Servers? Can it be configured in such a
way that even if two metadata server computers fail the whole system
data will still be accessible from the remaining computers, without
interruptions, or they share different data aiming only for performance?
3)
Other softwares compability: We seen that there is NFS incompability, is it correct?
Also, any posix issues?
4) No single (or
double) point of failure: every single possible stance has to be able to
endure a *double* failure (yes, things can get time to be fixed here). Does Ceph need s single master server for any of its activities? Can it endure
double failure? How long would it take to any sort of "fallback" to be
completed, users would need to wait to regain access?
I think that covers the initial questions we have. Sorry if this is the wrong list, however.
Looking forward for any answer or suggestion,
Regards,
Jones
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com