Hi,
On 10/19/2015 10:34 AM, Shinobu Kinjo wrote:
What kind of applications are you talking about regarding to applications
for HPC.
Are you talking about like netcdf?
Caching is quite necessary for some applications for computation.
But it's not always the case.
It's not quite related to this topic but I'm really interested in your
thought using Ceph cluster for HPC computation.
Our application are in the field of bioinformatics. This involves read
mapping, homology search in databases etc.
In almost all cases there's a fixed dataset or database like the human
genome with all read mapping index files (> 20GB) or the database with
all known protein sequences (>25 GB). With enough RAM in the cluster
machines most of these datasets can be keep in memory for subsequent
processing runs.
These datasets are updated from time to time, so keeping them on a
network storage is simplier than distributing updates to instances on
local hard disks. It would also require intensive interaction with the
queuing system to ensure that one job array operates on a consistent
datasets. It worked fine with NFS based storage, but NFS introduces a
single point of failure (except for pNFS).
Regards,
Burkhard
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com