Newbie questions

pkoelle at gmail.com (pkoelle) · Tue, 04 May 2010 14:25:36 +0200



Am 03.05.2010 21:50, schrieb Joshua Baker-LePain:
[snip]
> I'm looking at Gluster for 2 purposes:
>
> 1) To host our "database" volume. This volume has copies of several
> protein and gene databases (PDB, UniProt, etc). The databases
> generally consist of tens of thousands of small (a few hundred KB at
> most) files. Users often start array jobs with hundreds or thousands
> of tasks, each task of which accesses many of these files.
 From our testing we found gluster with many small files to be rather 
slow (GigE). Each open() will go over the network and will effectively 
kill read performance (5-7 MB/sec). We tried to serve webapps with many 
small files and startup time was not tolerable.

Of course, you need to test yourself ;)

hth
  Paul