Hello,
I used glusterFS 3-4 years ago and I'm looking at it as a potential solution for creating an HTTP accessible internal cache of files for a project I'm working on. This doesn't need to be super scalable in terms of the number of simultaneous users, but relatively consistent access in the sub second range per file would be important.
I currently have images stored directly on disk, with an Apache service sitting on top to provide the files back on demand.
I have a naming structure set up such that the first two characters in a file name provide a folder path - file "abc123.txt" sits at /my/data/a/b/abc123.txt
This is how I avoid having millions of files in a single directory. This can be extended as needed, of course, and is managed with a simple apache rewrite rule.
If I need to cache and expose millions to billions of relatively small files (average file size 50kb), where am I likely to encounter problems with glusterFS?
Are there block size issues?
inode issues?
Obviously raw disk storage is a constraint, but are there any others I should be aware of when planning this service?
Can I point Apache at an NFS filesystem mounted glusterFS volume and do the same kind of service I'm doing currently?
Is there a better way to do this?
Do I need to do the same kind of file routing I'm doing currently, within gluster?
That is to say, will I still need to store data in my gluster volume at /my/gluster/data/a/b/abc123.txt?
Thanks!
Josh Harrison
_______________________________________________ Gluster-users mailing list Gluster-users@xxxxxxxxxxx http://www.gluster.org/mailman/listinfo/gluster-users