Slow metadata

harry.mangalam at uci.edu (harry mangalam) · Tue, 08 Oct 2013 15:09:45 -0700

This is a widely perceived feature/bug of gluster.  It also affects other 
distributed filesystems, tho generally not as much.

We've done 2 things to address this.  One is a distributed 'du' that is  
clusterfork'ed out to the storage nodes and compiles the results.  This is 
realtime and will provide data to that point. If you're interested in it, let 
me know and I can provide the code to do this. However, it requires 
clusterfork, some per-site config, and is specific to 'du', altho it could be 
modified to support other shell commands.

Here's the difference in performance on a fairly busy gluster system (4 
storage nodes, 8 volumes, 340TB, 60% used)
=====================
14:54:09 root at hpc-s:/som
1226 $ time du -sh abusch/* 
694M    abusch/MATS-gtf
^C

real    3m58.098s <--- killed after ~4m
user    0m0.033s
sys     0m0.351s

14:58:24 root at hpc-s:/som
1227 $ gfdu abusch/\*

INFO: Corrected gluster starting path: [/som/abusch/*]
About to execute [/root/bin/cf --script --tar=GLSRV "du -s /raid1/som/abusch/* 
; du -s /raid2/som/abusch/*; "]
Go? [yN]y

INFO: For raw results [cd /root/cf/CF-du--s--raid1-som-
abu-14.58.38_2013-10-08]

Size:           File|Dir
693.8203 M      /som/abusch/MATS-gtf
  1.5292 G      /som/abusch/MISO-gffs
764.5117 M      /som/abusch/MISO-gffs-v2
 23.8720 G      /som/abusch/deepSeq
 25.2845 G      /som/abusch/genomes
  5.4239 G      /som/abusch/index
 16.8011 G      /som/abusch/index2
-----------------------------------
 74.3348 G      Total

time was ~4s
=====================

The other approach is with the RobinHood Policy Engine 
<http://sourceforge.net/apps/trac/robinhood> which runs on a cron and recurses 
thru your FS, taking X hours, but compiles that info into a MySQL DB that is 
instantly responsive (but could be slightly out of date).  NTL, it's a very 
helpful tool to detect hotspots and ZOTfiles (Zillions Of Tiny files)

We are using it to monitor NFS volumes, Gluster, and Fraunhofer FSs.
It is a very slick system and a student (Adam Brenner) is modifying it to 
generate better stats via the web interface.

See his github and the robinhood trac:

https://github.com/abrenner/robinhood-multifs-web

http://sourceforge.net/apps/trac/robinhood

On Tuesday, October 08, 2013 09:07:52 AM Anders Salling Andersen wrote:
> Hi all i have a 50tb glusterfs replicated setup, with Many small files. My
> metadata is very slow ex. Du -sh takes over 24 hours. Is there a Way to
> make faster metadata ?
> 
> Regards Anders.
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://supercolony.gluster.org/mailman/listinfo/gluster-users

---
Harry Mangalam - Research Computing, OIT, Rm 225 MSTB, UC Irvine
[m/c 2225] / 92697 Google Voice Multiplexer: (949) 478-4487
415 South Circle View Dr, Irvine, CA, 92697 [shipping]
MSTB Lat/Long: (33.642025,-117.844414) (paste into Google Maps)
---
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://supercolony.gluster.org/pipermail/gluster-users/attachments/20131008/fea421e6/attachment.html>