This is a widely perceived feature/bug of gluster. It also affects other distributed filesystems, tho generally not as much. We've done 2 things to address this. One is a distributed 'du' that is clusterfork'ed out to the storage nodes and compiles the results. This is realtime and will provide data to that point. If you're interested in it, let me know and I can provide the code to do this. However, it requires clusterfork, some per-site config, and is specific to 'du', altho it could be modified to support other shell commands. Here's the difference in performance on a fairly busy gluster system (4 storage nodes, 8 volumes, 340TB, 60% used) ===================== 14:54:09 root at hpc-s:/som 1226 $ time du -sh abusch/* 694M abusch/MATS-gtf ^C real 3m58.098s <--- killed after ~4m user 0m0.033s sys 0m0.351s 14:58:24 root at hpc-s:/som 1227 $ gfdu abusch/\* INFO: Corrected gluster starting path: [/som/abusch/*] About to execute [/root/bin/cf --script --tar=GLSRV "du -s /raid1/som/abusch/* ; du -s /raid2/som/abusch/*; "] Go? [yN]y INFO: For raw results [cd /root/cf/CF-du--s--raid1-som- abu-14.58.38_2013-10-08] Size: File|Dir 693.8203 M /som/abusch/MATS-gtf 1.5292 G /som/abusch/MISO-gffs 764.5117 M /som/abusch/MISO-gffs-v2 23.8720 G /som/abusch/deepSeq 25.2845 G /som/abusch/genomes 5.4239 G /som/abusch/index 16.8011 G /som/abusch/index2 ----------------------------------- 74.3348 G Total time was ~4s ===================== The other approach is with the RobinHood Policy Engine <http://sourceforge.net/apps/trac/robinhood> which runs on a cron and recurses thru your FS, taking X hours, but compiles that info into a MySQL DB that is instantly responsive (but could be slightly out of date). NTL, it's a very helpful tool to detect hotspots and ZOTfiles (Zillions Of Tiny files) We are using it to monitor NFS volumes, Gluster, and Fraunhofer FSs. It is a very slick system and a student (Adam Brenner) is modifying it to generate better stats via the web interface. See his github and the robinhood trac: https://github.com/abrenner/robinhood-multifs-web http://sourceforge.net/apps/trac/robinhood On Tuesday, October 08, 2013 09:07:52 AM Anders Salling Andersen wrote: > Hi all i have a 50tb glusterfs replicated setup, with Many small files. My > metadata is very slow ex. Du -sh takes over 24 hours. Is there a Way to > make faster metadata ? > > Regards Anders. > _______________________________________________ > Gluster-users mailing list > Gluster-users at gluster.org > http://supercolony.gluster.org/mailman/listinfo/gluster-users --- Harry Mangalam - Research Computing, OIT, Rm 225 MSTB, UC Irvine [m/c 2225] / 92697 Google Voice Multiplexer: (949) 478-4487 415 South Circle View Dr, Irvine, CA, 92697 [shipping] MSTB Lat/Long: (33.642025,-117.844414) (paste into Google Maps) --- -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://supercolony.gluster.org/pipermail/gluster-users/attachments/20131008/fea421e6/attachment.html>