Re: Slow performance on samba with small files

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Was just reading the small file section of the 3.9 release notes:

http://blog.gluster.org/2016/11/announcing-gluster-3-9/

Setting these options does seem to increase transfer speeds on small files by quite alot:
  # gluster volume set <volname> features.cache-invalidation on
  # gluster volume set <volname> features.cache-invalidation-timeout 600
  # gluster volume set <volname> performance.stat-prefetch on       #This one seemed to have the biggest impact in small file performance for me
  # gluster volume set <volname> performance.cache-invalidation on
  # gluster volume set <volname> performance.md-cache-timeout 600

Setting  # gluster volume set <volname> performance.cache-samba-metadata on # Only for SMB access. Results in my client to keep losing the state of the server and the shares often disappear / become inaccessible and I can only get them back if I logon / logoff the machine, this is with distro Samba 4.4.4.

Has anyone here had the same issue, does the version of samba need to be newer to support the feature ?

Thanks

Gary Lloyd
________________________________________________
I.T. Systems:Keele University
Finance & IT Directorate
Keele:Staffs:IC1 Building:ST5 5NB:UK
+44 1782 733063
________________________________________________

On 8 February 2017 at 11:49, Дмитрий Глушенок <glush@xxxxxxxxxx> wrote:
For _every_ file copied samba performs readdir() to get all entries of the destination folder. Then the list is searched for filename (to prevent name collisions as SMB shares are not case sensitive). More files in folder, more time it takes to perform readdir(). It is a lot worse for Gluster because single folder contents distributed among many servers and Gluster has to join many directory listings (requested via network) to form one and return it to caller.

Rsync does not perform readdir(), it just checks file existence with stat() IIRC. And as modern Gluster versions has default setting to check for file only at its destination (when volume is balanced) - the check performs relatively fast.

You can hack samba to prevent such checks if your goal is to get files copied not so slow (as you sure the files you are copying are not exists at destination). But try to perform 'ls -l' on _not_ cached folder with thousands of files - it will take tens of seconds. This is time your users will waste browsing shares.

8 февр. 2017 г., в 13:17, Gary Lloyd <g.lloyd@xxxxxxxxxxx> написал(а):

Thanks for the reply

I've just done a bit more testing. If I use rsync from a gluster client to copy the same files to the mount point it only takes a couple of minutes.
For some reason it's very slow on samba though (version 4.4.4).

I have tried various samba tweaks / settings and have yet to get acceptable write speed on small files.


Gary Lloyd
________________________________________________
I.T. Systems:Keele University
Finance & IT Directorate
Keele:Staffs:IC1 Building:ST5 5NB:UK
+44 1782 733063
________________________________________________

On 8 February 2017 at 10:05, Дмитрий Глушенок <glush@xxxxxxxxxx> wrote:
Hi,

There is a number of tweaks/hacks to make it better, but IMHO overall performance with small files is still unacceptable for such folders with thousands of entries.

If your shares are not too large to be placed on single filesystem and you still want to use Gluster - it is possible to run VM on top of Gluster. Inside that VM you can create ZFS/NTFS to be shared.

8 февр. 2017 г., в 12:10, Gary Lloyd <g.lloyd@xxxxxxxxxxx> написал(а):

Hi

I am currently testing gluster 3.9 replicated/distrbuted on centos 7.3 with samba/ctdb.
I have been able to get it all up and running, but writing small files is really slow. 

If I copy large files from gluster backed samba I get almost wire speed (We only have 1Gb at the moment). I get around half that speed if I copy large files to the gluster backed samba system, which I am guessing is due to it being replicated (This is acceptable).

Small file write performance seems really poor for us though:
As an example I have an eclipse IDE workspace folder that is 6MB in size that has around 6000 files in it. A lot of these files are <1k in size.

If I copy this up to gluster backed samba it takes almost one hour to get there.
With our basic samba deployment it only takes about 5 minutes.

Both systems reside on the same disks/SAN.


I was hoping that we would be able to move away from using a proprietary SAN to house our network shares and use gluster instead.

Does anyone have any suggestions of anything I could tweak to make it better ?

Many Thanks


Gary Lloyd
________________________________________________
I.T. Systems:Keele University
Finance & IT Directorate
Keele:Staffs:IC1 Building:ST5 5NB:UK
________________________________________________
_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://lists.gluster.org/mailman/listinfo/gluster-users

--
Dmitry Glushenok
Jet Infosystems



--
Dmitry Glushenok
Jet Infosystems


_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://lists.gluster.org/mailman/listinfo/gluster-users

[Index of Archives]     [Gluster Development]     [Linux Filesytems Development]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux