On February 10, 2020 5:32:29 PM GMT+02:00, Matthias Schniedermeyer <matthias-gluster-users@xxxxxxxxxxxxx> wrote: >On 10.02.20 16:21, Strahil Nikolov wrote: >> On February 10, 2020 2:25:17 PM GMT+02:00, Matthias Schniedermeyer ><matthias-gluster-users@xxxxxxxxxxxxx> wrote: >>> Hi >>> >>> >>> I would describe our basic use case for gluster as: >>> "data-store for a cold-standby application". >>> >>> A specific application is installed on 2 hardware machines, the data >is >>> kept in-sync between the 2 machines by a replica-2 gluster volume. >>> (IOW: "RAID 1") >>> >>> At any one time only 1 machine has the volume mounted and the >>> application running. If the machine goes down the application is >>> started >>> on the remaining machine. >>> IOW at any one point in time there is only ever 1 "reader & writer" >>> running. >>> >>> I profiled a performance problem we have with this application, >which >>> unfortunately we can't modify. >>> >>> The profile shows many "opendir/readdirp/releasedir" cycles, the >>> directory in question has about 1000 files and the application >"stalls" >>> for several milliseconds any time it decides to do a readdir. >>> The volume is mounted via FUSE and it appears that said operation is >>> not >>> cached at all. >>> >>> To provide a test-case i tried to replicate what the application >does. >>> The problematic operation is nearly perfectly emulated just by using >>> "ls .". >>> >>> I created a script that replicates how we use gluster and >demonstrates >>> that a FUSE-mount appears to be lacking any caching of readdir. >>> >>> A word about the test-environment: >>> 2 identical servers >>> Dual Socket Xeon CPU E5-2640 v3 (8 cores, 2.60GHz, HT enabled) >>> RAM: 128GB DDR4 ECC (8x16GB) >>> Storage: 2TB Intel P3520 PCIe-NVMe-SSD >>> Network: Gluster: 10GB/s direct connect (no switch), external: >1Gbit/s >>> OS: CentOS 7.7, Installed with "Minimal" ISO, everything: Default >>> Up2Date as of: 2020-01-21 (Kernel: 3.10.0-1062.9.1.el7.x86_64) >>> SELinux: Disabled >>> SSH-Key for 1 -> 2 exchanged >>> Gluster 6.7 packages installed via 'centos-release-gluster6' >>> >>> see attached: >gluster-testcase-no-caching-of-dir-operations-for-fuse.sh >>> >>> The meat of the testcase is this: >>> a profile of: >>> ls . >>> vs: >>> ls . . . . . . . . . . >>> (10 dots) >>> >>>> cat /root/profile-1-times | grep DIR | head -n 3 >>> 0.00 0.00 us 0.00 us 0.00 us 1 >>> RELEASEDIR >>> 0.27 66.79 us 66.79 us 66.79 us 1 >OPENDIR >>> 98.65 12190.30 us 9390.88 us 14989.73 us 2 >>> READDIRP >>> >>>> cat /root/profile-10-times | grep DIR | head -n 3 >>> 0.00 0.00 us 0.00 us 0.00 us 10 >>> RELEASEDIR >>> 0.64 108.02 us 85.72 us 131.96 us 10 >OPENDIR >>> 99.36 8388.64 us 5174.71 us 14808.77 us 20 >>> READDIRP >>> >>> This testcase shows perfect scaling. >>> 10 times the request, results in 10 times the gluster-operations. >>> >>> I would say ideally there should be no difference in the number of >>> gluster-operations, regardless of how often a directory is read in a >>> short amount of time (with no changes in between) >>> >>> >>> Is there something we can do to enable caching or otherwise improve >>> performance? >> >> Hi Matthias, >> >> Have you tried the 'readdir-ahead' option . >> According to docs it is useful for ' improving sequential directory >read performance' . >> I'm not sure how gluster defines sequential directory read, but it's >worth trying. > >readdir-ahead is enabled by default. Has been for several years. >In effect this option changes how many READDIRP OPs are executed for a >single "ls .". >(It takes more OPs when readdir-ahead is disabled.) > >> Also, you can try metadata caching , as described in: >> >https://access.redhat.com/documentation/en-us/red_hat_gluster_storage/3.3/html/administration_guide/sect-directory_operations >> The actual group should contain the following: >> >https://github.com/gluster/glusterfs/blob/master/extras/group-metadata-cache > >Metadata-Caching, in general, works: >e.g. `stat FILE` is cached if executed repeatedly. > >AFAICT the big exception to metadata-caching is readdir. Hi Matthias, This now has turned into 'shoot into the dark'. I have checked a nice presentation and these 2 attracted my attention: performance.parallel-readdir on cluster.readdir-hashed on Presentation is found at: https://www.google.com/url?sa=t&source=web&rct=j&url=https://events.static.linuxfound.org/sites/events/files/slides/Gluster_DirPerf_Vault2017_0.pdf&ved=2ahUKEwirh-qBs8fnAhWTTxUIHfn3CWEQFjAAegQIAhAB&usg=AOvVaw1yhHZaWovhYGCexkGaMVQ8&cshid=1581352097024 I hope you find something useful there. Best Regards, Strahil Nikolov ________ Community Meeting Calendar: APAC Schedule - Every 2nd and 4th Tuesday at 11:30 AM IST Bridge: https://bluejeans.com/441850968 NA/EMEA Schedule - Every 1st and 3rd Tuesday at 01:00 PM EDT Bridge: https://bluejeans.com/441850968 Gluster-users mailing list Gluster-users@xxxxxxxxxxx https://lists.gluster.org/mailman/listinfo/gluster-users