Hi,
Below is the volume configuration that we used to test the patch.
testvol
Type: Distributed-Replicate
Volume ID: 9aa40921-c5ae-416e-a7d2-0a82d53a8b4d
Status: Started
Snapshot Count: 0
Number of Bricks: 24 x 3 = 72
Transport-type: tcp
Bricks:
Brick1: host003.addr.com:/gluster/brick1/testvol
Brick2: host005.addr.com:/gluster/brick1/testvol
Brick3: host008.addr.com:/gluster/brick1/testvol
Brick4: host011.addr.com:/gluster/brick1/testvol
Brick5: host015.addr.com:/gluster/brick1/testvol
Brick6: host016.addr.com:/gluster/brick1/testvol
Brick7: host003.addr.com:/gluster/brick2/testvol
Brick8: host005.addr.com:/gluster/brick2/testvol
Brick9: host008.addr.com:/gluster/brick2/testvol
Brick10: host011.addr.com:/gluster/brick2/testvol
Brick11: host015.addr.com:/gluster/brick2/testvol
Brick12: host016.addr.com:/gluster/brick2/testvol
Brick13: host003.addr.com:/gluster/brick3/testvol
Brick14: host005.addr.com:/gluster/brick3/testvol
Brick15: host008.addr.com:/gluster/brick3/testvol
Brick16: host011.addr.com:/gluster/brick3/testvol
Brick17: host015.addr.com:/gluster/brick3/testvol
Brick18: host016.addr.com:/gluster/brick3/testvol
Brick19: host003.addr.com:/gluster/brick4/testvol
Brick20: host005.addr.com:/gluster/brick4/testvol
Brick21: host008.addr.com:/gluster/brick4/testvol
Brick22: host011.addr.com:/gluster/brick4/testvol
Brick23: host015.addr.com:/gluster/brick4/testvol
Brick24: host016.addr.com:/gluster/brick4/testvol
Brick25: host003.addr.com:/gluster/brick5/testvol
Brick26: host005.addr.com:/gluster/brick5/testvol
Brick27: host008.addr.com:/gluster/brick5/testvol
Brick28: host011.addr.com:/gluster/brick5/testvol
Brick29: host015.addr.com:/gluster/brick5/testvol
Brick30: host016.addr.com:/gluster/brick5/testvol
Brick31: host003.addr.com:/gluster/brick6/testvol
Brick32: host005.addr.com:/gluster/brick6/testvol
Brick33: host008.addr.com:/gluster/brick6/testvol
Brick34: host011.addr.com:/gluster/brick6/testvol
Brick35: host015.addr.com:/gluster/brick6/testvol
Brick36: host016.addr.com:/gluster/brick6/testvol
Brick37: host003.addr.com:/gluster/brick7/testvol
Brick38: host005.addr.com:/gluster/brick7/testvol
Brick39: host008.addr.com:/gluster/brick7/testvol
Brick40: host011.addr.com:/gluster/brick7/testvol
Brick41: host015.addr.com:/gluster/brick7/testvol
Brick42: host016.addr.com:/gluster/brick7/testvol
Brick43: host003.addr.com:/gluster/brick8/testvol
Brick44: host005.addr.com:/gluster/brick8/testvol
Brick45: host008.addr.com:/gluster/brick8/testvol
Brick46: host011.addr.com:/gluster/brick8/testvol
Brick47: host015.addr.com:/gluster/brick8/testvol
Brick48: host016.addr.com:/gluster/brick8/testvol
Brick49: host003.addr.com:/gluster/brick9/testvol
Brick50: host005.addr.com:/gluster/brick9/testvol
Brick51: host008.addr.com:/gluster/brick9/testvol
Brick52: host011.addr.com:/gluster/brick9/testvol
Brick53: host015.addr.com:/gluster/brick9/testvol
Brick54: host016.addr.com:/gluster/brick9/testvol
Brick55: host003.addr.com:/gluster/brick10/testvol
Brick56: host005.addr.com:/gluster/brick10/testvol
Brick57: host008.addr.com:/gluster/brick10/testvol
Brick58: host011.addr.com:/gluster/brick10/testvol
Brick59: host015.addr.com:/gluster/brick10/testvol
Brick60: host016.addr.com:/gluster/brick10/testvol
Brick61: host003.addr.com:/gluster/brick11/testvol
Brick62: host005.addr.com:/gluster/brick11/testvol
Brick63: host008.addr.com:/gluster/brick11/testvol
Brick64: host011.addr.com:/gluster/brick11/testvol
Brick65: host015.addr.com:/gluster/brick11/testvol
Brick66: host016.addr.com:/gluster/brick11/testvol
Brick67: host003.addr.com:/gluster/brick12/testvol
Brick68: host005.addr.com:/gluster/brick12/testvol
Brick69: host008.addr.com:/gluster/brick12/testvol
Brick70: host011.addr.com:/gluster/brick12/testvol
Brick71: host015.addr.com:/gluster/brick12/testvol
Brick72: host016.addr.com:/gluster/brick12/testvol
Options Reconfigured:
diagnostics.count-fop-hits: on
diagnostics.latency-measurement: on
server.event-threads: 4
client.event-threads: 4
cluster.lookup-optimize: on
network.inode-lru-limit: 200000
performance.md-cache-timeout: 600
performance.cache-invalidation: on
performance.stat-prefetch: on
features.cache-invalidation-timeout: 600
features.cache-invalidation: on
storage.fips-mode-rchecksum: on
transport.address-family: inet
nfs.disable: off
performance.client-io-threads: off
For specific to size I did untar linux.tar 600 times, per linux.tar it has almost 65k files so the total number of files around 39M and size of the
volume is around 600G (per tar size is around 1G).
It will be good if we can run the same for long-duration like around 6000 times.
testvol
Type: Distributed-Replicate
Volume ID: 9aa40921-c5ae-416e-a7d2-0a82d53a8b4d
Status: Started
Snapshot Count: 0
Number of Bricks: 24 x 3 = 72
Transport-type: tcp
Bricks:
Brick1: host003.addr.com:/gluster/brick1/testvol
Brick2: host005.addr.com:/gluster/brick1/testvol
Brick3: host008.addr.com:/gluster/brick1/testvol
Brick4: host011.addr.com:/gluster/brick1/testvol
Brick5: host015.addr.com:/gluster/brick1/testvol
Brick6: host016.addr.com:/gluster/brick1/testvol
Brick7: host003.addr.com:/gluster/brick2/testvol
Brick8: host005.addr.com:/gluster/brick2/testvol
Brick9: host008.addr.com:/gluster/brick2/testvol
Brick10: host011.addr.com:/gluster/brick2/testvol
Brick11: host015.addr.com:/gluster/brick2/testvol
Brick12: host016.addr.com:/gluster/brick2/testvol
Brick13: host003.addr.com:/gluster/brick3/testvol
Brick14: host005.addr.com:/gluster/brick3/testvol
Brick15: host008.addr.com:/gluster/brick3/testvol
Brick16: host011.addr.com:/gluster/brick3/testvol
Brick17: host015.addr.com:/gluster/brick3/testvol
Brick18: host016.addr.com:/gluster/brick3/testvol
Brick19: host003.addr.com:/gluster/brick4/testvol
Brick20: host005.addr.com:/gluster/brick4/testvol
Brick21: host008.addr.com:/gluster/brick4/testvol
Brick22: host011.addr.com:/gluster/brick4/testvol
Brick23: host015.addr.com:/gluster/brick4/testvol
Brick24: host016.addr.com:/gluster/brick4/testvol
Brick25: host003.addr.com:/gluster/brick5/testvol
Brick26: host005.addr.com:/gluster/brick5/testvol
Brick27: host008.addr.com:/gluster/brick5/testvol
Brick28: host011.addr.com:/gluster/brick5/testvol
Brick29: host015.addr.com:/gluster/brick5/testvol
Brick30: host016.addr.com:/gluster/brick5/testvol
Brick31: host003.addr.com:/gluster/brick6/testvol
Brick32: host005.addr.com:/gluster/brick6/testvol
Brick33: host008.addr.com:/gluster/brick6/testvol
Brick34: host011.addr.com:/gluster/brick6/testvol
Brick35: host015.addr.com:/gluster/brick6/testvol
Brick36: host016.addr.com:/gluster/brick6/testvol
Brick37: host003.addr.com:/gluster/brick7/testvol
Brick38: host005.addr.com:/gluster/brick7/testvol
Brick39: host008.addr.com:/gluster/brick7/testvol
Brick40: host011.addr.com:/gluster/brick7/testvol
Brick41: host015.addr.com:/gluster/brick7/testvol
Brick42: host016.addr.com:/gluster/brick7/testvol
Brick43: host003.addr.com:/gluster/brick8/testvol
Brick44: host005.addr.com:/gluster/brick8/testvol
Brick45: host008.addr.com:/gluster/brick8/testvol
Brick46: host011.addr.com:/gluster/brick8/testvol
Brick47: host015.addr.com:/gluster/brick8/testvol
Brick48: host016.addr.com:/gluster/brick8/testvol
Brick49: host003.addr.com:/gluster/brick9/testvol
Brick50: host005.addr.com:/gluster/brick9/testvol
Brick51: host008.addr.com:/gluster/brick9/testvol
Brick52: host011.addr.com:/gluster/brick9/testvol
Brick53: host015.addr.com:/gluster/brick9/testvol
Brick54: host016.addr.com:/gluster/brick9/testvol
Brick55: host003.addr.com:/gluster/brick10/testvol
Brick56: host005.addr.com:/gluster/brick10/testvol
Brick57: host008.addr.com:/gluster/brick10/testvol
Brick58: host011.addr.com:/gluster/brick10/testvol
Brick59: host015.addr.com:/gluster/brick10/testvol
Brick60: host016.addr.com:/gluster/brick10/testvol
Brick61: host003.addr.com:/gluster/brick11/testvol
Brick62: host005.addr.com:/gluster/brick11/testvol
Brick63: host008.addr.com:/gluster/brick11/testvol
Brick64: host011.addr.com:/gluster/brick11/testvol
Brick65: host015.addr.com:/gluster/brick11/testvol
Brick66: host016.addr.com:/gluster/brick11/testvol
Brick67: host003.addr.com:/gluster/brick12/testvol
Brick68: host005.addr.com:/gluster/brick12/testvol
Brick69: host008.addr.com:/gluster/brick12/testvol
Brick70: host011.addr.com:/gluster/brick12/testvol
Brick71: host015.addr.com:/gluster/brick12/testvol
Brick72: host016.addr.com:/gluster/brick12/testvol
Options Reconfigured:
diagnostics.count-fop-hits: on
diagnostics.latency-measurement: on
server.event-threads: 4
client.event-threads: 4
cluster.lookup-optimize: on
network.inode-lru-limit: 200000
performance.md-cache-timeout: 600
performance.cache-invalidation: on
performance.stat-prefetch: on
features.cache-invalidation-timeout: 600
features.cache-invalidation: on
storage.fips-mode-rchecksum: on
transport.address-family: inet
nfs.disable: off
performance.client-io-threads: off
For specific to size I did untar linux.tar 600 times, per linux.tar it has almost 65k files so the total number of files around 39M and size of the
volume is around 600G (per tar size is around 1G).
It will be good if we can run the same for long-duration like around 6000 times.
On Tue, Feb 25, 2020 at 1:31 PM Mohit Agrawal <moagrawa@xxxxxxxxxx> wrote:
With these 2 changes, we are getting a good improvement in file creation andslight improvement in the "ls-l" operation.We are still working to improve the same.
To validate the same we have executed below script from 6 different clients on 24x3 distributed
replicate environment after enabling performance related option
mkdir /gluster-mount/`hostname`
date;
for i in {1..100}
do
echo "directory $i is created" `date`
mkdir /gluster-mount/`hostname`/dir$i
tar -xvf /root/kernel_src/linux-5.4-rc8.tar.xz -C /gluster-mount/`hostname`/dir$i >/dev/null
done
With no Patch
tar was taking almost 36-37 hours
With Patch
tar is taking almost 26 hours
We were getting a similar kind of improvement in smallfile tool also.On Tue, Feb 25, 2020 at 1:29 PM Mohit Agrawal <moagrawa@xxxxxxxxxx> wrote:Hi,We observed performance is mainly hurt while .glusterfs is having huge data.As we know before executing a fop in POSIX xlator it builds an internal path based on GFID.To validate the path it call's (l)stat system call and while .glusterfs is heavily loaded kernel takes time to lookup inode and due to that performance drops
To improve the same we tried two things with this patch(https://review.gluster.org/#/c/glusterfs/+/23783/)
1) To keep the first level entry always in a cache so that inode lookup will be faster we have to keep open first level fds(00 to ff total 256) per brick at the time of starting a brick process. Even in case of cache cleanup kernel will not evict first level fds from the cache and performance will improve
2) We tried using "at" based call(lstatat,fstatat,readlinat etc) instead of accessing complete path access relative path, these call's were also helpful to improve performance.Regards,Mohit Agrawal
_______________________________________________ Community Meeting Calendar: Schedule - Every Tuesday at 14:30 IST / 09:00 UTC Bridge: https://bluejeans.com/441850968 Gluster-devel mailing list Gluster-devel@xxxxxxxxxxx https://lists.gluster.org/mailman/listinfo/gluster-devel