Hi Mohit,
Thank you for the update. More inline.
On Wed, May 1, 2019 at 11:45 PM Mohit Agrawal <moagrawa@xxxxxxxxxx> wrote:
Hi Vijay,I have tried to execute smallfile tool on volume(12x3), i have not found any significant performance improvementfor smallfile operations, I have configured 4 clients and 8 thread to run operations.
For measuring performance, did you measure both time taken and cpu consumed? Normally O(n) computations are cpu expensive and we might see better results with a hash table when a large number of objects ( a few thousands) are present in a single dictionary. If you haven't gathered cpu statistics, please also gather that for comparison.
I have generated statedump and found below data for dictionaries specific to gluster processesbrickmax-pairs-per-dict=50total-pairs-used=192212171total-dicts-used=24794349average-pairs-per-dict=7glusterdmax-pairs-per-dict=301total-pairs-used=156677total-dicts-used=30719average-pairs-per-dict=5fuse process[dict]max-pairs-per-dict=50total-pairs-used=88669561total-dicts-used=12360543average-pairs-per-dict=7It seems dictionary has max-pairs in case of glusterd and while no. of volumes are high the number can be increased.I think there is no performance regression in case of brick and fuse. I have used hash_size 20 for the dictionary.Let me know if you can provide some other test to validate the same.
A few more items to try out:
1. Vary the number of buckets and test.
2. Create about 10000 volumes and measure performance for a volume info <volname> operation on some random volume?
3. Check the related patch from Facebook and see if we can incorporate any ideas from their patch.
Thanks,
Vijay
Thanks,Mohit AgrawalOn Tue, Apr 30, 2019 at 2:29 PM Mohit Agrawal <moagrawa@xxxxxxxxxx> wrote:Thanks, Amar for sharing the patch, I will test and share the result.On Tue, Apr 30, 2019 at 2:23 PM Amar Tumballi Suryanarayan <atumball@xxxxxxxxxx> wrote:Shreyas/Kevin tried to address it some time back using https://bugzilla.redhat.com/show_bug.cgi?id=1428049 (https://review.gluster.org/16830)I vaguely remember the reason to keep the hash value 1 was done during the time when we had dictionary itself sent as on wire protocol, and in most other places, number of entries in dictionary was on an avg, 3. So, we felt, saving on a bit of memory for optimization was better at that time.-AmarOn Tue, Apr 30, 2019 at 12:02 PM Mohit Agrawal <moagrawa@xxxxxxxxxx> wrote:sure Vijay, I will try and update.Regards,Mohit AgrawalOn Tue, Apr 30, 2019 at 11:44 AM Vijay Bellur <vbellur@xxxxxxxxxx> wrote:Hi Mohit,On Mon, Apr 29, 2019 at 7:15 AM Mohit Agrawal <moagrawa@xxxxxxxxxx> wrote:Hi All,I was just looking at the code of dict, I have one query current dictionary logic.I am not able to understand why we use hash_size is 1 for a dictionary.IMO with thehash_size of 1 dictionary always work like a list, not a hash, for every lookupin dictionary complexity is O(n).Before optimizing the code I just want to know what was the exact reason to definehash_size is 1?This is a good question. I looked up the source in gluster's historic repo [1] and hash_size is 1 even there. So, this could have been the case since the first version of the dictionary code.Would you be able to run some tests with a larger hash_size and share your observations?Thanks,Vijay
_______________________________________________ Gluster-devel mailing list Gluster-devel@xxxxxxxxxxx https://lists.gluster.org/mailman/listinfo/gluster-devel