Can someone help me find doc. or explain what the below report means? I'm concerned that the localhost has 1127 failures. gluster> volume rebalance devstatic status Node Rebalanced-files size scanned failures status run time in secs --------- ----------- ----------- ----------- ----------- ------------ -------------- localhost 0 0Bytes 3688880 1127 completed 15788.00 omhq1832 1186 82.9MB 3688880 0 completed 17676.00 omdx1448 0 0Bytes 3688880 0 completed 16930.00 omdx14f0 0 0Bytes 3688879 0 completed 16931.00 volume rebalance: devstatic: success: Khoi From: gluster-users-request at gluster.org To: gluster-users at gluster.org Date: 09/18/2013 06:59 AM Subject: Gluster-users Digest, Vol 65, Issue 18 Sent by: gluster-users-bounces at gluster.org Send Gluster-users mailing list submissions to gluster-users at gluster.org To subscribe or unsubscribe via the World Wide Web, visit http://supercolony.gluster.org/mailman/listinfo/gluster-users or, via email, send a message with subject or body 'help' to gluster-users-request at gluster.org You can reach the person managing the list at gluster-users-owner at gluster.org When replying, please edit your Subject line so it is more specific than "Re: Contents of Gluster-users digest..." Today's Topics: 1. Re: gluster volume top issue (Nux!) 2. Re: gluster volume top issue (Shishir Gowda) 3. Re: gluster volume top issue (Nux!) 4. Strange Sync Problem - FS not matching GlusterFS (Cristiano Bianchi) 5. Re: GlusterFS extended attributes, "system" namespace (Iain Buchanan) 6. Re: Gluster 3.4 QEMU and Permission Denied Errors (Andrew Niemantsverdriet) 7. Re: remove-brick question (james.bellinger at icecube.wisc.edu) 8. Re: [rhos-list] [gluster-swift] Gluster UFO 3.4 swift Multi tenant question (Luis Pabon) 9. Re: Gluster 3.4 QEMU and Permission Denied Errors (Asias He) 10. Re: gluster volume top issue (Shishir Gowda) 11. Re: Gluster samba vfs read performance slow (kane) 12. Re: Gluster samba vfs read performance slow (Anand Avati) 13. Re: Gluster samba vfs read performance slow (kane) 14. Mounting same replica-volume on multiple clients. ???? (Bobby Jacob) 15. Re: Gluster samba vfs read performance slow (Anand Avati) 16. Re: Mounting same replica-volume on multiple clients. ???? (Daniel M?ller) 17. Re: Gluster samba vfs read performance slow (kane) 18. Re: Mounting same replica-volume on multiple clients. ???? (Bobby Jacob) 19. Re: Gluster samba vfs read performance slow (Anand Avati) 20. Re: Mounting same replica-volume on multiple clients. ???? (Daniel M?ller) 21. Re: Gluster samba vfs read performance slow (kane) 22. Re: [Gluster-devel] glusterfs-3.4.1qa2 released (Luk?? Bezdi?ka) 23. Re: Cant see files after network failure (Dragon) 24. Re: gluster volume top issue (Nux!) 25. Secure Setup / Separate GlusterFS / Encryption (Michael.OBrien) 26. Re: Cant see files after network failure (Krishnan Parthasarathi) 27. Re: Cant see files after network failure (Dragon) ---------------------------------------------------------------------- Message: 1 Date: Tue, 17 Sep 2013 13:01:39 +0100 From: Nux! <nux at li.nux.ro> To: Gluster Users <gluster-users at gluster.org> Subject: Re: gluster volume top issue Message-ID: <3c99760fce535b37e28371a8221e670f at li.nux.ro> Content-Type: text/plain; charset=UTF-8; format=flowed On 16.09.2013 11:22, Nux! wrote: > Hello, > > I'm trying to find out the most accessed (read from and/or written > to) file in a volume and "gluster volume top" does not seem to be > helping me at all. > For example the following would only output the list of bricks: > gluster volume top xenvms write nfs brick localhost:/bricks/xenvms or > gluster volume top xenvms write > > "gluster volume top xenvms open" says 0 fds opened for all bricks. > > I'm sure there should be some activity on this volume as I have a > Xenserver reading & writing at 200-300 Mbps to it over NFS. > > Any pointers? Anyone? -- Sent from the Delta quadrant using Borg technology! Nux! www.nux.ro ------------------------------ Message: 2 Date: Tue, 17 Sep 2013 08:13:01 -0400 (EDT) From: Shishir Gowda <sgowda at redhat.com> To: Nux! <nux at li.nux.ro> Cc: Gluster Users <gluster-users at gluster.org> Subject: Re: gluster volume top issue Message-ID: <614027677.14062684.1379419981657.JavaMail.root at redhat.com> Content-Type: text/plain; charset=utf-8 Hi Nux, Is only open count being shown as "0", or all stats being shown as "0"? With regards, Shishir ----- Original Message ----- From: "Nux!" <nux at li.nux.ro> To: "Gluster Users" <gluster-users at gluster.org> Sent: Tuesday, September 17, 2013 5:31:39 PM Subject: Re: gluster volume top issue On 16.09.2013 11:22, Nux! wrote: > Hello, > > I'm trying to find out the most accessed (read from and/or written > to) file in a volume and "gluster volume top" does not seem to be > helping me at all. > For example the following would only output the list of bricks: > gluster volume top xenvms write nfs brick localhost:/bricks/xenvms or > gluster volume top xenvms write > > "gluster volume top xenvms open" says 0 fds opened for all bricks. > > I'm sure there should be some activity on this volume as I have a > Xenserver reading & writing at 200-300 Mbps to it over NFS. > > Any pointers? Anyone? -- Sent from the Delta quadrant using Borg technology! Nux! www.nux.ro _______________________________________________ Gluster-users mailing list Gluster-users at gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-users ------------------------------ Message: 3 Date: Tue, 17 Sep 2013 14:16:05 +0100 From: Nux! <nux at li.nux.ro> To: Shishir Gowda <sgowda at redhat.com> Cc: Gluster Users <gluster-users at gluster.org> Subject: Re: gluster volume top issue Message-ID: <10854f971aad0d256b7a8440f6d7b243 at li.nux.ro> Content-Type: text/plain; charset=UTF-8; format=flowed On 17.09.2013 13:13, Shishir Gowda wrote: > Hi Nux, > > Is only open count being shown as "0", or all stats being shown as > "0"? Hi Shishir, For all bricks I get: Current open fds: 0, Max open fds: 0, Max openfd time: N/A Lucian -- Sent from the Delta quadrant using Borg technology! Nux! www.nux.ro ------------------------------ Message: 4 Date: Tue, 17 Sep 2013 07:42:23 +0100 From: Cristiano Bianchi <c.bianchi at keepthinking.it> To: gluster-users at gluster.org Subject: Strange Sync Problem - FS not matching GlusterFS Message-ID: <CAC=_txqvQfrmJMdZwPL4X15xtBDvcKH1s=dtdQh1bUy5qQikSA at mail.gmail.com> Content-Type: text/plain; charset="utf-8" Hi all, we have a strange sync issue. Scenario: - GlusterFS 3.2.5 with two nodes, N1 and N2, replicating over VLAN - Both nodes share the /home folder, in both cases mounted on /mnt/glusterfs/ - If I create file in N1 /mnt/glusterfs/user1/test.txt it shows up in /home/user1/test.txt - all normal - It also shows up on N2 in /mnt/glusterfs/user1/test.txt - but NOT (and this is the odd part) in N2 /home/user1/ - If I do the same starting from N2, creating a file in N2 /mnt/glusterfs/user1/test.txt it all works: the file shows up in N2 /home/user1, N1 /mnt/glusterfs/user1 and N1 /home/user1 My questions are (if they can be answered based on the info provided): - What could have gone wrong and how to fix it? - How do I re-sync the /home folder in N2 to match the content of glusterfs - which is correct in the virtual FS N2 /mnt/glusterfs It seems that N2 has lost the wires between the glusterfs db and the 'real world' of the filesystem. With many thanks, Cristiano -- Cristiano Bianchi *Keepthinking* 43 Clerkenwell Road London EC1M 5RS tel +44 20 7490 5337 mobile +44 7939 041169 (UK) c.bianchi at keepthinking.it www.keepthinking.it --- Registration no. 04905582 VAT 831 1329 62 -------------- next part -------------- An HTML attachment was scrubbed... URL: < http://supercolony.gluster.org/pipermail/gluster-users/attachments/20130917/b0bbb260/attachment-0001.html > ------------------------------ Message: 5 Date: Tue, 17 Sep 2013 14:36:56 +0100 From: Iain Buchanan <iainbuc at gmail.com> To: Venky Shankar <yknev.shankar at gmail.com> Cc: gluster-users <gluster-users at gluster.org> Subject: Re: GlusterFS extended attributes, "system" namespace Message-ID: <876CC0EB-2E5F-4827-B3ED-8F37E92BF108 at gmail.com> Content-Type: text/plain; charset="windows-1252" Thanks - sorry for the delay responding. I switched it to running as super-user and that has fixed the problem. I'll have to investigate mount-broker in future. Iain On 5 Sep 2013, at 17:30, Venky Shankar <yknev.shankar at gmail.com> wrote: > 'system' namespace is flipped to 'trusted' for geo-replication auxillary mount. So, it should be left as 'system' in the source. > > I see that you're trying to connect to the remote slave as a non-super user. For that, you'd need to access the slave via mount-broker, which would require some modification in the glusterd volfile. > > Thanks, > -venky > > > On Thu, Sep 5, 2013 at 10:48 AM, Amar Tumballi <amarts at redhat.com> wrote: > On 08/28/2013 11:15 PM, Iain Buchanan wrote: > Hi, > > I'm running GlusterFS 3.3.2 and I'm having trouble getting geo-replication to work. I think it is a problem with extended attributes. I'll using ssh with a normal user to perform the replication. > > On the server log in /var/log/glusterfs/geo-replication/VOLNAME/ssh?.log I'm getting an error "ReceClient: call ?:?:? (xtime) failed on peer with OSError". On the replication target I'm getting the same error, but with a stack trace leading back to where it tries to set extended attributes in the Python replication code. It appears to be trying to get the attribute "system.glusterfs.xyz.xtime" at line 365 of /usr/lib/glusterfs/glusterfs/python/syncdaemon/resource.py: "Xattr.lgetxattr(path, '.'.join([cls.GX_NSPACE, uuid, 'xtime')], 8))". > I don't know anything about extended attributes, but I can't get anything in the "system" namespace manually, even running as root - e.g. > touch a > getfattr -n system.test a > > The above returns "Operation not supported" rather than "No such attribute". The "user" and "trusted" namespace work fine - this is on ext3 with user_xattr set in the mount options, and also on the server (ext4). > Yes, 'system' is not allowed to be used by a process. > > On the server side I can see files have things set in the "trusted" namespace (e.g. with "getfattr -m - filename"). > > Should the setting of GX_NSPACE set the namespace to be "system" for non-root or should it always be "trusted"? (line 248 in resource.py) If I force it to be "trusted" it seems to get further (I get occasional "Operation not permitted" lines, but I think this is file permission related). > Looks like a bug. Please change 'system' to 'user' in resource.py file, and see if it works. > > Regards, > Amar > > Iain > > > _______________________________________________ > Gluster-users mailing list > Gluster-users at gluster.org > http://supercolony.gluster.org/mailman/listinfo/gluster-users > > _______________________________________________ > Gluster-users mailing list > Gluster-users at gluster.org > http://supercolony.gluster.org/mailman/listinfo/gluster-users > > _______________________________________________ > Gluster-users mailing list > Gluster-users at gluster.org > http://supercolony.gluster.org/mailman/listinfo/gluster-users -------------- next part -------------- An HTML attachment was scrubbed... URL: < http://supercolony.gluster.org/pipermail/gluster-users/attachments/20130917/53c6b075/attachment-0001.html > -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 495 bytes Desc: Message signed with OpenPGP using GPGMail URL: < http://supercolony.gluster.org/pipermail/gluster-users/attachments/20130917/53c6b075/attachment-0001.sig > ------------------------------ Message: 6 Date: Tue, 17 Sep 2013 07:59:16 -0600 From: Andrew Niemantsverdriet <andrew at rocky.edu> To: Samuli Heinonen <samppah at neutraali.net> Cc: gluster-users <gluster-users at gluster.org> Subject: Re: Gluster 3.4 QEMU and Permission Denied Errors Message-ID: <CAGn8edbMouCUqi4Rpe=g5-_wPsksU3es8Mf-noB=sRYZxA7t-A at mail.gmail.com> Content-Type: text/plain; charset=ISO-8859-1 Right now I am just using virsh to start the machines, I have also tried using Virtual Machine Manager to start them. I have enabled Gluster mounting from insecure ports, forgot to mention that in my first email. It looks like the disk mounts as it starts to boot but nothing can be written to the disk as it just hangs in an infinite loop. Thanks, _ /-\ ndrew On Tue, Sep 17, 2013 at 1:05 AM, Samuli Heinonen <samppah at neutraali.net> wrote: > Hello Andrew, > > How are you booting/managing VM's? Which user you use to launch them? > > Have you enabled Gluster mounting from insecure ports? It needs two changes. > You have to edit glusterd.vol (in /etc/glusterfs directory) and add line > "option rpc-auth-allow-insecure on". Also you have to set volume option > server.allow-insecure on (ie. gluster volume set volname > server.allow-insecure on). Restart of glusterd and stop and start of the > volume is required for these changes to take effect. > > 16.9.2013 21:38, Andrew Niemantsverdriet kirjoitti: > >> Hey List, >> >> I'm trying to test out using Gluster 3.4 for virtual machine disks. My >> enviroment consists of two Fedora 19 hosts with gluster and qemu/kvm >> installed. >> >> I have a single volume on gluster called vmdata that contains my qcow2 >> formated image created like this: >> >> qemu-img create -f qcow2 gluster://localhost/vmdata/test1.qcow 8G >> >> I'm able to boot my created virtual machine but in the logs I see this: >> >> [2013-09-16 15:16:04.471205] E [addr.c:152:gf_auth] 0-auth/addr: >> client is bound to port 46021 which is not privileged >> [2013-09-16 15:16:04.471277] I >> [server-handshake.c:567:server_setvolume] 0-vmdata-server: accepted >> client from >> gluster1.local-1061-2013/09/16-15:16:04:441166-vmdata-client-1-0 >> (version: 3.4.0)[2013-09-16 15:16:04.488000] I >> [server-rpc-fops.c:1572:server_open_cbk] 0-vmdata-server: 18: OPEN >> /test1.qcow (6b63a78b-7d5c-4195-a172-5bb6ed1e7dac) ==> (Permission >> denied) >> >> I have turned off SELinux to be sure that isn't in the way. When I >> look at the permissions on the file using ls -l I see the file is set >> to 600, this doesn't seem right. I tried manually changing the >> permission to 755 as a test and as soon as the machine booted it was >> changed back to 600. >> >> Any hints as to what is going on and how to get the disk functioning? >> The machine will boot but as soon as anything is written to disk it >> will hang forever. >> >> Thanks, >> > -- _ /-\ ndrew Niemantsverdriet Linux System Administrator Academic Computing (406) 238-7360 Rocky Mountain College 1511 Poly Dr. Billings MT, 59102 ------------------------------ Message: 7 Date: Tue, 17 Sep 2013 09:47:14 -0500 From: james.bellinger at icecube.wisc.edu To: "Ravishankar N" <ravishankar at redhat.com> Cc: gluster-users at gluster.org Subject: Re: remove-brick question Message-ID: <e47a5be41156fa52f47e44af55911c81.squirrel at webmail.icecube.wisc.edu> Content-Type: text/plain;charset=iso-8859-1 Thanks for your replies. The vols seem to match the bricks ok. FWIW, gfs-node01:/sda is the first brick; perhaps it is getting the lion's share of the pointers? The results of a log search and seem even more confusing. sdb drained rather than sda, but an error in rebalancing shows up in sdd. I include excerpts from scratch-rebalance, ls -l and getfattr, and bricks/sda Does any of this suggest anything? One failure is an apparent duplicate. This seems to refer to the relevant brick, and the date is correct. [2013-09-15 19:10:07.620881] W [client3_1-fops.c:258:client3_1_mknod_cbk] 0-scratch-client-0: remote operation failed: File exists. Path: /nwhitehorn/vetoblast/data/Level2_IC86.2011_data_Run00118906_Qtot1500.h5.out (00000000-0000-0000-0000-000000000000) On the array that actually drained (mostly): [2013-09-15 19:10:19.483040] W [client3_1-fops.c:647:client3_1_unlink_cbk] 0-scratch-client-12: remote operation failed: No such file or directory [2013-09-15 19:10:19.483122] W [client3_1-fops.c:647:client3_1_unlink_cbk] 0-scratch-client-12: remote operation failed: No such file or directory [2013-09-15 19:10:19.494585] W [client3_1-fops.c:258:client3_1_mknod_cbk] 0-scratch-client-12: remote operation failed: File exists. Path: /nwhitehorn/vetoblast/data/Level2_IC86.2011_data_Run00118429_Qtot1500.h5 (00000000-0000-0000-0000-000000000000) [2013-09-15 19:10:19.494701] W [client3_1-fops.c:258:client3_1_mknod_cbk] 0-scratch-client-12: remote operation failed: File exists. Path: /nwhitehorn/vetoblast/data/Level2_IC86.2011_data_Run00118429_Qtot1500.h5 (00000000-0000-0000-0000-000000000000) An example failure where I can trace the files is an apparent duplicate: gfs-node01 # grep -A2 -B2 Level2_IC86.2011_data_Run00118218_Qtot1500.h5 scratch-rebalance.log [2013-09-15 19:10:30.164409] W [client3_1-fops.c:258:client3_1_mknod_cbk] 0-scratch-client-3: remote operation failed: File exists. Path: /nwhitehorn/vetoblast/data/Level2a_IC79_data_Run00117874_Qtot1500.h5.out (00000000-0000-0000-0000-000000000000) [2013-09-15 19:10:30.164473] W [client3_1-fops.c:258:client3_1_mknod_cbk] 0-scratch-client-3: remote operation failed: File exists. Path: /nwhitehorn/vetoblast/data/Level2a_IC79_data_Run00117874_Qtot1500.h5.out (00000000-0000-0000-0000-000000000000) [2013-09-15 19:10:30.176606] I [dht-common.c:956:dht_lookup_everywhere_cbk] 0-scratch-dht: deleting stale linkfile /nwhitehorn/vetoblast/data/Level2_IC86.2011_data_Run00118218_Qtot1500.h5 on scratch-client-2 [2013-09-15 19:10:30.176717] I [dht-common.c:956:dht_lookup_everywhere_cbk] 0-scratch-dht: deleting stale linkfile /nwhitehorn/vetoblast/data/Level2_IC86.2011_data_Run00118218_Qtot1500.h5 on scratch-client-2 [2013-09-15 19:10:30.176856] I [dht-common.c:956:dht_lookup_everywhere_cbk] 0-scratch-dht: deleting stale linkfile /nwhitehorn/vetoblast/data/Level2_IC86.2011_data_Run00118218_Qtot1500.h5 on scratch-client-2 [2013-09-15 19:10:30.177232] W [client3_1-fops.c:647:client3_1_unlink_cbk] 0-scratch-client-2: remote operation failed: No such file or directory [2013-09-15 19:10:30.177303] W [client3_1-fops.c:647:client3_1_unlink_cbk] 0-scratch-client-2: remote operation failed: No such file or directory [2013-09-15 19:10:30.178101] W [client3_1-fops.c:258:client3_1_mknod_cbk] 0-scratch-client-3: remote operation failed: File exists. Path: /nwhitehorn/vetoblast/data/Level2_IC86.2011_data_Run00118218_Qtot1500.h5 (00000000-0000-0000-0000-000000000000) [2013-09-15 19:10:30.178150] W [client3_1-fops.c:258:client3_1_mknod_cbk] 0-scratch-client-3: remote operation failed: File exists. Path: /nwhitehorn/vetoblast/data/Level2_IC86.2011_data_Run00118218_Qtot1500.h5 (00000000-0000-0000-0000-000000000000) [2013-09-15 19:10:30.192605] W [client3_1-fops.c:2566:client3_1_opendir_cbk] 0-scratch-client-7: remote operation failed: No such file or directory. Path: /nwhitehorn/vetoblast/data (00000000-0000-0000-0000-000000000000) [2013-09-15 19:10:30.192830] W [client3_1-fops.c:2566:client3_1_opendir_cbk] 0-scratch-client-7: remote operation failed: No such file or directory. Path: /nwhitehorn/vetoblast/data (00000000-0000-0000-0000-000000000000) gfs-node01 # ls -l /sdd/nwhitehorn/vetoblast/data/Level2_IC86.2011_data_Run00118218_Qtot1500.h5 ---------T 2 34037 40978 0 Sep 15 14:10 /sdd/nwhitehorn/vetoblast/data/Level2_IC86.2011_data_Run00118218_Qtot1500.h5 gfs-node01 # ssh i3admin at gfs-node06 sudo ls -l /sdb/nwhitehorn/vetoblast/data/Level2_IC86.2011_data_Run00118218_Qtot1500.h5 -rw-r--r-- 2 34037 40978 715359 May 1 22:28 /sdb/nwhitehorn/vetoblast/data/Level2_IC86.2011_data_Run00118218_Qtot1500.h5 gfs-node01 # getfattr -d -m . -e hex /sdd/nwhitehorn/vetoblast/data/Level2_IC86.2011_data_Run00118218_Qtot1500.h5 # file: sdd/nwhitehorn/vetoblast/data/Level2_IC86.2011_data_Run00118218_Qtot1500.h5 trusted.gfid=0x11fb3ffd87be4ce3a88576466279819f trusted.glusterfs.dht.linkto=0x736372617463682d636c69656e742d313200 gfs-node01 # ssh i3admin at gfs-node06 sudo getfattr -d -m . -e hex /sdb/nwhitehorn/vetoblast/data/Level2_IC86.2011_data_Run00118218_Qtot1500.h5 # file: sdb/nwhitehorn/vetoblast/data/Level2_IC86.2011_data_Run00118218_Qtot1500.h5 trusted.gfid=0x11fb3ffd87be4ce3a88576466279819f Further. # getfattr -d -m . -e hex /sdd # file: sdd trusted.gfid=0x00000000000000000000000000000001 trusted.glusterfs.dht=0x0000000100000000bffffffdd5555551 trusted.glusterfs.volume-id=0xde1fbb473e5a45dc8df804f7f73a3ecc gfs-node01 # getfattr -d -m . -e hex /sdc # file: sdc trusted.gfid=0x00000000000000000000000000000001 trusted.glusterfs.dht=0x0000000100000000aaaaaaa8bffffffc trusted.glusterfs.volume-id=0xde1fbb473e5a45dc8df804f7f73a3ecc gfs-node01 # getfattr -d -m . -e hex /sdb # file: sdb trusted.gfid=0x00000000000000000000000000000001 trusted.glusterfs.dht=0x00000001000000000000000000000000 trusted.glusterfs.volume-id=0xde1fbb473e5a45dc8df804f7f73a3ecc gfs-node01 # getfattr -d -m . -e hex /sda # file: sda trusted.gfid=0x00000000000000000000000000000001 trusted.glusterfs.dht=0x0000000100000000555555546aaaaaa8 trusted.glusterfs.volume-id=0xde1fbb473e5a45dc8df804f7f73a3ecc gfs-node01 # ssh i3admin at gfs-node04 sudo getfattr -d -m . -e hex /sdb # file: sdb trusted.gfid=0x00000000000000000000000000000001 trusted.glusterfs.dht=0x00000001000000002aaaaaaa3ffffffe trusted.glusterfs.volume-id=0xde1fbb473e5a45dc8df804f7f73a3ecc bricks/sda etc logs have a rather monotonous [2013-09-16 22:23:01.723146] I [server-handshake.c:571:server_setvolume] 0-scratch-server: accepted client from node086-11928-2013/09/16-22:22:57:696729-scratch-client-0-0 (version: 3.3.2) [2013-09-16 22:23:01.769154] I [server.c:703:server_rpc_notify] 0-scratch-server: disconnecting connectionfrom node086-11928-2013/09/16-22:22:57:696729-scratch-client-0-0 [2013-09-16 22:23:01.769211] I [server-helpers.c:741:server_connection_put] 0-scratch-server: Shutting down connection node086-11928-2013/09/16-22:22:57:696729-scratch-client-0-0 [2013-09-16 22:23:01.769253] I [server-helpers.c:629:server_connection_destroy] 0-scratch-server: destroyed connection of node086-11928-2013/09/16-22:22:57:696729-scratch-client-0-0 > On 09/17/2013 03:26 AM, james.bellinger at icecube.wisc.edu wrote: >> I inherited a system with a wide mix of array sizes (no replication) in >> 3.2.2, and wanted to drain data from a failing array. >> >> I upgraded to 3.3.2, and began a >> gluster volume remove-brick scratch "gfs-node01:/sda" start >> >> After some time I got this: >> gluster volume remove-brick scratch "gfs-node01:/sda" status >> Node Rebalanced-files size scanned failures >> status >> --------- ----------- ----------- ----------- ----------- >> ------------ >> localhost 0 0Bytes 0 0 >> not started >> gfs-node06 0 0Bytes 0 0 >> not started >> gfs-node03 0 0Bytes 0 0 >> not started >> gfs-node05 0 0Bytes 0 0 >> not started >> gfs-node01 2257394624 2.8TB 5161640 208878 >> completed >> >> Two things jump instantly to mind: >> 1) The number of failures is rather large > Can you see the rebalance logs (/var/log/scratch-rebalance.log) to > figure out what the error messages are? >> 2) A _different_ disk seems to have been _partially_ drained. >> /dev/sda 2.8T 2.7T 12G 100% /sda >> /dev/sdb 2.8T 769G 2.0T 28% /sdb >> /dev/sdc 2.8T 2.1T 698G 75% /sdc >> /dev/sdd 2.8T 2.2T 589G 79% /sdd >> >> > I know this sounds silly, but just to be sure, is /dev/sda actually > mounted on "gfs-node01:sda"? > If yes,the files that _were_ successfully rebalanced should have been > moved from gfs-node01:sda to one of the other bricks. Is that the case? > >> When I mount the system it is read-only (another problem I want to fix > Again, the mount logs could shed some information .. > (btw a successful rebalance start/status sequence should be followed by > the rebalance 'commit' command to ensure the volume information gets > updated) > >> ASAP) so I'm pretty sure the failures aren't due to users changing the >> system underneath me. >> >> Thanks for any pointers. >> >> James Bellinger >> >> _______________________________________________ >> Gluster-users mailing list >> Gluster-users at gluster.org >> http://supercolony.gluster.org/mailman/listinfo/gluster-users > > _______________________________________________ > Gluster-users mailing list > Gluster-users at gluster.org > http://supercolony.gluster.org/mailman/listinfo/gluster-users > ------------------------------ Message: 8 Date: Tue, 17 Sep 2013 13:52:12 -0400 From: Luis Pabon <lpabon at redhat.com> To: Paul Robert Marino <prmarino1 at gmail.com> Cc: "rhos-list at redhat.com" <rhos-list at redhat.com>, Ramana Raja <rraja at redhat.com>, gluster-users at gluster.org, Chetan Risbud <crisbud at redhat.com> Subject: Re: [rhos-list] [gluster-swift] Gluster UFO 3.4 swift Multi tenant question Message-ID: <523896CC.70105 at redhat.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed On 09/17/2013 11:13 AM, Paul Robert Marino wrote: > Luis > well thats intresting because it was my impression that Gluster UFO > 3.4 was based on the Grizzly version of Swift. [LP] Sorry, the gluster-ufo RPM is Essex only. > Also I was previously unaware of this new rpm which doesnt seem to be > in a repo any where. [LP] gluster-swift project RPMs have been submitted to Fedora and are currently being reviewed. > also there is a line in this new howto that is extreamly unclear > > " > /usr/bin/gluster-swift-gen-builders test > " > in place of "test" what should go there is it the tenant ID string, > the tenant name, or just a generic volume you can name whatever you > want? > in other words how should the Gluster volumes be named? [LP] We will clarify that in the quick start guide. Thank you for pointing it out. While we update the community site, please refer to the documentation available here http://goo.gl/bQFI8o for a usage guide. As for the tool, the format is: gluster-swift-gen-buildes [VOLUME] [VOLUME...] Where VOLUME is the name of the GlusterFS volume to use for object storage. For example if the following two GlusterFS volumes, volume1 and volume2, need to be accessed over Swift, then you can type the following: # gluster-swift-gen-builders volume1 volume2 For more information please read: http://goo.gl/gd8LkW Let us know if you have any more questions or comments. - Luis > > > On Tue, Sep 17, 2013 at 10:10 AM, Luis Pabon <lpabon at redhat.com> wrote: >> First thing I can see is that you have Essex based gluster-ufo-* which has >> been replaced by the gluster-swift project. We are currently in progress of >> replacing the gluster-ufo-* with RPMs from the gluster-swift project in >> Fedora. >> >> Please checkout the following quickstart guide which show how to download >> the Grizzly version of gluster-swift: >> https://github.com/gluster/gluster-swift/blob/master/doc/markdown/quick_start_guide.md >> . >> >> For more information please visit: https://launchpad.net/gluster-swift >> >> - Luis >> >> >> On 09/16/2013 05:02 PM, Paul Robert Marino wrote: >> >> Sorry for the delay on reporting the details. I got temporarily pulled >> off the project and dedicated to a different project which was >> considered higher priority by my employer. I'm just getting back to >> doing my normal work today. >> >> first here are the rpms I have installed >> " >> rpm -qa |grep -P -i '(gluster|swift)' >> glusterfs-libs-3.4.0-8.el6.x86_64 >> glusterfs-server-3.4.0-8.el6.x86_64 >> openstack-swift-plugin-swift3-1.0.0-0.20120711git.el6.noarch >> openstack-swift-proxy-1.8.0-2.el6.noarch >> glusterfs-3.4.0-8.el6.x86_64 >> glusterfs-cli-3.4.0-8.el6.x86_64 >> glusterfs-geo-replication-3.4.0-8.el6.x86_64 >> glusterfs-api-3.4.0-8.el6.x86_64 >> openstack-swift-1.8.0-2.el6.noarch >> openstack-swift-container-1.8.0-2.el6.noarch >> openstack-swift-object-1.8.0-2.el6.noarch >> glusterfs-fuse-3.4.0-8.el6.x86_64 >> glusterfs-rdma-3.4.0-8.el6.x86_64 >> openstack-swift-account-1.8.0-2.el6.noarch >> glusterfs-ufo-3.4.0-8.el6.noarch >> glusterfs-vim-3.2.7-1.el6.x86_64 >> python-swiftclient-1.4.0-1.el6.noarch >> >> here are some key config files note I've changed the passwords I'm >> using and hostnames >> " >> cat /etc/swift/account-server.conf >> [DEFAULT] >> mount_check = true >> bind_port = 6012 >> user = root >> log_facility = LOG_LOCAL2 >> devices = /swift/tenants/ >> >> [pipeline:main] >> pipeline = account-server >> >> [app:account-server] >> use = egg:gluster_swift_ufo#account >> log_name = account-server >> log_level = DEBUG >> log_requests = true >> >> [account-replicator] >> vm_test_mode = yes >> >> [account-auditor] >> >> [account-reaper] >> >> " >> >> " >> cat /etc/swift/container-server.conf >> [DEFAULT] >> devices = /swift/tenants/ >> mount_check = true >> bind_port = 6011 >> user = root >> log_facility = LOG_LOCAL2 >> >> [pipeline:main] >> pipeline = container-server >> >> [app:container-server] >> use = egg:gluster_swift_ufo#container >> >> [container-replicator] >> vm_test_mode = yes >> >> [container-updater] >> >> [container-auditor] >> >> [container-sync] >> " >> >> " >> cat /etc/swift/object-server.conf >> [DEFAULT] >> mount_check = true >> bind_port = 6010 >> user = root >> log_facility = LOG_LOCAL2 >> devices = /swift/tenants/ >> >> [pipeline:main] >> pipeline = object-server >> >> [app:object-server] >> use = egg:gluster_swift_ufo#object >> >> [object-replicator] >> vm_test_mode = yes >> >> [object-updater] >> >> [object-auditor] >> " >> >> " >> cat /etc/swift/proxy-server.conf >> [DEFAULT] >> bind_port = 8080 >> user = root >> log_facility = LOG_LOCAL1 >> log_name = swift >> log_level = DEBUG >> log_headers = True >> >> [pipeline:main] >> pipeline = healthcheck cache authtoken keystone proxy-server >> >> [app:proxy-server] >> use = egg:gluster_swift_ufo#proxy >> allow_account_management = true >> account_autocreate = true >> >> [filter:tempauth] >> use = egg:swift#tempauth >> # Here you need to add users explicitly. See the OpenStack Swift Deployment >> # Guide for more information. The user and user64 directives take the >> # following form: >> # user_<account>_<username> = <key> [group] [group] [...] [storage_url] >> # user64_<account_b64>_<username_b64> = <key> [group] [group] >> [...] [storage_url] >> # Where you use user64 for accounts and/or usernames that include >> underscores. >> # >> # NOTE (and WARNING): The account name must match the device name specified >> # when generating the account, container, and object build rings. >> # >> # E.g. >> # user_ufo0_admin = abc123 .admin >> >> [filter:healthcheck] >> use = egg:swift#healthcheck >> >> [filter:cache] >> use = egg:swift#memcache >> >> >> [filter:keystone] >> use = egg:swift#keystoneauth >> #paste.filter_factory = keystone.middleware.swift_auth:filter_factory >> operator_roles = Member,admin,swiftoperator >> >> >> [filter:authtoken] >> paste.filter_factory = keystone.middleware.auth_token:filter_factory >> auth_host = keystone01.vip.my.net >> auth_port = 35357 >> auth_protocol = http >> admin_user = swift >> admin_password = PASSWORD >> admin_tenant_name = service >> signing_dir = /var/cache/swift >> service_port = 5000 >> service_host = keystone01.vip.my.net >> >> [filter:swiftauth] >> use = egg:keystone#swiftauth >> auth_host = keystone01.vip.my.net >> auth_port = 35357 >> auth_protocol = http >> keystone_url = https://keystone01.vip.my.net:5000/v2.0 >> admin_user = swift >> admin_password = PASSWORD >> admin_tenant_name = service >> signing_dir = /var/cache/swift >> keystone_swift_operator_roles = Member,admin,swiftoperator >> keystone_tenant_user_admin = true >> >> [filter:catch_errors] >> use = egg:swift#catch_errors >> " >> >> " >> cat /etc/swift/swift.conf >> [DEFAULT] >> >> >> [swift-hash] >> # random unique string that can never change (DO NOT LOSE) >> swift_hash_path_suffix = gluster >> #3d60c9458bb77abe >> >> >> # The swift-constraints section sets the basic constraints on data >> # saved in the swift cluster. >> >> [swift-constraints] >> >> # max_file_size is the largest "normal" object that can be saved in >> # the cluster. This is also the limit on the size of each segment of >> # a "large" object when using the large object manifest support. >> # This value is set in bytes. Setting it to lower than 1MiB will cause >> # some tests to fail. It is STRONGLY recommended to leave this value at >> # the default (5 * 2**30 + 2). >> >> # FIXME: Really? Gluster can handle a 2^64 sized file? And can the fronting >> # web service handle such a size? I think with UFO, we need to keep with the >> # default size from Swift and encourage users to research what size their >> web >> # services infrastructure can handle. >> >> max_file_size = 18446744073709551616 >> >> >> # max_meta_name_length is the max number of bytes in the utf8 encoding >> # of the name portion of a metadata header. >> >> #max_meta_name_length = 128 >> >> >> # max_meta_value_length is the max number of bytes in the utf8 encoding >> # of a metadata value >> >> #max_meta_value_length = 256 >> >> >> # max_meta_count is the max number of metadata keys that can be stored >> # on a single account, container, or object >> >> #max_meta_count = 90 >> >> >> # max_meta_overall_size is the max number of bytes in the utf8 encoding >> # of the metadata (keys + values) >> >> #max_meta_overall_size = 4096 >> >> >> # max_object_name_length is the max number of bytes in the utf8 encoding of >> an >> # object name: Gluster FS can handle much longer file names, but the length >> # between the slashes of the URL is handled below. Remember that most web >> # clients can't handle anything greater than 2048, and those that do are >> # rather clumsy. >> >> max_object_name_length = 2048 >> >> # max_object_name_component_length (GlusterFS) is the max number of bytes in >> # the utf8 encoding of an object name component (the part between the >> # slashes); this is a limit imposed by the underlying file system (for XFS >> it >> # is 255 bytes). >> >> max_object_name_component_length = 255 >> >> # container_listing_limit is the default (and max) number of items >> # returned for a container listing request >> >> #container_listing_limit = 10000 >> >> >> # account_listing_limit is the default (and max) number of items returned >> # for an account listing request >> >> #account_listing_limit = 10000 >> >> >> # max_account_name_length is the max number of bytes in the utf8 encoding of >> # an account name: Gluster FS Filename limit (XFS limit?), must be the same >> # size as max_object_name_component_length above. >> >> max_account_name_length = 255 >> >> >> # max_container_name_length is the max number of bytes in the utf8 encoding >> # of a container name: Gluster FS Filename limit (XFS limit?), must be the >> same >> # size as max_object_name_component_length above. >> >> max_container_name_length = 255 >> >> " >> >> >> The volumes >> " >> gluster volume list >> cindervol >> unified-storage-vol >> a07d2f39117c4e5abdeba722cf245828 >> bd74a005f08541b9989e392a689be2fc >> f6da0a8151ff43b7be10d961a20c94d6 >> " >> >> if I run the command >> " >> gluster-swift-gen-builders unified-storage-vol >> a07d2f39117c4e5abdeba722cf245828 bd74a005f08541b9989e392a689be2fc >> f6da0a8151ff43b7be10d961a20c94d6 >> " >> >> because of a change in the script in this version as compaired to the >> version I got from >> http://repos.fedorapeople.org/repos/kkeithle/glusterfs/ the >> gluster-swift-gen-builders script only takes the first option and >> ignores the rest. >> >> other than the location of the config files none of the changes Ive >> made are functionally different than the ones mentioned in >> http://www.gluster.org/2012/09/howto-using-ufo-swift-a-quick-and-dirty-setup-guide/ >> >> The result is that the first volume named "unified-storage-vol" winds >> up being used for every thing regardless of the tenant, and users and >> see and manage each others objects regardless of what tenant they are >> members of. >> through the swift command or via horizon. >> >> In a way this is a good thing for me it simplifies thing significantly >> and would be fine if it just created a directory for each tenant and >> only allow the user to access the individual directories, not the >> whole gluster volume. >> by the way seeing every thing includes the service tenants data so >> unprivileged users can delete glance images without being a member of >> the service group. >> >> >> >> >> On Mon, Sep 2, 2013 at 9:58 PM, Paul Robert Marino <prmarino1 at gmail.com> >> wrote: >> >> Well I'll give you the full details in the morning but simply I used the >> stock cluster ring builder script that came with the 3.4 rpms and the old >> version from 3.3 took the list of volumes and would add all of them the >> version with 3.4 only takes the first one. >> >> Well I ran the script expecting the same behavior but instead they all used >> the first volume in the list. >> >> Now I knew from the docs I read that the per tenant directories in a single >> volume were one possible plan for 3.4 to deal with the scalding issue with a >> large number of tenants, so when I saw the difference in the script and that >> it worked I just assumed that this was done and I missed something. >> >> >> >> -- Sent from my HP Pre3 >> >> ________________________________ >> On Sep 2, 2013 20:55, Ramana Raja <rraja at redhat.com> wrote: >> >> Hi Paul, >> >> Currently, gluster-swift doesn't support the feature of multiple >> accounts/tenants accessing the same volume. Each tenant still needs his own >> gluster volume. So I'm wondering how you were able to observe the reported >> behaviour. >> >> How did you prepare the ringfiles for the different tenants, which use the >> same gluster volume? Did you change the configuration of the servers? Also, >> how did you access the files that you mention? It'd be helpful if you could >> share the commands you used to perform these actions. >> >> Thanks, >> >> Ram >> >> >> ----- Original Message ----- >> From: "Vijay Bellur" <vbellur at redhat.com> >> To: "Paul Robert Marino" <prmarino1 at gmail.com> >> Cc: rhos-list at redhat.com, "Luis Pabon" <lpabon at redhat.com>, "Ramana Raja" >> <rraja at redhat.com>, "Chetan Risbud" <crisbud at redhat.com> >> Sent: Monday, September 2, 2013 4:17:51 PM >> Subject: Re: [rhos-list] Gluster UFO 3.4 swift Multi tenant question >> >> On 09/02/2013 01:39 AM, Paul Robert Marino wrote: >> >> I have Gluster UFO installed as a back end for swift from here >> http://download.gluster.org/pub/gluster/glusterfs/3.4/3.4.0/RHEL/epel-6/ >> with RDO 3 >> >> Its working well except for one thing. All of the tenants are seeing >> one Gluster volume which is some what nice, especially when compared >> to the old 3.3 behavior of creating one volume per tenant named after >> the tenant ID number. >> >> The problem is I expected to see is sub directory created under the >> volume root for each tenant but instead what in seeing is that all of >> the tenants can see the root of the Gluster volume. The result is that >> all of the tenants can access each others files and even delete them. >> even scarier is that the tennants can see and delete each others >> glance images and snapshots. >> >> Can any one suggest options to look at or documents to read to try to >> figure out how to modify the behavior? >> >> Adding gluster swift developers who might be able to help. >> >> -Vijay >> >> ------------------------------ Message: 9 Date: Wed, 18 Sep 2013 07:39:56 +0800 From: Asias He <asias.hejun at gmail.com> To: Andrew Niemantsverdriet <andrew at rocky.edu> Cc: gluster-users <gluster-users at gluster.org> Subject: Re: Gluster 3.4 QEMU and Permission Denied Errors Message-ID: <CAFO3S41gLLdA_9YqBPLD_TE=2dLaPHP6cw+v08+jt6Px_xdA4w at mail.gmail.com> Content-Type: text/plain; charset=UTF-8 On Tue, Sep 17, 2013 at 9:59 PM, Andrew Niemantsverdriet <andrew at rocky.edu> wrote: > Right now I am just using virsh to start the machines, I have also > tried using Virtual Machine Manager to start them. Try 'chmod qemu.qemu image_on_gluster.qcow2'. This along with the 'option rpc-auth-allow-insecure on' and 'gluster volume set <volname> server.allow-insecure on' should make libvirt+qemu+libgfapi work. > I have enabled Gluster mounting from insecure ports, forgot to mention > that in my first email. It looks like the disk mounts as it starts to > boot but nothing can be written to the disk as it just hangs in an > infinite loop. > > Thanks, > _ > /-\ ndrew > > On Tue, Sep 17, 2013 at 1:05 AM, Samuli Heinonen <samppah at neutraali.net> wrote: >> Hello Andrew, >> >> How are you booting/managing VM's? Which user you use to launch them? >> >> Have you enabled Gluster mounting from insecure ports? It needs two changes. >> You have to edit glusterd.vol (in /etc/glusterfs directory) and add line >> "option rpc-auth-allow-insecure on". Also you have to set volume option >> server.allow-insecure on (ie. gluster volume set volname >> server.allow-insecure on). Restart of glusterd and stop and start of the >> volume is required for these changes to take effect. >> >> 16.9.2013 21:38, Andrew Niemantsverdriet kirjoitti: >> >>> Hey List, >>> >>> I'm trying to test out using Gluster 3.4 for virtual machine disks. My >>> enviroment consists of two Fedora 19 hosts with gluster and qemu/kvm >>> installed. >>> >>> I have a single volume on gluster called vmdata that contains my qcow2 >>> formated image created like this: >>> >>> qemu-img create -f qcow2 gluster://localhost/vmdata/test1.qcow 8G >>> >>> I'm able to boot my created virtual machine but in the logs I see this: >>> >>> [2013-09-16 15:16:04.471205] E [addr.c:152:gf_auth] 0-auth/addr: >>> client is bound to port 46021 which is not privileged >>> [2013-09-16 15:16:04.471277] I >>> [server-handshake.c:567:server_setvolume] 0-vmdata-server: accepted >>> client from >>> gluster1.local-1061-2013/09/16-15:16:04:441166-vmdata-client-1-0 >>> (version: 3.4.0)[2013-09-16 15:16:04.488000] I >>> [server-rpc-fops.c:1572:server_open_cbk] 0-vmdata-server: 18: OPEN >>> /test1.qcow (6b63a78b-7d5c-4195-a172-5bb6ed1e7dac) ==> (Permission >>> denied) >>> >>> I have turned off SELinux to be sure that isn't in the way. When I >>> look at the permissions on the file using ls -l I see the file is set >>> to 600, this doesn't seem right. I tried manually changing the >>> permission to 755 as a test and as soon as the machine booted it was >>> changed back to 600. >>> >>> Any hints as to what is going on and how to get the disk functioning? >>> The machine will boot but as soon as anything is written to disk it >>> will hang forever. >>> >>> Thanks, >>> >> > > > > -- > _ > /-\ ndrew Niemantsverdriet > Linux System Administrator > Academic Computing > (406) 238-7360 > Rocky Mountain College > 1511 Poly Dr. > Billings MT, 59102 > _______________________________________________ > Gluster-users mailing list > Gluster-users at gluster.org > http://supercolony.gluster.org/mailman/listinfo/gluster-users -- Asias ------------------------------ Message: 10 Date: Tue, 17 Sep 2013 23:39:38 -0400 (EDT) From: Shishir Gowda <sgowda at redhat.com> To: Nux! <nux at li.nux.ro> Cc: Gluster Users <gluster-users at gluster.org> Subject: Re: gluster volume top issue Message-ID: <975294514.14460863.1379475578881.JavaMail.root at redhat.com> Content-Type: text/plain; charset=utf-8 Hi Nux, I am trying to see if the issue of "0" open fd is based on the work-load, or a bug. Could you check top command output of "read/write" operation too? With regards, Shishir ----- Original Message ----- From: "Nux!" <nux at li.nux.ro> To: "Shishir Gowda" <sgowda at redhat.com> Cc: "Gluster Users" <gluster-users at gluster.org> Sent: Tuesday, September 17, 2013 6:46:05 PM Subject: Re: gluster volume top issue On 17.09.2013 13:13, Shishir Gowda wrote: > Hi Nux, > > Is only open count being shown as "0", or all stats being shown as > "0"? Hi Shishir, For all bricks I get: Current open fds: 0, Max open fds: 0, Max openfd time: N/A Lucian -- Sent from the Delta quadrant using Borg technology! Nux! www.nux.ro ------------------------------ Message: 11 Date: Wed, 18 Sep 2013 12:02:11 +0800 From: kane <stef_9k at 163.com> To: Vijay Bellur <vbellur at redhat.com>, Anand Avati <avati at redhat.com> Cc: gluster-users at gluster.org Subject: Re: Gluster samba vfs read performance slow Message-ID: <5F332BE6-BBC4-40CD-88E9-5291F15A39A8 at 163.com> Content-Type: text/plain; charset=GB2312 Hi Vijay I used the code in https://github.com/gluster/glusterfs.git with the lasted commit: commit de2a8d303311bd600cb93a775bc79a0edea1ee1a Author: Anand Avati <avati at redhat.com> Date: Tue Sep 17 16:45:03 2013 -0700 Revert "cluster/distribute: Rebalance should also verify free inodes" This reverts commit 215fea41a96479312a5ab8783c13b30ab9fe00fa Realized soon after merging, ?. which include the patch you mentioned last time improve read perf, written by Anand. but the read perf was still slow: write: 500MB/s read: 77MB/s while via fuse : write 800MB/s read 600MB/s any advises? Thank you. -Kane ? 2013-9-13???10:37?kane <stef_9k at 163.com> ??? > Hi Vijay? > > thank you for post this message, i will try it soon > > -kane > > > > ? 2013-9-13???9:21?Vijay Bellur <vbellur at redhat.com> ??? > >> On 09/13/2013 06:10 PM, kane wrote: >>> Hi >>> >>> We use gluster samba vfs test io,but the read performance via vfs is >>> half of write perfomance, >>> but via fuse the read and write performance is almost the same. >>> >>> this is our smb.conf: >>> [global] >>> workgroup = MYGROUP >>> server string = DCS Samba Server >>> log file = /var/log/samba/log.vfs >>> max log size = 500000 >>> # use sendfile = true >>> aio read size = 262144 >>> aio write size = 262144 >>> aio write behind = true >>> min receivefile size = 262144 >>> write cache size = 268435456 >>> security = user >>> passdb backend = tdbsam >>> load printers = yes >>> cups options = raw >>> read raw = yes >>> write raw = yes >>> max xmit = 262144 >>> socket options = TCP_NODELAY IPTOS_LOWDELAY SO_RCVBUF=262144 >>> SO_SNDBUF=262144 >>> kernel oplocks = no >>> stat cache = no >>> >>> any advises helpful? >>> >> >> This patch has shown improvement in read performance with libgfapi: >> >> http://review.gluster.org/#/c/5897/ >> >> Would it be possible for you to try this patch and check if it improves performance in your case? >> >> -Vijay >> > ------------------------------ Message: 12 Date: Tue, 17 Sep 2013 22:19:23 -0700 From: Anand Avati <avati at gluster.org> To: kane <stef_9k at 163.com> Cc: gluster-users <gluster-users at gluster.org>, Anand Avati <avati at redhat.com> Subject: Re: Gluster samba vfs read performance slow Message-ID: <CAFboF2xxQ8rYqdDSBZtLVbae2OJ1yDZRMT4=AiWD_+2stkiRKA at mail.gmail.com> Content-Type: text/plain; charset="iso-2022-jp" How are you testing this? What tool are you using? Avati On Tue, Sep 17, 2013 at 9:02 PM, kane <stef_9k at 163.com> wrote: > Hi Vijay > > I used the code in https://github.com/gluster/glusterfs.git with > the lasted commit: > commit de2a8d303311bd600cb93a775bc79a0edea1ee1a > Author: Anand Avati <avati at redhat.com> > Date: Tue Sep 17 16:45:03 2013 -0700 > > Revert "cluster/distribute: Rebalance should also verify free inodes" > > This reverts commit 215fea41a96479312a5ab8783c13b30ab9fe00fa > > Realized soon after merging, ?. > > which include the patch you mentioned last time improve read perf, written > by Anand. > > but the read perf was still slow: > write: 500MB/s > read: 77MB/s > > while via fuse : > write 800MB/s > read 600MB/s > > any advises? > > > Thank you. > -Kane > > ? 2013-9-13???10:37?kane <stef_9k at 163.com> ??? > > > Hi Vijay? > > > > thank you for post this message, i will try it soon > > > > -kane > > > > > > > > ? 2013-9-13???9:21?Vijay Bellur <vbellur at redhat.com> ??? > > > >> On 09/13/2013 06:10 PM, kane wrote: > >>> Hi > >>> > >>> We use gluster samba vfs test io,but the read performance via vfs is > >>> half of write perfomance, > >>> but via fuse the read and write performance is almost the same. > >>> > >>> this is our smb.conf: > >>> [global] > >>> workgroup = MYGROUP > >>> server string = DCS Samba Server > >>> log file = /var/log/samba/log.vfs > >>> max log size = 500000 > >>> # use sendfile = true > >>> aio read size = 262144 > >>> aio write size = 262144 > >>> aio write behind = true > >>> min receivefile size = 262144 > >>> write cache size = 268435456 > >>> security = user > >>> passdb backend = tdbsam > >>> load printers = yes > >>> cups options = raw > >>> read raw = yes > >>> write raw = yes > >>> max xmit = 262144 > >>> socket options = TCP_NODELAY IPTOS_LOWDELAY SO_RCVBUF=262144 > >>> SO_SNDBUF=262144 > >>> kernel oplocks = no > >>> stat cache = no > >>> > >>> any advises helpful? > >>> > >> > >> This patch has shown improvement in read performance with libgfapi: > >> > >> http://review.gluster.org/#/c/5897/ > >> > >> Would it be possible for you to try this patch and check if it improves > performance in your case? > >> > >> -Vijay > >> > > > > > _______________________________________________ > Gluster-users mailing list > Gluster-users at gluster.org > http://supercolony.gluster.org/mailman/listinfo/gluster-users > -------------- next part -------------- An HTML attachment was scrubbed... URL: < http://supercolony.gluster.org/pipermail/gluster-users/attachments/20130917/98115dbd/attachment-0001.html > ------------------------------ Message: 13 Date: Wed, 18 Sep 2013 13:34:48 +0800 From: kane <stef_9k at 163.com> To: Anand Avati <avati at gluster.org> Cc: gluster-users <gluster-users at gluster.org>, Anand Avati <avati at redhat.com> Subject: Re: Gluster samba vfs read performance slow Message-ID: <D7FF253D-7D90-417A-9D26-543F93F2250D at 163.com> Content-Type: text/plain; charset="iso-2022-jp" Hi Anand, I use 2 gluster server , this is my volume info: Volume Name: soul Type: Distribute Volume ID: 58f049d0-a38a-4ebe-94c0-086d492bdfa6 Status: Started Number of Bricks: 2 Transport-type: tcp Bricks: Brick1: 192.168.101.133:/dcsdata/d0 Brick2: 192.168.101.134:/dcsdata/d0 each brick use a raid 5 logic disk with 8*2TSATA hdd. smb.conf: [gvol] comment = For samba export of volume test vfs objects = glusterfs glusterfs:volfile_server = localhost glusterfs:volume = soul path = / read only = no guest ok = yes this my testparm result: [global] workgroup = MYGROUP server string = DCS Samba Server log file = /var/log/samba/log.vfs max log size = 500000 max xmit = 262144 socket options = TCP_NODELAY IPTOS_LOWDELAY SO_RCVBUF=262144 SO_SNDBUF=262144 stat cache = No kernel oplocks = No idmap config * : backend = tdb aio read size = 262144 aio write size = 262144 aio write behind = true cups options = raw in client mount the smb share with cifs to dir /mnt/vfs, then use iozone executed in the cifs mount dir "/mnt/vfs": $ ./iozone -s 10G -r 128k -i0 -i1 -t 4 File size set to 10485760 KB Record Size 128 KB Command line used: ./iozone -s 10G -r 128k -i0 -i1 -t 4 Output is in Kbytes/sec Time Resolution = 0.000001 seconds. Processor cache size set to 1024 Kbytes. Processor cache line size set to 32 bytes. File stride size set to 17 * record size. Throughput test with 4 processes Each process writes a 10485760 Kbyte file in 128 Kbyte records Children see throughput for 4 initial writers = 534315.84 KB/sec Parent sees throughput for 4 initial writers = 519428.83 KB/sec Min throughput per process = 133154.69 KB/sec Max throughput per process = 134341.05 KB/sec Avg throughput per process = 133578.96 KB/sec Min xfer = 10391296.00 KB Children see throughput for 4 rewriters = 536634.88 KB/sec Parent sees throughput for 4 rewriters = 522618.54 KB/sec Min throughput per process = 133408.80 KB/sec Max throughput per process = 134721.36 KB/sec Avg throughput per process = 134158.72 KB/sec Min xfer = 10384384.00 KB Children see throughput for 4 readers = 77403.54 KB/sec Parent sees throughput for 4 readers = 77402.86 KB/sec Min throughput per process = 19349.42 KB/sec Max throughput per process = 19353.42 KB/sec Avg throughput per process = 19350.88 KB/sec Min xfer = 10483712.00 KB Children see throughput for 4 re-readers = 77424.40 KB/sec Parent sees throughput for 4 re-readers = 77423.89 KB/sec Min throughput per process = 19354.75 KB/sec Max throughput per process = 19358.50 KB/sec Avg throughput per process = 19356.10 KB/sec Min xfer = 10483840.00 KB then the use the same command test in the dir mounted with glister fuse: File size set to 10485760 KB Record Size 128 KB Command line used: ./iozone -s 10G -r 128k -i0 -i1 -t 4 Output is in Kbytes/sec Time Resolution = 0.000001 seconds. Processor cache size set to 1024 Kbytes. Processor cache line size set to 32 bytes. File stride size set to 17 * record size. Throughput test with 4 processes Each process writes a 10485760 Kbyte file in 128 Kbyte records Children see throughput for 4 initial writers = 887534.72 KB/sec Parent sees throughput for 4 initial writers = 848830.39 KB/sec Min throughput per process = 220140.91 KB/sec Max throughput per process = 223690.45 KB/sec Avg throughput per process = 221883.68 KB/sec Min xfer = 10319360.00 KB Children see throughput for 4 rewriters = 892774.92 KB/sec Parent sees throughput for 4 rewriters = 871186.83 KB/sec Min throughput per process = 222326.44 KB/sec Max throughput per process = 223970.17 KB/sec Avg throughput per process = 223193.73 KB/sec Min xfer = 10431360.00 KB Children see throughput for 4 readers = 605889.12 KB/sec Parent sees throughput for 4 readers = 601767.96 KB/sec Min throughput per process = 143133.14 KB/sec Max throughput per process = 159550.88 KB/sec Avg throughput per process = 151472.28 KB/sec Min xfer = 9406848.00 KB it shows much higher perf. any places i did wrong? thank you -Kane ? 2013-9-18???1:19?Anand Avati <avati at gluster.org> ??? > How are you testing this? What tool are you using? > > Avati > > > On Tue, Sep 17, 2013 at 9:02 PM, kane <stef_9k at 163.com> wrote: > Hi Vijay > > I used the code in https://github.com/gluster/glusterfs.git with the lasted commit: > commit de2a8d303311bd600cb93a775bc79a0edea1ee1a > Author: Anand Avati <avati at redhat.com> > Date: Tue Sep 17 16:45:03 2013 -0700 > > Revert "cluster/distribute: Rebalance should also verify free inodes" > > This reverts commit 215fea41a96479312a5ab8783c13b30ab9fe00fa > > Realized soon after merging, ?. > > which include the patch you mentioned last time improve read perf, written by Anand. > > but the read perf was still slow: > write: 500MB/s > read: 77MB/s > > while via fuse : > write 800MB/s > read 600MB/s > > any advises? > > > Thank you. > -Kane > > ? 2013-9-13???10:37?kane <stef_9k at 163.com> ??? > > > Hi Vijay? > > > > thank you for post this message, i will try it soon > > > > -kane > > > > > > > > ? 2013-9-13???9:21?Vijay Bellur <vbellur at redhat.com> ??? > > > >> On 09/13/2013 06:10 PM, kane wrote: > >>> Hi > >>> > >>> We use gluster samba vfs test io,but the read performance via vfs is > >>> half of write perfomance, > >>> but via fuse the read and write performance is almost the same. > >>> > >>> this is our smb.conf: > >>> [global] > >>> workgroup = MYGROUP > >>> server string = DCS Samba Server > >>> log file = /var/log/samba/log.vfs > >>> max log size = 500000 > >>> # use sendfile = true > >>> aio read size = 262144 > >>> aio write size = 262144 > >>> aio write behind = true > >>> min receivefile size = 262144 > >>> write cache size = 268435456 > >>> security = user > >>> passdb backend = tdbsam > >>> load printers = yes > >>> cups options = raw > >>> read raw = yes > >>> write raw = yes > >>> max xmit = 262144 > >>> socket options = TCP_NODELAY IPTOS_LOWDELAY SO_RCVBUF=262144 > >>> SO_SNDBUF=262144 > >>> kernel oplocks = no > >>> stat cache = no > >>> > >>> any advises helpful? > >>> > >> > >> This patch has shown improvement in read performance with libgfapi: > >> > >> http://review.gluster.org/#/c/5897/ > >> > >> Would it be possible for you to try this patch and check if it improves performance in your case? > >> > >> -Vijay > >> > > > > > _______________________________________________ > Gluster-users mailing list > Gluster-users at gluster.org > http://supercolony.gluster.org/mailman/listinfo/gluster-users > -------------- next part -------------- An HTML attachment was scrubbed... URL: < http://supercolony.gluster.org/pipermail/gluster-users/attachments/20130918/fabbf07e/attachment-0001.html > ------------------------------ Message: 14 Date: Wed, 18 Sep 2013 05:35:32 +0000 From: Bobby Jacob <bobby.jacob at alshaya.com> To: "gluster-users at gluster.org" <gluster-users at gluster.org> Subject: Mounting same replica-volume on multiple clients. ???? Message-ID: <AC3305F9C186F849B835A3E6D3C9BEFEAFA70E at KWTPRMBX001.mha.local> Content-Type: text/plain; charset="us-ascii" HI, I have 2 gluster nodes (GFS01/GFS02) each with a single brick (B01/B01). I have created a simple replica volume with these bricks. Bricks : GFS01/B01 and GFS02/B01. Volume: TestVol I have 2 clients (C01/C02) which will mount this "testvol" for simultaneous read/write. The 2 clients run the same application which is load-balanced, so user request are end to both the client servers which reads/writes data to both the same volume. Mounting the volume on C1 : mount -t glusterfs -o backupvolfile-server=GFS02 GFS01:/testvol /data Mounting the volume on C2 : mount -t glusterfs -o backupvolfile-server=GFS01 GFS02:/testvol /data Is this the appropriate way to be followed.? At times, I notice that when I write data through C1-mount point the data is written only to GFS01/B01 and if data is written through C2-mount point the data is written only to GFS02/B01. Please advise. !! Thanks & Regards, Bobby Jacob -------------- next part -------------- An HTML attachment was scrubbed... URL: < http://supercolony.gluster.org/pipermail/gluster-users/attachments/20130918/e3b53134/attachment-0001.html > ------------------------------ Message: 15 Date: Tue, 17 Sep 2013 22:38:14 -0700 From: Anand Avati <avati at redhat.com> To: kane <stef_9k at 163.com> Cc: gluster-users <gluster-users at gluster.org> Subject: Re: Gluster samba vfs read performance slow Message-ID: <52393C46.80503 at redhat.com> Content-Type: text/plain; charset=ISO-2022-JP On 9/17/13 10:34 PM, kane wrote: > Hi Anand, > > I use 2 gluster server , this is my volume info: > Volume Name: soul > Type: Distribute > Volume ID: 58f049d0-a38a-4ebe-94c0-086d492bdfa6 > Status: Started > Number of Bricks: 2 > Transport-type: tcp > Bricks: > Brick1: 192.168.101.133:/dcsdata/d0 > Brick2: 192.168.101.134:/dcsdata/d0 > > each brick use a raid 5 logic disk with 8*2TSATA hdd. > > smb.conf: > [gvol] > comment = For samba export of volume test > vfs objects = glusterfs > glusterfs:volfile_server = localhost > glusterfs:volume = soul > path = / > read only = no > guest ok = yes > > this my testparm result: > [global] > workgroup = MYGROUP > server string = DCS Samba Server > log file = /var/log/samba/log.vfs > max log size = 500000 > max xmit = 262144 > socket options = TCP_NODELAY IPTOS_LOWDELAY SO_RCVBUF=262144 > SO_SNDBUF=262144 > stat cache = No > kernel oplocks = No > idmap config * : backend = tdb > aio read size = 262144 > aio write size = 262144 > aio write behind = true > cups options = raw > > in client mount the smb share with cifs to dir /mnt/vfs, > then use iozone executed in the cifs mount dir "/mnt/vfs": > $ ./iozone -s 10G -r 128k -i0 -i1 -t 4 > File size set to 10485760 KB > Record Size 128 KB > Command line used: ./iozone -s 10G -r 128k -i0 -i1 -t 4 > Output is in Kbytes/sec > Time Resolution = 0.000001 seconds. > Processor cache size set to 1024 Kbytes. > Processor cache line size set to 32 bytes. > File stride size set to 17 * record size. > Throughput test with 4 processes > Each process writes a 10485760 Kbyte file in 128 Kbyte records > > Children see throughput for 4 initial writers = 534315.84 KB/sec > Parent sees throughput for 4 initial writers = 519428.83 KB/sec > Min throughput per process = 133154.69 KB/sec > Max throughput per process = 134341.05 KB/sec > Avg throughput per process = 133578.96 KB/sec > Min xfer = 10391296.00 KB > > Children see throughput for 4 rewriters = 536634.88 KB/sec > Parent sees throughput for 4 rewriters = 522618.54 KB/sec > Min throughput per process = 133408.80 KB/sec > Max throughput per process = 134721.36 KB/sec > Avg throughput per process = 134158.72 KB/sec > Min xfer = 10384384.00 KB > > Children see throughput for 4 readers = 77403.54 KB/sec > Parent sees throughput for 4 readers = 77402.86 KB/sec > Min throughput per process = 19349.42 KB/sec > Max throughput per process = 19353.42 KB/sec > Avg throughput per process = 19350.88 KB/sec > Min xfer = 10483712.00 KB > > Children see throughput for 4 re-readers = 77424.40 KB/sec > Parent sees throughput for 4 re-readers = 77423.89 KB/sec > Min throughput per process = 19354.75 KB/sec > Max throughput per process = 19358.50 KB/sec > Avg throughput per process = 19356.10 KB/sec > Min xfer = 10483840.00 KB > > then the use the same command test in the dir mounted with glister fuse: > File size set to 10485760 KB > Record Size 128 KB > Command line used: ./iozone -s 10G -r 128k -i0 -i1 -t 4 > Output is in Kbytes/sec > Time Resolution = 0.000001 seconds. > Processor cache size set to 1024 Kbytes. > Processor cache line size set to 32 bytes. > File stride size set to 17 * record size. > Throughput test with 4 processes > Each process writes a 10485760 Kbyte file in 128 Kbyte records > > Children see throughput for 4 initial writers = 887534.72 KB/sec > Parent sees throughput for 4 initial writers = 848830.39 KB/sec > Min throughput per process = 220140.91 KB/sec > Max throughput per process = 223690.45 KB/sec > Avg throughput per process = 221883.68 KB/sec > Min xfer = 10319360.00 KB > > Children see throughput for 4 rewriters = 892774.92 KB/sec > Parent sees throughput for 4 rewriters = 871186.83 KB/sec > Min throughput per process = 222326.44 KB/sec > Max throughput per process = 223970.17 KB/sec > Avg throughput per process = 223193.73 KB/sec > Min xfer = 10431360.00 KB > > Children see throughput for 4 readers = 605889.12 KB/sec > Parent sees throughput for 4 readers = 601767.96 KB/sec > Min throughput per process = 143133.14 KB/sec > Max throughput per process = 159550.88 KB/sec > Avg throughput per process = 151472.28 KB/sec > Min xfer = 9406848.00 KB > > it shows much higher perf. > > any places i did wrong? > > > thank you > -Kane > > ? 2013-9-18???1:19?Anand Avati <avati at gluster.org > <mailto:avati at gluster.org>> ??? > >> How are you testing this? What tool are you using? >> >> Avati >> >> >> On Tue, Sep 17, 2013 at 9:02 PM, kane <stef_9k at 163.com >> <mailto:stef_9k at 163.com>> wrote: >> >> Hi Vijay >> >> I used the code in >> https://github.com/gluster/glusterfs.git with the lasted commit: >> commit de2a8d303311bd600cb93a775bc79a0edea1ee1a >> Author: Anand Avati <avati at redhat.com <mailto:avati at redhat.com>> >> Date: Tue Sep 17 16:45:03 2013 -0700 >> >> Revert "cluster/distribute: Rebalance should also verify free >> inodes" >> >> This reverts commit 215fea41a96479312a5ab8783c13b30ab9fe00fa >> >> Realized soon after merging, ?. >> >> which include the patch you mentioned last time improve read perf, >> written by Anand. >> >> but the read perf was still slow: >> write: 500MB/s >> read: 77MB/s >> >> while via fuse : >> write 800MB/s >> read 600MB/s >> >> any advises? >> >> >> Thank you. >> -Kane >> >> ? 2013-9-13???10:37?kane <stef_9k at 163.com >> <mailto:stef_9k at 163.com>> ??? >> >> > Hi Vijay? >> > >> > thank you for post this message, i will try it soon >> > >> > -kane >> > >> > >> > >> > ? 2013-9-13???9:21?Vijay Bellur <vbellur at redhat.com >> <mailto:vbellur at redhat.com>> ??? >> > >> >> On 09/13/2013 06:10 PM, kane wrote: >> >>> Hi >> >>> >> >>> We use gluster samba vfs test io,but the read performance via >> vfs is >> >>> half of write perfomance, >> >>> but via fuse the read and write performance is almost the same. >> >>> >> >>> this is our smb.conf: >> >>> [global] >> >>> workgroup = MYGROUP >> >>> server string = DCS Samba Server >> >>> log file = /var/log/samba/log.vfs >> >>> max log size = 500000 >> >>> # use sendfile = true >> >>> aio read size = 262144 >> >>> aio write size = 262144 >> >>> aio write behind = true >> >>> min receivefile size = 262144 >> >>> write cache size = 268435456 >> >>> security = user >> >>> passdb backend = tdbsam >> >>> load printers = yes >> >>> cups options = raw >> >>> read raw = yes >> >>> write raw = yes >> >>> max xmit = 262144 >> >>> socket options = TCP_NODELAY IPTOS_LOWDELAY >> SO_RCVBUF=262144 >> >>> SO_SNDBUF=262144 >> >>> kernel oplocks = no >> >>> stat cache = no >> >>> >> >>> any advises helpful? >> >>> >> >> >> >> This patch has shown improvement in read performance with libgfapi: >> >> >> >> http://review.gluster.org/#/c/5897/ >> >> >> >> Would it be possible for you to try this patch and check if it >> improves performance in your case? >> >> >> >> -Vijay >> >> >> > >> >> >> _______________________________________________ >> Gluster-users mailing list >> Gluster-users at gluster.org <mailto:Gluster-users at gluster.org> >> http://supercolony.gluster.org/mailman/listinfo/gluster-users >> >> > Please add 'kernel oplocks = no' in the [gvol] section and try again. Avati ------------------------------ Message: 16 Date: Wed, 18 Sep 2013 07:45:39 +0200 From: Daniel M?ller <mueller at tropenklinik.de> To: "'Bobby Jacob'" <bobby.jacob at alshaya.com>, <gluster-users at gluster.org> Subject: Re: Mounting same replica-volume on multiple clients. ???? Message-ID: <001801ceb432$4f1b0b60$ed512220$@de> Content-Type: text/plain; charset="iso-8859-1" Hello, this ist he behavior as if you write directly into the glusterd directory/partition and not to the remounted replicating bricks!? EDV Daniel M?ller Leitung EDV Tropenklinik Paul-Lechler-Krankenhaus Paul-Lechler-Str. 24 72076 T?bingen Tel.: 07071/206-463, Fax: 07071/206-499 eMail: mueller at tropenklinik.de Internet: www.tropenklinik.de Von: gluster-users-bounces at gluster.org [mailto:gluster-users-bounces at gluster.org] Im Auftrag von Bobby Jacob Gesendet: Mittwoch, 18. September 2013 07:36 An: gluster-users at gluster.org Betreff: Mounting same replica-volume on multiple clients. ???? HI, I have 2 gluster nodes (GFS01/GFS02) each with a single brick (B01/B01). I have created a simple replica volume with these bricks. Bricks ?? : GFS01/B01 and GFS02/B01. Volume: TestVol I have 2 clients (C01/C02) which will mount this ?testvol? for simultaneous read/write. The 2 clients run the same application which is load-balanced, so user request are end to both the client servers which reads/writes data to both the same volume. Mounting the volume on C1?????? :?????????????? mount ?t glusterfs ?o backupvolfile-server=GFS02 GFS01:/testvol /data Mounting the volume on C2?????? :?????????????? mount ?t glusterfs ?o backupvolfile-server=GFS01 GFS02:/testvol /data Is this the appropriate way to be followed.? At times, I notice that when I write data through C1-mount point the data is written only to GFS01/B01 and if data is written through C2-mount point the data is written only to GFS02/B01. Please advise. !! Thanks & Regards, Bobby Jacob ------------------------------ Message: 17 Date: Wed, 18 Sep 2013 13:46:09 +0800 From: kane <stef_9k at 163.com> To: Anand Avati <avati at redhat.com> Cc: gluster-users <gluster-users at gluster.org> Subject: Re: Gluster samba vfs read performance slow Message-ID: <BDD668E0-EA97-4084-ABBF-5508D3723107 at 163.com> Content-Type: text/plain; charset=iso-2022-jp I have already used "kernel oplocks = no" in the smb.conf, next is my original smb.conf file global settings: [global] workgroup = MYGROUP server string = DCS Samba Server log file = /var/log/samba/log.vfs max log size = 500000 aio read size = 262144 aio write size = 262144 aio write behind = true security = user passdb backend = tdbsam load printers = yes cups options = raw read raw = yes write raw = yes max xmit = 262144 socket options = TCP_NODELAY IPTOS_LOWDELAY SO_RCVBUF=262144 SO_SNDBUF=262144 # max protocol = SMB2 kernel oplocks = no stat cache = no thank you -Kane ? 2013-9-18???1:38?Anand Avati <avati at redhat.com> ??? > On 9/17/13 10:34 PM, kane wrote: >> Hi Anand, >> >> I use 2 gluster server , this is my volume info: >> Volume Name: soul >> Type: Distribute >> Volume ID: 58f049d0-a38a-4ebe-94c0-086d492bdfa6 >> Status: Started >> Number of Bricks: 2 >> Transport-type: tcp >> Bricks: >> Brick1: 192.168.101.133:/dcsdata/d0 >> Brick2: 192.168.101.134:/dcsdata/d0 >> >> each brick use a raid 5 logic disk with 8*2TSATA hdd. >> >> smb.conf: >> [gvol] >> comment = For samba export of volume test >> vfs objects = glusterfs >> glusterfs:volfile_server = localhost >> glusterfs:volume = soul >> path = / >> read only = no >> guest ok = yes >> >> this my testparm result: >> [global] >> workgroup = MYGROUP >> server string = DCS Samba Server >> log file = /var/log/samba/log.vfs >> max log size = 500000 >> max xmit = 262144 >> socket options = TCP_NODELAY IPTOS_LOWDELAY SO_RCVBUF=262144 >> SO_SNDBUF=262144 >> stat cache = No >> kernel oplocks = No >> idmap config * : backend = tdb >> aio read size = 262144 >> aio write size = 262144 >> aio write behind = true >> cups options = raw >> >> in client mount the smb share with cifs to dir /mnt/vfs, >> then use iozone executed in the cifs mount dir "/mnt/vfs": >> $ ./iozone -s 10G -r 128k -i0 -i1 -t 4 >> File size set to 10485760 KB >> Record Size 128 KB >> Command line used: ./iozone -s 10G -r 128k -i0 -i1 -t 4 >> Output is in Kbytes/sec >> Time Resolution = 0.000001 seconds. >> Processor cache size set to 1024 Kbytes. >> Processor cache line size set to 32 bytes. >> File stride size set to 17 * record size. >> Throughput test with 4 processes >> Each process writes a 10485760 Kbyte file in 128 Kbyte records >> >> Children see throughput for 4 initial writers = 534315.84 KB/sec >> Parent sees throughput for 4 initial writers = 519428.83 KB/sec >> Min throughput per process = 133154.69 KB/sec >> Max throughput per process = 134341.05 KB/sec >> Avg throughput per process = 133578.96 KB/sec >> Min xfer = 10391296.00 KB >> >> Children see throughput for 4 rewriters = 536634.88 KB/sec >> Parent sees throughput for 4 rewriters = 522618.54 KB/sec >> Min throughput per process = 133408.80 KB/sec >> Max throughput per process = 134721.36 KB/sec >> Avg throughput per process = 134158.72 KB/sec >> Min xfer = 10384384.00 KB >> >> Children see throughput for 4 readers = 77403.54 KB/sec >> Parent sees throughput for 4 readers = 77402.86 KB/sec >> Min throughput per process = 19349.42 KB/sec >> Max throughput per process = 19353.42 KB/sec >> Avg throughput per process = 19350.88 KB/sec >> Min xfer = 10483712.00 KB >> >> Children see throughput for 4 re-readers = 77424.40 KB/sec >> Parent sees throughput for 4 re-readers = 77423.89 KB/sec >> Min throughput per process = 19354.75 KB/sec >> Max throughput per process = 19358.50 KB/sec >> Avg throughput per process = 19356.10 KB/sec >> Min xfer = 10483840.00 KB >> >> then the use the same command test in the dir mounted with glister fuse: >> File size set to 10485760 KB >> Record Size 128 KB >> Command line used: ./iozone -s 10G -r 128k -i0 -i1 -t 4 >> Output is in Kbytes/sec >> Time Resolution = 0.000001 seconds. >> Processor cache size set to 1024 Kbytes. >> Processor cache line size set to 32 bytes. >> File stride size set to 17 * record size. >> Throughput test with 4 processes >> Each process writes a 10485760 Kbyte file in 128 Kbyte records >> >> Children see throughput for 4 initial writers = 887534.72 KB/sec >> Parent sees throughput for 4 initial writers = 848830.39 KB/sec >> Min throughput per process = 220140.91 KB/sec >> Max throughput per process = 223690.45 KB/sec >> Avg throughput per process = 221883.68 KB/sec >> Min xfer = 10319360.00 KB >> >> Children see throughput for 4 rewriters = 892774.92 KB/sec >> Parent sees throughput for 4 rewriters = 871186.83 KB/sec >> Min throughput per process = 222326.44 KB/sec >> Max throughput per process = 223970.17 KB/sec >> Avg throughput per process = 223193.73 KB/sec >> Min xfer = 10431360.00 KB >> >> Children see throughput for 4 readers = 605889.12 KB/sec >> Parent sees throughput for 4 readers = 601767.96 KB/sec >> Min throughput per process = 143133.14 KB/sec >> Max throughput per process = 159550.88 KB/sec >> Avg throughput per process = 151472.28 KB/sec >> Min xfer = 9406848.00 KB >> >> it shows much higher perf. >> >> any places i did wrong? >> >> >> thank you >> -Kane >> >> ? 2013-9-18???1:19?Anand Avati <avati at gluster.org >> <mailto:avati at gluster.org>> ??? >> >>> How are you testing this? What tool are you using? >>> >>> Avati >>> >>> >>> On Tue, Sep 17, 2013 at 9:02 PM, kane <stef_9k at 163.com >>> <mailto:stef_9k at 163.com>> wrote: >>> >>> Hi Vijay >>> >>> I used the code in >>> https://github.com/gluster/glusterfs.git with the lasted commit: >>> commit de2a8d303311bd600cb93a775bc79a0edea1ee1a >>> Author: Anand Avati <avati at redhat.com <mailto:avati at redhat.com>> >>> Date: Tue Sep 17 16:45:03 2013 -0700 >>> >>> Revert "cluster/distribute: Rebalance should also verify free >>> inodes" >>> >>> This reverts commit 215fea41a96479312a5ab8783c13b30ab9fe00fa >>> >>> Realized soon after merging, ?. >>> >>> which include the patch you mentioned last time improve read perf, >>> written by Anand. >>> >>> but the read perf was still slow: >>> write: 500MB/s >>> read: 77MB/s >>> >>> while via fuse : >>> write 800MB/s >>> read 600MB/s >>> >>> any advises? >>> >>> >>> Thank you. >>> -Kane >>> >>> ? 2013-9-13???10:37?kane <stef_9k at 163.com >>> <mailto:stef_9k at 163.com>> ??? >>> >>>> Hi Vijay? >>>> >>>> thank you for post this message, i will try it soon >>>> >>>> -kane >>>> >>>> >>>> >>>> ? 2013-9-13???9:21?Vijay Bellur <vbellur at redhat.com >>> <mailto:vbellur at redhat.com>> ??? >>>> >>>>> On 09/13/2013 06:10 PM, kane wrote: >>>>>> Hi >>>>>> >>>>>> We use gluster samba vfs test io,but the read performance via >>> vfs is >>>>>> half of write perfomance, >>>>>> but via fuse the read and write performance is almost the same. >>>>>> >>>>>> this is our smb.conf: >>>>>> [global] >>>>>> workgroup = MYGROUP >>>>>> server string = DCS Samba Server >>>>>> log file = /var/log/samba/log.vfs >>>>>> max log size = 500000 >>>>>> # use sendfile = true >>>>>> aio read size = 262144 >>>>>> aio write size = 262144 >>>>>> aio write behind = true >>>>>> min receivefile size = 262144 >>>>>> write cache size = 268435456 >>>>>> security = user >>>>>> passdb backend = tdbsam >>>>>> load printers = yes >>>>>> cups options = raw >>>>>> read raw = yes >>>>>> write raw = yes >>>>>> max xmit = 262144 >>>>>> socket options = TCP_NODELAY IPTOS_LOWDELAY >>> SO_RCVBUF=262144 >>>>>> SO_SNDBUF=262144 >>>>>> kernel oplocks = no >>>>>> stat cache = no >>>>>> >>>>>> any advises helpful? >>>>>> >>>>> >>>>> This patch has shown improvement in read performance with libgfapi: >>>>> >>>>> http://review.gluster.org/#/c/5897/ >>>>> >>>>> Would it be possible for you to try this patch and check if it >>> improves performance in your case? >>>>> >>>>> -Vijay >>>>> >>>> >>> >>> >>> _______________________________________________ >>> Gluster-users mailing list >>> Gluster-users at gluster.org <mailto:Gluster-users at gluster.org> >>> http://supercolony.gluster.org/mailman/listinfo/gluster-users >>> >>> >> > > Please add 'kernel oplocks = no' in the [gvol] section and try again. > > Avati > ------------------------------ Message: 18 Date: Wed, 18 Sep 2013 05:48:29 +0000 From: Bobby Jacob <bobby.jacob at alshaya.com> To: "mueller at tropenklinik.de" <mueller at tropenklinik.de>, "gluster-users at gluster.org" <gluster-users at gluster.org> Subject: Re: Mounting same replica-volume on multiple clients. ???? Message-ID: <AC3305F9C186F849B835A3E6D3C9BEFEAFA748 at KWTPRMBX001.mha.local> Content-Type: text/plain; charset="iso-8859-1" Exactly. !! BUT I am writing through the volume mount-point from the clients. !! NOT directly into the bricks. !! I'm using GlusterFS 3.3.2 with Centos6.4 . ! Thanks & Regards, Bobby Jacob -----Original Message----- From: Daniel M?ller [mailto:mueller at tropenklinik.de] Sent: Wednesday, September 18, 2013 8:46 AM To: Bobby Jacob; gluster-users at gluster.org Subject: AW: Mounting same replica-volume on multiple clients. ???? Hello, this ist he behavior as if you write directly into the glusterd directory/partition and not to the remounted replicating bricks!? EDV Daniel M?ller Leitung EDV Tropenklinik Paul-Lechler-Krankenhaus Paul-Lechler-Str. 24 72076 T?bingen Tel.: 07071/206-463, Fax: 07071/206-499 eMail: mueller at tropenklinik.de Internet: www.tropenklinik.de Von: gluster-users-bounces at gluster.org [mailto:gluster-users-bounces at gluster.org] Im Auftrag von Bobby Jacob Gesendet: Mittwoch, 18. September 2013 07:36 An: gluster-users at gluster.org Betreff: Mounting same replica-volume on multiple clients. ???? HI, I have 2 gluster nodes (GFS01/GFS02) each with a single brick (B01/B01). I have created a simple replica volume with these bricks. Bricks ?? : GFS01/B01 and GFS02/B01. Volume: TestVol I have 2 clients (C01/C02) which will mount this "testvol" for simultaneous read/write. The 2 clients run the same application which is load-balanced, so user request are end to both the client servers which reads/writes data to both the same volume. Mounting the volume on C1?????? :?????????????? mount -t glusterfs -o backupvolfile-server=GFS02 GFS01:/testvol /data Mounting the volume on C2?????? :?????????????? mount -t glusterfs -o backupvolfile-server=GFS01 GFS02:/testvol /data Is this the appropriate way to be followed.? At times, I notice that when I write data through C1-mount point the data is written only to GFS01/B01 and if data is written through C2-mount point the data is written only to GFS02/B01. Please advise. !! Thanks & Regards, Bobby Jacob ------------------------------ Message: 19 Date: Tue, 17 Sep 2013 23:45:43 -0700 From: Anand Avati <avati at gluster.org> To: kane <stef_9k at 163.com> Cc: gluster-users <gluster-users at gluster.org>, Anand Avati <avati at redhat.com> Subject: Re: Gluster samba vfs read performance slow Message-ID: <CAFboF2yGB6UPN-chHDTGf9HgM_0jbPjWDnUeEiyQp+h9qDTV_w at mail.gmail.com> Content-Type: text/plain; charset="iso-2022-jp" Can you get the volume profile dumps for both the runs and compare them? Avati On Tue, Sep 17, 2013 at 10:46 PM, kane <stef_9k at 163.com> wrote: > I have already used "kernel oplocks = no" in the smb.conf, next is my > original smb.conf file global settings: > [global] > workgroup = MYGROUP > server string = DCS Samba Server > log file = /var/log/samba/log.vfs > max log size = 500000 > aio read size = 262144 > aio write size = 262144 > aio write behind = true > security = user > passdb backend = tdbsam > load printers = yes > cups options = raw > read raw = yes > write raw = yes > max xmit = 262144 > socket options = TCP_NODELAY IPTOS_LOWDELAY SO_RCVBUF=262144 > SO_SNDBUF=262144 > # max protocol = SMB2 > kernel oplocks = no > stat cache = no > > thank you > -Kane > ? 2013-9-18???1:38?Anand Avati <avati at redhat.com> ??? > > > On 9/17/13 10:34 PM, kane wrote: > >> Hi Anand, > >> > >> I use 2 gluster server , this is my volume info: > >> Volume Name: soul > >> Type: Distribute > >> Volume ID: 58f049d0-a38a-4ebe-94c0-086d492bdfa6 > >> Status: Started > >> Number of Bricks: 2 > >> Transport-type: tcp > >> Bricks: > >> Brick1: 192.168.101.133:/dcsdata/d0 > >> Brick2: 192.168.101.134:/dcsdata/d0 > >> > >> each brick use a raid 5 logic disk with 8*2TSATA hdd. > >> > >> smb.conf: > >> [gvol] > >> comment = For samba export of volume test > >> vfs objects = glusterfs > >> glusterfs:volfile_server = localhost > >> glusterfs:volume = soul > >> path = / > >> read only = no > >> guest ok = yes > >> > >> this my testparm result: > >> [global] > >> workgroup = MYGROUP > >> server string = DCS Samba Server > >> log file = /var/log/samba/log.vfs > >> max log size = 500000 > >> max xmit = 262144 > >> socket options = TCP_NODELAY IPTOS_LOWDELAY SO_RCVBUF=262144 > >> SO_SNDBUF=262144 > >> stat cache = No > >> kernel oplocks = No > >> idmap config * : backend = tdb > >> aio read size = 262144 > >> aio write size = 262144 > >> aio write behind = true > >> cups options = raw > >> > >> in client mount the smb share with cifs to dir /mnt/vfs, > >> then use iozone executed in the cifs mount dir "/mnt/vfs": > >> $ ./iozone -s 10G -r 128k -i0 -i1 -t 4 > >> File size set to 10485760 KB > >> Record Size 128 KB > >> Command line used: ./iozone -s 10G -r 128k -i0 -i1 -t 4 > >> Output is in Kbytes/sec > >> Time Resolution = 0.000001 seconds. > >> Processor cache size set to 1024 Kbytes. > >> Processor cache line size set to 32 bytes. > >> File stride size set to 17 * record size. > >> Throughput test with 4 processes > >> Each process writes a 10485760 Kbyte file in 128 Kbyte records > >> > >> Children see throughput for 4 initial writers = 534315.84 KB/sec > >> Parent sees throughput for 4 initial writers = 519428.83 KB/sec > >> Min throughput per process = 133154.69 KB/sec > >> Max throughput per process = 134341.05 KB/sec > >> Avg throughput per process = 133578.96 KB/sec > >> Min xfer = 10391296.00 KB > >> > >> Children see throughput for 4 rewriters = 536634.88 KB/sec > >> Parent sees throughput for 4 rewriters = 522618.54 KB/sec > >> Min throughput per process = 133408.80 KB/sec > >> Max throughput per process = 134721.36 KB/sec > >> Avg throughput per process = 134158.72 KB/sec > >> Min xfer = 10384384.00 KB > >> > >> Children see throughput for 4 readers = 77403.54 KB/sec > >> Parent sees throughput for 4 readers = 77402.86 KB/sec > >> Min throughput per process = 19349.42 KB/sec > >> Max throughput per process = 19353.42 KB/sec > >> Avg throughput per process = 19350.88 KB/sec > >> Min xfer = 10483712.00 KB > >> > >> Children see throughput for 4 re-readers = 77424.40 KB/sec > >> Parent sees throughput for 4 re-readers = 77423.89 KB/sec > >> Min throughput per process = 19354.75 KB/sec > >> Max throughput per process = 19358.50 KB/sec > >> Avg throughput per process = 19356.10 KB/sec > >> Min xfer = 10483840.00 KB > >> > >> then the use the same command test in the dir mounted with glister fuse: > >> File size set to 10485760 KB > >> Record Size 128 KB > >> Command line used: ./iozone -s 10G -r 128k -i0 -i1 -t 4 > >> Output is in Kbytes/sec > >> Time Resolution = 0.000001 seconds. > >> Processor cache size set to 1024 Kbytes. > >> Processor cache line size set to 32 bytes. > >> File stride size set to 17 * record size. > >> Throughput test with 4 processes > >> Each process writes a 10485760 Kbyte file in 128 Kbyte records > >> > >> Children see throughput for 4 initial writers = 887534.72 KB/sec > >> Parent sees throughput for 4 initial writers = 848830.39 KB/sec > >> Min throughput per process = 220140.91 KB/sec > >> Max throughput per process = 223690.45 KB/sec > >> Avg throughput per process = 221883.68 KB/sec > >> Min xfer = 10319360.00 KB > >> > >> Children see throughput for 4 rewriters = 892774.92 KB/sec > >> Parent sees throughput for 4 rewriters = 871186.83 KB/sec > >> Min throughput per process = 222326.44 KB/sec > >> Max throughput per process = 223970.17 KB/sec > >> Avg throughput per process = 223193.73 KB/sec > >> Min xfer = 10431360.00 KB > >> > >> Children see throughput for 4 readers = 605889.12 KB/sec > >> Parent sees throughput for 4 readers = 601767.96 KB/sec > >> Min throughput per process = 143133.14 KB/sec > >> Max throughput per process = 159550.88 KB/sec > >> Avg throughput per process = 151472.28 KB/sec > >> Min xfer = 9406848.00 KB > >> > >> it shows much higher perf. > >> > >> any places i did wrong? > >> > >> > >> thank you > >> -Kane > >> > >> ? 2013-9-18???1:19?Anand Avati <avati at gluster.org > >> <mailto:avati at gluster.org>> ??? > >> > >>> How are you testing this? What tool are you using? > >>> > >>> Avati > >>> > >>> > >>> On Tue, Sep 17, 2013 at 9:02 PM, kane <stef_9k at 163.com > >>> <mailto:stef_9k at 163.com>> wrote: > >>> > >>> Hi Vijay > >>> > >>> I used the code in > >>> https://github.com/gluster/glusterfs.git with the lasted commit: > >>> commit de2a8d303311bd600cb93a775bc79a0edea1ee1a > >>> Author: Anand Avati <avati at redhat.com <mailto:avati at redhat.com>> > >>> Date: Tue Sep 17 16:45:03 2013 -0700 > >>> > >>> Revert "cluster/distribute: Rebalance should also verify free > >>> inodes" > >>> > >>> This reverts commit 215fea41a96479312a5ab8783c13b30ab9fe00fa > >>> > >>> Realized soon after merging, ?. > >>> > >>> which include the patch you mentioned last time improve read perf, > >>> written by Anand. > >>> > >>> but the read perf was still slow: > >>> write: 500MB/s > >>> read: 77MB/s > >>> > >>> while via fuse : > >>> write 800MB/s > >>> read 600MB/s > >>> > >>> any advises? > >>> > >>> > >>> Thank you. > >>> -Kane > >>> > >>> ? 2013-9-13???10:37?kane <stef_9k at 163.com > >>> <mailto:stef_9k at 163.com>> ??? > >>> > >>>> Hi Vijay? > >>>> > >>>> thank you for post this message, i will try it soon > >>>> > >>>> -kane > >>>> > >>>> > >>>> > >>>> ? 2013-9-13???9:21?Vijay Bellur <vbellur at redhat.com > >>> <mailto:vbellur at redhat.com>> ??? > >>>> > >>>>> On 09/13/2013 06:10 PM, kane wrote: > >>>>>> Hi > >>>>>> > >>>>>> We use gluster samba vfs test io,but the read performance via > >>> vfs is > >>>>>> half of write perfomance, > >>>>>> but via fuse the read and write performance is almost the same. > >>>>>> > >>>>>> this is our smb.conf: > >>>>>> [global] > >>>>>> workgroup = MYGROUP > >>>>>> server string = DCS Samba Server > >>>>>> log file = /var/log/samba/log.vfs > >>>>>> max log size = 500000 > >>>>>> # use sendfile = true > >>>>>> aio read size = 262144 > >>>>>> aio write size = 262144 > >>>>>> aio write behind = true > >>>>>> min receivefile size = 262144 > >>>>>> write cache size = 268435456 > >>>>>> security = user > >>>>>> passdb backend = tdbsam > >>>>>> load printers = yes > >>>>>> cups options = raw > >>>>>> read raw = yes > >>>>>> write raw = yes > >>>>>> max xmit = 262144 > >>>>>> socket options = TCP_NODELAY IPTOS_LOWDELAY > >>> SO_RCVBUF=262144 > >>>>>> SO_SNDBUF=262144 > >>>>>> kernel oplocks = no > >>>>>> stat cache = no > >>>>>> > >>>>>> any advises helpful? > >>>>>> > >>>>> > >>>>> This patch has shown improvement in read performance with libgfapi: > >>>>> > >>>>> http://review.gluster.org/#/c/5897/ > >>>>> > >>>>> Would it be possible for you to try this patch and check if it > >>> improves performance in your case? > >>>>> > >>>>> -Vijay > >>>>> > >>>> > >>> > >>> > >>> _______________________________________________ > >>> Gluster-users mailing list > >>> Gluster-users at gluster.org <mailto:Gluster-users at gluster.org> > >>> http://supercolony.gluster.org/mailman/listinfo/gluster-users > >>> > >>> > >> > > > > Please add 'kernel oplocks = no' in the [gvol] section and try again. > > > > Avati > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: < http://supercolony.gluster.org/pipermail/gluster-users/attachments/20130917/7acaaa03/attachment-0001.html > ------------------------------ Message: 20 Date: Wed, 18 Sep 2013 09:17:19 +0200 From: Daniel M?ller <mueller at tropenklinik.de> To: "'Bobby Jacob'" <bobby.jacob at alshaya.com>, <gluster-users at gluster.org> Subject: Re: Mounting same replica-volume on multiple clients. ???? Message-ID: <002e01ceb43f$1c8635a0$5592a0e0$@de> Content-Type: text/plain; charset="iso-8859-1" What about gluster volume info on both nodes!? Ex.: Volume Name: sambacluster Type: Replicate Volume ID: 4fd0da03-8579-47cc-926b-d7577dac56cf Status: Started Number of Bricks: 1 x 2 = 2 Transport-type: tcp Bricks: Brick1: s4master:/raid5hs/glusterfs/samba Brick2: s4slave:/raid5hs/glusterfs/samba Options Reconfigured: network.ping-timeout: 5 performance.quick-read: on What are telling you your log files? ----------------------------------------------- EDV Daniel M?ller Leitung EDV Tropenklinik Paul-Lechler-Krankenhaus Paul-Lechler-Str. 24 72076 T?bingen Tel.: 07071/206-463, Fax: 07071/206-499 eMail: mueller at tropenklinik.de Internet: www.tropenklinik.de ----------------------------------------------- -----Urspr?ngliche Nachricht----- Von: Bobby Jacob [mailto:bobby.jacob at alshaya.com] Gesendet: Mittwoch, 18. September 2013 07:48 An: mueller at tropenklinik.de; gluster-users at gluster.org Betreff: RE: Mounting same replica-volume on multiple clients. ???? Exactly. !! BUT I am writing through the volume mount-point from the clients. !! NOT directly into the bricks. !! I'm using GlusterFS 3.3.2 with Centos6.4 . ! Thanks & Regards, Bobby Jacob -----Original Message----- From: Daniel M?ller [mailto:mueller at tropenklinik.de] Sent: Wednesday, September 18, 2013 8:46 AM To: Bobby Jacob; gluster-users at gluster.org Subject: AW: Mounting same replica-volume on multiple clients. ???? Hello, this ist he behavior as if you write directly into the glusterd directory/partition and not to the remounted replicating bricks!? EDV Daniel M?ller Leitung EDV Tropenklinik Paul-Lechler-Krankenhaus Paul-Lechler-Str. 24 72076 T?bingen Tel.: 07071/206-463, Fax: 07071/206-499 eMail: mueller at tropenklinik.de Internet: www.tropenklinik.de Von: gluster-users-bounces at gluster.org [mailto:gluster-users-bounces at gluster.org] Im Auftrag von Bobby Jacob Gesendet: Mittwoch, 18. September 2013 07:36 An: gluster-users at gluster.org Betreff: Mounting same replica-volume on multiple clients. ???? HI, I have 2 gluster nodes (GFS01/GFS02) each with a single brick (B01/B01). I have created a simple replica volume with these bricks. Bricks ?? : GFS01/B01 and GFS02/B01. Volume: TestVol I have 2 clients (C01/C02) which will mount this "testvol" for simultaneous read/write. The 2 clients run the same application which is load-balanced, so user request are end to both the client servers which reads/writes data to both the same volume. Mounting the volume on C1?????? :?????????????? mount -t glusterfs -o backupvolfile-server=GFS02 GFS01:/testvol /data Mounting the volume on C2?????? :?????????????? mount -t glusterfs -o backupvolfile-server=GFS01 GFS02:/testvol /data Is this the appropriate way to be followed.? At times, I notice that when I write data through C1-mount point the data is written only to GFS01/B01 and if data is written through C2-mount point the data is written only to GFS02/B01. Please advise. !! Thanks & Regards, Bobby Jacob ------------------------------ Message: 21 Date: Wed, 18 Sep 2013 16:27:28 +0800 From: kane <stef_9k at 163.com> To: Anand Avati <avati at gluster.org> Cc: gluster-users <gluster-users at gluster.org>, Anand Avati <avati at redhat.com> Subject: Re: Gluster samba vfs read performance slow Message-ID: <00411A37-CF9E-4598-8BC3-5A080B0B4766 at 163.com> Content-Type: text/plain; charset="iso-2022-jp" I compared the profile dumps while write and read is separately running; writing: ------------------------------------------------ Interval 58 Stats: Block Size: 65536b+ 131072b+ No. of Reads: 0 0 No. of Writes: 27120 10500 %-latency Avg-latency Min-Latency Max-Latency No. of calls Fop --------- ----------- ----------- ----------- ------------ ---- 100.00 133.51 us 36.00 us 1339.00 us 37619 WRITE Duration: 12 seconds Data Read: 0 bytes Data Written: 3153854464 bytes ------------------------------------------------ read: ------------------------------------------------ Interval 63 Stats: Block Size: 131072b+ No. of Reads: 3529 No. of Writes: 0 %-latency Avg-latency Min-Latency Max-Latency No. of calls Fop --------- ----------- ----------- ----------- ------------ ---- 0.54 87.86 us 68.00 us 127.00 us 42 FSTAT 99.46 193.68 us 89.00 us 2121.00 us 3529 READ Duration: 12 seconds Data Read: 462553088 bytes Data Written: 0 bytes ------------------------------------------------ two server brick avg dumps: ================================ Brick: 192.168.101.133:/dcsdata/d0 ---------------------------------- Cumulative Stats: Block Size: 8192b+ 16384b+ 32768b+ No. of Reads: 0 0 0 No. of Writes: 2 1 1 Block Size: 65536b+ 131072b+ 262144b+ No. of Reads: 0 1613832 0 No. of Writes: 2282474 1148962 227 %-latency Avg-latency Min-Latency Max-Latency No. of calls Fop --------- ----------- ----------- ----------- ------------ ---- 0.00 0.00 us 0.00 us 0.00 us 14 FORGET 0.00 0.00 us 0.00 us 0.00 us 39 RELEASE 0.00 0.00 us 0.00 us 0.00 us 114 RELEASEDIR 0.00 84.50 us 54.00 us 115.00 us 2 OPENDIR 0.00 79.00 us 52.00 us 127.00 us 4 OPEN 0.00 47.00 us 14.00 us 130.00 us 8 FLUSH 0.00 342.00 us 311.00 us 373.00 us 2 CREATE 0.00 104.77 us 26.00 us 281.00 us 13 STATFS 0.01 131.75 us 35.00 us 285.00 us 93 LOOKUP 0.02 7446.00 us 104.00 us 29191.00 us 4 READDIRP 0.07 2784.89 us 49.00 us 49224.00 us 36 GETXATTR 0.20 64.49 us 29.00 us 164.00 us 4506 FSTAT 1.07 399482.25 us 361616.00 us 450370.00 us 4 UNLINK 42.87 167.36 us 56.00 us 44827.00 us 381080 READ 55.76 71.51 us 35.00 us 7032.00 us 1159912 WRITE Duration: 22156 seconds Data Read: 211528187904 bytes Data Written: 300276908032 bytes Interval 71 Stats: %-latency Avg-latency Min-Latency Max-Latency No. of calls Fop --------- ----------- ----------- ----------- ------------ ---- 0.00 0.00 us 0.00 us 0.00 us 1 RELEASEDIR 0.18 54.00 us 54.00 us 54.00 us 1 OPENDIR 1.05 107.33 us 40.00 us 217.00 us 3 STATFS 2.90 126.57 us 81.00 us 256.00 us 7 LOOKUP 95.88 14669.00 us 147.00 us 29191.00 us 2 READDIRP Duration: 581 seconds Data Read: 0 bytes Data Written: 0 bytes Brick: 192.168.101.134:/dcsdata/d0 ---------------------------------- Cumulative Stats: Block Size: 8192b+ 16384b+ 32768b+ No. of Reads: 0 0 0 No. of Writes: 2 3 24 Block Size: 65536b+ 131072b+ 262144b+ No. of Reads: 22 1563063 0 No. of Writes: 1522412 1525007 184 %-latency Avg-latency Min-Latency Max-Latency No. of calls Fop --------- ----------- ----------- ----------- ------------ ---- 0.00 0.00 us 0.00 us 0.00 us 14 FORGET 0.00 0.00 us 0.00 us 0.00 us 39 RELEASE 0.00 0.00 us 0.00 us 0.00 us 114 RELEASEDIR 0.00 116.50 us 111.00 us 122.00 us 2 OPENDIR 0.00 69.25 us 23.00 us 95.00 us 8 FLUSH 0.00 418.00 us 285.00 us 551.00 us 2 CREATE 0.00 239.25 us 101.00 us 396.00 us 4 READDIRP 0.00 93.00 us 39.00 us 249.00 us 13 STATFS 0.01 142.89 us 78.00 us 241.00 us 87 LOOKUP 0.09 48402.25 us 114.00 us 99173.00 us 4 OPEN 0.19 10974.42 us 60.00 us 345979.00 us 36 GETXATTR 0.20 94.33 us 41.00 us 200.00 us 4387 FSTAT 0.85 440436.25 us 381525.00 us 582989.00 us 4 UNLINK 35.80 193.96 us 57.00 us 23312.00 us 380869 READ 62.86 134.89 us 29.00 us 9976.00 us 961593 WRITE Duration: 22155 seconds Data Read: 204875400152 bytes Data Written: 299728837956 bytes ================================ Kane ? 2013-9-18???2:45?Anand Avati <avati at gluster.org> ??? > Can you get the volume profile dumps for both the runs and compare them? > > Avati > > > > On Tue, Sep 17, 2013 at 10:46 PM, kane <stef_9k at 163.com> wrote: > I have already used "kernel oplocks = no" in the smb.conf, next is my original smb.conf file global settings: > [global] > workgroup = MYGROUP > server string = DCS Samba Server > log file = /var/log/samba/log.vfs > max log size = 500000 > aio read size = 262144 > aio write size = 262144 > aio write behind = true > security = user > passdb backend = tdbsam > load printers = yes > cups options = raw > read raw = yes > write raw = yes > max xmit = 262144 > socket options = TCP_NODELAY IPTOS_LOWDELAY SO_RCVBUF=262144 SO_SNDBUF=262144 > # max protocol = SMB2 > kernel oplocks = no > stat cache = no > > thank you > -Kane > ? 2013-9-18???1:38?Anand Avati <avati at redhat.com> ??? > > > On 9/17/13 10:34 PM, kane wrote: > >> Hi Anand, > >> > >> I use 2 gluster server , this is my volume info: > >> Volume Name: soul > >> Type: Distribute > >> Volume ID: 58f049d0-a38a-4ebe-94c0-086d492bdfa6 > >> Status: Started > >> Number of Bricks: 2 > >> Transport-type: tcp > >> Bricks: > >> Brick1: 192.168.101.133:/dcsdata/d0 > >> Brick2: 192.168.101.134:/dcsdata/d0 > >> > >> each brick use a raid 5 logic disk with 8*2TSATA hdd. > >> > >> smb.conf: > >> [gvol] > >> comment = For samba export of volume test > >> vfs objects = glusterfs > >> glusterfs:volfile_server = localhost > >> glusterfs:volume = soul > >> path = / > >> read only = no > >> guest ok = yes > >> > >> this my testparm result: > >> [global] > >> workgroup = MYGROUP > >> server string = DCS Samba Server > >> log file = /var/log/samba/log.vfs > >> max log size = 500000 > >> max xmit = 262144 > >> socket options = TCP_NODELAY IPTOS_LOWDELAY SO_RCVBUF=262144 > >> SO_SNDBUF=262144 > >> stat cache = No > >> kernel oplocks = No > >> idmap config * : backend = tdb > >> aio read size = 262144 > >> aio write size = 262144 > >> aio write behind = true > >> cups options = raw > >> > >> in client mount the smb share with cifs to dir /mnt/vfs, > >> then use iozone executed in the cifs mount dir "/mnt/vfs": > >> $ ./iozone -s 10G -r 128k -i0 -i1 -t 4 > >> File size set to 10485760 KB > >> Record Size 128 KB > >> Command line used: ./iozone -s 10G -r 128k -i0 -i1 -t 4 > >> Output is in Kbytes/sec > >> Time Resolution = 0.000001 seconds. > >> Processor cache size set to 1024 Kbytes. > >> Processor cache line size set to 32 bytes. > >> File stride size set to 17 * record size. > >> Throughput test with 4 processes > >> Each process writes a 10485760 Kbyte file in 128 Kbyte records > >> > >> Children see throughput for 4 initial writers = 534315.84 KB/sec > >> Parent sees throughput for 4 initial writers = 519428.83 KB/sec > >> Min throughput per process = 133154.69 KB/sec > >> Max throughput per process = 134341.05 KB/sec > >> Avg throughput per process = 133578.96 KB/sec > >> Min xfer = 10391296.00 KB > >> > >> Children see throughput for 4 rewriters = 536634.88 KB/sec > >> Parent sees throughput for 4 rewriters = 522618.54 KB/sec > >> Min throughput per process = 133408.80 KB/sec > >> Max throughput per process = 134721.36 KB/sec > >> Avg throughput per process = 134158.72 KB/sec > >> Min xfer = 10384384.00 KB > >> > >> Children see throughput for 4 readers = 77403.54 KB/sec > >> Parent sees throughput for 4 readers = 77402.86 KB/sec > >> Min throughput per process = 19349.42 KB/sec > >> Max throughput per process = 19353.42 KB/sec > >> Avg throughput per process = 19350.88 KB/sec > >> Min xfer = 10483712.00 KB > >> > >> Children see throughput for 4 re-readers = 77424.40 KB/sec > >> Parent sees throughput for 4 re-readers = 77423.89 KB/sec > >> Min throughput per process = 19354.75 KB/sec > >> Max throughput per process = 19358.50 KB/sec > >> Avg throughput per process = 19356.10 KB/sec > >> Min xfer = 10483840.00 KB > >> > >> then the use the same command test in the dir mounted with glister fuse: > >> File size set to 10485760 KB > >> Record Size 128 KB > >> Command line used: ./iozone -s 10G -r 128k -i0 -i1 -t 4 > >> Output is in Kbytes/sec > >> Time Resolution = 0.000001 seconds. > >> Processor cache size set to 1024 Kbytes. > >> Processor cache line size set to 32 bytes. > >> File stride size set to 17 * record size. > >> Throughput test with 4 processes > >> Each process writes a 10485760 Kbyte file in 128 Kbyte records > >> > >> Children see throughput for 4 initial writers = 887534.72 KB/sec > >> Parent sees throughput for 4 initial writers = 848830.39 KB/sec > >> Min throughput per process = 220140.91 KB/sec > >> Max throughput per process = 223690.45 KB/sec > >> Avg throughput per process = 221883.68 KB/sec > >> Min xfer = 10319360.00 KB > >> > >> Children see throughput for 4 rewriters = 892774.92 KB/sec > >> Parent sees throughput for 4 rewriters = 871186.83 KB/sec > >> Min throughput per process = 222326.44 KB/sec > >> Max throughput per process = 223970.17 KB/sec > >> Avg throughput per process = 223193.73 KB/sec > >> Min xfer = 10431360.00 KB > >> > >> Children see throughput for 4 readers = 605889.12 KB/sec > >> Parent sees throughput for 4 readers = 601767.96 KB/sec > >> Min throughput per process = 143133.14 KB/sec > >> Max throughput per process = 159550.88 KB/sec > >> Avg throughput per process = 151472.28 KB/sec > >> Min xfer = 9406848.00 KB > >> > >> it shows much higher perf. > >> > >> any places i did wrong? > >> > >> > >> thank you > >> -Kane > >> > >> ? 2013-9-18???1:19?Anand Avati <avati at gluster.org > >> <mailto:avati at gluster.org>> ??? > >> > >>> How are you testing this? What tool are you using? > >>> > >>> Avati > >>> > >>> > >>> On Tue, Sep 17, 2013 at 9:02 PM, kane <stef_9k at 163.com > >>> <mailto:stef_9k at 163.com>> wrote: > >>> > >>> Hi Vijay > >>> > >>> I used the code in > >>> https://github.com/gluster/glusterfs.git with the lasted commit: > >>> commit de2a8d303311bd600cb93a775bc79a0edea1ee1a > >>> Author: Anand Avati <avati at redhat.com <mailto:avati at redhat.com>> > >>> Date: Tue Sep 17 16:45:03 2013 -0700 > >>> > >>> Revert "cluster/distribute: Rebalance should also verify free > >>> inodes" > >>> > >>> This reverts commit 215fea41a96479312a5ab8783c13b30ab9fe00fa > >>> > >>> Realized soon after merging, ?. > >>> > >>> which include the patch you mentioned last time improve read perf, > >>> written by Anand. > >>> > >>> but the read perf was still slow: > >>> write: 500MB/s > >>> read: 77MB/s > >>> > >>> while via fuse : > >>> write 800MB/s > >>> read 600MB/s > >>> > >>> any advises? > >>> > >>> > >>> Thank you. > >>> -Kane > >>> > >>> ? 2013-9-13???10:37?kane <stef_9k at 163.com > >>> <mailto:stef_9k at 163.com>> ??? > >>> > >>>> Hi Vijay? > >>>> > >>>> thank you for post this message, i will try it soon > >>>> > >>>> -kane > >>>> > >>>> > >>>> > >>>> ? 2013-9-13???9:21?Vijay Bellur <vbellur at redhat.com > >>> <mailto:vbellur at redhat.com>> ??? > >>>> > >>>>> On 09/13/2013 06:10 PM, kane wrote: > >>>>>> Hi > >>>>>> > >>>>>> We use gluster samba vfs test io,but the read performance via > >>> vfs is > >>>>>> half of write perfomance, > >>>>>> but via fuse the read and write performance is almost the same. > >>>>>> > >>>>>> this is our smb.conf: > >>>>>> [global] > >>>>>> workgroup = MYGROUP > >>>>>> server string = DCS Samba Server > >>>>>> log file = /var/log/samba/log.vfs > >>>>>> max log size = 500000 > >>>>>> # use sendfile = true > >>>>>> aio read size = 262144 > >>>>>> aio write size = 262144 > >>>>>> aio write behind = true > >>>>>> min receivefile size = 262144 > >>>>>> write cache size = 268435456 > >>>>>> security = user > >>>>>> passdb backend = tdbsam > >>>>>> load printers = yes > >>>>>> cups options = raw > >>>>>> read raw = yes > >>>>>> write raw = yes > >>>>>> max xmit = 262144 > >>>>>> socket options = TCP_NODELAY IPTOS_LOWDELAY > >>> SO_RCVBUF=262144 > >>>>>> SO_SNDBUF=262144 > >>>>>> kernel oplocks = no > >>>>>> stat cache = no > >>>>>> > >>>>>> any advises helpful? > >>>>>> > >>>>> > >>>>> This patch has shown improvement in read performance with libgfapi: > >>>>> > >>>>> http://review.gluster.org/#/c/5897/ > >>>>> > >>>>> Would it be possible for you to try this patch and check if it > >>> improves performance in your case? > >>>>> > >>>>> -Vijay > >>>>> > >>>> > >>> > >>> > >>> _______________________________________________ > >>> Gluster-users mailing list > >>> Gluster-users at gluster.org <mailto:Gluster-users at gluster.org> > >>> http://supercolony.gluster.org/mailman/listinfo/gluster-users > >>> > >>> > >> > > > > Please add 'kernel oplocks = no' in the [gvol] section and try again. > > > > Avati > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: < http://supercolony.gluster.org/pipermail/gluster-users/attachments/20130918/ab613dc7/attachment-0001.html > ------------------------------ Message: 22 Date: Wed, 18 Sep 2013 11:15:21 +0200 From: Luk?? Bezdi?ka <lukas.bezdicka at gooddata.com> To: Vijay Bellur <vbellur at redhat.com> Cc: Emmanuel Dreyfus <manu at netbsd.org>, "gluster-users at gluster.org" <gluster-users at gluster.org>, Gluster Devel <gluster-devel at nongnu.org> Subject: Re: [Gluster-devel] glusterfs-3.4.1qa2 released Message-ID: <CAEePdhkatO15oh6JajVogKfhY4cK-P=4G9u6H4wtVTTtO9q+1A at mail.gmail.com> Content-Type: text/plain; charset="utf-8" Tested with glusterfs-3.4.1qa2-1.el6.x86_64 issue with ACL is still there, unless one applies patch from http://review.gluster.org/#/c/5693/ which shoots through the caches and takes ACLs from server or sets entry-timeout=0 it returns wrong values. This is probably because ACL mask being applied incorrectly in posix_acl_inherit_mode, but I'm no C expert to say so :( On Mon, Sep 16, 2013 at 9:37 AM, Vijay Bellur <vbellur at redhat.com> wrote: > On 09/16/2013 05:47 AM, Emmanuel Dreyfus wrote: > >> Emmanuel Dreyfus <manu at netbsd.org> wrote: >> >> It seems it has something very broken. A tar -xzf reported no error >>> while most of the file do not appear in the filesystem on completion. >>> >> >> I am not sure but it seems files appeared afterwards. There was a system >> clock problem, client being a few seconds in the past. Can it explain >> the problem? >> > > When the client's time is off and fuse has entry-timeout set to 0, is this > problem seen? > > The fuse forget log message is benign and does not have any functional > impact. I have sent a patch which addresses this behavior: > > http://review.gluster.org/#/c/**5932/< http://review.gluster.org/#/c/5932/> > > -Vijay > > > > ______________________________**_________________ > Gluster-users mailing list > Gluster-users at gluster.org > http://supercolony.gluster.**org/mailman/listinfo/gluster-**users< http://supercolony.gluster.org/mailman/listinfo/gluster-users> > -------------- next part -------------- An HTML attachment was scrubbed... URL: < http://supercolony.gluster.org/pipermail/gluster-users/attachments/20130918/260d04d5/attachment-0001.html > ------------------------------ Message: 23 Date: Wed, 18 Sep 2013 11:35:26 +0200 (CEST) From: Dragon <Sunghost at gmx.de> To: gluster-users at gluster.org Subject: Re: Cant see files after network failure Message-ID: <trinity-454d5481-7606-49b1-977b-9619577d4820-1379496926132 at 3capp-gmx-bs16> Content-Type: text/plain; charset="us-ascii" An HTML attachment was scrubbed... URL: < http://supercolony.gluster.org/pipermail/gluster-users/attachments/20130918/f3e5de21/attachment-0001.html > ------------------------------ Message: 24 Date: Wed, 18 Sep 2013 10:35:58 +0100 From: Nux! <nux at li.nux.ro> To: Shishir Gowda <sgowda at redhat.com> Cc: Gluster Users <gluster-users at gluster.org> Subject: Re: gluster volume top issue Message-ID: <2f58eea8b9dfc00fb8efe932acedc83b at li.nux.ro> Content-Type: text/plain; charset=UTF-8; format=flowed On 18.09.2013 04:39, Shishir Gowda wrote: > Hi Nux, > > I am trying to see if the issue of "0" open fd is based on the > work-load, or a bug. > > Could you check top command output of "read/write" operation too? Shishir, Those commands only output the bricks and nothing more: [root at 2216 ~]# gluster volume top xenvms read nfs NFS Server : localhost NFS Server : 1726.stocare.domeniu.net NFS Server : 1613.stocare.domeniu.net NFS Server : 1631.stocare.domeniu.net [root at 2216 ~]# gluster volume top xenvms write nfs NFS Server : localhost NFS Server : 1631.stocare.domeniu.net NFS Server : 1613.stocare.domeniu.net NFS Server : 1726.stocare.domeniu.net Same without "nfs". -- Sent from the Delta quadrant using Borg technology! Nux! www.nux.ro ------------------------------ Message: 25 Date: Wed, 18 Sep 2013 11:01:36 +0100 From: "Michael.OBrien" <Michael.OBrien at ul.ie> To: <gluster-users at gluster.org> Subject: Secure Setup / Separate GlusterFS / Encryption Message-ID: <2A20FC0CEBD54B4D98950B3B5B1D99FE01400256 at staffexchange3.ul.campus> Content-Type: text/plain; charset="us-ascii" Hi Gluster Users, I'm looking for some advice or best practice recommendations when it comes to designing secure glusterFS environments. I'm talking about the basic design principles that a user should consider irrespective of the content that will be stored. I realise security isn't a destination but a journey but I'd appreciate any advice you may have and it goes without saying that if the content is that important it should be separated. What is the current advise on configuring secure glusterFS environments or the trade-offs to consider? Should everything from bricks to storage nodes and the storage data network be separated into different glusterFS's or can I share storage nodes across different clients without fear of crossed wires or a rogue client being able to list the other mount points of other clients or worse access their data? My mindset would be to try and compare it to a SAN (but I'm not a SAN guy either) where disk storage is pooled and provisioned as LUN's and the LUN's are presented to certain HBA's . The SAN can be configured so that only particular HBA's can access a LUN so even if the client is compromised the SAN doesn't allow it to access other LUN's Finally also on the topic of security how would people suggest handling encryption of client data and working with a storage server hosting different encrypted data Michael -------------- next part -------------- An HTML attachment was scrubbed... URL: < http://supercolony.gluster.org/pipermail/gluster-users/attachments/20130918/6429d503/attachment-0001.html > ------------------------------ Message: 26 Date: Wed, 18 Sep 2013 07:36:32 -0400 (EDT) From: Krishnan Parthasarathi <kparthas at redhat.com> To: Dragon <Sunghost at gmx.de> Cc: gluster-users at gluster.org Subject: Re: Cant see files after network failure Message-ID: <569256344.14871568.1379504192724.JavaMail.root at redhat.com> Content-Type: text/plain; charset=utf-8 Dragon, Could you attach brick log files, client log file(s) and output of the following commands, gluster volume info VOLNAME gluster volume status VOLNAME Could you attach the "etc-glusterfs.." log as well? thanks, krish ----- Original Message ----- > Hello, > i didnt find any hint of an error. Now i restart all server and watched the > "etc-glusterfs.." log. The only thing i found is: "rpc actor failed to > complete successfully". All peers looks good and the volume too. i can see > the files in the data folder of each brick, but after fuse mount on a > client, i cant see anything. permissions at the files on each brick are > root:root. > What can i do? > > _______________________________________________ > Gluster-users mailing list > Gluster-users at gluster.org > http://supercolony.gluster.org/mailman/listinfo/gluster-users ------------------------------ Message: 27 Date: Wed, 18 Sep 2013 13:51:52 +0200 (CEST) From: Dragon <Sunghost at gmx.de> To: gluster-users at gluster.org Subject: Re: Cant see files after network failure Message-ID: <trinity-319be117-5bab-441b-9c7c-7794b2523d3a-1379505112825 at 3capp-gmx-bs56> Content-Type: text/plain; charset="us-ascii" An HTML attachment was scrubbed... URL: < http://supercolony.gluster.org/pipermail/gluster-users/attachments/20130918/2349a003/attachment-0001.html > ------------------------------ _______________________________________________ Gluster-users mailing list Gluster-users at gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-users End of Gluster-users Digest, Vol 65, Issue 18 ********************************************* ** This email and any attachments may contain information that is confidential and/or privileged for the sole use of the intended recipient. Any use, review, disclosure, copying, distribution or reliance by others, and any forwarding of this email or its contents, without the express permission of the sender is strictly prohibited by law. If you are not the intended recipient, please contact the sender immediately, delete the e-mail and destroy all copies. ** -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://supercolony.gluster.org/pipermail/gluster-users/attachments/20130918/dc1e2b1c/attachment.html>