-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Am 08.07.2014 13:24, schrieb Pranith Kumar Karampuri: > > On 07/08/2014 04:49 PM, Norman Mähler wrote: > > > Am 08.07.2014 13:02, schrieb Pranith Kumar Karampuri: >>>> On 07/08/2014 04:23 PM, Norman Mähler wrote: Of course: >>>> >>>> The configuration is: >>>> >>>> Volume Name: gluster_dateisystem Type: Replicate Volume ID: >>>> 2766695c-b8aa-46fd-b84d-4793b7ce847a Status: Started Number >>>> of Bricks: 1 x 2 = 2 Transport-type: tcp Bricks: Brick1: >>>> filecluster1:/mnt/raid Brick2: filecluster2:/mnt/raid >>>> Options Reconfigured: nfs.enable-ino32: on >>>> performance.cache-size: 512MB diagnostics.brick-log-level: >>>> WARNING diagnostics.client-log-level: WARNING >>>> nfs.addr-namelookup: off performance.cache-refresh-timeout: >>>> 60 performance.cache-max-file-size: 100MB >>>> performance.write-behind-window-size: 10MB >>>> performance.io-thread-count: 18 performance.stat-prefetch: >>>> off >>>> >>>> >>>> The file count in xattrop is >>>>> Do "gluster volume set gluster_dateisystem >>>>> cluster.self-heal-daemon off" This should stop all the >>>>> entry self-heals and should also get the CPU usage low. >>>>> When you don't have a lot of activity you can enable it >>>>> again using "gluster volume set gluster_dateisystem >>>>> cluster.self-heal-daemon on" If it doesn't get the CPU down >>>>> execute "gluster volume set gluster_dateisystem >>>>> cluster.entry-self-heal off". Let me know how it goes. >>>>> Pranith > Thanks for your help so far but stopping the self heal deamon and > the self heal machanism itself did not improve the situation. > > Do you have further suggestions? Is it simply the load on the > system? NFS could handle it easily before... >> Is it at least a little better or no improvement at all? > >> Pranith There is a very small improvement of about 1 point in the 15 minute load. The 15 minute load now is at about 20 to 22 at the moment. Norman > > Norman > >>>> Brick 1: 2706 Brick 2: 2687 >>>> >>>> Norman >>>> >>>> Am 08.07.2014 12:28, schrieb Pranith Kumar Karampuri: >>>>>>> It seems like entry self-heal is happening. What is >>>>>>> the volume configuration? Could you give ls >>>>>>> <brick-path>/.glusterfs/indices/xattrop | wc -l Count >>>>>>> for all the bricks >>>>>>> >>>>>>> Pranith On 07/08/2014 03:36 PM, Norman Mähler wrote: >>>>>>>> Hello Pranith, >>>>>>>> >>>>>>>> here are the logs. I only giv you the last 3000 >>>>>>>> lines, because the nfs.log from today is already 550 >>>>>>>> MB. >>>>>>>> >>>>>>>> There are the standard files from a user home on the >>>>>>>> gluster system. All you normally find in a user >>>>>>>> home. Config files, firefox and thunderbird files >>>>>>>> etc. >>>>>>>> >>>>>>>> Thanks in advance Norman >>>>>>>> >>>>>>>> Am 08.07.2014 11:46, schrieb Pranith Kumar >>>>>>>> Karampuri: >>>>>>>>> On 07/08/2014 02:46 PM, Norman Mähler wrote: Hello >>>>>>>>> again, >>>>>>>>> >>>>>>>>> i could resolve the self heal problems with the >>>>>>>>> missing gfid files on one of the servers by >>>>>>>>> deleting the gfid files on the other server. >>>>>>>>> >>>>>>>>> They had a link count of 1 which means that the >>>>>>>>> file on that the gfid pointed was already deleted. >>>>>>>>> >>>>>>>>> >>>>>>>>> We have still these errors >>>>>>>>> >>>>>>>>> [2014-07-08 09:09:43.564488] W >>>>>>>>> [client-rpc-fops.c:2469:client3_3_link_cbk] >>>>>>>>> 0-gluster_dateisystem-client-0: remote operation >>>>>>>>> failed: File exists >>>>>>>>> (00000000-0000-0000-0000-000000000000 -> >>>>>>>>> <gfid:b338b09e-2577-45b3-82bd-032f954dd083>/lock) >>>>>>>>> >>>>>>>>> which appear in the glusterfshd.log and these >>>>>>>>> >>>>>>>>> [2014-07-08 09:13:31.198462] E >>>>>>>>> [client-rpc-fops.c:5179:client3_3_inodelk] >>>>>>>>> (-->/usr/lib/x86_64-linux-gnu/glusterfs/3.4.4/xlator/cluster/replicate.so(+0x466b8) >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> > >>>>>>>>> [0x7f5d29d4e6b8] >>>>>>>>> (-->/usr/lib/x86_64-linux-gnu/glusterfs/3.4.4/xlator/cluster/replicate.so(afr_lock_blocking+0x844) >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> > >>>>>>>>> [0x7f5d29d4e2e4] >>>>>>>>> (-->/usr/lib/x86_64-linux-gnu/glusterfs/3.4.4/xlator/protocol/client.so(client_inodelk+0x99) >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> > >>>>>>>>> [0x7f5d29f8b3c9]))) 0-: Assertion failed: 0 >>>>>>>>> from the nfs.log. >>>>>>>>>> Could you attach mount (nfs.log) and brick logs >>>>>>>>>> please. Do you have files with lots of >>>>>>>>>> hard-links? Pranith >>>>>>>>> I think the error messages belong together but I >>>>>>>>> don't have any idea how to solve them. >>>>>>>>> >>>>>>>>> Still we have got a very bad performance issue. >>>>>>>>> The system load on the servers is above 20 and >>>>>>>>> nearly no one is able to work in here on a >>>>>>>>> client... >>>>>>>>> >>>>>>>>> Hope for help Norman >>>>>>>>> >>>>>>>>> >>>>>>>>> Am 07.07.2014 15:39, schrieb Pranith Kumar >>>>>>>>> Karampuri: >>>>>>>>>>>> On 07/07/2014 06:58 PM, Norman Mähler wrote: >>>>>>>>>>>> Dear community, >>>>>>>>>>>> >>>>>>>>>>>> we have got some serious problems with our >>>>>>>>>>>> Gluster installation. >>>>>>>>>>>> >>>>>>>>>>>> Here is the setting: >>>>>>>>>>>> >>>>>>>>>>>> We have got 2 bricks (version 3.4.4) on a >>>>>>>>>>>> debian 7.5, one of them with an nfs export. >>>>>>>>>>>> There are about 120 clients connecting to the >>>>>>>>>>>> exported nfs. These clients are thin clients >>>>>>>>>>>> reading and writing their Linux home >>>>>>>>>>>> directories from the exported nfs. >>>>>>>>>>>> >>>>>>>>>>>> We want to change the access of these clients >>>>>>>>>>>> one by one to access via gluster client. >>>>>>>>>>>>> I did not understand what you meant by >>>>>>>>>>>>> this. Are you moving to glusterfs-fuse >>>>>>>>>>>>> based mounts? >>>>>>>>>>>> Here are our problems: >>>>>>>>>>>> >>>>>>>>>>>> In the moment we have got two types of error >>>>>>>>>>>> messages which come in burts to our >>>>>>>>>>>> glusterfshd.log >>>>>>>>>>>> >>>>>>>>>>>> [2014-07-07 13:10:21.572487] W >>>>>>>>>>>> [client-rpc-fops.c:1538:client3_3_inodelk_cbk] >>>>>>>>>>>> >>>>>>>>>>>> 0-gluster_dateisystem-client-1: remote operation >>>>>>>>>>>> failed: No such file or directory >>>>>>>>>>>> [2014-07-07 13:10:21.573448] W >>>>>>>>>>>> [client-rpc-fops.c:471:client3_3_open_cbk] >>>>>>>>>>>> 0-gluster_dateisystem-client-1: remote >>>>>>>>>>>> operation failed: No such file or directory. >>>>>>>>>>>> Path: >>>>>>>>>>>> <gfid:b0c4f78a-249f-4db7-9d5b-0902c7d8f6cc> >>>>>>>>>>>> (00000000-0000-0000-0000-000000000000) >>>>>>>>>>>> [2014-07-07 13:10:21.573468] E >>>>>>>>>>>> [afr-self-heal-data.c:1270:afr_sh_data_open_cbk] >>>>>>>>>>>> >>>>>>>>>>>> 0-gluster_dateisystem-replicate-0: open of >>>>>>>>>>>> <gfid:b0c4f78a-249f-4db7-9d5b-0902c7d8f6cc> >>>>>>>>>>>> failed on child gluster_dateisystem-client-1 >>>>>>>>>>>> (No such file or directory) >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> This looks like a missing gfid file on one of >>>>>>>>>>>> the bricks. I looked it up and yes the file >>>>>>>>>>>> is missing on the second brick. >>>>>>>>>>>> >>>>>>>>>>>> We got these messages the other way round, >>>>>>>>>>>> too (missing on client-0 and the first >>>>>>>>>>>> brick). >>>>>>>>>>>> >>>>>>>>>>>> Is it possible to repair this one by copying >>>>>>>>>>>> the gfid file to the brick where it was >>>>>>>>>>>> missing? Or ist there another way to repair >>>>>>>>>>>> it? >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> The second message is >>>>>>>>>>>> >>>>>>>>>>>> [2014-07-07 13:06:35.948738] W >>>>>>>>>>>> [client-rpc-fops.c:2469:client3_3_link_cbk] >>>>>>>>>>>> 0-gluster_dateisystem-client-1: remote >>>>>>>>>>>> operation failed: File exists >>>>>>>>>>>> (00000000-0000-0000-0000-000000000000 -> >>>>>>>>>>>> <gfid:aae47250-8f69-480c-ac75-2da2f4d21d7a>/lock) >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> and I really do not know what to do with this >>>>>>>>>>>> one... >>>>>>>>>>>>> Did any of the bricks went offline and came >>>>>>>>>>>>> back online? Pranith >>>>>>>>>>>> I am really looking forward to your help >>>>>>>>>>>> because this is an active system and the >>>>>>>>>>>> system load on the nfs brick is about 25 >>>>>>>>>>>> (!!) >>>>>>>>>>>> >>>>>>>>>>>> Thanks in advance! Norman Maehler >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>>> _______________________________________________ >>>>>>>>>>>>> >>>>>>>>>>>>> Gluster-users mailing list >>>>>>>>>>>>> Gluster-users@xxxxxxxxxxx >>>>>>>>>>>>> http://supercolony.gluster.org/mailman/listinfo/gluster-users > >>>>>>>>>>>>> - -- Mit freundlichen Grüßen, >>>> Norman Mähler >>>> >>>> Bereichsleiter IT-Hochschulservice uni-assist e. V. >>>> Geneststr. 5 Aufgang H, 3. Etage 10829 Berlin >>>> >>>> Tel.: 030-66644382 n.maehler@xxxxxxxxxxxxx >>>> > -- Mit freundlichen Grüßen, > > Norman Mähler > > Bereichsleiter IT-Hochschulservice uni-assist e. V. Geneststr. 5 > Aufgang H, 3. Etage 10829 Berlin > > Tel.: 030-66644382 n.maehler@xxxxxxxxxxxxx > - -- Mit freundlichen Grüßen, Norman Mähler Bereichsleiter IT-Hochschulservice uni-assist e. V. Geneststr. 5 Aufgang H, 3. Etage 10829 Berlin Tel.: 030-66644382 n.maehler@xxxxxxxxxxxxx -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iQEcBAEBAgAGBQJTu9ZpAAoJEB810LSP8y+R6sQH/ieTOn6W8LheGswXcRJvHgSB 7BRjo3BxFrN/xa63EpRIXzdY+ScRwuNAp76z6/IZ+A/l3DrGQW2lxDXnvDB81CNW 2ergEJ4WuiC3x29tYHAj+A7DStiONz1qoH1v1VRsluHpPYOyhgQ6OKi6zWiFWllR +gk3QfDOjpYaG0lQNHAci3pdBeg0uzYjaxhsMeMxq8T2NH0656++sx/vAW3XPyb6 Pkw7yDHuD4PKUOcyaR6QY7MrUPnVgSrlU1XTlLqwDyTR6erZqQHPBoaxG+Klm9vM EFyi4MT8s/KE/fwlSh/EGP7+9CvRmNGilX2gPZoS/Y9ugrL+3c7jvFEosWgYCc4= =roJ7 -----END PGP SIGNATURE----- _______________________________________________ Gluster-users mailing list Gluster-users@xxxxxxxxxxx http://supercolony.gluster.org/mailman/listinfo/gluster-users