hi! there ist no relavant output from dmesg. no entries in the server log - only the one line in the client-server log, I already posted. the glusterfs version on the server had been updated to gfs 3.2.0 more than a month ago. because of the troubles on the backup server, I deleted the whole backup share and started from scratch. I looked for a update of "fuse" and upgraded from 2.7.2-61.18.1 to 2.8.5-41.1 maybe this helps. here is the changelog info: Authors: -------- Miklos Szeredi <miklos at szeredi.hu> Distribution: systemsmanagement:baracus / SLE_11_SP1 * Tue Mar 29 2011 dbahi at novell.com - remove the --no-canonicalize usage for suse_version <= 11.3 * Mon Mar 21 2011 coolo at novell.com - licenses package is about to die * Thu Feb 17 2011 mszeredi at suse.cz - In case of failure to add to /etc/mtab don't umount. [bnc#668820] [CVE-2011-0541] * Tue Nov 16 2010 mszeredi at suse.cz - Fix symlink attack for mount and umount [bnc#651598] * Wed Oct 27 2010 mszeredi at suse.cz - Remove /etc/init.d/boot.fuse [bnc#648843] * Tue Sep 28 2010 mszeredi at suse.cz - update to 2.8.5 * fix option escaping for fusermount [bnc#641480] * Wed Apr 28 2010 mszeredi at suse.cz - keep examples and internal docs in devel package (from jnweiger) * Mon Apr 26 2010 mszeredi at suse.cz - update to 2.8.4 * fix checking for symlinks in umount from /tmp * fix umounting if /tmp is a symlink kind regards markus froehlich Am 06.06.2011 21:19, schrieb Anthony J. Biacco: > Could be fuse, check 'dmesg' for kernel module timeouts. > > In a similar vein, has anyone seen signifigant performance/reliability with diff fuse versions? say, latest source vs. Rhel distro rpms vers. > > -Tony > > > > -----Original Message----- > From: Mohit Anchlia<mohitanchlia at gmail.com> > Sent: June 06, 2011 1:14 PM > To: Markus Fr?hlich<markus.froehlich at xidras.com> > Cc: gluster-users at gluster.org<gluster-users at gluster.org> > Subject: Re: uninterruptible processes writing to glusterfsshare > > Is there anything in the server logs? Does it follow any particular > pattern before going in this mode? > > Did you upgrade Gluster or is this new install? > > 2011/6/6 Markus Fr?hlich<markus.froehlich at xidras.com>: >> hi! >> >> sometimes we've on some client-servers hanging uninterruptible processes >> ("ps aux" stat is on "D" ) and on one the CPU wait I/O grows within some >> minutes to 100%. >> you are not able to kill such processes - also "kill -9" doesnt work - when >> you connect via "strace" to such an process, you wont see anything and you >> cannot detach it again. >> >> there are only two possibilities: >> killing the glusterfs process (umount GFS share) or rebooting the server. >> >> the only log entry I found, was on one client - just a single line: >> [2011-06-06 10:44:18.593211] I [afr-common.c:581:afr_lookup_collect_xattr] >> 0-office-data-replicate-0: data self-heal is pending for >> /pc-partnerbet-public/Promotionaktionen/Mailakquise_2009/Webmaster_2010/HTML/bilder/Thumbs.db. >> >> one of the client-servers is a samba-server, the other one a backup-server >> based on rsync with millions of small files. >> >> gfs-servers + gfs-clients: SLES11 x86_64, glusterfs V 3.2.0 >> >> and here are the configs from server and client: >> server config >> "/etc/glusterd/vols/office-data/office-data.gfs-01-01.GFS-office-data02.vol": >> volume office-data-posix >> type storage/posix >> option directory /GFS/office-data02 >> end-volume >> >> volume office-data-access-control >> type features/access-control >> subvolumes office-data-posix >> end-volume >> >> volume office-data-locks >> type features/locks >> subvolumes office-data-access-control >> end-volume >> >> volume office-data-io-threads >> type performance/io-threads >> subvolumes office-data-locks >> end-volume >> >> volume office-data-marker >> type features/marker >> option volume-uuid 3c6e633d-a0bb-4c52-8f05-a2db9bc9c659 >> option timestamp-file /etc/glusterd/vols/office-data/marker.tstamp >> option xtime off >> option quota off >> subvolumes office-data-io-threads >> end-volume >> >> volume /GFS/office-data02 >> type debug/io-stats >> option latency-measurement off >> option count-fop-hits off >> subvolumes office-data-marker >> end-volume >> >> volume office-data-server >> type protocol/server >> option transport-type tcp >> option auth.addr./GFS/office-data02.allow * >> subvolumes /GFS/office-data02 >> end-volume >> >> >> -------------- >> client config "/etc/glusterd/vols/office-data/office-data-fuse.vol": >> volume office-data-client-0 >> type protocol/client >> option remote-host gfs-01-01 >> option remote-subvolume /GFS/office-data02 >> option transport-type tcp >> end-volume >> >> volume office-data-replicate-0 >> type cluster/replicate >> subvolumes office-data-client-0 >> end-volume >> >> volume office-data-write-behind >> type performance/write-behind >> subvolumes office-data-replicate-0 >> end-volume >> >> volume office-data-read-ahead >> type performance/read-ahead >> subvolumes office-data-write-behind >> end-volume >> >> volume office-data-io-cache >> type performance/io-cache >> subvolumes office-data-read-ahead >> end-volume >> >> volume office-data-quick-read >> type performance/quick-read >> subvolumes office-data-io-cache >> end-volume >> >> volume office-data-stat-prefetch >> type performance/stat-prefetch >> subvolumes office-data-quick-read >> end-volume >> >> volume office-data >> type debug/io-stats >> option latency-measurement off >> option count-fop-hits off >> subvolumes office-data-stat-prefetch >> end-volume >> >> >> -- Mit freundlichen Gr?ssen >> >> Markus Fr?hlich >> Techniker >> >> _______________________________________________ >> Gluster-users mailing list >> Gluster-users at gluster.org >> http://gluster.org/cgi-bin/mailman/listinfo/gluster-users >> > _______________________________________________ > Gluster-users mailing list > Gluster-users at gluster.org > http://gluster.org/cgi-bin/mailman/listinfo/gluster-users >