Hi, Perhaps you should open up a bugzilla record against this problem. Then attach lock dumps from all nodes and a full call trace (from magic sysrq key: <sysrq>t) to the bugzilla. https://bugzilla.redhat.com/ Regards, Bob Peterson Red Hat On Thu, 2007-11-15 at 03:40 -0500, Fair, Brian wrote: > We have a a GFS filesystem (1 of 100 on this server in particular) that will consistently hang. I haven't identified the circumstances around it, but there is some speculation that it may occur during heavy usage, though that isn't for sure. When this happens, the load average on the system skyrockets. > > The mountpoint is /omni_mnt/clients/j2 > > When I say hang, cd sometimes hangs, ls will hang, etc. Programs and file operations certainly hang. Sometimes just cd'ing into the mountpoint, other times into a large subdirectory. > > ie: > > # cd /omni_mnt/clients/j2 > root@hlpom500:[/omni_mnt/clients/j2 <mailto:root@hlpom500:[/omni_mnt/clients/j2> ] > # ls > <normal output> > root@hlpom500:[/omni_mnt/clients/j2 <mailto:root@hlpom500:[/omni_mnt/clients/j2> ] > # cd stmt > root@hlpom500:[/omni_mnt/clients/j2/stmt <mailto:root@hlpom500:[/omni_mnt/clients/j2/stmt> ] > # ls > > <hangs here, shell must be killed> > > In the past, shutting down and rebooting the 2 systems that mount this gfs has cleared the issue. > > Info: > > RHEL ES 4 u5 > kernel 2.6.9-55.0.2.ELsmp > GFS 2.6.9-72.2.0.2 > > Not sure what is helpful. but here are some outputs from the system while the fs was hung.. I have a lockdump also, but it is 4,650 lines. I can send it along if needed. Any suggestions on data to gather in the future are welcomed. > > > Thanks! > > Brian Fair > > > gfs_tool gettune ************************************************************************ > > ilimit1 = 100 > ilimit1_tries = 3 > ilimit1_min = 1 > ilimit2 = 500 > ilimit2_tries = 10 > ilimit2_min = 3 > demote_secs = 300 > incore_log_blocks = 1024 > jindex_refresh_secs = 60 > depend_secs = 60 > scand_secs = 5 > recoverd_secs = 60 > logd_secs = 1 > quotad_secs = 5 > inoded_secs = 15 > glock_purge = 0 > quota_simul_sync = 64 > quota_warn_period = 10 > atime_quantum = 3600 > quota_quantum = 60 > quota_scale = 1.0000 (1, 1) > quota_enforce = 1 > quota_account = 1 > new_files_jdata = 0 > new_files_directio = 0 > max_atomic_write = 4194304 > max_readahead = 262144 > lockdump_size = 131072 > stall_secs = 600 > complain_secs = 10 > reclaim_limit = 5000 > entries_per_readdir = 32 > prefetch_secs = 10 > statfs_slots = 64 > max_mhc = 10000 > greedy_default = 100 > greedy_quantum = 25 > greedy_max = 250 > rgrp_try_threshold = 100 > statfs_fast = 0 > > gfs_tool counters ************************************************************************ > > locks 246 > locks held 127 > freeze count 0 > incore inodes 101 > metadata buffers 4 > unlinked inodes 2 > quota IDs 3 > incore log buffers 0 > log space used 0.05% > meta header cache entries 0 > glock dependencies 0 > glocks on reclaim list 0 > log wraps 85 > outstanding LM calls 2 > outstanding BIO calls 0 > fh2dentry misses 0 > glocks reclaimed 1316856 > glock nq calls 194073094 > glock dq calls 193851427 > glock prefetch calls 102749 > lm_lock calls 903612 > lm_unlock calls 833348 > lm callbacks 1769983 > address operations 71707236 > dentry operations 23750382 > export operations 0 > file operations 139487453 > inode operations 38356847 > super operations 110620113 > vm operations 1052447 > block I/O reads 241669 > block I/O writes 3295626 > > > -- > Linux-cluster mailing list > Linux-cluster@xxxxxxxxxx > https://www.redhat.com/mailman/listinfo/linux-cluster -- Linux-cluster mailing list Linux-cluster@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/linux-cluster