Some additional info, I ran gfs_tool counters <mountpoint> on all stuck nodes They all seem to have a large amount of outstanding BIO calls. Any way I can find out what is causing this ? Any other information to look for ? As far as I can see, there's no bottleneck at the shared coraid storage AoE. gfs_tool counters output on: mrcluster1 locks 3024 locks held 108 incore inodes 12 metadata buffers 166 unlinked inodes 0 quota IDs 0 incore log buffers 1 log space used 0.15% meta header cache entries 9998 glock dependencies 33 glocks on reclaim list 0 log wraps 8 outstanding LM calls 0 outstanding BIO calls 805 fh2dentry misses 0 glocks reclaimed 5992 glock nq calls 4784596 glock dq calls 4784236 glock prefetch calls 25 lm_lock calls 6341 lm_unlock calls 5269 lm callbacks 11907 address operations 254155990 dentry operations 1978 export operations 0 file operations 1457916 inode operations 5489 super operations 1577443 vm operations 0 block I/O reads 136200 block I/O writes 81780453 gfs_tool counters output on: mrcluster2 locks 3020 locks held 107 incore inodes 10 metadata buffers 604 unlinked inodes 0 quota IDs 0 incore log buffers 1 log space used 0.15% meta header cache entries 10000 glock dependencies 34 glocks on reclaim list 0 log wraps 8 outstanding LM calls 0 outstanding BIO calls 490 fh2dentry misses 0 glocks reclaimed 3725 glock nq calls 4755003 glock dq calls 4754287 glock prefetch calls 12 lm_lock calls 4364 lm_unlock calls 3017 lm callbacks 8140 address operations 252523873 dentry operations 1957 export operations 0 file operations 1444785 inode operations 5425 super operations 1564779 vm operations 0 block I/O reads 135658 block I/O writes 81574696 gfs_tool counters output on: mrcluster3 locks 3018 locks held 135 incore inodes 9 metadata buffers 1 unlinked inodes 0 quota IDs 0 incore log buffers 1 log space used 0.15% meta header cache entries 9997 glock dependencies 20 glocks on reclaim list 0 log wraps 25 outstanding LM calls 0 outstanding BIO calls 191 fh2dentry misses 0 glocks reclaimed 11097 glock nq calls 15308139 glock dq calls 15307573 glock prefetch calls 13 lm_lock calls 8734 lm_unlock calls 7813 lm callbacks 17167 address operations 941469125 dentry operations 5308 export operations 0 file operations 4730084 inode operations 17157 super operations 5170894 vm operations 0 block I/O reads 333851 block I/O writes 4449228 gfs_tool counters output on: mrcluster4 locks 3017 locks held 206 incore inodes 7 metadata buffers 2945 unlinked inodes 0 quota IDs 2 incore log buffers 5 log space used 0.24% meta header cache entries 9343 glock dependencies 54 glocks on reclaim list 0 log wraps 2 outstanding LM calls 0 outstanding BIO calls 249 fh2dentry misses 0 glocks reclaimed 2075 glock nq calls 1485236 glock dq calls 1485023 glock prefetch calls 0 lm_lock calls 1967 lm_unlock calls 1326 lm callbacks 3657 address operations 66603382 dentry operations 1747 export operations 0 file operations 457700 inode operations 4821 super operations 484938 vm operations 0 block I/O reads 28969 block I/O writes 21973543 Grtz Ramon -- Linux-cluster mailing list Linux-cluster@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/linux-cluster