Probably not a cluster issue just pure kernel question. Sounds like the driver or device is locked up and the driver or device is confused, so the processes attached to it will be hung. To be honest I've had similar problems on pretty much all Unixes for many years. And I've never found a good way out of it. Maybe not an option with your case and application, but I guess why most people have their backup systems running on separate dedicated boxes so it can be rebooted without affecting production systems. I wish there was a way of saying to the kernel, something like, I want to forceably unload this driver for a device and you can kill any processes attached to it. Then you could reinitialise the driver and processes. Resetting the physical device might work (or has for me in the past) but it equally I'd guess could panic the kernel. If someone else has a better way out of a hung device driver on Linux I'd love to know too (seems particularly bad for tape devices in my experience when it happens). Colin On Mon, 2011-08-15 at 03:55 +0100, sunhux G wrote: > Apologies if this is not the right list to post but getting desperate: > > I have 2 processes (shown by ps -ef below) which has 'jammed' the > tape > drive below & I can't "kill -9" them. > > Is there any way short of reboot to stop them, say "service xxx > restart" or > anything else other than rebooting this Linux 4.x server? Since > reboot > involves doing "service stop xxx" of various services, surely one of > the > xxx must be able to stop the processes (just an educated guess). We > faced this issue with our Dataprotector quite often so frequent reboot > is not an option. > > # ps -ef |grep -i bma |grep -v grep > root 10197 1 0 Aug13 ? 00:00:08 /opt/omni/lbin/vbda > -bmaname HP:Ultrium 4-SCSI_4 -type 2 -start 1313175661 -level 0 > -access 1 0 -protection 2 1209600 -name / -ma xxxdgjt1.ss.de 22000 -id > 1313175612 -volume / -profile -no_lock -hlink -no_touch -no_encode > -no_expand_sparse -no_nwuncompress -no_compress -no_preview -profile > -report 0 -on_busy 2 -no_nthlink -archattr -share_info -objname 02 > xxxdgjt1.ss.de:/ // / -no_aligned > root 23303 1 0 Aug13 ? 00:00:03 /opt/omni/lbin/vbda > -bmaname HP:Ultrium 4-SCSI_1 -type 2 -start 1313192083 -level 0 > -access 1 0 -protection 2 1209600 -name / -ma xxxdgjt1.ss.de 22000 -id > 1313192026 -volume / -profile -no_lock -hlink -no_touch -no_encode > -no_expand_sparse -no_nwuncompress -no_compress -no_preview -profile > -report 0 -on_busy 2 -no_nthlink -archattr -share_info -objname 02 > xxxdgjt1.ss.de:/ // / -no_aligned > root 25618 1 0 Aug13 ? 00:00:03 /opt/omni/lbin/vbda > -bmaname HP:Ultrium 4-SCSI_1 -type 2 -start 1313195066 -level 0 > -access 1 0 -protection 2 1209600 -name / -ma xxxdgjt1.ss.de 22000 -id > 1313195016 -volume / -profile -no_lock -hlink -no_touch -no_encode > -no_expand_sparse -no_nwuncompress -no_compress -no_preview -profile > -report 0 -on_busy 2 -no_nthlink -archattr -share_info -objname 02 > xxxdgjt1.ss.de:/ // / -no_aligned > > > they're listening on the Tcp ports : > > [root@xxxdgjt1 ~]# netstat -antp | grep 25618 > tcp 21 0 172.17.1.47:5555 172.17.12.12:2128 > CLOSE_WAIT 25618/vbda > [root@xxxdgjt1 ~]# netstat -antp | grep 23303 > tcp 21 0 172.17.1.47:5555 172.17.12.12:2073 > CLOSE_WAIT 23303/vbda > > > fuser all other partitions do not show processes locking/opening > files, only the > root (ie / ) partition : > > # fuser / |grep 25618 ==> will show 25618 & 25618r as amongst the > processes > # fuser / |grep 23303 ==> will show 23303 & 23303r as amongst the > processes > > > # cd /etc > # ls */*omni* > xinetd.d/omni > > opt/omni: > client server > > -- > Linux-cluster mailing list > Linux-cluster@xxxxxxxxxx > https://www.redhat.com/mailman/listinfo/linux-cluster > > This email and any files transmitted with it are confidential and are intended solely for the use of the individual or entity to whom they are addressed. If you are not the original recipient or the person responsible for delivering the email to the intended recipient, be advised that you have received this email in error, and that any use, dissemination, forwarding, printing, or copying of this email is strictly prohibited. If you received this email in error, please immediately notify the sender and delete the original. -- Linux-cluster mailing list Linux-cluster@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/linux-cluster