Re: [dm-devel] multipath-tools-0.4.4 on 3par unknown path failure issue

Alan Kasindorf <akasindorf@xxxxxxxxxxxxxxxxxxxxxxxxx> · Thu, 11 Aug 2005 16:19:24 -0400

I've had problems like this happen to me on 3par too.  What kernel version
are you using?  It almost always happened when the SAN got a RSCN (using
when another server was rebooted) I found that, at least in kernel 2.6.11.7,
that if I changed the line

bio->bi_rw != (1 << BIO_RW_FAILFAST); to
bio->bi_rw != (0 << BIO_RW_FAILFAST); 

in drivers/md/dm_mpath.c

the problem went away.  Now, in the newest kernels, after there was a big
change to the qla drivers (2.6.12-rc? and beyond, I believe) I did not need
to do the above change, but I now get aborts sometimes (these aborts
apparently come from the qlogic card).  The aborts recover, but I have been
unable to determine why I am getting them.

Andy

We're running 2.6.9-11.ELsmp, off of redhat ES 4.1. I don't exactly have 
the entire list of redhat patches on hand, so I can't say for sure. Nor 
can I actually modify our kernel without losing support to the box. If 
this is fixed with a kernel upgrade, we can open a support ticket from 
redhat and scream/yell until they apply the patch.

However, I'd like to know what the exact issue is. I'm not exactly great 
on eliciting issues with the linux kernel right now. How were you 
monitoring what events the SAN was sending up through the card? I could 
use this to at least verify what is happening if/when we lose another 
mount. None of our servers were being rebooted when this happened though.

-Alan