Hi james, We haven't yet been able to ask our Telco to switch back the DWDM links to original situation. However, since logging was activated on the server I'm having a lot of messages : lpfc 0000:10:00.1: 1:(0):0730 FCP command x26 failed: x2 SNS x70000500 x20000000 Data: xa x200 x10 x0 x0 for which I couldn't find no explanation (http://www-dl.emulex.com/support/linux/820482p/linux.pdf) Do you have any information on this ? Also, there are other lpfc parameters that could be tweaked if I understand well their meaning: lpfc_hba_queue_depth currently set to 1024 : Does it represent the number of [IOs/Exchanges] the HBA will queue untill the remote port acks them or untill it is considered down ? lpfc_max_scsicmpl_time set to 0 : Does 0 represent some infinite value, meaning it won't timeout any IO for which the driver did not receive any completion ack ? Thanks Brem 2010/4/27 brem belguebli <brem.belguebli@xxxxxxxxx>: > Hi James, > > I could set lpfc_log_verbose on both HBA's to 4115, I hope it'll be high > enough to get interesting traces. > > On Mon, 2010-04-26 at 23:52 +0200, brem belguebli wrote: >> Hi James, >> >> On Mon, 2010-04-26 at 13:50 -0400, James Smart wrote: >> > Brem, >> > >> > I'm not understanding you. >> > >> > >> > brem belguebli wrote: >> > > We have sg3_utils installed , and I think we ran sg_verify on one or >> > > 2 >> > > unresponsive /dev/sd and it didn't give the hand back. >> > > >> > what do you mean "give the hand back" ? was the operation >> > successful or not ? >> > >> When I say it didn't give the hand back, I mean the one or 2 processes >> got stuck in D state, thus not returning success . >> > > It was exactly >> > > cd /sys/block >> > > for DEV in `ls -1d dev*`; do >> > > echo ${DEV} >> > > dd if =/dev/${DEV} of=/dev/null bs=1024 count=1 & >> > > echo >> > > done >> > > >> > > And yes it really works, never seen any kind of preemption of DM-MP over >> > > direct sd access. I've cc'ed dm-devel may be some DM guru could give his >> > > opinion on this. >> > > >> > > Next time, I'll use a sg_dd instead of dd, to bypass any cache effect >> > > (by the way, does VFS cache anything when addressing /dev/X devices ?) >> > > >> > ok - by "works" means "dd successfully read 1 block from the device" - >> > right ? >> > >> Yes, the devices on which dd was successful were the ones from FABRIC1, >> dd completed successfully by reading the first 1024 bytes to copy them >> to /dev/null >> >> > > > The most interesting for the lpfc driver would be the lpfc module >> > > > parameter "lpfc_log_verbose=4115" >> > > > which turns on discovery log messages, els messages, link events, and >> > > > FCP i/o error messages. >> > > > >> > > >> > > As our DWDM ring switch is on the less optimal path, there will be a >> > > switch back to nominal soon. >> > > >> > > I'll activate this log level on the HBA's and check the firmware >> > > versions you gave me . >> > > >> > ok. I believe that the shost for the adapters in question, have a >> > sysfs variable for lpfc_log_verbose, that sets the log level on the >> > individual adapter. This would not require you to unload/reload the >> > driver to set the option. >> > >> I'll tell you tomorrow (was off today) if the parameter exists for these >> HBA's. > > >> > > Hopefully, we will be able to provide you something deeper to >> > > investigate. >> > > >> > > Brem >> > > >> > >> > ok. >> > >> > -- james >> > >> > >> Thanks >> >> > > > -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html