Re: lpfc SAN/SCSI issue

brem belguebli <brem.belguebli@xxxxxxxxx> · Mon, 3 May 2010 18:39:41 +0200

Hi james,

We haven't yet been able to ask our Telco to switch back the DWDM
links to original situation.

However, since logging was activated on the server I'm having a lot of
messages :

lpfc 0000:10:00.1: 1:(0):0730 FCP command x26 failed: x2 SNS x70000500
x20000000 Data: xa x200 x10 x0 x0

for which I couldn't find no explanation
(http://www-dl.emulex.com/support/linux/820482p/linux.pdf)

Do you have any information on this ?

Also, there are other lpfc parameters that could be tweaked if I
understand well their meaning:

lpfc_hba_queue_depth currently set to 1024 :   Does it represent the
number of [IOs/Exchanges] the HBA will queue untill the remote port
acks them or untill it is considered down ?

lpfc_max_scsicmpl_time set to 0 : Does 0 represent some infinite
value, meaning it won't timeout any IO for which the driver did not
receive any completion ack ?

Thanks

Brem

2010/4/27 brem belguebli <brem.belguebli@xxxxxxxxx>:
> Hi James,
>
> I could set lpfc_log_verbose on both HBA's to 4115, I hope it'll be high
> enough to get interesting traces.
>
> On Mon, 2010-04-26 at 23:52 +0200, brem belguebli wrote:
>> Hi James,
>>
>> On Mon, 2010-04-26 at 13:50 -0400, James Smart wrote:
>> > Brem,
>> >
>> > I'm not understanding you.
>> >
>> >
>> > brem belguebli wrote:
>> > > We have sg3_utils installed , and I think we ran sg_verify on one or
>> > > 2
>> > > unresponsive /dev/sd and it didn't give the hand back.
>> > >
>> > what do you mean "give the hand back" ?    was the operation
>> > successful or not ?
>> >
>> When I say it didn't give the hand back, I mean the one or 2 processes
>> got stuck in D state, thus not returning success .
>> > > It was exactly
>> > > cd /sys/block
>> > > for DEV in `ls -1d dev*`; do
>> > > echo ${DEV}
>> > >         dd if =/dev/${DEV} of=/dev/null bs=1024 count=1 &
>> > >         echo
>> > > done
>> > >
>> > > And yes it really works, never seen any kind of preemption of DM-MP over
>> > > direct sd access. I've cc'ed dm-devel may be some DM guru could give his
>> > > opinion on this.
>> > >
>> > > Next time, I'll use a sg_dd instead of dd, to bypass any cache effect
>> > > (by the way, does VFS cache anything when addressing /dev/X devices ?)
>> > >
>> > ok - by "works" means "dd successfully read 1 block from the device" -
>> > right ?
>> >
>> Yes, the devices on which dd was successful were the ones from FABRIC1,
>> dd completed successfully by reading the first 1024 bytes to copy them
>> to /dev/null
>>
>> > > > The most interesting for the lpfc driver would be the lpfc module
>> > > > parameter "lpfc_log_verbose=4115"
>> > > > which turns on discovery log messages, els messages, link events, and
>> > > > FCP i/o error messages.
>> > > >
>> > >
>> > > As our DWDM ring switch is on the less optimal path, there will be a
>> > > switch back to nominal soon.
>> > >
>> > > I'll activate this log level on the HBA's and check the firmware
>> > > versions you gave me .
>> > >
>> > ok. I believe that the shost for the adapters in question, have a
>> > sysfs variable for lpfc_log_verbose, that sets the log level on the
>> > individual adapter. This would not require you to unload/reload the
>> > driver to set the option.
>> >
>> I'll tell you tomorrow (was off today) if the parameter exists for these
>> HBA's.
>
>
>> > > Hopefully, we will be able to provide you something deeper to
>> > > investigate.
>> > >
>> > > Brem
>> > >
>> >
>> > ok.
>> >
>> > -- james
>> >
>> >
>> Thanks
>>
>>
>
>
>
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html