[PATCH] for Deadlock in transport_fc

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



> currently I'm trying to use the new transport_fc to read the
> very often changing FibreChannel configuration in a test system.
> 
> To avoid a growing list of consistent binding entries (which
> make no sense in this special case), I tried to switch off this
> feature by
>      "echo none > /sys/class/fc_host/host1/tgtid_bind_type"
> 
> Unfortunately, the system stalls immediately, I guess the reason
> is store_fc_private_host_tgtid_bind_type() calling
> fc_rport_terminate() while holding host_lock.

Yep. A rather blatant lock bug that slipped through due to testing
on a non-smp box.  Try the attached patch.

> 
> If I understand the code correctly, even if tgtid_bind_type
> would work correctly, still the rport-nummer and scsi-target-id
> would count up on configuration changes. In the lpfc-driver, I
> saw:
> #define MAX_FCP_TARGET              256     /* max num of FCP 
> targets supported */
> Will this result in problems after 256 configuration changes?
> If so, what could I do?

Yes, it will. Once the target id assignment became larger than 256,
scsi scans won't see the remote port.

I admit, a more difficult implementation is possible if this is a
goal. In general, a production system will always manage devices
by wwpn assignments, and will usually use fabric zoning to minimize
it's view. Thus, a configuration such as yours, with high variability
in the fabric, is unusual.

I'm open to a different implementation if deemed necessary.

> BTW: My Emulex boards do not recognize a change behind the
> FibreChannel switch. So I force them to scan the configuration
> using "echo [01] >/sys/class/scsi_host/host1/board_online".
> Is there a better way to do this?

This is an issue worth noting. The lpfc driver registers for RSCN
events, so it should be seeing changes. There could be switch issues
in not posting the RSCN's (rare, long-time ago). The driver does
qualify it's nameserver request by FC4 type of FCP. Is the device in
question registering as an FCP type device with the fabric ?
Please follow up on this. This should not be happening.

Also - tweaking the lpfc-specific board_online attribute is a little
odd to make things scan. It resets and restarts the entire adapter.
For a link rescan, we recommend that bounce the link via
"echo 1 > sys/class/scsi_host/host1/issue_lip". If all you needed was
a scsi scan - try "echo "- - -"  > /sys/class/scsi_host/host2/scan ".

-- James


> 
> Regards
> Bodo
> 
> P.S.: Please CC me, I'm not on the list.
> -
> : send the line "unsubscribe 
> linux-scsi" in
> the body of a message to majordomo@xxxxxxxxxxxxxxx
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

Attachment: patch.fc_xpt
Description: patch.fc_xpt


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [SCSI Target Devel]     [Linux SCSI Target Infrastructure]     [Kernel Newbies]     [IDE]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux ATA RAID]     [Linux IIO]     [Samba]     [Device Mapper]
  Powered by Linux