On Thu, 2012-02-09 at 14:01 +0800, Jim Barber wrote: > On 9/02/2012 11:29 AM, Nicholas A. Bellinger wrote: <SNIP> > > You should still definitely *not* be seeing NON_EXISTENT_LUN errors > > after a session reset, regardless of the unsupported bits mentioned > > above. > > > > What does your running qla2xxx configuration look like..? Also, what > > lio-core.git HEAD is this with again..? > > The errors I posted were encountered with a kernel compiled from the latest qla_tgt-3.3 HEAD. > The first entry of a 'git log' shows: > > commit 7a4dc04b51cf01d5cea3332db61c0ac609b09b13 > Author: Roland Dreier <roland@xxxxxxxxxxxxxxx> > Date: Wed Jan 11 16:58:05 2012 -0800 > > For your question regarding what does my configuration look like, I'll give you what I think you might need (but I'm just guessing). > >From the targetcli command, an 'ls' at the root results in the following: > Hi again Jim, Thanks for the additional information. So looking at your configuration below, nothing appears out of the ordinary. After looking the problem output from the previous email a bit more, one thing that is strange is the fact that all of the NON_EXISTENT_LUN are for LUN > 0, when only LUN=0 has been configured on the individual tcm_qla2xxx ports. Really quite strange for the ESX client to start hammering new LUNs that aren't registered during TMR handling.. I've tried reproducing this case with unsupported TMRs with a normal Linux FC qla2xxx client, but I've not had any luck to reproduce a similar case thus far.. What I'm not sure about yet is if a single rejected ABORT is actually making ESX client blow up (instead of falling back to LUN_RESET for example), or if it's the locally generated events (eg: Unknown task mgmt fn 0xfffd) on the qla_target.c side not being handled that is causing the hard failure. If at all possible, it would very be helpful to try to reproduce using ql2xextended_error_logging so that I can get a better idea of what's going on. This is enabled at module load time with: modprobe qla2xxx ql2xextended_error_logging=0xffffff Since you've mention it only seems to happen occasionally, you'll likely end up with gigs of logs with debug enabled. I'm still looking at how to reproduce this case, so any more logs would be very helpful to that end. Thanks, --nab > /> ls > o- / ..................................................................... [...] > o- backstores .......................................................... [...] > | o- fileio ............................................... [0 Storage Object] > | o- iblock ............................................... [1 Storage Object] > | | o- datastore0 ............................ [/dev/vg1/datastore0 activated] > | o- pscsi ................................................ [0 Storage Object] > | o- rd_dr ................................................ [0 Storage Object] > | o- rd_mcp ............................................... [0 Storage Object] > o- iscsi .......................................................... [0 Target] > o- loopback ....................................................... [0 Target] > o- qla2xxx ....................................................... [4 Targets] > | o- 21:00:00:1b:32:1b:27:5d ....................................... [enabled] > | | o- acls ......................................................... [4 ACLs] > | | | o- 21:00:00:1b:32:83:ac:c5 .............................. [1 Mapped LUN] > | | | | o- mapped_lun0 ........................................... [lun0 (rw)] > | | | o- 21:00:00:1b:32:83:ba:c4 .............................. [1 Mapped LUN] > | | | | o- mapped_lun0 ........................................... [lun0 (rw)] > | | | o- 21:00:00:1b:32:83:c3:c2 .............................. [1 Mapped LUN] > | | | | o- mapped_lun0 ........................................... [lun0 (rw)] > | | | o- 21:00:00:1b:32:83:f3:c6 .............................. [1 Mapped LUN] > | | | o- mapped_lun0 ........................................... [lun0 (rw)] > | | o- luns .......................................................... [1 LUN] > | | o- lun0 ...................... [iblock/datastore0 (/dev/vg1/datastore0)] > | o- 21:00:00:1b:32:1b:f8:5e ....................................... [enabled] > | | o- acls ......................................................... [4 ACLs] > | | | o- 21:00:00:1b:32:83:ac:c5 .............................. [1 Mapped LUN] > | | | | o- mapped_lun0 ........................................... [lun0 (rw)] > | | | o- 21:00:00:1b:32:83:ba:c4 .............................. [1 Mapped LUN] > | | | | o- mapped_lun0 ........................................... [lun0 (rw)] > | | | o- 21:00:00:1b:32:83:c3:c2 .............................. [1 Mapped LUN] > | | | | o- mapped_lun0 ........................................... [lun0 (rw)] > | | | o- 21:00:00:1b:32:83:f3:c6 .............................. [1 Mapped LUN] > | | | o- mapped_lun0 ........................................... [lun0 (rw)] > | | o- luns .......................................................... [1 LUN] > | | o- lun0 ...................... [iblock/datastore0 (/dev/vg1/datastore0)] > | o- 21:01:00:1b:32:3b:27:5d ....................................... [enabled] > | | o- acls ......................................................... [4 ACLs] > | | | o- 21:00:00:1b:32:83:ac:c5 .............................. [1 Mapped LUN] > | | | | o- mapped_lun0 ........................................... [lun0 (rw)] > | | | o- 21:00:00:1b:32:83:ba:c4 .............................. [1 Mapped LUN] > | | | | o- mapped_lun0 ........................................... [lun0 (rw)] > | | | o- 21:00:00:1b:32:83:c3:c2 .............................. [1 Mapped LUN] > | | | | o- mapped_lun0 ........................................... [lun0 (rw)] > | | | o- 21:00:00:1b:32:83:f3:c6 .............................. [1 Mapped LUN] > | | | o- mapped_lun0 ........................................... [lun0 (rw)] > | | o- luns .......................................................... [1 LUN] > | | o- lun0 ...................... [iblock/datastore0 (/dev/vg1/datastore0)] > | o- 21:01:00:1b:32:3b:f8:5e ....................................... [enabled] > | o- acls ......................................................... [4 ACLs] > | | o- 21:00:00:1b:32:83:ac:c5 .............................. [1 Mapped LUN] > | | | o- mapped_lun0 ........................................... [lun0 (rw)] > | | o- 21:00:00:1b:32:83:ba:c4 .............................. [1 Mapped LUN] > | | | o- mapped_lun0 ........................................... [lun0 (rw)] > | | o- 21:00:00:1b:32:83:c3:c2 .............................. [1 Mapped LUN] > | | | o- mapped_lun0 ........................................... [lun0 (rw)] > | | o- 21:00:00:1b:32:83:f3:c6 .............................. [1 Mapped LUN] > | | o- mapped_lun0 ........................................... [lun0 (rw)] > | o- luns .......................................................... [1 LUN] > | o- lun0 ...................... [iblock/datastore0 (/dev/vg1/datastore0)] > o- tcm_fc ......................................................... [0 Target] > > >From this you can see that I have four fibre channel (FC) ports on my Linux SAN (2x dual port cards). > I have two VMware ESXi hosts, that each have two FC ports. > The same LUN is mapped from each local FC port to all the FC ports on the ESXi servers. > <SNIP> -- To unsubscribe from this list: send the line "unsubscribe target-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html