RE: [PATCH v2] libsas: fix bug for vacant phy

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



This fixes the issue I was seeing. Thanks!

-----Original Message-----
From: Jack Wang [mailto:jack_wang@xxxxxxxxx] 
Sent: Sunday, September 19, 2010 10:52 PM
To: 'Jack Wang'; Chuck Tuffli; 'James Bottomley'; linux-scsi@xxxxxxxxxxxxxxx
Cc: 'lindar_liu'; 'roy'
Subject: [PATCH v2] libsas: fix bug for vacant phy


Hi, James

Please drop the previous patch and apply this new one.
Attached patch fix following bugs reported by Chuck.

Chuck
Could you test if this solve your problem ?

Jack -
Signed-off-by: Jack Wang <jack_wang@xxxxxxxxx>

Signed-off-by: Lindar <lindar_liu@xxxxxxxxx>
---
 sas_expander.c |    7 ++++---
 1 files changed, 4 insertions(+), 3 deletions(-)

diff --git a/sas_expander.c b/sas_expander.c
index d1d86a6..f63dbea 100644
--- a/sas_expander.c
+++ b/sas_expander.c
@@ -175,10 +175,10 @@ static void sas_set_ex_phy(struct domain_device *dev,
int phy_id,
 	switch (resp->result) {
 	case SMP_RESP_PHY_VACANT:
 		phy->phy_state = PHY_VACANT;
-		return;
+		break;
 	default:
 		phy->phy_state = PHY_NOT_PRESENT;
-		return;
+		break;
 	case SMP_RESP_FUNC_ACC:
 		phy->phy_state = PHY_EMPTY; /* do not know yet */
 		break;
@@ -209,7 +209,8 @@ static void sas_set_ex_phy(struct domain_device *dev,
int phy_id,
 	phy->phy->negotiated_linkrate = phy->linkrate;
 
 	if (!rediscover)
-		sas_phy_add(phy->phy);
+		if (sas_phy_add(phy->phy))
+			sas_phy_free(phy->phy);
 
 	SAS_DPRINTK("ex %016llx phy%02d:%c attached: %016llx\n",
 		    SAS_ADDR(dev->sas_addr), phy->phy_id,
-- 
1.7.2.3.msysgit.0


I finally had a chance to try something more recent (2.6.34) and I still see
the problem. I posted my findings to linux-scsi
(http://marc.info/?l=linux-scsi&m=128254243405363&w=2), but no one has
commented. Do you have any suggestions for approaches to fix this? I'm
willing to do the work, but am a little unclear where to look. Thanks!

-----Original Message-----
From: jack wang [mailto:jack_wang@xxxxxxxxx] 
Sent: Thursday, July 08, 2010 6:35 PM
To: Chuck Tuffli
Cc: 'lindar_liu'; 'aoqingy'; 'roy'
Subject: RE: BUG reported during pm8001 driver rmmod

Hi Jack.

We have been using your Linux driver for testing and it has been working
great (thanks!). Today I tested against a new JBOD (HP D2700) and am
hitting an error when unloading the driver. Note that I don't see this
error with other JBODs (IBM, USI, etc), only the new one. Have you seen
anything like this before?

Chuck

[510] uname -srv
Linux 2.6.31-22-server #60-Ubuntu SMP Thu May 27 03:42:09 UTC 2010
[511] sudo insmod ./pm8001.ko 
[512] sudo rmmod pm8001 
Segmentation fault
[513] dmesg
...
[14131.624620] pm8001 0000:09:00.0: pm8001: driver version 0.1.36
[14131.630619] pm8001 0000:09:00.0: PCI INT A -> GSI 16 (level, low) ->
IRQ 16
[14131.637752] pm8001 0000:09:00.0: setting latency timer to 64
[14132.541239] scsi4 : pm8001
[14132.544347]   alloc irq_desc for 97 on node 0
[14132.548799]   alloc kstat_irqs on node 0
[14132.552851] pm8001 0000:09:00.0: irq 97 for MSI/MSI-X
[14248.581889] scsi 4:0:0:0: Direct-Access     HP       DG0300FARVV
HPD6 PQ: 0 ANSI: 5
[14248.590706] sd 4:0:0:0: Attached scsi generic sg2 type 0
[14248.596738] sd 4:0:0:0: [sdb] 585937500 512-byte logical blocks: (300
GB/279 GiB)
[14248.605469] sd 4:0:0:0: [sdb] Write Protect is off
[14248.610371] sd 4:0:0:0: [sdb] Mode Sense: eb 00 10 08
[14248.616130] sd 4:0:0:0: [sdb] Write cache: disabled, read cache:
enabled, supports DPO and FUA
[14248.627080]  sdb: unknown partition table
[14248.634138] sd 4:0:0:0: [sdb] Attached SCSI disk
[14248.656076] scsi 4:0:1:0: Enclosure         HP       D2700 SAS AJ941A
0052 PQ: 0 ANSI: 5
[14248.667852] scsi 4:0:1:0: Attached scsi generic sg3 type 13
[14248.945688] ses 4:0:1:0: Attached Enclosure device
[14288.328616] pm8001 0000:09:00.0: PCI INT A disabled
[14288.333712] ------------[ cut here ]------------
[14288.338429] kernel BUG at
/build/buildd/linux-2.6.31/include/linux/transport_class.h:92!
[14288.343639] invalid opcode: 0000 [#1] SMP 
[14288.343639] last sysfs file:
/sys/devices/pci0000:00/0000:00:1c.0/0000:09:00.0/host4/port-4:0/expande
r-4:0/port-4:0:36/end_device-4:0:36/target4:0:1/4:0:1:0/type
[14288.362505] CPU 0 
[14288.362505] Modules linked in: ses enclosure pm8001(-) nfs lockd
nfs_acl auth_rpcgss sunrpc radeon ttm drm libsas i2c_algo_bit
scsi_transport_sas iptable_filter psmouse ip_tables i5400_edac edac_core
lp x_tables serio_raw i5k_amb shpchp parport floppy igb dca
[14288.391254] Pid: 1204, comm: rmmod Not tainted 2.6.31-22-server
#60-Ubuntu X7DW3
[14288.392505] RIP: 0010:[<ffffffffa00c10c8>]  [<ffffffffa00c10c8>]
sas_release_transport+0x88/0x90 [scsi_transport_sas]
[14288.402505] RSP: 0018:ffff88003cdd5eb8  EFLAGS: 00010286
[14288.412505] RAX: 00000000fffffff0 RBX: ffff88003b698000 RCX:
01000000000000c1
[14288.422505] RDX: ffff88003b698700 RSI: ffffffff812782b0 RDI:
ffffffff817d8fa0
[14288.422505] RBP: ffff88003cdd5ec8 R08: 0000000000000000 R09:
0000000000000000
[14288.432505] R10: 0000000000000000 R11: ffff88003d94d9f4 R12:
ffffffffa02a6fa0
[14288.442505] R13: 0000000000000000 R14: 00007fff3e63a700 R15:
0000000000000001
[14288.451254] FS:  00007f8dd6c6d6f0(0000) GS:ffff8800019f3000(0000)
knlGS:0000000000000000
[14288.452505] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[14288.462505] CR2: 00007fed69a640a0 CR3: 000000003c8ef000 CR4:
00000000000006f0
[14288.472505] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
0000000000000000
[14288.472505] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7:
0000000000000400
[14288.482505] Process rmmod (pid: 1204, threadinfo ffff88003cdd4000,
task ffff88003ca516b0)
[14288.492505] Stack:
[14288.492505]  0000000000000000 0000000000000880 ffff88003cdd5ed8
ffffffffa029dd20
[14288.502505] <0> ffff88003cdd5f78 ffffffff8108ed38 ffff88003cdd5ef8
ffffffff8107c909
[14288.512505] <0> ffffffffa02a6fa0 ffffffff00000880 ffff88003cdd5f14
0000000000000014
[14288.521254] Call Trace:
[14288.522505]  [<ffffffffa029dd20>] pm8001_exit+0x1c/0x1e [pm8001]
[14288.522505]  [<ffffffff8108ed38>] sys_delete_module+0x1a8/0x280
[14288.532505]  [<ffffffff8107c909>] ? up_read+0x9/0x10
[14288.541254]  [<ffffffff81012042>] system_call_fastpath+0x16/0x1b
[14288.542505] Code: 0f 6c 26 e1 85 c0 75 13 48 89 df e8 d3 31 05 e1 48
83 c4 08 5b c9 c3 0f 0b eb fe 0f 0b eb fe 0f 0b eb fe 0f 0b eb fe 0f 0b
eb fe <0f> 0b eb fe 0f 1f 40 00 55 48 89 e5 41 56 41 55 41 54 53 4c 8b 
[14288.562505] RIP  [<ffffffffa00c10c8>] sas_release_transport+0x88/0x90
[scsi_transport_sas]
[14288.572505]  RSP <ffff88003cdd5eb8>
[14288.580975] ---[ end trace 0436c237fa6eeca0 ]---

Hi, Chuck
I haven't seen this bug before and We don't have the HP JBOD to test. It
seams you hit the bug in linux-2.6.31/include/linux/transport_class.h:92!,
the problem is transport unresgister the HP Enclosure HP D2700,
the bug appears. Maybe there something wrong with the ses && enclosure
modules , could you update to newer kernel to see whether the bug 
still exists?
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [SCSI Target Devel]     [Linux SCSI Target Infrastructure]     [Kernel Newbies]     [IDE]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux ATA RAID]     [Linux IIO]     [Samba]     [Device Mapper]
  Powered by Linux