Re: [PATCH 00/20, v4] Make ib_srp better suited for H.A. purposes

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Bart,

On 28.08.2012 10:04, Bart Van Assche wrote:
> On 08/27/12 18:37, Dongsu Park wrote:
> > while testing ib_srp based on your srp-ha,
> > we sometimes hit kernel crashes with the call trace below.
> > 
> > How to reproduce:
> > 
> > 0. Kernel 3.2.15 with SCST v4193 on the target,
> >    Kernel 3.2.8 with ib_srp-ha on the initiator.
> > 1. Configure 500+ vdisks on target, and get initiator connected.
> > 2. Exchange data intensively, which works well.
> > 3. (On initiator) delete SRP remote port occasionally, e.g.
> >    # echo "1" > /sys/class/srp_remote_ports/port-6\:1/delete
> >    And configure again the SRP target.
> > 4. (On target) disable Infiniband interface, and enable it again.
> > 5. Repeat 3 and 4.
> > 
> > Then the initiator's kernel suddenly crashes. (but not always)
> > 
> > Do you have any idea why?
> 
> Hello Dongsu,
> 
> That's unfortunate. I've just finished running the above test 1000 times
> on my test setup. The test ran perfectly - login succeeded every time,
> the test finished in the expected time, no kernel crash did occur and no
> memory was leaked. I've been running my test with kernel 3.6-rc3 instead
> of kernel 3.2.8 though. Can you repeat your test with kernel 3.6-rc3 on
> the initiator system instead of kernel 3.2.8 ? The 3.6-rc3 kernel
> contains multiple patches that improve robustness with regard to SCSI
> device removal.

Ok, when I get a chance to set up a new test system with kernel 3.6-rc3,
I'll do a new test and let you know.

By the way, as long as I've observed today, the crash occurs only if
rport_dev_loss_timedout() is called. It means, without device loss,
a simple rport_delete does not make any crash.

Is that probably because arguments to pr_err() are accessing to invalid
addresses?

drivers/scsi/scsi_transport_srp.c:275

        pr_err("SRP transport: dev_loss_tmo (%ds) expired - removing %s.\n",
               rport->dev_loss_tmo, dev_name(&rport->dev));

Cheers,
Dongsu

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [SCSI Target Devel]     [Linux SCSI Target Infrastructure]     [Kernel Newbies]     [IDE]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux ATA RAID]     [Linux IIO]     [Samba]     [Device Mapper]
  Powered by Linux