Re: Serious regression caused by fix for [BUG 1/3] bsg queue oops with iscsi logout

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sat, 22 Mar 2008 11:06:00 -0500
James Bottomley <James.Bottomley@xxxxxxxxxxxxxxxxxxxxx> wrote:

> On Tue, 2008-03-11 at 00:36 -0500, Mike Christie wrote:
> > Mike Christie wrote:
> > > Pete Wyckoff wrote:
> > >> I think this used not to happen; not sure.  But I changed two things
> > > 
> > > This most likely did not happen before 2.6.25-rc* or it broke in 
> > > slightly different ways, because iscsi used to try and do
> > > 
> > > echo 1 > /sys/block/sdX/device/delete
> > > 
> > > from userspace instead of calling scsi_remove_target from the kernel.
> > > 
> > > As you know around 2.6.21, the behavior of doing the echo to the delete 
> > > file changed due to a driver model and scsi change and that broke the 
> > > iscsi tools. The iscsi tools userspace removal was sort of hack in the 
> > > first place and was racey, so we switched to removing devices/target 
> > > like the FC class.
> > > 
> > > 
> > >> lately.  2.6.25-rc1 to -rc4 and fedora 8 iscsi-initiator-utils (865) to
> > >> fedora devel (868).  Bidi and varlen patches always too.
> > >>
> > >> I'll follow with some more variations on this theme.  Looks like bsg
> > >> needs to protect more carefully against the device going away.  Any
> > >> ideas how best to do this?  What was the approach in sg?
> > >>
> > > 
> > > I think sg is broken in similar ways. The iser guys have some tests 
> > > cases that have broken sg while IO is outstanding. I am ccing Erez.
> > 
> > Actually one of the problems looks a little different than some of the 
> > problems hit with sg and are caused because we remove the bsg device too 
> > soon. I think we want to wait until all the references from the 
> > commands/requests are released. The attached patch (untested) moves the 
> > bsg unreg call to the scsi device release fn.
> 
> Well, this fix is now upstream.  However, it's causing all our
> scsi_devices never to get released, which is a serious regression.
> We're also doing spurious bsg_unregister_queue() for things that never
> actually registered one (all scan devices that return DID_NO_CONNECT),
> but bsg doesn't seem to be complaining about this.
> 
> The essence of the problem is that bsg_register_queue() takes a ref to
> the sdev_gendev, so you can't move bsg_unregister_queue() into the
> release function because nothing ever puts bsg's device ref and so
> release is never called.
> 
> Options for fixing this before 2.6.25 are
> 
>      1. revert the patch
>      2. Do an additional put for the bsg reference in
>         __scsi_remove_device (patch below).  It's nasty but it preserves
>         the semantics and does what you want

After some investigation, this patch doesn't fix the bug that Pete
reported (I'll send a new patch shortly).

Can you revert the commit 4b6f5b3a993cbe34b4280f252bccc76967c185c8
instead of merging this?
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [SCSI Target Devel]     [Linux SCSI Target Infrastructure]     [Kernel Newbies]     [IDE]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux ATA RAID]     [Linux IIO]     [Samba]     [Device Mapper]
  Powered by Linux