Re: PATCH [1/1]: sd_remove() hangs waiting on async_synchronize of unrelated threads

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




James Bottomley wrote:
> On Tue, 2009-12-01 at 16:28 -0600, Michael Reed wrote:
>> James Bottomley wrote:
>>> On Tue, 2009-12-01 at 15:45 -0600, Michael Reed wrote:
>>>> Prevent delays and hangs due to sd_remove() waiting for the completion of
>>>> async threads executing sd_probe_async of disks on unrelated host adapters.
>>>> This patch executes every sd_probe_async in its own async domain allowing
>>>> sd_remove() to wait for just the completion of the async thread associated with
>>>> the scsi_disk being removed.
>>> This patch was thought of a while ago. Unfortunately, some of the
>>> unrelated threads we end up waiting on are libata ones.  you confine sd
>>> to only its own probes, we end up unsynchronised with respect to libata
>>> probes and we might cause ordering problems amongst the ata devices.
>>>
>> Isn't sd_remove() only concerned with the removal of a a single scsi_disk?
>> Shouldn't libata use reference counting if it has is an issue with a scsi_disk
>> being prematurely removed?  Or is this a concern about "sd" naming?  Or something
>> else that I admittedly don't understand (but should)?
> 
> Well, no ... the sync on remove is preventing us removing a device whose
> async part is still running.  That async part for libata includes pieces
> kicked off from the sd probe.

What does the call stack look like that spawns the async part of libata?

What kind of hardware do I need to demonstrate this?

> 
>> I would truly like to better understand the issue.  Would someone mind expanding
>> upon the concern about ata ordering issues associated with the removal of a
>> single scsi_disk?
> 
> The problem isn't removal per se ... it's the fact that remove can't
> complete until any async pieces remaining from probe have run. 

Yes, I understand that.  I didn't realize that the sd_probe resulted in any
async work other than sd_probe_async().  It does complicate serialization
at removal.

I'll try to capture the "motivation" from within my fibre channel centric world
for the change and see if anyone's got some ideas on how to resolve the issue.

Mike


> 
> James
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [SCSI Target Devel]     [Linux SCSI Target Infrastructure]     [Kernel Newbies]     [IDE]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux ATA RAID]     [Linux IIO]     [Samba]     [Device Mapper]
  Powered by Linux