Re: [RFC PATCH] scsi: libsas: fix WARN on device removal

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 09/11/2016 20:35, Dan Williams wrote:
On Wed, Nov 9, 2016 at 11:09 AM, Dan Williams <dan.j.williams@xxxxxxxxx> wrote:
On Wed, Nov 9, 2016 at 9:36 AM, John Garry <john.garry@xxxxxxxxxx> wrote:
On 09/11/2016 12:28, John Garry wrote:

On 03/11/2016 14:58, John Garry wrote:

The following patch introduces an annoying WARN
when a device is removed from the SAS topology:
[SCSI] libsas: prevent domain rediscovery competing with ata error
handling


Are there any views on this patch? I would have thought that the parties
who use the drivers based on libsas would be interested in fixing this
bug.


I should have added the before and after logs earlier, so the issue is
illustrated. Now attached. When a 24-port expander is unplugged we get >6k
lines of WARN on the console, lasting >30 seconds. Not nice.


I might be mistaken, but this patch seems functionally identical to
this attempt:

http://marc.info/?l=linux-scsi&m=143459794823595&w=2

Hi Dan,

They're not the same. I don't see how your solution properly deals with remote sas_port deletion.

When we unplug a device connected to an expander, can't the sas_port be deleted twice, in sas_unregister_devs_sas_addr() from domain revalidation and also now in sas_destruct_devices()? I think that this gives a NULL dereference. And we still get the WARN as the sas_port has still been deleted before the device.

In my solution, we should always delete the sas_port after the attached device.


i.e. it moves the port destruction to the workqueue and still suffers
from the flutter problem:

http://marc.info/?l=linux-scsi&m=143801026028006&w=2
http://marc.info/?l=linux-scsi&m=143801971131073&w=2

Perhaps we instead need to quiet this warning?

http://marc.info/?l=linux-scsi&m=143802229932175&w=2

I have not seen the flutter issue. I am just trying to solve the horrible WARN dump. However I do understand that there may be a issue related to how we queue the events; there was a recent attempt to fix this, but it came to nothing:
https://www.spinics.net/lists/linux-scsi/msg99991.html

Cheers,
John


Alternatively we need a mechanism to cancel in-flight port shutdown
requests when we start re-attaching devices before queued port
destruction events have run.

.



--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [SCSI Target Devel]     [Linux SCSI Target Infrastructure]     [Kernel Newbies]     [IDE]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux ATA RAID]     [Linux IIO]     [Samba]     [Device Mapper]
  Powered by Linux