Hi all,
I am seeing an issue while using an LSI-3008-based adapter (mpt3sas
driver) on a PowerPC system (although I am not yet convinced it is
architecture dependent). When I create a RAID1 volume, the physical disk
devices get "hidden" as expected however the various kernel objects are
out of sync. The corresponding bits in the "sd_index_ida" bitmap gets
cleared, and the symlink in /sys/dev/block for this major:minor pair
gets removed, but none of the other major:minor entries in sysfs get
removed. The next time a new device is added (for example, during
another RAID volume create or delete), the recently-freed major:minor
number is picked up from the "sd_index_ida" bitmap but the attempt to
create sysfs entries fails EEXIST due to an entry by the same name
already (still) existing. This failure goes unhandled and later the
kernel panics in sd_probe_async while dereferencing an (apparently)
invalid backing_dev_info structure (presumably left invalid due to the
EEXIST error).
A reboot clears this (bitmaps and sysfs) up and the second RAID volume
(if a create was done) shows up normally. However, even if the panic
were avoided by better error handling in sd_probe_async there would
still be the problem of being able to create more than one RAID volume
without rebooting.
I am wondering if this issue has been seen elsewhere, and also just what
might be going wrong. For mpt3sas, it appears that the firmware largely
drives the hiding/exposing of devices but I don't see an issue with the
ordering of those events. I am wondering if the driver is failing to
setup the device attributes correctly in order to get the proper sysfs
handling.
I am seeing this on Ubuntu 16.04, but also see it on the upstream
kernel. Oddly, it does not happen on RHEL 7.2 (an older kernel).
A possibly-related issue we see is that when a RAID volume is deleted,
none of the RAID device nodes (/dev as well as /sys/) get removed -
although they are unusable. Deleting before creating does not produce
the panic, so I believe the "sd_index_ida" bitmap is not getting updated
by the delete.
Any help would be appreciated.
Thanks,
Doug
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html