On 5/16/19 2:26 PM, Alibek Amaev wrote:
Hi!
I want to address the following problem:
On the system with hot-attached new storage volume, such as FC-switch
update configuration for connected FC-HBA on servers, linux kernel
reorder block devices and change names of block devices. Becouse
scsi-id, wwn-id and other is a symbol links to block device names than
on change block device name change path to device.
This causes the server to stop working.
For example, on server present ZFS pool with attached device by scsi-id
# zpool status
pool: pool
state: ONLINE
scan: scrub repaired 0 in 1h39m with 0 errors on Sun Oct 8 02:03:34 2017
config:
NAME STATE READ WRITE CKSUM
pool ONLINE 0 0 0
scsi-3600144f0c7a5bc61000058d3b96d001d ONLINE 0 0 0
Before export new block device from storage to hba, scsi-id have next
path to device:
/dev/disk/by-id/scsi-3600144f0c7a5bc61000058d3b96d001d -> ../../sdd
When added new block device by FC-switch, FC-HBA kernel change block
device names:
/dev/disk/by-id/scsi-3600144f0c7a5bc61000058d3b96d001d -> ../../sdf
and ZFS can't access to device until reboot (partprobe, zpool online
-e pool scsi-3600144f0c7a5bc61000058d3b96d001d - may help or may not
help)
Hmm. That really is curious; typically existing devices will not be
reassigned. Especially not if they are in use by something.
And the FC layer is going into quite some lengths to prevent this from
happening.
So this really looks more like an issue with how exactly this 'adding
new block device' step was done.
Is there any way to fix or change this behavior of the kernel?
As I said, this typically does not happen.
It would need closer examination to figure out what really happened.
It may be more reasonable to immediately assign an unique persistent
identifier of device and linking other identifiers with it?
Which is what we try ...
Also I think this is not specific problem of ZFS. And can occur with other
file system modules.Moreover, I had previously encountered a similar
problem - NetAPP
storage attached to servers by FC and export multiple LUN - suddenly
decided to change the order of LUNs and Ext4 on servers is switch to
readonly mode because driver detect changes of magic number in
superblocks of partitions.
Suddently changing the order of LUNs is _not_ what is supposed to
happen. This really sounds more like an issue with NetApp.
Cheers,
Hannes
--
Dr. Hannes Reinecke Teamlead Storage & Networking
hare@xxxxxxx +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: Felix Imendörffer, Mary Higgins, Sri Rasiah
HRB 21284 (AG Nürnberg)