Hey guys, Let me introduce myself. I'm Rik Bobbaers, working at the KULeuven (university in belgium) I'm a linux sysadmin, followed a linux kernel internals course and a device drivers course. I have some machines that i want to connect to a SAN, which also has an IBM/ESS (shark) connection. The san is split in 2 (for failover). each server has 2 qlogic fibrechannel cards, which are both connected to different san switches. I want failover/loadbalancing over those (du'uh ;)) hardware: Dell PE 1750, dual xeon cpu, 1 gig ram distro: Debian/unstable (normally it's a stable) kernel version: vanilla 2.6.13-rc6 bootloader: lilo (maybe switch to grub necessary?) udev version: 0.067-1 multipath version: 0.4.2.4-2 fibre channel cards: 0000:01:04.0 Fibre Channel: QLogic Corp. QLA2312 Fibre Channel Adapter (rev 02) 0000:03:06.0 Fibre Channel: QLogic Corp. QLA2312 Fibre Channel Adapter (rev 02) Now the questions: 1. If i unplug one cable (simulating a sanswitch breakdown/upgrade/...), the timer starts, giving a timeout after fc_dev_loss_tmo seconds. In the normal situation, this is 35 seconds (the qlogic driver adds 5 seconds to the normal 30). This timer can be set to a max of SCSI_DEVICE_BLOCK_MAX_TIMEOUT, being 15000 it seems, which is 4 hours 10 minutes. If you plug your cable in again before this timer reached 0, it reactivates the device, making reintegration in the multipath possible (disks keep the same major:minor number). If the timer runs out, the devices are removed permanently. When you plug the cable back in, it gives the newly found disks other major:minor numbers and device names (same major, but well... ;)), which makes reintegration in the multipath impossible unless you unmount everything etc... This is not the behaviour you want on a SAN network in a failover/loadbalancing environment imho. Is there a way you can make the disks be rerecognised as "the same disks as before"? So that the devices can be removed by the driver, but re-made when the san connection returns after an undetermined amount of time? If not, is it dangerous to set the SCSI_DEVICE_BLOCK_MAX_TIMEOUT to ... let's say 2^32 (uint_32)? Or redefine it in drivers/scsi/scsi_transport_fc.c ? What would be the consequences? I can imagine that this could cause memory problems, workqueue problems or so. I didn't find anything on this on the lkml list archives or so. 2. Is it possible to make the fibrechannel driver get loaded AFTER the others? (Our kernels have no modules support, so i built it all in the kernel for security reasons etc...) The reason is very simple. If I have 1 disk on ESS, it now becomes /dev/sda on scsi0, /dev/sdb on scsi1 and the local disks are /dev/sdc and /dev/sdd. If I add another ESS disk , it will be /dev/sde and /dev/sdf, after a reboot, they will be /dev/sdc and /dev/sdd, the local disks will become /dev/sde and /dev/sdf. At first i thought, use e2label, but that's only for ext2/3 filesystems, our systems run on reiserfs. Are there any sollutions for this in a monolythical kernel or is the only way to fix this, compile the driver as modules and load them at boottime? I tried patching the current code a little bit, which was quite hard (since i've never done this before). I allready learned a lot, but I would like to learn some more so that I could eventually even try to help out on these things. I hope this is enough information, if not, please ask! thanks a million, -- harry aka Rik Bobbaers K.U.Leuven - LUDIT -=- Tel: +32 485 52 71 50 Rik.Bobbaers@xxxxxxxxxxxxxx -=- http://harry.ulyssis.org