On Thu, 2010-04-08 at 15:44 +0200, Hannes Reinecke wrote: > Nicholas A. Bellinger wrote: > > On Thu, 2010-04-01 at 10:15 +0200, Hannes Reinecke wrote: > >> Hi all, > >> > > > > Greetings Hannes, > > > > Just a few comments on your proposal.. > > > >> [Topic] > >> Handling of invalid requests in virtual HBAs > >> > >> [Abstract] > >> This discussion will focus on the problem of correct request handling with virtual HBAs. > >> For KVM I have implemented a 'megasas' HBA emulation which serves as a backend for the > >> megaraid_sas linux driver. > >> It is now possible to connect several disks from different (physical) HBAs to that > >> HBA emulation, each having different logical capabilities wrt transfersize, > >> sgl size, sgl length etc. > >> > >> The goal of this discussion is how to determine the 'best' capability setting for the > >> virtual HBA and how to handle hotplug scenarios, where a disk might be plugged in > >> which has incompatible settings from the one the virtual HBA is using currently. > >> <SNIP> > > What values should be enforced by TCM based on metadata presented by TCM > > subsystem plugins (pSCSI, IBLOCK, FILEIO) for struct block_device, and > > what is expected to be enforced by underlying Linux subsystems > > presenting struct block_device..? > > > > For the virtual TCM subsystem plugin cases (IBLOCK, FILEIO, RAMDISK) the > > can_queue is a competely arbitary value and is enforced by the > > underyling Linux subsystem. There are a couple of special cases: > > > > *) For TCM/pSCSI, can_queue is enforced from struct scsi_device->queue_depth > > and max_sectors from the smaller of the two values from struct Scsi_Host->max_sectors > > and struct scsi_device->request_queue->limits.max_sectors. > > > > *) For TCM/IBLOCK, max_sectors is enforced based on struct request_queue->limits.max_sectors. > > > > *) For TCM/FILEIO and TCM/RAMDISK, both can_queue and max_sectors are > > set to arbitrarly high values. > > > > Also I should mention that TCM_Loop code currently uses a hardcoded > > struct scsi_host_template->can_queue=1 and ->max_sectors=128, but will > > work fine with larger values. Being able to change the Linux/SCSI > > queue depth on the fly for TCM_Loop virtual SAS ports being used in KVM > > guest could be quite useful for managing KVM Guest megasas emulation I/O > > traffic on a larger scale.. > > > And my question / topic here is how to handle a hotplug capability in these > cases: What happens if a device / HBA is plugged in with different / lower > capabilities than those announced? I think this question depends a great deal upon the coupling of the virtual HBA queue depth and per physical Linux/SCSI reported device queue depth. Using the TCM/pSCSI subsystem plugin as an example here to reference plain /dev/sdX backstores, there are two possible modes of operation using referenced struct scsi_device's and their parent struct Scsi_Host's: Virtual HBA Mode: Present a arbitrarily high virtual HBA queue depth and allow individual struct scsi_device's from different underlying struct Scsi_Host's to hang from a single TCM HBA. TCM will enforce the per device queue depth presented by struct scsi_device->queue_depth. Physical HBA Mode: Enforce an physical LLD queue_depth from each underlying struct Scsi_Host and all struct scsi_device attached to it. This is required for SCSI LLDs that report a higher struct scsi_device->queue_depth than what the underlying hardware for struct Scsi_Host is capable. TCM will enforce the per HBA and per device queue depths presented by the SCSI LLD. The main requirement for SCSI LLDs with the first mode to function properly is that the underlying Linux/SCSI LLD must present the proper struct scsi_device->queue_depth, and the sum total of queue slots exposed by struct scsi_device's cannot exceed what the parent struct Scsi_Host is capable of (also can change based on the number of LUNs presented by the SCSI LLD) I had ran into some buggy SCSI LLDs in v2.4 kernel days that reported their queue depths improperly, but do not recall coming across this issue personally recently on modern v2.6 drivers/scsi/ (not sure if they are completely gone now). So with this in mind, I added support for virtual HBA mode (called PHV_VIRUTAL_HOST_ID and default) while leaving the legacy phyiscal HBA mode available (called PHV_LLD_SCSI_HOST_NO) for broken SCSI LLDs. The commit for doing this with TCM/pSCSI is here: [Target_Core_Mod/pSCSI]: Decouple subsystem plugin from struct Scsi_Host http://git.kernel.org/?p=linux/kernel/git/nab/lio-core-2.6.git;a=commitdiff;h=da5ed2625e7690c33f776dd1a907a2739fe7f779 > Can we change the announced settings for the HBA on the fly? In existing TCM v3.x code, the HBA queue depth is not exposed as a configfs attribute, so unfortuately this cannot be changed just yet.. However the per TCM device virtual and physical queue_depth is available at: /sys/kernel/config/target/core/$HBA/$DEV/attrib/[hw_]queue_depth The 'queue_depth' attribute here what is being actively enforced by TCM for the backstore device, and the 'hw_queue_depth' attribute is what had been reported by TCM/pSCSI via struct scsi_device->queue_depth. Changing 'queue_depth' for the backstore currently requires that no fabric module port symlinks exist, but this is something that will be changing for TCM 4.0. Also, changing 'hw_queue_depth' from underlying struct scsi_device for the plain /dev/sdX currently requires that the device be re-registered from TCM. However, it would be easy enough to do this on the fly if there was a target mode callback present in drivers/scsi/scsi.c:scsi_adjust_queue_depth() to tell me when the change is happening within the LLD. :-) > > > The other big advantage of using TCM_Loop with your megasas guest > > emulation means that existing TCM logic for >= SPC-3 T10 NAA naming, PR, > > and ALUA emulation is immediately available to KVM guest, and does not > > have to be reproduced in QEMU code. > > > I'm not doubting that using TCM_loop here would be advantageous. > But I have to find a solution for folks just wanting to run on plain /dev/sdX. > Well, I think that using a scsi-debug-esque model like TCM_Loop + SG_IO on top of a target infrastructure enforcing underlying HBA and device requirements would give KVM Guests alot of flexibility with existing code, even for the plain /dev/sdX case. > And I need to find a common ground here to argue with the KVM folks, > whose main objection against the megasas emulation is this issue. > > Either way would be fine by me, I just think we should come to a common > understanding. > Completely understood. I will give SG_IO + TCM_Loop a shot with megasas emulation into KVM Guest and see how things look with using backstores configured with the two HBA Modes for TCM/pSCSI (plain /dev/sdX) discussed above. Best, --nab -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html