So we've had many email threads on the subject of storage, but none have resulted in a satisfactory way forward to implementing any storage mgmt apis. Part of the problem I think is that we've not tried to understand all the various concepts / technologies which are available & how they relate to each other. This mail attempts to outline all the different technologies. There's a short list of API operations, but I don't want to particularly get into API details until we have a good undersatnding of the concepts. First and foremost I don't believe it is acceptable to say we're only going to allow one kind of storage. Storage is the key piece of infrastructure for any serious network, and we have to be able to adapt to deployment scenarios that present themselves. Second, there is clearly a huge number of storage technologies here and there's no way we'll implement support for all of them in one go. So we need to prioritize getting the conceptual model correct, to allow us to incrementally support new types of storage backend. Taxonomy of storage types ========================= | +- Block | | | +- Disk | | | | | +- Direct attached | | | | | | | +- IDE/ATA disk | | | +- SCSI disk | | | +- FibreChannel disk | | | +- USB disk/flash | | | +- FireWire disk/flash | | | | | +- Remote attached | | | | | +- iSCSI disk | | +- GNBD disk | | | +- Partition | | | +- Virtual | | | +- Direct attached | | | | | +- LVM | | +- ZFS | | | +- Remote attached | | | +- Cluster LVM | +- FileSystem | | | +- Directed attached | | | | | +- ext2/3/4 | | +- xfs | | +- ZFS | | | +- Remote attached | | | +- NFS | +- GFS | +- OCFS2 | +- Directory | +- File | +- Raw allocated +- Raw sparse +- QCow2 +- VMDK Storage attributes ================== - Local vs network (ext3 vs NFS, SCSI vs iSCSI) - Private vs shared (IDE vs FibreChannel) - Pool vs volume (LVM VG vs LV, Directory vs File, Disk vs Partition) - Container vs guest (OpenVZ vs Xen) - Attributes - Compressed - Encrypted - Auto-extend - Snapshots - RO - RW - Partition table - MBR - GPT - UUID - 16 hex digits - Unique string - SCSI WWID (world wide ID) - Local Path(s) (/dev/sda, /var/lib/xen/images/foo.img) - Server Hostname - Server Identifier (export path/target) - MAC security label (SELinux) - Redundancy - Mirrored - Striped - Multipath - Pool operation - RO - RW Nesting hierarchy ================= Many possibilities... - 1 x Host -> N x iSCSI target -> N x LUN -> N x Partition - N x Disk/Partition -> 1 x LVM VG -> B x LVM LV - 1 x Filesystem -> N x directory -> N x file - 1 x File -> 1 x Block (loopback) Application users ================= - virt-manager / virt-install - Enumerate available pools - Allocate volume from pool - Create guest with volume - virt-clone - Copy disks - Snapshot disks - virt-df - Filesystem usage - pygrub - Extract kernel/initrd from filesystem - virt-factory - Manage storage pools - Pre-migration sanity checks - virt-backup - Snapshot disks - virt-p2v - Snapshot disks Storage representation ====================== Two core concepts - Volume - a chunk of storage - assignable to a guest - assignable to a pool - optionally part of a pool - Pool - a chunk of storage - contains free space - allocate to provide volumes - compromised of volumes Recursive! n x Volume -> Pool -> n x Volume Nesting to many levels... Do we need an explicit Filesystem concept ? Operations ========== Limited set of operations to perform - List host volumes (physical attached devices) - List pools (logical volume groups, partitioned devs, filesystems) - List pool volumes (dev partitions, LVM logical volumes, files) - Define pool (eg create directory, or define iSCSI target) - Undefine pool (delete directory, undefine iSCSI config - Activate pool (mount NFS volume, login to iSCSI target) - Deactivate pool (unmount volume, logout of iSCSI) - Dump pool XML (get all the metadata) - Lookup by path - Lookup by UUID - Lookup by name - Create volume (create a file, allocate a LVM LV, etc) - Destroy volume (delete a file, deallocate a LVM LV) - Resize volume (grow or shrink volume) - Copy volume (copy data between volumes) - Snapshot volume (snapshot a volume) - Dump volume XML (get all the metadata) - Lookup by path - Lookup by UUID - Lookup by name http://www.redhat.com/archives/libvir-list/2007-February/msg00010.html http://www.redhat.com/archives/libvir-list/2007-September/msg00119.html Do we also need some explicit Filesystem APIs ? XML description =============== The horrible recursiveness & specific attributes are all in the XML description for different storage pool / volume types. This is where we define things like what physical volume are in a volume group, iSCSI server / target names, login details, etc, etc XXX fill in the hard stuff for metadata description here Implementation backends ======================= - FileSystem/Directory/File - POSIX APIs - LVM - LVM tools, or libLVM - Disk/partitions - sysfs / parted - iSCSI - sysfs / iscsi utils - ZFS - ZFS tools Implementation strategy ======================= Should prioritize implementation according to immediate application needs Initial goal to support remote guest creation on par with current capabilities: - Directory + allocateing raw sparse files - Enumerate existing disks, partitions & LVM volumes Further work: - Allocating LVM volumes - Defining LVM volume groups - Partitioning disks - Mounting networked filesystems - Accessing iSCSI volumes - Copying existing volumes - Snapshotting volumes - Cluster aware filesystems (GFS) - Various file formats (QCow, VMDK, etc) Dan. -- |=- Red Hat, Engineering, Emerging Technologies, Boston. +1 978 392 2496 -=| |=- Perl modules: http://search.cpan.org/~danberr/ -=| |=- Projects: http://freshmeat.net/~danielpb/ -=| |=- GnuPG: 7D3B9505 F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 -=| -- Libvir-list mailing list Libvir-list@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/libvir-list