Just as a note for now, I won't be saying much about naming stuff. 1) it could lead us into a rabbit hole that has no way out, 2) it is not needed for the first RFCs, 3) you'll see how other things will be named around, so I guess you will feel yourself what the right name will be, and finally 4) it is *very* likely that it will be done by others and discussed more than the rest of the code ;)
Thank you :) Naming will be dealt with later. We will have the possibility of adding an address pool to the allocator. For example, when we create a root PCI bus, we tell the allocator that the following PCI addresses are possible: {0..0}:{0..31}.{0..7}, where {a..b} means a range from a to b, inclusive.
This function call could look like this: allocatorAddPool( allocator[PCI_ADDRESS_TYPE], &outputAddress, AllocatorRange(0, 0), AllocatorRange(0, 31), AllocatorRange(0, 7));
The outputAddress would be an array owned by the caller (only filled in by the allocator).
If we were to allocate IP addresses for computers from the range 192.168.1.2 to 192.168.1.240, where the router has to have the address 192.168.1.254:
Even though we won't be assigning IP addresses, it is really niceanalogy to use and explain stuff on since hopefully everyone on the listknows more or less about the limits and properties of them.First, we reserve the address for the router to let the Allocator know that it's in use, and that we can track collisions in manually assigned addresses:
allocatorReserveAddress( allocator[IP_ADDRESS_TYPE], &outputAddress, AllocatorValue(192), AllocatorValue(168), AllocatorValue(1), AllocatorValue(254));
Then, we assign the addresses for the computers:
allocatorReserveAddress( allocator[IP_ADDRESS_TYPE], &outputAddress, AllocatorValue(192), AllocatorValue(168), AllocatorValue(1), AllocatorRange(1, 254)); Please note that AllocatorValue() is now in use.
Since libvirt is pure C, no C++, you need to make sure errors arecaught, esp. allocation errors. If AllocatorValue() is a function andit can fail, then this won't work. But you probably know that, right?
These functions/macros are just for the convenience of using the allocator.
An idea on how AllocatorRange/Value might work as a macro: Every field (IP has 4 of them) of an address would be provided as a range. For example, if we wanted to allocate an address from the range 192.168.1.1 - 192.168.1.254 without using any macros, it would be: AllocatorReserveAddr(…, 192, 192, 168, 168, 1, 1, 1, 254); but we can use AllocatorValue(X), which would expand to “X, X”, and AllocatorRange(X, Y), which expands to “X, Y”, so the function call looks nicer:
allocatorReserveAddress(…, AllocatorValue(192), allocatorValue(168), allocatorValue(1), allocatorRange(1, 254));
Another idea is to use a helper struct with two integers inside. This struct would represent a range. Then a call to AllocatorValue(192) would return a struct with fields {192, 192}, and AllocatorRange(1, 254) would return a struct with fields {1, 254}. AllocatorValue() could be a macro or a function in this case.
These functions/macros are just for the convenience of using the allocator. I am open to suggestions though. There could be a different function call to simply assign any address: allocatorReserveAny(Allocator* allocator, &outputAddress);
Let's say that we want an "sda" disk. We could create a wrapper: allocatorReserveSCSIAddress(allocator, "sda"); All this wrapper does is representing the string "sda" as an integer mapping to the letter (eg. 'a' = 0, 'z' = 25): allocatorReserveAddress(allocator[SCSI_ADDRESS_TYPE], &outputAddress, 0);
If an address is already determined, because it was specified in the XML or it's some specific device that has to be at some specific address, we still reserve it to let the Allocator know that it's in use.
How would this work internally? One of the possible solutions is keeping a set of ranges of addresses. For example, suppose that we have two PCI busses to use. Our address pool is stored as one range: {0..1} {0..31} {0..7}
Now someone reserves address 0:13.2
Our new free addresses pool is now stored as this set of ranges: {0..0} {0..12} {0..7}, {0..0} {12..12} {0..1}, {0..0} {12..12} {3..7}, {0..0} {13..31} {0..7}, {1..1} {0..31} {0..7}
If we kept every address separately, it would require 2*32*8=512 addresses.
The set data structure from gnulib's gl_oset.h is a candidate for keeping the ranges in a sorted fashion. Another candidate is simply a sorted list from gl_list.h.
We cannot use those since they are GPL and the library part of libvirtis LGPL. We can either create our own or just use whatever there'sprovided already.We have macros and functions for handling arrays; if you want it sorted,then VIR_INSERT_ELEMENT and others are your friends. With arrays wewould just keep pointers there and move those which is not thatexpensive. Sparse arrays don't make sense to me in this situation.You could also convert it to bunch of bytes (either specifying a newformat or just converting them to a string) and use our hash table (seevirHashCreate()).But I don't think we'll need to handle these, let me explain why a bitbelow.This structure would be able to handle all types of addresses that are convertible to a fixed-length list of integers. We don't mind how many of these integers there are, because we can use variadic arguments. It won't allow duplicates if we stick to using it in every place where we have some addresses. It will also be very generic, with the possibility of writing wrappers on top of it that might be more convenient for some particular address type.
This way we would keep qemu_* files as they are for now, and just replace existing allocators (and manual assignment of addresses) to calls to our new allocator.
Actually, in order for us to be able to replace the calls we have there,we would need to go a bit different route. The one I suggested wouldactually look a bit different:Let's say we would start by moving qemuDomainAssignSpaprVIOAddress() to,for example, src/conf/domain_addr.c and renaming it to, for examplevirDomainDeviceAddressAssignSpaprVIO(). Then we would replace all itscallers and so on. Every time the function needs to access something inthe private data of the domain (qemuDomainObjPrivate), such data wouldbe instead moved into the virDomainObj (or virDomainDef. Whenever qemu-specificinformation needs to be passed into that function, it could be convertedto a parameter of that function. Either some value, flag or some morecomplex structure, e.g. allocopt (similarly to xmlopt which we have forparsing and formating XML). Slowly we might get to the point where lotof the code is changed to data that driver has stored either per-domainor globally.
I started doing it. I moved almost every function that assigns addresses (apart from PCI addresses, they are the most complex ones) from qemu_domain_address.c to domain_addr.c. For an overview of what happened, look at the function qemuDomainAssignAddresses from qemu_domain_address.c.
What happened: I changed a lot of function names from qemu* to vir* and moved them to another files. I also moved the structures virDomainCCWAddressSetPtr and virDomainVirtioSerialAddrSetPtr from qemu’s private data to virDomainObj.
I had to move some definitions from domain_addr.h to domain_conf.h, because the address sets are now in domainObj, so virDomainObjDispose() has to dispose them. I am not totally sure that the resources are freed properly, because their owner changed, but I will deal with it later, especially that we haven’t even decided whether they should belong to domainDef or domainObj.
My next step is to make qemuDomainReleaseDeviceAddress generic and then start doing the same thing with PCI address assignment functions.
Is that what you meant? Does it look good at first glance?
Thank you very much for the email!
Tomasz
My ideas: - A possibility of reserving a range of addresses might be used to reserve a whole PCI slot at once.
I don't see a use case for that.- Flags could be added, changing the behavior of particular addresses or the whole allocator.
As above, we'll most probably need that.My questions: - Is this approach good long-term? - Is address releasing expected to be required at any point? (think device hotplug/unplug, etc.)
Every time device is unplugged. At first I thought it will be only onhot-unplug, but it needs to be freed whenever the device is removed andthe data will not be recalculated. And that can happen even oncold-unplug (virsh detach-device --config).- What other features are needed?
We'll see :) Let's start with either of the approaches (I wouldpersonally suggest the one I had in mind as I think it goes nicely withcurrent approaches in libvirt and will have less hurdles along the way.Think it through and let us know about the next design steps.Please speak up your mind if you have remarks or thoughts about the idea. I'd really appreciate it.
Tomasz
Have a nice day,Martin
|