Hi all, It turns out that anaconda uses Size() objects for a number of things that are only tangentially related to storage. There's a bunch of stuff in packaging, for instance, which does not seem that nearly related to storage but which does use Size(). It seems cruel to make anaconda restrict its usage of this handy Size class to only storage related stuff, but inconsistent, if it is a storage package, not to. Clearly, a separate package is the solution and might prove to be generally useful. Other changes that should be made at the same time are: 1) Disentangle parsing user input from Size() construction. This would change the interface, so that: * Size("10 MiB") would have to be changed to Size(10, size.MiB) * Size("%d KiB" % int(val[i])) would have to be changed to Size(int(val[i]), size.KiB) Note: All other things being equal, the above changes would result in a speed up of about 4x, when the argument is changed from a string to a tuple. * Size(user_input) would have to be changed to Size(parseSpec(user_input)) 2) Use int instead of Decimal for underlying representation and use delegation, not extension. int is unbounded, and appropriate as Size() stores exact number of bytes. The implementation will still use Decimal within humanReadable() and convertTo(), where Decimal proves its worth. This would make (1) a bit easier, for it would be possible to override object.__init__() instead of overriding Decimal.__new__() which is awkward because of the context parameter which Decimal.__new__ requires. Benefits to the change would be a 5 to 10x reduction in size of Size() objects and a 10 to 15x speedup in constructing Size() objects and in arithmetic operations. The only arithmetic operation I profiled was addition; I'm generalizing here. There would be no change to the interface, and the changes in the class are actually very simple. 3) Make arithmetic operations more type-sensible. Currently, our code does a bunch of not so good things, like: >>> divmod(Size("3 MiB"), Size("2 MiB")) (Decimal('1'), Decimal('1048576')) The semantics of divmod should agree with the, in this situation, correct semantics of mod, which gives the type of the remainder as a Size. >>> Size("3 MiB") % Size("2 MiB") Size('1 MiB') 2 multiplied by itself 3 bytes times should raise an exception or something, but it doesn't: >>> 2 ** Size("3 B") Decimal('8') The number of times 2 MiB can be subtracted from 3 MiB is not 1 bytes times, it's 1 times, but: >>> Size("3 MiB") / Size("2 MiB") Size('1 B') and so forth. In theory, our code should not be asking for really nonsensical operations like 2 ** Size("1 MiB"), and I think it is not. It can be made configurable how extreme the reaction is to a requested operation that makes no sense. There are certain operations that are mathematically sensible, but which the implementation would not, because it could not, support. One such operation is 2 MiB * 2 KiB which yields 4096 KiB^^2. We have no ability to represent that, so it would be better to raise a TypeError when called upon to calculate it rather than return, as now, Size('4 GiB'). I believe that computations in this category are not ones that we actually do in Blivet or Anaconda or should expect to. The context in which they could arise is in, e.g., statistics, where in order to calculate the standard deviation, one has to square the values. But if we start doing statistics on some MiB's we should not be using Size class at all, we can use pint, which is good for that, but bad for our regular purposes. NOTE: Most of the _implementation_ is already done, leaving just packaging and a few places in anaconda where it will require a little thought to decide if parseSpec is called for. Please give me your opinions. Maybe we can discuss this in installer meeting. - mulhern _______________________________________________ Anaconda-devel-list mailing list Anaconda-devel-list@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/anaconda-devel-list