On Fri, 2005-09-09 at 14:41, Bryan J. Smith wrote: > Les Mikesell <lesmikesell@xxxxxxxxx> wrote: > > it would be trivial for a client to decide whether it is > > more efficient to apply a delta to an existing cached or > > locally available version or pull the latest. > > But how does the "appropriate delta" get built? > At the server! That's more server overhead, let alone the > "service" that the client uses to query. It would be built by the person who maintains the master repository - only once per new RPM rev. > Now let's say you just have the client download the _entire_ > delta for the RPM to avoid all that extra server overhead. > Now you're actually _increasing_ the amount you download. Huh? You don't build the delta on demand, you offer a choice of full revs and deltas and their sizes. The client can easily compute which to download based on what it already has. > > Likewise, how does the style used in rdiff-backup compare? > > rdiff-backup does 1 delta, against 2 files. Yes, that would be the way to do it. As a final step of creating an RPM update, build that delta. Offer it to the millions of clients who already have the original file. It would be a nice proof-of-concept to feed the updates at the end of a fedora cycle to it to see what the savings would be to use deltas vs full downloads. > rsync does *1* delta. > Furthermore, it's not as efficient as a straight HTTP stream > when it comes to the server. > > With rdiff the server side work only has to be done once. > > Not for rippling through multiple deltas, which is what > versioned files are. Several of you seem to forget that > aspect. The delta between any version and its next update will never change. Make it once, store it, mirror it, whatever. Do it again separately for the next version (delta from it's immediately prior version only). Leave it up to the client to figure out what it has to start with and whether it is cheaper to apply a series of deltas or pull the full copy of the version it wants. > You're talking a lot more overhead than just a HTTP access. No, just one extra, probably automated step in adding a new update version that only gets done once. > But what do clients need? Re-assembly in the check-out! So > if there have been 5 version updates since, then _all_5_ > deltas will need to be "rippled through." Done on the client side - and only after computing that the operation is cheaper than jumping directly to the desired version. > So why not just store the _whole_ versions? > That's my point. You should. Just give the client the choice. > > You would trade that off against the > > network traffic saved when the client chooses the smaller > > delta. But, for this to work you need an on-line local > > cache of the base rpms. > > Hence why you should have a local repository! > Isn't that what you were arguing against?!?!?! I'm not against having local repositories. I'm against anything that takes human time/intervention. Put a price tag on it and you'll see why. I have no problem tossing a few hundred gigs of disk space out if no one has to manually edit anything anywhere to use it. If my install procedure starts by copying the base rpms somewhere or the NFS-shared directory where the downloaded isos were specified during the install is saved it would be fine with me. -- Les Mikesell lesmikesell@xxxxxxxxx