Re: F21 downloads repository metadata in 3 places!

Hedayat Vatankhah <hedayat.fwd@xxxxxxxxx> · Mon, 15 Dec 2014 16:39:03 +0330

/*Richard Hughes <hughsient@xxxxxxxxx>*/ wrote on Mon, 15 Dec 2014 
09:37:27 +0000:
On 13 December 2014 at 21:10, Hedayat Vatankhah<hedayat.fwd@xxxxxxxxx>  wrote:
Surprisingly, PackageKit uses its own separate cache.
Not surprising at all, when you're familiar with how PackageKit works.
PackageKit has to accept transactions from clients and return results
very quickly. Just something as simple as SHA'ing a metadata file
destroys our latency, which is one of the biggest reasons nobody liked
the command-not-found functionality when it was introduced: it was
SLOW. This interactive command had to return results in ~100ms, not
tens of seconds.

By having 100% complete control of a copy of the cache we can keep
certain files locked in memory, and we can be aggressive about caching
pools of packages. This allows us to achieve the low-latency design
required by gnome-software, which is firing off tons of transactions
in parallel at startup with expected latency guarantees. Another thing
it allows us to do is atomically update the cache, so if we're
updating the cache in the background and we get interrupted or the
transaction is cancelled to make room for a user-requester
"interactive" transaction, we can just continue to use the old cache,
and then atomically rename the new location to the proper location and
update pools when done. You just can't do this when there are three
things fiddling with files behind your back without any co-ordination.
<...>

Note, if yum or DNF wanted to use the PK cache, it's guaranteed to be
valid, complete and up to date, although I'm not sure a dependency
from the package manager CLI to PK would be acceptable for their
maintainers.

Richard.
What I think about this (I'm looking at the distribution level, rather 
than specific packages):
1. If PK really needs its own *copy* of the cache, that's OK (well, not 
OK but acceptable), but IMHO it should not download it independently 
too. I think it should just copy the DNF(librepo) cache if it is 
considered valid and up-to-date, or ask it to bring its cache up-to-date 
and then copy the cache atomically to its own cache (preferably using 
hardlinks if possible).

2. I believe that the use should know, and more importantly be able to 
control WHEN the repo data is being updated. At the very least, he 
should be able to specify if the updates are automatic or not using a 
very user friendly method (probably during/after the installation; or 
per network connection).

3. I think the repository data management backend should be separate 
from the frontends (including PK, and dnf cli). Also, I like the idea of 
having a working cache even when new repodata is being downloaded, and I 
think it is something that DNF/Yum/... should also do. There were many 
times that I ended up with a half-updated repo cache which prevented me 
from using Yum as I didn't want/can let it download whole repodata. 
Probably this should be filled as a feature request against DNF.

Regards,
Hedayat
--
devel mailing list
devel@xxxxxxxxxxxxxxxxxxxxxxx
https://admin.fedoraproject.org/mailman/listinfo/devel
Fedora Code of Conduct: http://fedoraproject.org/code-of-conduct