On Tue, 8 Oct 2019 at 02:42, Adrian Reber <adrian@xxxxxxxx> wrote: > > > Fedora's complete MirrorManager setup is still running on Python2. The > code has been ported to Python3 probably over two years ago but we have > not switched yet. One of the reasons is that the backend is running on > RHEL7 which means we are not in a hurry to deploy the Python3 version. > > The mirrorlist server which is answering the actual dnf/yum queries for > a mirrorlist/metalink is, however, running in a Fedora 29 container. > This container also still uses Python2 and it actually cannot use the > Python3 version. > > One of MirrorManager's design points is that the mirrorlist servers, > which are answering around 27 000 000 requests per day, are not directly > accessing the database. The backend creates a snapshot of the relevant > data (113MB) and the mirrorlist servers are using this snapshot to > answer client requests. > > This data exchange is based on Python's pickle format and that does not > seem to work with Python3 if it is generated using Python2. > > Having used protobuf before, I added code to also export the data for the > mirrorlist servers based on protobuf. > > The good news with protobuf is, that the resulting file is only 66MB > instead of 113MB. The bad news is, that loading it from Python requires > 3.5 times the amount of memory during runtime (3.5GB instead of 1GB). > > In addition to the data exchange problems between backend and > mirrorlist servers the architecture of the mirrorlist server does not > really make sense today. 12 years ago it made a lot of sense as it could > be easily integrated into httpd and it could be easily reloaded without > stopping the service. Today the mirrorlist server and httpd is all part > of a container which is then behind haproxy. So there is a lot of > infrastructure in the container which is not really useful. > > To get rid of the pickle format and to have a simpler architecture I > reimplemented the mirrorlist-server in Rust. This was brought up some > time ago on a ticket and with the protobuf problems I was seeing in > Python it made sense to try it out. > > My code currently can be found at https://github.com/adrianreber/mirrorlist-server > and so far the results from the new mirrorlist server are the same as > from the Python based mirrorlist server. > > It requires less than 700MB instead of the 1GB in Python with production > based data and seems really fast. > > I have set up a test instance with the mirror data from Sunday at: > > https://lisas.de/metalink?repo=updates-testing-f31&arch=x86_64 > https://lisas.de/mirrorlist?repo=updates-testing-f31&arch=x86_64 > Nice. Very very nice. Mirror management software is hard. Thank you for doing all this work -- Stephen J Smoogen. _______________________________________________ infrastructure mailing list -- infrastructure@xxxxxxxxxxxxxxxxxxxxxxx To unsubscribe send an email to infrastructure-leave@xxxxxxxxxxxxxxxxxxxxxxx Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/infrastructure@xxxxxxxxxxxxxxxxxxxxxxx