New mirrorlist server implementation

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Fedora's complete MirrorManager setup is still running on Python2. The
code has been ported to Python3 probably over two years ago but we have
not switched yet. One of the reasons is that the backend is running on
RHEL7 which means we are not in a hurry to deploy the Python3 version.

The mirrorlist server which is answering the actual dnf/yum queries for
a mirrorlist/metalink is, however, running in a Fedora 29 container.
This container also still uses Python2 and it actually cannot use the
Python3 version.

One of MirrorManager's design points is that the mirrorlist servers,
which are answering around 27 000 000 requests per day, are not directly
accessing the database. The backend creates a snapshot of the relevant
data (113MB) and the mirrorlist servers are using this snapshot to
answer client requests.

This data exchange is based on Python's pickle format and that does not
seem to work with Python3 if it is generated using Python2.

Having used protobuf before, I added code to also export the data for the
mirrorlist servers based on protobuf.

The good news with protobuf is, that the resulting file is only 66MB
instead of 113MB. The bad news is, that loading it from Python requires
3.5 times the amount of memory during runtime (3.5GB instead of 1GB).

In addition to the data exchange problems between backend and
mirrorlist servers the architecture of the mirrorlist server does not
really make sense today. 12 years ago it made a lot of sense as it could
be easily integrated into httpd and it could be easily reloaded without
stopping the service. Today the mirrorlist server and httpd is all part
of a container which is then behind haproxy. So there is a lot of
infrastructure in the container which is not really useful.

To get rid of the pickle format and to have a simpler architecture I
reimplemented the mirrorlist-server in Rust. This was brought up some
time ago on a ticket and with the protobuf problems I was seeing in
Python it made sense to try it out.

My code currently can be found at https://github.com/adrianreber/mirrorlist-server
and so far the results from the new mirrorlist server are the same as
from the Python based mirrorlist server.

It requires less than 700MB instead of the 1GB in Python with production
based data and seems really fast.

I have set up a test instance with the mirror data from Sunday at:

https://lisas.de/metalink?repo=updates-testing-f31&arch=x86_64
https://lisas.de/mirrorlist?repo=updates-testing-f31&arch=x86_64

The instance is based on the container I pushed to quay.io:

 $ podman run quay.io/adrianreber/mirrorlist-server:latest -h

With this change the mirrorlist server would also finally switch to
geoip2. The currently running mirrorlist server still uses the legacy
geoip database.

After the Fedora 31 freeze I would like to introduce this new mirrorlist
server implementation on the proxies. I already verified that I can run
this mirrorlist container rootless. This new container can be a drop-in
replacement for the current container and no infrastructure around it
needs to be changed.

The main changes to get it into production is to change mirrorlist1.service
and mirrorlist2.service to include a line "User=mirrormanager" and
replace the current container name with new container.

		Adrian
_______________________________________________
infrastructure mailing list -- infrastructure@xxxxxxxxxxxxxxxxxxxxxxx
To unsubscribe send an email to infrastructure-leave@xxxxxxxxxxxxxxxxxxxxxxx
Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: https://lists.fedoraproject.org/archives/list/infrastructure@xxxxxxxxxxxxxxxxxxxxxxx




[Index of Archives]     [Fedora Development]     [Fedora Users]     [Fedora Desktop]     [Fedora SELinux]     [Yosemite News]     [KDE Users]

  Powered by Linux