Re: New mirrorlist server implementation

Stephen John Smoogen <smooge@xxxxxxxxx> · Tue, 8 Oct 2019 09:40:09 -0400

On Tue, 8 Oct 2019 at 02:42, Adrian Reber <adrian@xxxxxxxx> wrote:
>
>
> Fedora's complete MirrorManager setup is still running on Python2. The
> code has been ported to Python3 probably over two years ago but we have
> not switched yet. One of the reasons is that the backend is running on
> RHEL7 which means we are not in a hurry to deploy the Python3 version.
>
> The mirrorlist server which is answering the actual dnf/yum queries for
> a mirrorlist/metalink is, however, running in a Fedora 29 container.
> This container also still uses Python2 and it actually cannot use the
> Python3 version.
>
> One of MirrorManager's design points is that the mirrorlist servers,
> which are answering around 27 000 000 requests per day, are not directly
> accessing the database. The backend creates a snapshot of the relevant
> data (113MB) and the mirrorlist servers are using this snapshot to
> answer client requests.
>
> This data exchange is based on Python's pickle format and that does not
> seem to work with Python3 if it is generated using Python2.
>
> Having used protobuf before, I added code to also export the data for the
> mirrorlist servers based on protobuf.
>
> The good news with protobuf is, that the resulting file is only 66MB
> instead of 113MB. The bad news is, that loading it from Python requires
> 3.5 times the amount of memory during runtime (3.5GB instead of 1GB).
>
> In addition to the data exchange problems between backend and
> mirrorlist servers the architecture of the mirrorlist server does not
> really make sense today. 12 years ago it made a lot of sense as it could
> be easily integrated into httpd and it could be easily reloaded without
> stopping the service. Today the mirrorlist server and httpd is all part
> of a container which is then behind haproxy. So there is a lot of
> infrastructure in the container which is not really useful.
>
> To get rid of the pickle format and to have a simpler architecture I
> reimplemented the mirrorlist-server in Rust. This was brought up some
> time ago on a ticket and with the protobuf problems I was seeing in
> Python it made sense to try it out.
>
> My code currently can be found at https://github.com/adrianreber/mirrorlist-server
> and so far the results from the new mirrorlist server are the same as
> from the Python based mirrorlist server.
>
> It requires less than 700MB instead of the 1GB in Python with production
> based data and seems really fast.
>
> I have set up a test instance with the mirror data from Sunday at:
>
> https://lisas.de/metalink?repo=updates-testing-f31&arch=x86_64
> https://lisas.de/mirrorlist?repo=updates-testing-f31&arch=x86_64
>

Nice. Very very nice.

Mirror management software is hard. Thank you for doing all this work

-- 
Stephen J Smoogen.
_______________________________________________
infrastructure mailing list -- infrastructure@xxxxxxxxxxxxxxxxxxxxxxx
To unsubscribe send an email to infrastructure-leave@xxxxxxxxxxxxxxxxxxxxxxx
Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: https://lists.fedoraproject.org/archives/list/infrastructure@xxxxxxxxxxxxxxxxxxxxxxx