On Thu, Jun 04, 2020 at 07:02:41PM -0700, Kevin Fenzi wrote: > On Mon, Jun 01, 2020 at 04:18:35PM +0200, Adrian Reber wrote: > > Our MirrorManager setup exports the current state of all mirrors every > > hour at :30 to a protobuf based file which is then used by the > > mirrorlist servers to answer the requests from yum and dnf. > > > > The Python script requires up to 10GB of memory and takes between 35 and > > 50 minutes. The script does a lot of SQL queries and also some really > > big SQL queries joining up to 6 large MirrorManager tables. > > > > I have rewritten this Python script in Rust and now it only needs around > > 1 minute instead of 35 to 50 minutes and only 600MB instead of 10GB. > > Wow. nice! > > > I think the biggest difference is that I am almost not doing any joins > > in my SQL request. I download all the tables once and then I do a lot of > > loops over the downloaded tables and this seems to be massively faster. > > > > As the mirrorlist-server in Rust has proven to be extremely stable over > > the last months we have been using it I would also like to replace the > > mirrorlist protbuf input generation with my new Rust based code. > > > > I am planing to try out the new protobuf file in staging in the next > > days and would then try to get my new protobuf generation program into > > Fedora. Once it is packaged I would discuss here how and if we want to > > deploy in Fedora's infrastructure. > > Cool. You will need to hurry as staging goes off on monday, and back in > a few weeks. :) Then I just have to wait a bit. No problem. > > Having the possibility to generate the mirrorlist input data in about a > > minute would significantly reduce the load on the database server and > > enable us to react much faster if broken protobuf data has been synced > > to the mirrorlist servers on the proxies. > > Yeah, and I wonder if it would let us revisit the entire sequence from > 'update push finished' to updated mirrorlist server. Probably. As the new code will not run on the current RHEL 7 based mm-backend01 would it make sense to run a short running service like this on Fedora's OpenShift? We could also create a new read-only (SELECT only) database account for this. Adrian
Attachment:
signature.asc
Description: PGP signature
_______________________________________________ infrastructure mailing list -- infrastructure@xxxxxxxxxxxxxxxxxxxxxxx To unsubscribe send an email to infrastructure-leave@xxxxxxxxxxxxxxxxxxxxxxx Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/infrastructure@xxxxxxxxxxxxxxxxxxxxxxx