Re: rpmbuild is very slow with large files

John Reiser <jreiser@xxxxxxxxxxxx> · Wed, 13 Jul 2022 07:30:10 -0700

On 7/11/22 Marius Schwarz wrote:
I have just create(d/ not finished yet, started 15 minutes ago) a ~2.5 GB rpm and found, that rpmbuild is an extrem bottleneck.

IMHO, this is caused by a fileread function which reads files in 32k blocks, which is very slow and extrem IO intensive.  The result is a task running at 1 core at 100% perma. With changes to larger chunks, we can speed up so many build tasks on the farm.

Multicore use would also be helpful i.e. while packing the files.

Any counter-arguments ?
If you give the complete package name and URL of the repo,
then more persons may be likely to help investigate.
Specifying a reproducible example is always good.

If you know "strace -p $PID" then please learn "perf record -p $PID".

If the size of the package is in gigabytes, then upstream bears some
responsibility for investigating and documenting the use of
data compression with the package.  What does upstream say?

In the few samples of "read(" from the output of strace,
there I see text similar to JSON or XML tags.  A large dataset
that contains zillions of repetitions of only a few dozen
tags, creates O(n**2) work for deflation.  Finding many matches
of any particular tag is quick, but which match can be extended
the most, considering the exact context of prefixes and suffixes?
A "looser" compression such as "gzip -3" or lzo might be
much faster with only slightly larger output.
A software implementation of a hardware technique such as WK,
or even "ancient" modem compression MNP5 or MNP10,
might also be a good choice.
_______________________________________________
devel mailing list -- devel@xxxxxxxxxxxxxxxxxxxxxxx
To unsubscribe send an email to devel-leave@xxxxxxxxxxxxxxxxxxxxxxx
Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: https://lists.fedoraproject.org/archives/list/devel@xxxxxxxxxxxxxxxxxxxxxxx
Do not reply to spam on the list, report it: https://pagure.io/fedora-infrastructure