On 7/11/22 Marius Schwarz wrote:
I have just create(d/ not finished yet, started 15 minutes ago) a ~2.5 GB rpm and found, that rpmbuild is an extrem bottleneck. IMHO, this is caused by a fileread function which reads files in 32k blocks, which is very slow and extrem IO intensive. The result is a task running at 1 core at 100% perma. With changes to larger chunks, we can speed up so many build tasks on the farm. Multicore use would also be helpful i.e. while packing the files. Any counter-arguments ?
If you give the complete package name and URL of the repo, then more persons may be likely to help investigate. Specifying a reproducible example is always good. If you know "strace -p $PID" then please learn "perf record -p $PID". If the size of the package is in gigabytes, then upstream bears some responsibility for investigating and documenting the use of data compression with the package. What does upstream say? In the few samples of "read(" from the output of strace, there I see text similar to JSON or XML tags. A large dataset that contains zillions of repetitions of only a few dozen tags, creates O(n**2) work for deflation. Finding many matches of any particular tag is quick, but which match can be extended the most, considering the exact context of prefixes and suffixes? A "looser" compression such as "gzip -3" or lzo might be much faster with only slightly larger output. A software implementation of a hardware technique such as WK, or even "ancient" modem compression MNP5 or MNP10, might also be a good choice. _______________________________________________ devel mailing list -- devel@xxxxxxxxxxxxxxxxxxxxxxx To unsubscribe send an email to devel-leave@xxxxxxxxxxxxxxxxxxxxxxx Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/devel@xxxxxxxxxxxxxxxxxxxxxxx Do not reply to spam on the list, report it: https://pagure.io/fedora-infrastructure