On Wed, Jan 08, 2025 at 05:22:56PM +0000, Alan Maguire wrote: > On 08/01/2025 16:38, Ihor Solodrai wrote: > > On Wednesday, January 8th, 2025 at 5:55 AM, Alan Maguire <alan.maguire@xxxxxxxxxx> wrote: > > > >> > >> > >> On 21/12/2024 03:04, Ihor Solodrai wrote: > >> > >>> In dwarf_loader with growing nr_jobs the wall-clock time of BTF > >>> encoding starts worsening after a certain point [1]. > >>> > >>> While some overhead of additional threads is expected, it's not > >>> supposed to be noticeable unless nr_jobs is set to an unreasonably big > >>> value. > >>> > >>> It turns out when there are "too many" threads decoding DWARF, they > >>> start competing for memory allocation: significant number of cycles is > >>> spent in osq_lock - in the depth of malloc called within > >>> cu__zalloc. Which suggests that many threads are trying to allocate > >>> memory at the same time. > >>> > >>> See an example on a perf flamegraph for run with -j240 [2]. This is > >>> 12-core machine, so the effect is small. On machines with more cores > >>> this problem is worse. > >>> > >>> Increasing the chunk size of obstacks associated with CUs helps to > >>> reduce the performance penalty caused by this race condition. > >> > >> > >> Is this because starting with a larger obstack size means we don't have > >> to keep reallocating as the obstack grows? > > > > Yes. Bigger obstack size leads to lower number of malloc calls. The > > mallocs tend to happen at the same time between threads in the case of > > DWARF decoding. > > > > Curiously, setting a higher obstack chunk size (like 1Mb), does not > > improve the overall wall-clock time, and can even make it worse. > > This happens because the kernel takes a different code path to allocate > > bigger chunks of memory. And also most CUs are not big (at least in case > > of vmlinux), so a bigger chunk size probably increases wasted memory. > > > > 128Kb seems to be close to a sweet spot for the vmlinux. > > The default is 4Kb. > > > > Thanks for the additional details! > > Reviewed-by: Alan Maguire <alan.maguire@xxxxxxxxxx> I'm adding these details and your reviewed-by tag to that cset. Thanks! - Arnaldo