On Fri, 10 Jan 2025 at 08:33, Joey Jiao <quic_jiangenj@xxxxxxxxxxx> wrote: > > From: "Jiao, Joey" <quic_jiangenj@xxxxxxxxxxx> > > The current design of KCOV risks frequent buffer overflows. To mitigate > this, new modes are introduced: KCOV_TRACE_UNIQ_PC, KCOV_TRACE_UNIQ_EDGE, > and KCOV_TRACE_UNIQ_CMP. These modes allow for the recording of unique > PCs, edges, and comparison operands (CMP). There ought to be a cover letter explaining the motivation for this, and explaining why the new modes would help. Ultimately, what are you using KCOV for where you encountered this problem? > Key changes include: > - KCOV_TRACE_UNIQ_[PC|EDGE] can be used together to replace KCOV_TRACE_PC. > - KCOV_TRACE_UNIQ_CMP can be used to replace KCOV_TRACE_CMP mode. > - Introduction of hashmaps to store unique coverage data. > - Pre-allocated entries in kcov_map_init during KCOV_INIT_TRACE to avoid > performance issues with kmalloc. > - New structs and functions for managing memory and unique coverage data. > - Example program demonstrating the usage of the new modes. This should be a patch series, carefully splitting each change into a separate patch. https://docs.kernel.org/process/submitting-patches.html#split-changes > With the new hashmap and pre-alloced memory pool added, cover size can't > be set to higher value like 1MB in KCOV_TRACE_PC or KCOV_TRACE_CMP modes > in 2GB device with 8 procs, otherwise it causes frequent oom. > > For KCOV_TRACE_UNIQ_[PC|EDGE|CMP] modes, smaller cover size like 8KB can > be used. > > Signed-off-by: Jiao, Joey <quic_jiangenj@xxxxxxxxxxx> As-is it's hard to review, and the motivation is unclear. A lot of code was moved and changed, and reviewers need to understand why that was done besides your brief explanation above. Generally, KCOV has very tricky constraints, due to being callable from any context, including NMI. This means adding new dependencies need to be carefully reviewed. For one, we can see this in genalloc's header: > * The lockless operation only works if there is enough memory > * available. If new memory is added to the pool a lock has to be > * still taken. So any user relying on locklessness has to ensure > * that sufficient memory is preallocated. > * > * The basic atomic operation of this allocator is cmpxchg on long. > * On architectures that don't have NMI-safe cmpxchg implementation, > * the allocator can NOT be used in NMI handler. So code uses the > * allocator in NMI handler should depend on > * CONFIG_ARCH_HAVE_NMI_SAFE_CMPXCHG. And you are calling gen_pool_alloc() from __sanitizer_cov_trace_pc. Which means this implementation is likely broken on !CONFIG_ARCH_HAVE_NMI_SAFE_CMPXCHG architectures (do we have architectures like that, that support KCOV?). There are probably other sharp corners due to the contexts KCOV can run in, but would simply ask you to carefully reason about why each new dependency is safe.