On Tue, Jan 14, 2025 at 12:02:31PM +0100, Dmitry Vyukov wrote: > On Tue, 14 Jan 2025 at 11:43, Marco Elver <elver@xxxxxxxxxx> wrote: > > On Tue, 14 Jan 2025 at 06:35, Jiao, Joey <quic_jiangenj@xxxxxxxxxxx> wrote: > > > > > > Hi, > > > > > > This patch series introduces new kcov unique modes: > > > `KCOV_TRACE_UNIQ_[PC|EDGE|CMP]`, which are used to collect unique PC, EDGE, > > > CMP information. > > > > > > Background > > > ---------- > > > > > > In the current kcov implementation, when `__sanitizer_cov_trace_pc` is hit, > > > the instruction pointer (IP) is stored sequentially in an area. Userspace > > > programs then read this area to record covered PCs and calculate covered > > > edges. However, recent syzkaller runs show that many syscalls likely have > > > `pos > t->kcov_size`, leading to kcov overflow. To address this issue, we > > > introduce new kcov unique modes. > > > > Overflow by how much? How much space is missing? > > > > > Solution Overview > > > ----------------- > > > > > > 1. [P 1] Introduce `KCOV_TRACE_UNIQ_PC` Mode: > > > - Export `KCOV_TRACE_UNIQ_PC` to userspace. > > > - Add `kcov_map` struct to manage memory during the KCOV lifecycle. > > > - `kcov_entry` struct as a hashtable entry containing unique PCs. > > > - Use hashtable buckets to link `kcov_entry`. > > > - Preallocate memory using genpool during KCOV initialization. > > > - Move `area` inside `kcov_map` for easier management. > > > - Use `jhash` for hash key calculation to support `KCOV_TRACE_UNIQ_CMP` > > > mode. > > > > > > 2. [P 2-3] Introduce `KCOV_TRACE_UNIQ_EDGE` Mode: > > > - Save `prev_pc` to calculate edges with the current IP. > > > - Add unique edges to the hashmap. > > > - Use a lower 12-bit mask to make hash independent of module offsets. > > > - Distinguish areas for `KCOV_TRACE_UNIQ_PC` and `KCOV_TRACE_UNIQ_EDGE` > > > modes using `offset` during mmap. > > > - Support enabling `KCOV_TRACE_UNIQ_PC` and `KCOV_TRACE_UNIQ_EDGE` > > > together. > > > > > > 3. [P 4] Introduce `KCOV_TRACE_UNIQ_CMP` Mode: > > > - Shares the area with `KCOV_TRACE_UNIQ_PC`, making these modes > > > exclusive. > > > > > > 4. [P 5] Add Example Code Documentation: > > > - Provide examples for testing different modes: > > > - `KCOV_TRACE_PC`: `./kcov` or `./kcov 0` > > > - `KCOV_TRACE_CMP`: `./kcov 1` > > > - `KCOV_TRACE_UNIQ_PC`: `./kcov 2` > > > - `KCOV_TRACE_UNIQ_EDGE`: `./kcov 4` > > > - `KCOV_TRACE_UNIQ_PC|KCOV_TRACE_UNIQ_EDGE`: `./kcov 6` > > > - `KCOV_TRACE_UNIQ_CMP`: `./kcov 8` > > > > > > 5. [P 6-7] Disable KCOV Instrumentation: > > > - Disable instrumentation like genpool to prevent recursive calls. > > > > > > Caveats > > > ------- > > > > > > The userspace program has been tested on Qemu x86_64 and two real Android > > > phones with different ARM64 chips. More syzkaller-compatible tests have > > > been conducted. However, due to limited knowledge of other platforms, > > > assistance from those with access to other systems is needed. > > > > > > Results and Analysis > > > -------------------- > > > > > > 1. KMEMLEAK Test on Qemu x86_64: > > > - No memory leaks found during the `kcov` program run. > > > > > > 2. KCSAN Test on Qemu x86_64: > > > - No KCSAN issues found during the `kcov` program run. > > > > > > 3. Existing Syzkaller on Qemu x86_64 and Real ARM64 Device: > > > - Syzkaller can fuzz, show coverage, and find bugs. Adjusting `procs` > > > and `vm mem` settings can avoid OOM issues caused by genpool in the > > > patches, so `procs:4 + vm:2GB` or `procs:4 + vm:2GB` are used for > > > Qemu x86_64. > > > - `procs:8` is kept on Real ARM64 Device with 12GB/16GB mem. > > > > > > 4. Modified Syzkaller to Support New KCOV Unique Modes: > > > - Syzkaller runs fine on both Qemu x86_64 and ARM64 real devices. > > > Limited `Cover overflows` and `Comps overflows` observed. > > > > > > 5. Modified Syzkaller + Upstream Kernel Without Patch Series: > > > - Not tested. The modified syzkaller will fall back to `KCOV_TRACE_PC` > > > or `KCOV_TRACE_CMP` if `ioctl` fails for Unique mode. > > > > > > Possible Further Enhancements > > > ----------------------------- > > > > > > 1. Test more cases and setups, including those in syzbot. > > > 2. Ensure `hash_for_each_possible_rcu` is protected for reentrance > > > and atomicity. > > > 3. Find a simpler and more efficient way to store unique coverage. > > > > > > Conclusion > > > ---------- > > > > > > These patches add new kcov unique modes to mitigate the kcov overflow > > > issue, compatible with both existing and new syzkaller versions. > > > > Thanks for the analysis, it's clearer now. > > > > However, the new design you introduce here adds lots of complexity. > > Answering the question of how much overflow is happening, might give > > better clues if this is the best design or not. Because if the > > overflow amount is relatively small, a better design (IMHO) might be > > simply implementing a compression scheme, e.g. a simple delta > > encoding. > > Joey, do you have corresponding patches for syzkaller? I wonder how > the integration looks like, in particular when/how these maps are > cleared. Uploaded in https://github.com/google/syzkaller/pull/5673