On Mon, 24 Jan 2022 at 12:24, liupeng (DM) <liupeng256@xxxxxxxxxx> wrote: > > > On 2022/1/24 16:19, Marco Elver wrote: > > On Mon, Jan 24, 2022 at 02:52AM +0000, Peng Liu wrote: > > KFENCE is designed to be enabled in production kernels, but it can > be also useful in some debug situations. For machines with limited > memory and CPU resources, KASAN is really hard to run. Fortunately, > > If these are arm64 based machines, see if CONFIG_KASAN_SW_TAGS works for > you. In future, we believe that CONFIG_KASAN_HW_TAGS will be suitable > for a variety of scenarios, including debugging scenarios of resource > constrained environments. > > Thank you for your good suggestion, we will try it. > > KFENCE can be a suitable candidate. For KFENCE running on a single > machine, the possibility of discovering existed bugs will increase > as the increasing of KFENCE objects, but this will cost more memory. > In order to balance the possibility of discovering existed bugs and > memory cost, KFENCE objects need to be adjusted according to memory > resources for a compiled kernel Image. Add a module parameter to > adjust KFENCE objects will make kfence to use in different machines > with the same kernel Image. > > In short, the following reasons motivate us to add this parameter. > 1) In some debug situations, this will make kfence flexible. > 2) For some production machines with different memory and CPU size, > this will reduce the kernel-Image-version burden. > > [...] > > This patch (of 3): > > [ Note for future: No need to add "This patch (of X)" usually -- this is > added by maintainers if deemed appropriate, and usually includes the > cover letter. ] > > The most important motivation of this patch series is to make > KFENCE easy-to-use in business situations. > > Signed-off-by: Peng Liu <liupeng256@xxxxxxxxxx> > --- > Documentation/dev-tools/kfence.rst | 14 ++-- > include/linux/kfence.h | 3 +- > mm/kfence/core.c | 108 ++++++++++++++++++++++++----- > mm/kfence/kfence.h | 2 +- > mm/kfence/kfence_test.c | 2 +- > 5 files changed, 103 insertions(+), 26 deletions(-) > > [...] > > diff --git a/include/linux/kfence.h b/include/linux/kfence.h > index 4b5e3679a72c..aec4f6b247b5 100644 > --- a/include/linux/kfence.h > +++ b/include/linux/kfence.h > @@ -17,12 +17,13 @@ > #include <linux/atomic.h> > #include <linux/static_key.h> > > +extern unsigned long kfence_num_objects; > /* > * We allocate an even number of pages, as it simplifies calculations to map > * address to metadata indices; effectively, the very first page serves as an > * extended guard page, but otherwise has no special purpose. > */ > -#define KFENCE_POOL_SIZE ((CONFIG_KFENCE_NUM_OBJECTS + 1) * 2 * PAGE_SIZE) > +#define KFENCE_POOL_SIZE ((kfence_num_objects + 1) * 2 * PAGE_SIZE) > extern char *__kfence_pool; > > I appreciate the effort, but you could have gotten a quicker answer if > you had first sent us an email to ask why adjustable number of objects > hasn't been done before. Because if it was trivial, we would have > already done it. > > What you've done is turned KFENCE_POOL_SIZE into a function instead of a > constant (it still being ALL_CAPS is now also misleading). > > This is important here: > > /** > * is_kfence_address() - check if an address belongs to KFENCE pool > * @addr: address to check > * > * Return: true or false depending on whether the address is within the KFENCE > * object range. > * > * KFENCE objects live in a separate page range and are not to be intermixed > * with regular heap objects (e.g. KFENCE objects must never be added to the > * allocator freelists). Failing to do so may and will result in heap > * corruptions, therefore is_kfence_address() must be used to check whether > * an object requires specific handling. > * > * Note: This function may be used in fast-paths, and is performance critical. > * Future changes should take this into account; for instance, we want to avoid > * introducing another load and therefore need to keep KFENCE_POOL_SIZE a > * constant (until immediate patching support is added to the kernel). > */ > static __always_inline bool is_kfence_address(const void *addr) > { > /* > * The __kfence_pool != NULL check is required to deal with the case > * where __kfence_pool == NULL && addr < KFENCE_POOL_SIZE. Keep it in > * the slow-path after the range-check! > */ > return unlikely((unsigned long)((char *)addr - __kfence_pool) < KFENCE_POOL_SIZE && __kfence_pool); > } > > Unfortunately I think you missed the "Note". > > Which means that ultimately your patch adds another LOAD to the fast > path, which is not an acceptable trade-off. > > This would mean your change would require benchmarking, but it'd also > mean we and everyone else would have to re-benchmark _all_ systems where > we've deployed KFENCE. > > I think the only reasonable way forward is if you add immediate patching > support to the kernel as the "Note" suggests. > > May you give us more details about "immediate patching"? Another option may be as follows: Have a config for _max_ pool size. Always reserve max amount of virtual address space, and do the range check for the max amount. But actually allocate pages potentially for a smaller number of objects (configured with a runtime parameter). > In the meantime, while not a single kernel imagine, we've found that > debug scenarios usually are best served with a custom debug kernel, as > there are other debug features that are only Kconfig configurable. Thus, > having a special debug kernel just configure KFENCE differently > shouldn't be an issue in the majority of cases. > > Should this answer not be satisfying for you, the recently added feature > skipping already covered allocations (configurable via > kfence.skip_covered_thresh) alleviates some of the issue of a smaller > pool with a very low sample interval (viz. high sample rate). > > The main thing to watch out for is KFENCE's actual sample rate vs > intended sample rate (per kfence.sample_interval). If you monitor > /sys/kernel/debug/kfence/stats, you can compute the actual sample rate. > If the actual sample rate becomes significantly lower than the intended > rate, only then does it make sense to increase the pool size. My > suggestion for you is therefore to run some experiments, while adjusting > kfence.sample_interval and kfence.skip_covered_thresh until you reach a > sample rate that is close to intended. > > Thanks, > -- Marco > . > > Thank you for your patient suggestions, it's actually helpful and inspired. > We have integrated your latest work "skipping already covered allocations", > and will do more experiments about KFENCE. Finally, we really hope you can > give us more introductions about "immediate patching". > > Thanks, > -- Peng Liu > .