On Tue 2023-01-10 12:05:20, Leizhen (ThunderTown) wrote: > > > On 2023/1/9 21:40, Petr Mladek wrote: > > On Wed 2022-12-28 09:45:11, Zhen Lei wrote: > >> [T58] BUG: sleeping function called from invalid context at kernel/kallsyms.c:305 > >> The execution time of function kallsyms_on_each_match_symbol() is very > >> short, about ten microseconds, the probability of this process being > >> interrupted is very small. And even if it happens, we just have to try > >> again. > >> > >> The execution time of function kallsyms_on_each_symbol() is very long, > >> it takes tens of milliseconds, context switches is likely occur during > >> this period. If the time obtained by task_cputime() is accurate, it is > >> preferred. Otherwise, use local_clock() directly, and the time taken by > >> irqs and high-priority tasks is not deducted because they are always > >> running for a short time. > >> > >> --- a/kernel/kallsyms_selftest.c > >> +++ b/kernel/kallsyms_selftest.c > >> + /* > >> + * The test thread has been bound to a fixed CPU in advance. If the > >> + * number of irqs does not change, no new scheduling request will be > >> + * generated. That is, the performance test process is atomic. > >> + */ > >> + do { > >> + nr_irqs = kstat_cpu_irqs_sum(cpu); > >> + cond_resched(); > >> + t0 = local_clock(); > >> + kallsyms_on_each_match_symbol(match_symbol, stat.name, &stat); > >> + t1 = local_clock(); > >> + } while (nr_irqs != kstat_cpu_irqs_sum(cpu)); > > > > Huh, is this guaranteed to ever finish? > > > > What if there is a regression and kallsyms_on_each_match_symbol() > > never finishes without rescheduling? > > The execution time of kallsyms_on_each_match_symbol() does not exceed 10 us. > Assume that an interrupt is generated every 100 us(10000 interrupts are generated > per second, it is very high). In this case, interrupts and high-priority tasks need > to run for more than 90 us each time to cause the loop cannot exit normally. > However, the CPU usage is 90+%, which is unreasonable. > > Function kallsyms_on_each_symbol() takes a long time, about 20 milliseconds, and > I'm already using task_cputime(). IMHO, this is a wrong mind set. After all, this tests a function that was optimized because it took to long. The function even contains cond_resched() because it caused problems. I know that kallsyms_on_each_match_symbol() is faster because it searches only one module but still. The cond_resched() called before taking t0 is just a horrible non-reliable hack. I have seen many problematic non-reliable tests that relayed on timing. They just hooped for the best. I am sure that we could do better. > > This is yet another unreliable hack. > > > > > > Use standard solution: > > > > I did the homework for you and checked how the "time" command > > measures the time spend in the system. It actually uses more methods. > > > > One is times() syscall. It uses thread_group_cputime_adjusted(), see > > do_sys_times() in kernel/sys.c > > > > Or it uses wait4() syscall to get struct rusage that provides this > > information. > > > > Please, stop inventing crazy hacks, and use these standard methods. > > If the "time" tool is enough for userspace performance tests > > then it must be enough in this case as well. > > Okay, I'll study in depth, just worried about the different precision > requirements. If the problem is a timer precision then the solution would be to repeat the operation more times or use better clock source. It is actually a good practice to repeat the action more times in performance tests because it helps to reduce noise. Well, thread_group_cputime_adjusted() works with struct task_cputime. It seems to store the time in nano-seconds, see include/linux/sched/types.h. I might be wrong but I would expect that it is accurate and precise enough. > By the way, we still have to actively embrace new things. > > In fact, I've thought of another way, which is to measure nine times, > sort in ascending order, and take the middle one. Based on probabilistic > statistics, the intermediate results are reliable. > > > > Or remove this test: > > > > Seriously, what is the value of this test? > > Is anyone going to use it at all in the future? > > There's really someone interested in performance data. > https://lkml.org/lkml/2022/12/15/134 I know. But this code had very bad performance for years and nobody cared. I am not sure if anyone would care about the info printed by this selftest. I am not sure if it is worth the effort. We have already spent quite some time with attempts to make it usable and are still not there. Best Regards, Petr