On Fri, 12 Apr 2019 09:53:33 -0400 Phil Auld <pauld@xxxxxxxxxx> wrote: > Hi, > > I was trying to get some sched traces on a 160 cpu box yesterday. Trace-cmd > failed with Thanks for the report! > > # ./tracecmd/trace-cmd record -e "sched:*" sleep 2 > none > trace-cmd: Invalid argument > Failed filter of /sys/kernel/tracing/events/sched/sched_switch/filter > > trace-cmd: No such file or directory > can not stat 'trace.dat.cpu0' > # > > > Which can be seen better with strace > > [pid 97653] open("/sys/kernel/tracing/events/sched/sched_swap_numa/filter", O_WRONLY|O_TRUNC) = 5 > [pid 97653] write(5, "(common_pid!=97652)&&(common_pid"..., 3358) = 3358 > [pid 97653] close(5) = 0 > [pid 97653] open("/sys/kernel/tracing/events/sched/sched_switch/filter", O_WRONLY|O_TRUNC) = 5 > [pid 97653] write(5, "(common_pid!=97652)&&(common_pid"..., 6398) = -1 EINVAL (Invalid argument) > [pid 97653] close(5) = 0 > > The filter file can only take a max write of length PAGE_SIZE. Ah yeah. By default we try not to trace the recorders. Newer kernels have a set_event_pid which is used for only tracing specific tasks for the events. I wonder if we should allow for "!pid" to be sent to that file as something to not be traced? But that doesn't help you now. Hmm, I thought we had an option to disable this, but I don't see one. That's the first thing we should do. Add an option such that you record all events, even the threads (which is something I would definitely want!). > > The extra pid filtering added for "next_pid" more or less doubles length > and pushes it over the 4k limit. > > WRITE: /sys/kernel/tracing/events/sched/sched_switch/filter, len 6718, data "(common_pid!=100199)&&(common_pid!=100198)&&(common_pid!=100197)&&(common_pid!=100196)&&(common_pid!=100195)&&(common_pid!=100194)&&(common_pid!=100193)&&(common_pid!=100192)&&(common_pid!=100191) ... 160 of these ... > &&(common_pid!=100040)||(next_pid!=100199)&&(next_pid!=100198)&&(next_pid!=100197)&&(next_pid!=100196)&&(next_pid!=100195)&&(next_pid!=100194)&&(next_pid!=100193)&&(next_pid!=100192)... 160 of these... > > > I suppose the answer is don't run on a system with that many cpus :) > > But I wonder if it would be possible to have the threads each handle say 8 cpu > files or something. Actually, I think another solution is to consolidate the pids that are to be excluded and sort them. Thus if we have (which is very likely the case) (common_pid!=1000)&&(common_pid!=1001)&&(common_pid!=1002) That we change that to: !((common_pid>=1000)||(common_pid<=1002)) Which would also have the affect of improving the filter logic within the kernel as well. Tzvetomir or Slavomir, would either of you be able to implement the above? Both adding an option to disable this (--no-filter) and the sorting of the excluded pids? Thanks! -- Steve > > Or maybe have the kernel filter accept an "all_pid" that covered common_pid, next_pid, pid to reduce > the number of items needed in there? >
![]() |