On 2018-12-05 21:53:37 [+0800], He Zhe wrote: > For call trace 1: … > Since kmemleak would most likely be used to debug in environments where > we would not expect as great performance as without it, and kfree() has raw locks > in its main path and other debug function paths, I suppose it wouldn't hurt that > we change to raw locks. okay. > >> >From what I reached above, this is RT-only and happens on v4.18 and v4.19. > >> > >> The call trace above is caused by grabbing kmemleak_lock and then getting > >> scheduled and then re-grabbing kmemleak_lock. Using raw lock can also solve > >> this problem. > > But this is a reader / writer lock. And if I understand the other part > > of the thread then it needs multiple readers. > > For call trace 2: > > I don't get what "it needs multiple readers" exactly means here. > > In this call trace, the kmemleak_lock is grabbed as write lock, and then scheduled > away, and then grabbed again as write lock from another path. It's a > write->write locking, compared to the discussion in the other part of the thread. > > This is essentially because kmemleak hooks on the very low level memory > allocation and free operations. After scheduled away, it can easily re-enter itself. > We need raw locks to prevent this from happening. With raw locks you wouldn't have multiple readers at the same time. Maybe you wouldn't have recursion but since you can't have multiple readers you would add lock contention where was none (because you could have two readers at the same time). > > Couldn't we just get rid of that kfree() or move it somewhere else? > > I mean if the free() memory on CPU-down and allocate it again CPU-up > > then we could skip that, rigth? Just allocate it and don't free it > > because the CPU will likely get up again. > > For call trace 1: > > I went through the CPU hotplug code and found that the allocation of the > problematic data, cpuc->shared_regs, is done in intel_pmu_cpu_prepare. And > the free is done in intel_pmu_cpu_dying. They are handlers triggered by two > different perf events. > > It seems we can hardly form a convincing method that holds the data while > CPUs are off and then uses it again. raw locks would be easy and good enough. Why not allocate the memory in intel_pmu_cpu_prepare() if it is not already there (otherwise skip the allocation) and in intel_pmu_cpu_dying() not free it. It looks easy. > Thanks, > Zhe Sebastian