> On Jan 16, 2020, at 11:04 AM, David Hildenbrand <david@xxxxxxxxxx> wrote: > > On 16.01.20 16:54, Michal Hocko wrote: >> On Thu 16-01-20 09:53:13, Qian Cai wrote: >>> >>> >>>> On Jan 16, 2020, at 9:28 AM, Michal Hocko <mhocko@xxxxxxxxxx> wrote: >>>> >>>> On Wed 15-01-20 12:29:16, Qian Cai wrote: >>>>> It is guaranteed to trigger a lockdep splat if calling printk() with >>>>> zone->lock held because there are many places (tty, console drivers, >>>>> debugobjects etc) would allocate some memory with another lock >>>>> held which is proved to be difficult to fix them all. >>>> >>>> I am still not happy with the above much. What would say about something >>>> like below instead? >>>> " >>>> It is not that hard to trigger lockdep splats by calling printk from >>>> under zone->lock. Most of them are false positives caused by lock chains >>>> introduced early in the boot process and they do not cause any real >>>> problems. There are some console drivers which do allocate from the >>>> printk context as well and those should be fixed. In any case false >>>> positives are not that trivial to workaround and it is far from optimal >>>> to lose lockdep functionality for something that is a non-issue. >>>> <An example of such a false positive goes here> >>>> " >>> >>> I feel like I repeated myself too many times. A call trace for one lock dependency >>> is sometimes from early boot process because lockdep will save the first one it >>> encountered, but it does not mean the lock dependency will only not happen in >>> early boot. I spent some time to study those early boot call traces in the given >>> lockdep splats, and it looks to me the lock dependency is also possible after >>> the boot. >> >> Then state it explicitly with an example of the trace and explanation >> that the deadlock is real. If the deadlock is real then it shouldn't be >> really terribly hard to notice even without lockdep splats which get >> disabled after the first false positive, right? > > I was asking myself for a long time: did anybody actually see this > deadlock in real life? Nobody knows for sure. I think one reason is that not many people will use memory offiline even if they do, it will mostly not be a continuous activity in the system. debugobjects make it way easier to reproduce because it allocates memory in random places, but then it is not all that popular.