I keep tripping up a BUG() in isolate_lru_pages in mm/vmscan.c:1345: switch (__isolate_lru_page(page, mode)) { case 0: nr_pages = hpage_nr_pages(page); mem_cgroup_update_lru_size(lruvec, lru, -nr_pages); list_move(&page->lru, dst); nr_taken += nr_pages; break; case -EBUSY: /* else it is being freed elsewhere */ list_move(&page->lru, src); continue; default: BUG(); } This is on an SGI Onyx2 platform (MIPS, IP27), two node boards (4x R14000 CPUs), and 8G of RAM. The problem appears tied to heavy disk I/O, typically writes. I can reproduce sometimes with a long bonnie++ run, but I haven't gotten a recent panic() message under 4.0 yet. Most of the time, it silently hardlocks. I only have serial console access at 9600bps, so it may lock too fast before the serial driver can dump the panic. Is there any information behind the purpose or triggers of this BUG()? I went back in git all the way to the initial 2006 commit that added this function, but could not find any comments or explanation of just what it's protecting against. That makes it hard to know where to start debugging. I've already tried switching filesystems, first ext4, now XFS. Enabling CONFIG_NUMA seems to make it harder to trigger, but that's not an objective observation. An md RAID resync doesn't appear to trigger it either. Help?