On (23/04/17 01:29), Yosry Ahmed wrote: > > @@ -2239,8 +2241,8 @@ static unsigned long __zs_compact(struct zs_pool *pool, > > if (fg == ZS_INUSE_RATIO_0) { > > free_zspage(pool, class, src_zspage); > > pages_freed += class->pages_per_zspage; > > - src_zspage = NULL; > > } > > + src_zspage = NULL; > > > > if (get_fullness_group(class, dst_zspage) == ZS_INUSE_RATIO_100 > > || spin_is_contended(&pool->lock)) { > > For my own education, how can this result in the "next is NULL" debug > error Yu Zhao is seeing? > > IIUC if we do not set src_zspage to NULL properly after putback, then > we will attempt to putback again after the main loop in some cases. > This can result in a zspage being present more than once in the > per-class fullness list, right? > > I am not sure how this can lead to "next is NULL", which sounds like a > corrupted list_head, because the next ptr should never be NULL as far > as I can tell. I feel like I am missing something. That's a good question to which I don't have an answer. We can list_add() the same zspage twice, unlocking the pool after first list_add() so that another process (including another zs_compact()) can do something to that zspage. The answer is somewhere between these lines, I guess. I can see how, for example, another DEBUG_LIST check can be triggered: "list_add double add", because we basically can do list_add(page, list) list_add(page, list) I can also see how lockdep can be unhappy with us doing write_unlock(&zspage->lock); write_unlock(&zspage->lock); But I don't think I see how "next is NULL" happens (I haven't observed it).