On Mon, May 18, 2020 at 02:45:02PM +0200, Miklos Szeredi wrote: > On Sun, May 3, 2020 at 12:27 PM Matthew Wilcox <willy@xxxxxxxxxxxxx> wrote: > > > > On Sun, May 03, 2020 at 09:43:41AM +0100, Nikolaus Rath wrote: > > > Here's what I got: > > > > > > [ 221.277260] page:ffffec4bbd639880 refcount:1 mapcount:0 mapping:0000000000000000 index:0xd9 > > > [ 221.277265] flags: 0x17ffffc0000097(locked|waiters|referenced|uptodate|lru) > > > [ 221.277269] raw: 0017ffffc0000097 ffffec4bbd62f048 ffffec4bbd619308 0000000000000000 > > > [ 221.277271] raw: 00000000000000d9 0000000000000000 00000001ffffffff ffff9aec11beb000 > > > [ 221.277272] page dumped because: fuse: trying to steal weird page > > > [ 221.277273] page->mem_cgroup:ffff9aec11beb000 > > > > Great! Here's the condition: > > > > if (page_mapcount(page) || > > page->mapping != NULL || > > page_count(page) != 1 || > > (page->flags & PAGE_FLAGS_CHECK_AT_PREP & > > ~(1 << PG_locked | > > 1 << PG_referenced | > > 1 << PG_uptodate | > > 1 << PG_lru | > > 1 << PG_active | > > 1 << PG_reclaim))) { > > > > mapcount is 0, mapping is NULL, refcount is 1, so that's all fine. > > flags has 'waiters' set, which is not in the allowed list. I don't > > know the internals of FUSE, so I don't know why that is. > > > > Also, page_count() is unstable. Unless there has been an RCU grace period > > between when the page was freed and now, a speculative reference may exist > > from the page cache. So I would say this is a bad thing to check for. > > page_cache_pipe_buf_steal() calls remove_mapping() which calls > page_ref_unfreeze(page, 1). That sets the refcount to 1, right? > > What am I missing? find_get_entry() calling page_cache_get_speculative(). In a previous allocation, this page belonged to the page cache. Then it was freed, but another thread is in the middle of a page cache lookup and has already loaded the pointer. It is now delayed by a few clock ticks. Now the page is allocated to FUSE, which calls page_ref_unfreeze(). And then the refcount gets bumped to 2 by page_cache_get_speculative(). find_get_entry() calls xas_reload() and discovers this page is no longer at that index, so it calls put_page(), but in that narrow window, FUSE checks the refcount and finds it's not 1. Monumentally unlikely, of course, so you've probably never seen it, but multiply by the hundreds of millions of devices running Linux, and somebody will hit it someday.