Since even just grab_cache_page_write_begin() might deadlock with page writeback from __ext4_journalled_writepage() if stable pages are required (as it does "lock_page(); wait_for_stable_page();") and the handle to the same/running transaction is currently held, it _cannot_ be called between ext4_journal_start/stop() as usual, we have to be careful. We have two options: 1) open-code the first part of grab_cache_page_write_begin() (before wait_for_stable_pages()) just before calling it, then check for deadlock and retry (a bit ugly but valid.) 2) move it before ext4_journal_start() to avoid the deadlock. Option 2 has been done as optimization to ext4_write_begin() in commit 47564bfb95bf ("ext4: grab page before starting transaction handle in write_begin()"), and can similarly apply to this case. Since it sounds more official, counts as optimization that prevents long transaction hold time, and isn't ugly, do it. Signed-off-by: Mauricio Faria de Oliveira <mfo@xxxxxxxxxxxxx> --- fs/ext4/inline.c | 48 ++++++++++++++++++++++++++++++++++-------------- 1 file changed, 34 insertions(+), 14 deletions(-) diff --git a/fs/ext4/inline.c b/fs/ext4/inline.c index f35e289e17aa..5fd275098d10 100644 --- a/fs/ext4/inline.c +++ b/fs/ext4/inline.c @@ -548,23 +548,42 @@ static int ext4_convert_inline_data_to_extent(struct address_space *mapping, if (ret) return ret; -retry: + /* + * grab_cache_page_write_begin() can take a long time if the + * system is thrashing due to memory pressure, or if the page + * is being written back. So grab it first before we start + * the transaction handle. This also allows us to allocate + * the page (if needed) without using GFP_NOFS. + */ +retry_grab: + page = grab_cache_page_write_begin(mapping, 0, flags); + if (!page) { + ret = -ENOMEM; + handle = NULL; + goto out; + } + unlock_page(page); + +retry_journal: handle = ext4_journal_start(inode, EXT4_HT_WRITE_PAGE, needed_blocks); if (IS_ERR(handle)) { + put_page(page); ret = PTR_ERR(handle); handle = NULL; + page = NULL; goto out; } - /* We cannot recurse into the filesystem as the transaction is already - * started */ - flags |= AOP_FLAG_NOFS; - - page = grab_cache_page_write_begin(mapping, 0, flags); - if (!page) { - ret = -ENOMEM; - goto out; + lock_page(page); + if (page->mapping != mapping) { + /* The page got truncated from under us */ + unlock_page(page); + put_page(page); + ext4_journal_stop(handle); + goto retry_grab; } + /* In case writeback began while the page was unlocked */ + wait_for_stable_page(page); ext4_write_lock_xattr(inode, &no_expand); sem_held = 1; @@ -600,8 +619,6 @@ static int ext4_convert_inline_data_to_extent(struct address_space *mapping, if (ret) { unlock_page(page); - put_page(page); - page = NULL; ext4_orphan_add(handle, inode); ext4_write_unlock_xattr(inode, &no_expand); sem_held = 0; @@ -616,10 +633,13 @@ static int ext4_convert_inline_data_to_extent(struct address_space *mapping, */ if (inode->i_nlink) ext4_orphan_del(NULL, inode); - } - if (ret == -ENOSPC && ext4_should_retry_alloc(inode->i_sb, &retries)) - goto retry; + if (ret == -ENOSPC && + ext4_should_retry_alloc(inode->i_sb, &retries)) + goto retry_journal; + put_page(page); + page = NULL; + } if (page) block_commit_write(page, from, to); -- 2.20.1