From: Toshiyuki Okajima <toshi.okajima@xxxxxxxxxxxxxx> Pages in the page cache belonging to ext4 data files are released via the ext4_releasepage() function specified in the ext4 inode's address_space_ops. However, metadata blocks (such as indirect blocks, directory blocks, etc) are managed via the block device address_space_ops, and they can not be released by try_to_free_buffers() if they have a journal head attached to them. To address this, we supply a release_metadata function which is called by the block device's blkdev_releasepage() function, which calls journal_try_to_free_buffers() function to free the metadata. Signed-off-by: Toshiyuki Okajima <toshi.okajima@xxxxxxxxxxxxxx> Signed-off-by: "Theodore Ts'o" <tytso@xxxxxxx> Cc: linux-fsdevel@xxxxxxxxxxxxxxx --- fs/ext4/ext4.h | 2 ++ fs/ext4/inode.c | 29 +++++++++++++++++++++++++++++ fs/ext4/super.c | 4 ++++ 3 files changed, 35 insertions(+), 0 deletions(-) diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h index 695b45c..91e06e4 100644 --- a/fs/ext4/ext4.h +++ b/fs/ext4/ext4.h @@ -1099,6 +1099,8 @@ extern int ext4_chunk_trans_blocks(struct inode *, int nrblocks); extern int ext4_block_truncate_page(handle_t *handle, struct address_space *mapping, loff_t from); extern int ext4_page_mkwrite(struct vm_area_struct *vma, struct page *page); +extern int ext4_release_metadata(void *client, struct page *page, + gfp_t wait); /* ioctl.c */ extern long ext4_ioctl(struct file *, unsigned int, unsigned long); diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c index f6d9447..1647903 100644 --- a/fs/ext4/inode.c +++ b/fs/ext4/inode.c @@ -3045,6 +3045,35 @@ static int ext4_releasepage(struct page *page, gfp_t wait) } /* + * Try to release metadata pages (indirect blocks, directories) which are + * mapped via the block device. Since these pages could have journal heads + * which would prevent try_to_free_buffers() from freeing them, we must use + * jbd2 layer's try_to_free_buffers() function to release them. + * + * Note: we have to strip the __GFP_WAIT flag before calling + * jbd2_journal_try_to_free_buffers because blkdev_releasepage is + * called while holding a spinlock (bdev_inode.client_lock). + * Fortunately the metadata buffers we are interested are freed right + * away and do not require calling journal_wait_for_transaction_sync_data(). + */ +int ext4_release_metadata(void *client, struct page *page, gfp_t wait) +{ + struct super_block *sb = (struct super_block*)client; + journal_t *journal; + + WARN_ON(PageChecked(page)); + if (!page_has_buffers(page)) + return 0; + BUG_ON(EXT4_SB(sb) == NULL); + journal = EXT4_SB(sb)->s_journal; + if (journal != NULL) + return jbd2_journal_try_to_free_buffers(journal, page, + wait & ~__GFP_WAIT); + else + return try_to_free_buffers(page); +} + +/* * If the O_DIRECT write will extend the file then add this inode to the * orphan list. So recovery will truncate it back to the original size * if the machine crashes during the write. diff --git a/fs/ext4/super.c b/fs/ext4/super.c index bd41fad..f447c46 100644 --- a/fs/ext4/super.c +++ b/fs/ext4/super.c @@ -3689,6 +3689,8 @@ static struct file_system_type ext4_fs_type = { .name = "ext4", .get_sb = ext4_get_sb, .kill_sb = kill_block_super, + .release_metadata + = ext4_release_metadata, .fs_flags = FS_REQUIRES_DEV, }; @@ -3708,6 +3710,8 @@ static struct file_system_type ext4dev_fs_type = { .name = "ext4dev", .get_sb = ext4dev_get_sb, .kill_sb = kill_block_super, + .release_metadata + = ext4_release_metadata, .fs_flags = FS_REQUIRES_DEV, }; MODULE_ALIAS("ext4dev"); -- 1.6.0.4.8.g36f27.dirty -- To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html