Re: [PATCH] nilfs2: fix issue of nilfs_set_page_dirty for page at EOF boundary

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Andrew,

Could you please queue this fix in -mm tree and send upstream ?

Thanks,
Ryusuke Konishi

On Wed, 1 May 2013 18:14:56 +0400, Vyacheslav Dubeyko wrote:
> Hi Ryusuke,
> 
> On May 1, 2013, at 11:39 AM, Ryusuke Konishi wrote:
> 
>> Hi Vyacheslav,
>> 
>> Here I post the bug fix of the problem that Anthony Doggett reported
>> and you originally proposed a bug fix with the patch titled "nilfs2:
>> fix issue with broken bmap for the case of block size lesser than 4
>> KB".
>> 
>> I tested this patch against the mainline and stable trees, and so far
>> it works fine (the issue is fixed without any side effect).
>> 
>> If you meet an issue on this patch, or have a question, please let me
>> know.
>> 
> 
> Yes, this patch much better than mine. :-)
> 
> With the best regards,
> Vyacheslav Dubeyko.


On Wed, 01 May 2013 16:39:30 +0900 (JST), Ryusuke Konishi wrote:
> Hi Vyacheslav,
> 
> Here I post the bug fix of the problem that Anthony Doggett reported
> and you originally proposed a bug fix with the patch titled "nilfs2:
> fix issue with broken bmap for the case of block size lesser than 4
> KB".
> 
> I tested this patch against the mainline and stable trees, and so far
> it works fine (the issue is fixed without any side effect).
> 
> If you meet an issue on this patch, or have a question, please let me
> know.
> 
> I appreciate very much that you are always making effort to solve
> problems of NILFS users.
> 
> Thanks,
> Ryusuke Konishi
> --
> From: Ryusuke Konishi <konishi.ryusuke@xxxxxxxxxxxxx>
> Subject: [PATCH] nilfs2: fix issue of nilfs_set_page_dirty for page at EOF
>  boundary
> 
> DESCRIPTION:
> There are use-cases when NILFS2 file system (formatted with block size
> lesser than 4 KB) can be remounted in RO mode because of encountering
> of "broken bmap" issue.
> 
> The issue was reported by Anthony Doggett
> <Anthony2486@xxxxxxxxxxxxxxxxx>: "The machine I've been trialling
> nilfs on is running Debian Testing, Linux version 3.2.0-4-686-pae
> (debian-kernel@xxxxxxxxxxxxxxxx) (gcc version 4.6.3 (Debian 4.6.3-14)
> ) #1 SMP Debian 3.2.35-2), but I've also reproduced it (identically)
> with Debian Unstable amd64 and Debian Experimental (using the
> 3.8-trunk kernel).  The problematic partitions were formatted with
> "mkfs.nilfs2 -b 1024 -B 8192"."
> 
> SYMPTOMS:
> (1) System log contains error messages likewise:
> 
>     [63102.496756] nilfs_direct_assign: invalid pointer: 0
>     [63102.496786] NILFS error (device dm-17): nilfs_bmap_assign: broken bmap (inode number=28)
>     [63102.496798]
>     [63102.524403] Remounting filesystem read-only
> 
> (2) The NILFS2 file system is remounted in RO mode.
> 
> REPRODUSING PATH:
> (1) Create volume group with name "unencrypted" by means of vgcreate utility.
> (2) Run script (prepared by Anthony Doggett <Anthony2486@xxxxxxxxxxxxxxxxx>):
> 
> ----------------[BEGIN SCRIPT]--------------------
> 
> VG=unencrypted
> lvcreate --size 2G --name ntest $VG
> mkfs.nilfs2 -b 1024 -B 8192 /dev/mapper/$VG-ntest
> mkdir /var/tmp/n
> mkdir /var/tmp/n/ntest
> mount /dev/mapper/$VG-ntest /var/tmp/n/ntest
> mkdir /var/tmp/n/ntest/thedir
> cd /var/tmp/n/ntest/thedir
> sleep 2
> date
> darcs init
> sleep 2
> dmesg|tail -n 5
> date
> darcs whatsnew || true
> date
> sleep 2
> dmesg|tail -n 5
> ----------------[END SCRIPT]--------------------
> 
> REPRODUCIBILITY: 100%
> 
> INVESTIGATION:
> As it was discovered, the issue takes place during segment
> construction after executing such sequence of user-space operations:
> 
> open("_darcs/index", O_RDWR|O_CREAT|O_NOCTTY, 0666) = 7
> fstat(7, {st_mode=S_IFREG|0644, st_size=0, ...}) = 0
> ftruncate(7, 60)
> 
> The error message "NILFS error (device dm-17): nilfs_bmap_assign:
> broken bmap (inode number=28)" takes place because of trying to get
> block number for third block of the file with logical offset #3072
> bytes. As it is possible to see from above output, the file has 60
> bytes of the whole size. So, it is enough one block (1 KB in size)
> allocation for the whole file. Trying to operate with several blocks
> instead of one takes place because of discovering several dirty
> buffers for this file in nilfs_segctor_scan_file() method.
> 
> The root cause of this issue is in nilfs_set_page_dirty function which
> is called just before writing to a mmapped page.
> 
> When nilfs_page_mkwrite function handles a page at EOF boundary, it
> fills hole blocks only inside EOF through __block_page_mkwrite().
> 
> The __block_page_mkwrite() function calls set_page_dirty() after
> filling hole blocks, thus nilfs_set_page_dirty function (=
> a_ops->set_page_dirty) is called.  However, the current implementation
> of nilfs_set_page_dirty() wrongly marks all buffers dirty even for
> page at EOF boundary.
> 
> As a result, buffers outside EOF are inconsistently marked dirty and
> queued for write even though they are not mapped with nilfs_get_block
> function.
> 
> FIX:
> This modifies nilfs_set_page_dirty() not to mark hole blocks dirty.
> 
> Thanks to Vyacheslav Dubeyko for his effort on analysis and proposals
> for this issue.
> 
> Reported-by: Anthony Doggett <Anthony2486@xxxxxxxxxxxxxxxxx>
> Reported-by: Vyacheslav Dubeyko <slava@xxxxxxxxxxx>
> Signed-off-by: Ryusuke Konishi <konishi.ryusuke@xxxxxxxxxxxxx>
> Cc: Vyacheslav Dubeyko <slava@xxxxxxxxxxx>
> Tested-by: Ryusuke Konishi <konishi.ryusuke@xxxxxxxxxxxxx>
> Cc: <stable@xxxxxxxxxxxxxxx>
> ---
>  fs/nilfs2/inode.c |   20 ++++++++++++++++----
>  1 file changed, 16 insertions(+), 4 deletions(-)
> 
> diff --git a/fs/nilfs2/inode.c b/fs/nilfs2/inode.c
> index ba7a1da..436b239 100644
> --- a/fs/nilfs2/inode.c
> +++ b/fs/nilfs2/inode.c
> @@ -219,13 +219,25 @@ static int nilfs_writepage(struct page *page, struct writeback_control *wbc)
>  
>  static int nilfs_set_page_dirty(struct page *page)
>  {
> -	int ret = __set_page_dirty_buffers(page);
> +	int ret = __set_page_dirty_nobuffers(page);
>  
> -	if (ret) {
> +	if (page_has_buffers(page)) {
>  		struct inode *inode = page->mapping->host;
> -		unsigned nr_dirty = 1 << (PAGE_SHIFT - inode->i_blkbits);
> +		unsigned nr_dirty = 0;
> +		struct buffer_head *bh, *head;
>  
> -		nilfs_set_file_dirty(inode, nr_dirty);
> +		bh = head = page_buffers(page);
> +		do {
> +			/* Do not mark hole blocks dirty */
> +			if (!buffer_dirty(bh) || !buffer_mapped(bh))
> +				continue;
> +
> +			set_buffer_dirty(bh);
> +			nr_dirty++;
> +		} while (bh = bh->b_this_page, bh != head);
> +
> +		if (nr_dirty)
> +			nilfs_set_file_dirty(inode, nr_dirty);
>  	}
>  	return ret;
>  }
> -- 
> 1.7.9.3
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [Samba]     [Device Mapper]     [CEPH Development]
  Powered by Linux