Hi Huajun, 2013-11-20 (수), 20:51 +0800, Huajun Li: > On Fri, Nov 15, 2013 at 3:49 PM, Jaegeuk Kim <jaegeuk.kim@xxxxxxxxxxx> wrote: > > Hi Huajun, > > > > [snip] > > > >> +static int __f2fs_convert_inline_data(struct inode *inode, struct page *page) > >> +{ > >> + int err; > >> + struct page *ipage; > >> + struct dnode_of_data dn; > >> + void *src_addr, *dst_addr; > >> + block_t old_blk_addr, new_blk_addr; > >> + struct f2fs_sb_info *sbi = F2FS_SB(inode->i_sb); > >> + > >> + f2fs_lock_op(sbi); > >> + ipage = get_node_page(sbi, inode->i_ino); > >> + if (IS_ERR(ipage)) > >> + return PTR_ERR(ipage); > >> + > >> + /* > >> + * i_addr[0] is not used for inline data, > >> + * so reserving new block will not destroy inline data > >> + */ > >> + set_new_dnode(&dn, inode, ipage, ipage, 0); > >> + err = f2fs_reserve_block(&dn, 0); > >> + if (err) { > >> + f2fs_put_page(ipage, 1); > >> + f2fs_unlock_op(sbi); > >> + return err; > >> + } > >> + > >> + src_addr = inline_data_addr(ipage); > >> + dst_addr = page_address(page); > >> + zero_user_segment(page, 0, PAGE_CACHE_SIZE); > >> + > >> + /* Copy the whole inline data block */ > >> + memcpy(dst_addr, src_addr, MAX_INLINE_DATA); > >> + > >> + /* write data page to try to make data consistent */ > >> + old_blk_addr = dn.data_blkaddr; > >> + set_page_writeback(page); > >> + write_data_page(inode, page, &dn, > >> + old_blk_addr, &new_blk_addr); > >> + update_extent_cache(new_blk_addr, &dn); > >> + f2fs_wait_on_page_writeback(page, DATA, true); > >> + > >> + /* clear inline data and flag after data writeback */ > >> + zero_user_segment(ipage, INLINE_DATA_OFFSET, > >> + INLINE_DATA_OFFSET + MAX_INLINE_DATA); > >> + clear_inode_flag(F2FS_I(inode), FI_INLINE_DATA); > >> + > >> + sync_inode_page(&dn); > >> + f2fs_put_page(ipage, 1); > > > > Again, it seems that you missed what I mentioned. > > If we write the inlined data block only, we cannot recover the data > > block after SPO. > > In order to avoid that, we should write its dnode block too by > > triggering sync_node_pages(ino) at this point as similar as fsync > > routine. > > > > Thanks, > > > > -- > > Jaegeuk Kim > > Samsung > > > > Hi Jaegeuk, > Previously, I refactored f2fs_sync_file() and tried to call it while > converting inline data, but find it is easily dead lock, so just write > data block here. > Then, how about following enhancement to this patch ? it only copies > some codes to the end of __f2fs_convert_inline_data(), and keeps > others same as before. Sorry for the late response. It takes some time for me to verify the consistency problem in more detail. What I've concerned was the following issues: - inlined data was synced before or not, - inlined data was fsynced before or not, - its parent directory inode was synced before or not, - recovery can be safe? - ... Most of these issues are based on the question that "can we recover the inlined data after sudden-power-off safely?". And initially what I concerned was from the following scenario. 1. user writes 3KB data 2. sync or fsync 3. user writes 4KB data : remove direct pointers in the inode page, and cache a converted data page. 4. do checkpoint : write the inode page only ** After power-cut, user expect at least the file should have 3KB data, but there is no data due to the converted inline data. Lastly, I found that, it'd be ok if we can cover the following lock coverages. - f2fs_lock_op | - lock_page(inode_page) | | -- convert_inline_data() | | 1. write_data_page() | | 2. update its inode page() | - unlock_page(inode_page) - f2fs_unlock_op This means that, the step #4 can guarantee that the inode has the direct pointer of 4KB data. And when considering other cases, I couldn't find any issues. So, yes, I concluded that your first approach which writes data pages only was correct. However I found that it needs to modify some recovery routine integrated to your patch, [5/6]. In do_recover_data(inode_page), 1. get the first file offset of the inode_page, 2. get its previous written inode page, 3. diff direct pointers between previous inode page and current inode page 4. check previous and current direct pointers Let's suppose that the previous inode page has inline data and current inode page is a coverted node page or vice versa. [direct pointers] [previous inode page] [current inode page] [0] abcd (inline data) xxx (block addr) or, [0] xxx (block addr) abcd (inline data) In this case, f2fs will recover errorneous block addresses, so it needs to avoid mishandling the direct pointers too. Thanks, -- Jaegeuk Kim Samsung -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html