Re: [PATCH 3/5] ext4: Speedup ext4 orphan inode handling

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu 12-08-21 11:01:34, Theodore Ts'o wrote:
> On Wed, Aug 11, 2021 at 12:19:13PM +0200, Jan Kara wrote:
> > +static int ext4_orphan_file_del(handle_t *handle, struct inode *inode)
> > +{
> > +	struct ext4_orphan_info *oi = &EXT4_SB(inode->i_sb)->s_orphan_info;
> > +	__le32 *bdata;
> > +	int blk, off;
> > +	int inodes_per_ob = ext4_inodes_per_orphan_block(inode->i_sb);
> > +	int ret = 0;
> > +
> > +	if (!handle)
> > +		goto out;
> > +	blk = EXT4_I(inode)->i_orphan_idx / inodes_per_ob;
> > +	off = EXT4_I(inode)->i_orphan_idx % inodes_per_ob;
> > +	if (WARN_ON_ONCE(blk >= oi->of_blocks))
> > +		goto out;
> > +
> > +	ret = ext4_journal_get_write_access(handle, inode->i_sb,
> > +				oi->of_binfo[blk].ob_bh, EXT4_JTR_ORPHAN_FILE);
> > +	if (ret)
> > +		goto out;
> 
> If ext4_journal_get_write_access() fails, we effectively drop the
> inode from the orphan list (as far as the in-memory inode is
> concerned), although the inode will still be listed in the orphan
> file.  This can be really unfortunate since if the inode gets
> reallocated for some other purpose, since its inode number is left in
> the orphan block, on the next remount, this could lead to data loss.
> 
> In the orphan list code, we leave the inode on the linked list, which
> is not great, since that will prevent the inode from being freed, but
> at least we're keeping the in-memory and on-disk state in sync and we
> avoid the data loss scenario when the inode gets reused.

Actually, in the orphan list code, we leave the inode in the on-disk list
but remove it from the in-memory list - see how
list_del_init(&ei->i_orphan) is called very early in ext4_orphan_del(). The
reason for this unconditional deletion is that if we do not remove the
inode from the in-memory orphan list, the filesystem will complain and
corrupt memory on unmount.

Also note that leaving inode in the on-disk orphan list actually does no
serious harm. Because the orphan cleanup code just checks i_nlink and
i_disksize and truncates inode down to current i_disksize and removes inode
completely if i_nlink is 0. So even if an inode on the orphan list gets
reused, orphan cleanup will just do nothing for it. So the worst problem
that will likely happen is that on-disk orphan linked list becomes
corrupted but there's no data loss AFAICT.

Is it clearer now or am I missing something?

								Honza
-- 
Jan Kara <jack@xxxxxxxx>
SUSE Labs, CR



[Index of Archives]     [Reiser Filesystem Development]     [Ceph FS]     [Kernel Newbies]     [Security]     [Netfilter]     [Bugtraq]     [Linux FS]     [Yosemite National Park]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Samba]     [Device Mapper]     [Linux Media]

  Powered by Linux