Re: [RFCv4 3/3] iomap: Support subpage size dirty tracking to improve write performance

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Matthew Wilcox <willy@xxxxxxxxxxxxx> writes:

> On Thu, May 04, 2023 at 08:21:09PM +0530, Ritesh Harjani (IBM) wrote:
>> @@ -90,12 +116,21 @@ iomap_page_create(struct inode *inode, struct folio *folio, unsigned int flags)
>>  	else
>>  		gfp = GFP_NOFS | __GFP_NOFAIL;
>>
>> -	iop = kzalloc(struct_size(iop, state, BITS_TO_LONGS(nr_blocks)),
>> +	/*
>> +	 * iop->state tracks two sets of state flags when the
>> +	 * filesystem block size is smaller than the folio size.
>> +	 * The first state tracks per-filesystem block uptodate
>> +	 * and the second tracks per-filesystem block dirty
>> +	 * state.
>> +	 */
>
> "per-filesystem"?  I think you mean "per-block (uptodate|block) state".
> Using "per-block" naming throughout this patchset might help readability.
> It's currently an awkward mix of "subpage", "sub-page" and "sub-folio".
and subfolio to add.

Yes, I agree it got all mixed up in the comments.
Let me stick to sub-folio (which was what we were using earlier [1])

[1]: https://lore.kernel.org/all/20211216210715.3801857-5-willy@xxxxxxxxxxxxx/

> It also feels like you're adding a comment to every non-mechanical change
> you make, which isn't necessarily helpful.  Changelog, sure, but
> sometimes your comments simply re-state what your change is doing.
>

Sure, I will keep that in mind for next rev to remove unwanted comments.

>> -static void iomap_iop_set_range_uptodate(struct folio *folio,
>> -		struct iomap_page *iop, size_t off, size_t len)
>> +static void iomap_iop_set_range(struct folio *folio, struct iomap_page *iop,
>> +		size_t off, size_t len, enum iop_state state)
>>  {
>>  	struct inode *inode = folio->mapping->host;
>> -	unsigned first = off >> inode->i_blkbits;
>> -	unsigned last = (off + len - 1) >> inode->i_blkbits;
>> +	unsigned int blks_per_folio = i_blocks_per_folio(inode, folio);
>> +	unsigned int first_blk = (off >> inode->i_blkbits);
>> +	unsigned int last_blk = ((off + len - 1) >> inode->i_blkbits);
>> +	unsigned int nr_blks = last_blk - first_blk + 1;
>>  	unsigned long flags;
>> -	unsigned int nr_blocks = i_blocks_per_folio(inode, folio);
>>
>> -	spin_lock_irqsave(&iop->state_lock, flags);
>> -	iop_set_range_uptodate(iop, first, last - first + 1, nr_blocks);
>> -	if (iop_uptodate_full(iop, nr_blocks))
>> -		folio_mark_uptodate(folio);
>> -	spin_unlock_irqrestore(&iop->state_lock, flags);
>> +	switch (state) {
>> +	case IOP_STATE_UPDATE:
>> +		if (!iop) {
>> +			folio_mark_uptodate(folio);
>> +			return;
>> +		}
>> +		spin_lock_irqsave(&iop->state_lock, flags);
>> +		iop_set_range_uptodate(iop, first_blk, nr_blks, blks_per_folio);
>> +		if (iop_uptodate_full(iop, blks_per_folio))
>> +			folio_mark_uptodate(folio);
>> +		spin_unlock_irqrestore(&iop->state_lock, flags);
>> +		break;
>> +	case IOP_STATE_DIRTY:
>> +		if (!iop)
>> +			return;
>> +		spin_lock_irqsave(&iop->state_lock, flags);
>> +		iop_set_range_dirty(iop, first_blk, nr_blks, blks_per_folio);
>> +		spin_unlock_irqrestore(&iop->state_lock, flags);
>> +		break;
>> +	}
>>  }
>
> I can't believe this is what Dave wanted you to do.  iomap_iop_set_range()
> should be the low-level helper called by iop_set_range_uptodate() and
> iop_set_range_dirty(), not the other way around.

Ok, I see the confusion, I think if we make
iomap_iop_set_range() to iomap_set_range(), then that should be good.
Then it becomes iomap_set_range() calling
iop_set_range_update() & iop_set_range_dirty() as the lower level helper routines.

Based on the the existing code, I believe this ^^^^ is how the heirarchy
should look like. Does it look good then? If yes, I will simply drop the
"_iop" part in the next rev.

<Relevant function list>
    iop_set_range_uptodate
    iop_clear_range_uptodate
    iop_test_uptodate
    iop_uptodate_full
    iop_set_range_dirty
    iop_clear_range_dirty
    iop_test_dirty
    iomap_page_create
    iomap_page_release
    iomap_iop_set_range -> iomap_set_range()
    iomap_iop_clear_range  -> iomap_clear_range()

-ritesh



[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [NTFS 3]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [NTFS 3]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux