Re: [PATCHv10 8/8] iomap: Add per-block dirty state tracking to improve performance

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Matthew Wilcox <willy@xxxxxxxxxxxxx> writes:

> On Mon, Jun 19, 2023 at 09:55:53PM +0530, Ritesh Harjani wrote:
>> Matthew Wilcox <willy@xxxxxxxxxxxxx> writes:
>> 
>> > On Mon, Jun 19, 2023 at 07:58:51AM +0530, Ritesh Harjani (IBM) wrote:
>> >> +static void ifs_calc_range(struct folio *folio, size_t off, size_t len,
>> >> +		enum iomap_block_state state, unsigned int *first_blkp,
>> >> +		unsigned int *nr_blksp)
>> >> +{
>> >> +	struct inode *inode = folio->mapping->host;
>> >> +	unsigned int blks_per_folio = i_blocks_per_folio(inode, folio);
>> >> +	unsigned int first = off >> inode->i_blkbits;
>> >> +	unsigned int last = (off + len - 1) >> inode->i_blkbits;
>> >> +
>> >> +	*first_blkp = first + (state * blks_per_folio);
>> >> +	*nr_blksp = last - first + 1;
>> >> +}
>> >
>> > As I said, this is not 'first_blkp'.  It's first_bitp.  I think this
>> > misunderstanding is related to Andreas' complaint, but it's not quite
>> > the same.
>> >
>> 
>> We represent each FS block as a bit in the bitmap. So first_blkp or
>> first_bitp or first_blkbitp essentially means the same. 
>> I went with first_blk, first_blkp in the first place based on your
>> suggestion itself [1].
>
> No, it's not the same!  If you have 1kB blocks in a 64kB page, they're
> numbered 0-63.  If you 'calc_range' for any of the dirty bits, you get
> back a number in the range 64-127.  That's not a block number!  It's
> the number of the bit you want to refer to.  Calling it blkp is going
> to lead to confusion -- as you yourself seem to be confused.
>
>> [1]: https://lore.kernel.org/linux-xfs/Y%2FvxlVUJ31PZYaRa@xxxxxxxxxxxxxxxxxxxx/
>
> Those _were_ block numbers!  off >> inode->i_blkbits calculates a block
> number.  (off >> inode->i_blkbits) + blocks_per_folio() does not calculate
> a block number, it calculates a bit number.
>

Yes, I don't mind changing it to _bit. It is derived out of an FS block
representation only. But I agree with your above argument using _bit in
variable name makes it explicit and clear.

>> >> -	return bitmap_full(ifs->state, i_blocks_per_folio(inode, folio));
>> >> +	return bitmap_full(ifs->state, nr_blks);
>> >
>> > I think we have a gap in our bitmap APIs.  We don't have a
>> > 'bitmap_range_full(src, pos, nbits)'.  We could use find_next_zero_bit(),
>> > but that's going to do more work than necessary.
>> >
>> > Given this lack, perhaps it's time to say that you're making all of
>> > this too hard by using an enum, and pretending that we can switch the
>> > positions of 'uptodate' and 'dirty' in the bitmap just by changing
>> > the enum.
>> 
>> Actually I never wanted to use the the enum this way. That's why I was
>> not fond of the idea behind using enum in all the bitmap state
>> manipulation APIs (test/set/).
>> 
>> It was only intended to be passed as a state argument to ifs_calc_range()
>> function to keep all the first_blkp and nr_blksp calculation at one
>> place. And just use it's IOMAP_ST_MAX value while allocating state bitmap.
>> It was never intended to be used like this.
>> 
>> We can even now go back to this original idea and keep the use of the
>> enum limited to what I just mentioned above i.e. for ifs_calc_range().
>> 
>> And maybe just use this in ifs_alloc()?
>> BUILD_BUG_ON(IOMAP_ST_UPTODATE == 0);
>> BUILD_BUG_ON(IOMAP_ST_DIRTY == 1);
>> 
>> > Define the uptodate bits to be the first ones in the bitmap,
>> > document it (and why), and leave it at that.
>> 
>> Do you think we can go with above suggestion, or do you still think we
>> need to drop it?
>> 
>> In case if we drop it, then should we open code the calculations for
>> first_blk, last_blk? These calculations are done in exact same fashion
>> at 3 places ifs_set_range_uptodate(), ifs_clear_range_dirty() and
>> ifs_set_range_dirty().
>> Thoughts?
>
> I disliked the enum from the moment I saw it, but didn't care enough to
> say so.
>
> Look, an abstraction should have a _purpose_.  The enum doesn't.  I'd
> ditch this calc_range function entirely; it's just not worth it.

I guess enum is creating more confusion with almost everyone than adding value.
So I don't mind ditching it (unless anyone else opposes for keeping it).

Also it would be helpful if you could let me know of any other review
comments on the rest of the patch? Does the rest looks good to you?

-ritesh



[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [NTFS 3]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [NTFS 3]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux