Re: [PATCHv10 8/8] iomap: Add per-block dirty state tracking to improve performance

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Jun 19, 2023 at 10:59:09PM +0530, Ritesh Harjani wrote:
> Matthew Wilcox <willy@xxxxxxxxxxxxx> writes:
> 
> > On Mon, Jun 19, 2023 at 09:55:53PM +0530, Ritesh Harjani wrote:
> >> Matthew Wilcox <willy@xxxxxxxxxxxxx> writes:
> >> 
> >> > On Mon, Jun 19, 2023 at 07:58:51AM +0530, Ritesh Harjani (IBM) wrote:
> >> >> +static void ifs_calc_range(struct folio *folio, size_t off, size_t len,
> >> >> +		enum iomap_block_state state, unsigned int *first_blkp,
> >> >> +		unsigned int *nr_blksp)
> >> >> +{
> >> >> +	struct inode *inode = folio->mapping->host;
> >> >> +	unsigned int blks_per_folio = i_blocks_per_folio(inode, folio);
> >> >> +	unsigned int first = off >> inode->i_blkbits;
> >> >> +	unsigned int last = (off + len - 1) >> inode->i_blkbits;
> >> >> +
> >> >> +	*first_blkp = first + (state * blks_per_folio);
> >> >> +	*nr_blksp = last - first + 1;
> >> >> +}
> >> >
> >> > As I said, this is not 'first_blkp'.  It's first_bitp.  I think this
> >> > misunderstanding is related to Andreas' complaint, but it's not quite
> >> > the same.
> >> >
> >> 
> >> We represent each FS block as a bit in the bitmap. So first_blkp or
> >> first_bitp or first_blkbitp essentially means the same. 
> >> I went with first_blk, first_blkp in the first place based on your
> >> suggestion itself [1].
> >
> > No, it's not the same!  If you have 1kB blocks in a 64kB page, they're
> > numbered 0-63.  If you 'calc_range' for any of the dirty bits, you get
> > back a number in the range 64-127.  That's not a block number!  It's
> > the number of the bit you want to refer to.  Calling it blkp is going
> > to lead to confusion -- as you yourself seem to be confused.
> >
> >> [1]: https://lore.kernel.org/linux-xfs/Y%2FvxlVUJ31PZYaRa@xxxxxxxxxxxxxxxxxxxx/
> >
> > Those _were_ block numbers!  off >> inode->i_blkbits calculates a block
> > number.  (off >> inode->i_blkbits) + blocks_per_folio() does not calculate
> > a block number, it calculates a bit number.
> >
> 
> Yes, I don't mind changing it to _bit. It is derived out of an FS block
> representation only. But I agree with your above argument using _bit in
> variable name makes it explicit and clear.
> 
> >> >> -	return bitmap_full(ifs->state, i_blocks_per_folio(inode, folio));
> >> >> +	return bitmap_full(ifs->state, nr_blks);
> >> >
> >> > I think we have a gap in our bitmap APIs.  We don't have a
> >> > 'bitmap_range_full(src, pos, nbits)'.  We could use find_next_zero_bit(),
> >> > but that's going to do more work than necessary.
> >> >
> >> > Given this lack, perhaps it's time to say that you're making all of
> >> > this too hard by using an enum, and pretending that we can switch the
> >> > positions of 'uptodate' and 'dirty' in the bitmap just by changing
> >> > the enum.
> >> 
> >> Actually I never wanted to use the the enum this way. That's why I was
> >> not fond of the idea behind using enum in all the bitmap state
> >> manipulation APIs (test/set/).
> >> 
> >> It was only intended to be passed as a state argument to ifs_calc_range()
> >> function to keep all the first_blkp and nr_blksp calculation at one
> >> place. And just use it's IOMAP_ST_MAX value while allocating state bitmap.
> >> It was never intended to be used like this.
> >> 
> >> We can even now go back to this original idea and keep the use of the
> >> enum limited to what I just mentioned above i.e. for ifs_calc_range().
> >> 
> >> And maybe just use this in ifs_alloc()?
> >> BUILD_BUG_ON(IOMAP_ST_UPTODATE == 0);
> >> BUILD_BUG_ON(IOMAP_ST_DIRTY == 1);
> >> 
> >> > Define the uptodate bits to be the first ones in the bitmap,
> >> > document it (and why), and leave it at that.
> >> 
> >> Do you think we can go with above suggestion, or do you still think we
> >> need to drop it?
> >> 
> >> In case if we drop it, then should we open code the calculations for
> >> first_blk, last_blk? These calculations are done in exact same fashion
> >> at 3 places ifs_set_range_uptodate(), ifs_clear_range_dirty() and
> >> ifs_set_range_dirty().
> >> Thoughts?
> >
> > I disliked the enum from the moment I saw it, but didn't care enough to
> > say so.
> >
> > Look, an abstraction should have a _purpose_.  The enum doesn't.  I'd
> > ditch this calc_range function entirely; it's just not worth it.
> 
> I guess enum is creating more confusion with almost everyone than adding value.
> So I don't mind ditching it (unless anyone else opposes for keeping it).
> 
> Also it would be helpful if you could let me know of any other review
> comments on the rest of the patch? Does the rest looks good to you?

I deleted my entire angry rant about how this review has turned a
fairly simple design change into a big mess that even the reviewers
don't understand anymore.  I'm on vacation, I DGAF anymore.

Ritesh: Dump the enum; "because btrfs does it" is not sufficient
justification.  The rest is good enough, I'll put it in iomap-for-next
along with willy's thing as soon as 6.5-rc1 closes, and if you all have
further complaints, send your own patches.

--D

> -ritesh



[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [NTFS 3]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [NTFS 3]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux