Re: [FYI] tux3: Core changes

Jan Kara <jack@xxxxxxx> · Wed, 20 May 2015 16:44:29 +0200



On Tue 19-05-15 13:33:31, David Lang wrote:
> On Tue, 19 May 2015, Daniel Phillips wrote:
> 
> >>I understand that Tux3 may avoid these issues due to some other mechanisms
> >>it internally has but if page forking should get into mm subsystem, the
> >>above must work.
> >
> >It does work, and by example, it does not need a lot of code to make
> >it work, but the changes are not trivial. Tux3's delta writeback model
> >will not suit everyone, so you can't just lift our code and add it to
> >Ext4. Using it in Ext4 would require a per-inode writeback model, which
> >looks practical to me but far from a weekend project. Maybe something
> >to consider for Ext5.
> >
> >It is the job of new designs like Tux3 to chase after that final drop
> >of performance, not our trusty Ext4 workhorse. Though stranger things
> >have happened - as I recall, Ext4 had O(n) directory operations at one
> >time. Fixing that was not easy, but we did it because we had to. Fixing
> >Ext4's write performance is not urgent by comparison, and the barrier
> >is high, you would want jbd3 for one thing.
> >
> >I think the meta-question you are asking is, where is the second user
> >for this new CoW functionality? With a possible implication that if
> >there is no second user then Tux3 cannot be merged. Is that is the
> >question?
> 
> I don't think they are asking for a second user. What they are
> saying is that for this functionality to be accepted in the mm
> subsystem, these problem cases need to work reliably, not just work
> for Tux3 because of your implementation.
> 
> So for things that you don't use, you need to make it an error if
> they get used on a page that's been forked (or not be an error and
> 'do the right thing')
> 
> For cases where it doesn't matter because Tux3 controls the
> writeback, and it's undefined in general what happens if writeback
> is triggered twice on the same page, you will need to figure out how
> to either prevent the second writeback from triggering if there's
> one in process, or define how the two writebacks are going to happen
> so that you can't end up with them re-ordered by some other
> filesystem.
> 
> I think that that's what's meant by the top statement that I left in
> the quote. Even if your implementation details make it safe, these
> need to be safe even without your implementation details to be
> acceptable in the core kernel.
  Yeah, that's what I meant. If you create a function which manipulates
page cache, you better make it work with other functions manipulating page
cache. Otherwise it's a landmine waiting to be tripped by some unsuspecting
developer. Sure you can document all the conditions under which the
function is safe to use but a function that has several paragraphs in front
of it explaning when it is safe to use isn't very good API...

								Honza
-- 
Jan Kara <jack@xxxxxxx>
SUSE Labs, CR
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html