[TOPIC 9/12] Code churn and cleanups

Taylor Blau <me@xxxxxxxxxxxx> · Mon, 2 Oct 2023 11:21:41 -0400



(Presenter: Calvin Wan, Notetaker: Taylor Blau)

* Question: When is refactoring worth the churn? The refactoring may or may not
  contribute to a different goal (e.g. libification). Other factors:
   * Should those refactor series be included with the feature?
   * Should they be split up?
   * Do they make sense as isolated units?
* Some examples: Elijah's cache.h cleanup series, which was obviously good.
  Others of dubious value.
* (Elijah) May have done the cache.h series a year or two earlier, but wasn't
  sure that it was obviously good.
* (Jonathan Tan) First have to define the churn. Two kinds:
   * Having reviewers look at it in the first place, since there are no
     objective user-facing improvements.
   * Causes additional toil in revision history.
* (Jonathan Tan) Let's start with reviewer churn. What constitutes "good" or
  "clean" code is subjective, so authors and reviewers may spend a large amount
  of time debating whether or not the refactoring meets that criteria. Can be
  avoided when the feature is on top in the same series.
   * (Junio) Speaking cynically: the new feature may be taking a subjective
     change or rejection of it hostage.
   * (Calvin) In other words, refactorings are of lower value than features?
   * (Junio) After you implement some features, you may discover opportunities
     for clean-up after the fact.
   * (Jonathan Nieder) In the course of solving a given problem, may come up
     with a lot of different changes that all help. If you generate a long patch
     series, you are over-constraining the maintainer in determining how to slot
     those changes in. Also makes applying to a maintenance branch, rolling back
     particular pieces harder, etc harder.
      * If I make a one-line bug fix and notice "this code was hard to
        understand, here's a refactoring that makes it more obvious", it's often
        more helpful to the project for the one-line bug fix to come first in
        the series and the refactoring to be a followup or later part of the
        series.
   * (Taylor) One thing that helps is motivating a refactoring. Saying "here's
     what this refactoring makes easier".
   * (Martin) What is "refactoring for its own sake"? For example, is removing
     global state something that we want without additional justification?
   * (Emily) Can we split the difference? Can we send cleanup patches with less
     context? With more context? Should we be better about committing to a
     feature and presumptively merging clean-up patches along the way.
   * (Junio) I rarely hear from reviewers the signals that would allow me to do
     this. "I have reviewed this series, and patches 1-4 look ready, I'd be
     happy with those landing and being built on top of".
   * (Emily) Could change our habits to add "LGTMs" part of the way through the
     series.
   * (Jonathan Tan) We often need to add a feature to "sweeten the deal". The
     feature proves that the refactoring is good. Doesn't add to the overall
     value, but makes it cost less to review the refactoring. Perhaps that the
     presence of the feature is proof enough, even if it isn't merged right
     away.
   * (Terry) Sounds like the question is, "what is the value proposition for
     refactoring?" Usually to lower tech debt. Challenge: maybe every
     refactoring should stand on its own?
      * In implementing a feature, I might notice "the object database interface
        is causing problems in this way". Then my cover letter can spell out
        those problems and how the refactoring addresses them.
      * It's hard work to describe why something isn't good, especially in a
        legacy codebase with some tech debt and some old changes missing clear
        commit messages. It's work but I think it's worthwhile. It builds an
        understanding in the community of how that subsystem should function.
   * (Elijah) My series might be an example of that, didn't have a feature
     associated with it. Helped with libification effort, and started with a
     smaller series to illustrate the direction. Guessing that there are certain
     types of refactoring that we already consider to be good.
   * (Jonathan Nieder) Could having a wiki page that lists helpful refactorings
     that would be likely to be accepted on their own?
   * (Jonathan Tan) I'd like to challenge Terry's challenge. It's a laudable
     goal, but a subsequent patch implementing the feature is worth 1,000 words.
   * (Jonathan Nieder) If we want to be doing more refactoring, then we're going
     to have to develop different skills as developers and reviewers. Reviewing
     refactoring is more like reviewing technical writing. Having examples to
     illustrate the idea can help, even if those examples are changes that
     aren't changes we want to make right now to Git.
   * (Terry) Some people are visual learners, some people are auditory learners,
     and so on. Having a change in place on top of a refactoring is worth 1,000
     words. But if you write well, maybe you don't need the latter patch.
   * (Taylor) I think I agree with both these things - I like having the
     documentation and explanation, but I also see Jonathan Tan's point about
     examples being helpful.
   * We should become more comfortable with throwing away work. Suppose I've
     made a refactoring and we decide not to ship the change it was meant to
     support. Is it worth the reviewer's time to take anyway?
      * We need to make the cover letters clearer, make the case for it being
        worth the time.
   * (Calvin) I think I agree with Taylor. To re-describe: our cost is code
     churn and reviewer time. Feature patches show that there is a 100%
     guarantee the preceding changes are worthwhile. There is a discount factor
     when you don't have a feature to illustrate the value. Authors need to be
     more clear when there doesn't exist a feature patch on what the value is.
      * Reviewers can encourage the author to give better examples of how the
        change will pay off.
   * (Glen) Are there things we could do to help newer contributors in this
     regard? Should we have a more opinionated style guide?
      * (Taylor) Separate CodingGuidelines into semantic requirements and more
        subjective "here are kinds of refactorings we like"
   * (Jonathan Nieder) For newer contributors: better/more worked examples of
     how experienced contributors justify their refactoring changes. E.g. "here
     are some series in the past that were harder to review because of the lack
     of this change". If people had examples to emulate, they would be doing it
     more.
   * (Emily) Difficult to synthesize commit messages without examples,
     especially for non-native English speakers, people who aren't great
     writers, etc.
* (Jonathan Tan) The other kind of churn in looking back at history and seeing
  what has happened in the file. One thing I worry about is that there may be
  another feature in the future that forces us to partially or entirely revert
  the refactoring. That reduces the probability of the refactoring being "good"
  in the first place.
* (Terry) Emily's point about inclusivity: that work (writing a persuasive
  essay, emulating examples) is tedious and difficult, it may not be natural to
  everybody. As a project, we should be creating those examples. Reviewers
  should help newer contributors along the way.