16 August 2024

Less is more: Embrace the power of small pull requests

Shady Botros

Does this sound familiar? You come back from lunch and see that you were asked to review a pull request. You think to yourself, I'm gonna review this PR before working on my stuff to unblock my teammate. You open it and find a 1000-line PR sitting there, waiting for you 😱 And you just know you're gonna spend at least half an hour to wrap your head around this and review it properly. It also doesn't stop there, because the first review round reveals new findings and this turns out to be the first of several rounds for this PR. You're tempted to just approve at this point, but your conscience doesn't let you, so you power through and keep reviewing until the PR looks good and you approve it. You wanna work on your stuff now, but you can't help but wonder, is there a better way? The answer is 'Yes!' - and it comes in the form of small pull requests, a game changer for your team's collaboration and productivity 🚀

After working in a couple of different teams, I noticed that big PRs are not as rare as I'd prefer. I also noticed that there are some recurring questions and points of discussion around this topic. That's why I decided to write my 2 cents about it so that I have a reference I can come back to and share with others.

Are you skeptical about small PRs? Do you often find yourself creating big PRs that take days to review? If you answered either of those questions with 'yes', then this blog post is for you.

What we'll cover:

What is a big PR?
Why are small PRs better?
Some blockers and pitfalls when creating small PRs
- The 1 story = 1 PR myth
- The delayed breakdown inaccuracy
tl;dr
Now what?

What is a big PR?

A big PR is complex and takes a long time to review. After reviewing a big PR, people usually suffer from a mild headache. Long-term exposure to big PRs might result in chronic review fatigue, which usually manifests in review requests being ignored or constantly approved without a single comment.

It all boils down to cognitive load. A higher cognitive load means that reviewers need to spend more time to get into the context and do a proper code review. And this load comes from how complex the PR is. A PR can be complex because there are just too many lines of code changed, too many files touched, too much new logic introduced, or just too much happening, causing reviewers to initially ignore the review request because "this is gonna take a while and I don't have time for this right now".

If the above feels like a long and vague way of saying "it depends" or "it's subjective", that's because it does, and it is. If you need a hard rule, here's a suggestion: If the reviewer says it's a big PR, then it's a big PR.

Why are small PRs better?

Based on the definition above, by contrast, a small PR has low complexity and results in a low cognitive load on reviewers. This leads to many benefits, and they all evolve around one main point: We optimize for learning about the validity of our approach.

🚀 short review cycle
Because who doesn't have 10 minutes or less to review a small code change?

👀 early feedback
Breaking down your changes into small PRs gives you early feedback from reviewers. When approved, this can give you higher confidence in your overall solution. When you receive comments or when changes are requested, this can spark new ideas for your implementation that you didn't think of, in which case you'd only need to re-write a small portion of your code because the whole change was small, to begin with.

🔄 short integration cycle
Merging your change and seeing it in action, whether in a staging environment or production, leads to valuable insights. If you see the results that you expect, that's a valuable insight because you know it works. If something breaks, that's also a valuable insight and you'd probably be thankful that you can look for the root cause in a 100-line and not a 1000-line PR.

🔎 high-quality review
No smile-and-wave "LGTM" approvals; if the change is small, people tend to review it more closely and discuss aspects of the code that they wouldn't otherwise discuss. Even you, as the PR author, are more likely to catch a bug when you self-review if the PR is small.

✨ more dopamine
Every merged PR gives you a dopamine hit. It just feels good when you finish something, no matter how small.

Dopamine aside, you can hopefully see how small PRs help us optimize for learning about the validity of our approach. We gather learnings faster thanks to fast reviews, early feedback from reviewers, and early feedback from integrating our changes into the system and seeing them in action. We also increase the quality of our learnings because the code reviews are usually of a higher quality for small PRs, but also because we can see the effect of each small code change on the system's behavior; we can't do that if we deploy one big PR with all code changes in one go.

Some blockers and pitfalls when creating small PRs

Switching to small PRs takes some getting used to. We need to get into the habit of breaking down our solution early on, but we also need to do a bit of a shift in our mindset when it comes to this topic.

The 1 story = 1 PR myth

When a reviewer says that a PR is too big, one classic answer that I've seen quite often is, "Yes, because it's one story" (story, ticket, task, etc.). I'm not sure where this comes from, but it seems like there's an unwritten rule that we need to implement a user story in a single pull request.

I don't see many benefits from adopting this rule. In the best case, the story is very simple and can be done in a small PR. In the worst case, it results in a huge PR. The way I see it, we can, and probably should, break down our solution into smaller steps whenever needed. Of course, there's an overhead from breaking it down, but in most cases, I've found that the benefits of small PRs mentioned above outweigh the overhead. And the more we train our small-PRs muscle, the better we get at breaking down our solutions, and the less that overhead becomes.

The delayed breakdown inaccuracy

...or "implement everything and then break it down".

This is a phenomenon that I've observed a couple of times. Here's how it usually goes: A teammate creates a big PR; I ask them if they can split it into smaller PRs; they ask what is wrong with big PRs; I share the points that I mentioned above; if I'm lucky, they're convinced and they break it down. But now comes the interesting part: The next time they work on a task, they create a small PR, but I noticed that they take a relatively long time before creating that first PR. I then learned that they first code the whole solution locally and only then break it down and create small PRs out of it. If this were a chess game, this is what might be called an inaccuracy: It's not necessarily a bad move, but it's also not the best move. Why? Because you either reduce or lose altogether the benefits of small PRs mentioned above.

Let's look at each benefit of small PRs and see how they might be affected when we implement the whole solution first and only break it down at the end.

🛩 longer review cycle
If you create a bunch of stacked PRs all at once, even if they're small, chances are, you will only get reviews for the first couple of PRs. That's because it takes a long time to review all PRs in one shot. This results in a longer review cycle.

👻 late feedback
You completely lose the benefit of early feedback. If reviewers request changes that affect the core approach of your solution, you'll likely have to re-write a large part of the code.

🌀 longer integration cycle
If you merge your first PR and realize that this breaks an unexpected part of the system that requires you to change the core approach, you'll likely have to re-write a large part of your solution. At least if something breaks, it's still easier to find the root cause than with a big PR because you'd be merging one small change at a time. But still, overall this results in a longer integration cycle.

🔎 lower-quality review
The benefit of high-quality reviews might still hold here because the solution is split into small PRs, although if someone has to review many stacked PRs all at once, they might feel pressured into not reviewing each PR as well as they would if they were only reviewing a single small PR.

✨ delayed dopamine
Last but not least, your PR-induced dopamine intake, although still divided into small hits, is delayed while you build the whole solution. This might seem like an unimportant point, but I personally don't feel like I've achieved something when I'm writing code for days on end, without merging any of my changes.

Overall, both the speed and quality of the learnings we gain from small PRs take a serious hit when we implement the whole solution first and only break it down at the end. We learn faster and better if we break it down early on and create one small PR at a time as we build our solution. While the first PR is being reviewed, we can start working on the next one.

tl;dr

Small pull requests receive better and faster reviews because they are less complex. We get early feedback on our solution, not only from reviewers but also from the system when we merge our changes and see them in action. This way, we optimize for the speed and quality of our learning to validate our approach. We gain the most when we plan and break down our solution early on and create small PRs as we build it.

Now what?

I hope I was able to convince you of the benefits of small PRs and how they help us optimize for learning about the validity of our approach. If you're now wondering how to create small PRs, stay tuned for the next blog post, where I plan to share some strategies that I've found to work well.

If you found this blog post helpful, feel free to share it with anyone who could benefit from it (you can use the social media buttons below). I'm also always happy to receive feedback on my writing. If you'd like to share some feedback, the best way to do that is by sending me a message on LinkedIn.