Monorepo changed packages done right

I’ve noticed a problem when creating a bot that notifies what packages have changed on each merge request. Consider the following dependency graph:

Change package-c -> test package-{a,c}

Change package-e -> test package-{a,b,c,e}

Change package-d -> test package-{a,d}

You get the point, we only want to test the packages that are affected by a change. Getting a list of what packages changed is quite straightforward:

  1. Git diff between 2 commits to get the files changed
  2. Compile a list of packages that are directly affected by the files changed
  3. Compile a list of packages that are indirectly affected by the packages directly affected

Now onto the actual problem, example:

  • feature-branch-1 starts off from branching on ref 1 and is only merged after feature-branch-2 (no specific reason)
  • feature-branch-1 only touches package-b, so the changed packages for this should be package-{a,b}
  • feature-branch-2 touches package f, which means changed packages for this one should be package-{a,d,f}

To my surprise, after running this bot in the wild for a while, I noticed a weird behaviour, without any code changes (just running the feature-branch-1 pipeline before and after feature-branch-2 has been merged) resulted in testing a lot more than what feature-branch-1 touched (package-{a,b,d,f}).

How did this happen? If my code hasn’t changed, and I definitely haven’t touched package-d how come the bot is telling me I’ve touched it?

The problem was essentially step #1, I was comparing against origin/main, which in this case is the wrong thing to do, the ref I want to compare against in the feature-branch-* pipeline is where I branched off from, in this case:

  • feature-branch-1 -> ref1
  • feature-branch-2 -> ref2

This way even if feature-branch-2 is merged, feature-branch-1 won’t know about it until it updates, and if we enforce having up to date branches before merging back to main then we’ll always have the whole thing testable with no unnecessary testing cycles.

So the solution is not to compare against origin/main, but to compare against git merge-base — fork-point origin/main, this is essentially saying “give me the ref of where I branched off origin/main”.

The solution assumes you’re running a monorepo and that you enforce up to date merge requests before merging. If you don’t enforce this then you need to run the full test suite on the main branch, otherwise, you’re leaving yourself very vulnerable.