Detecting dodgy Git practices

Posted July 2016

First, let me tell you a story…

Once upon a time there was a start-up (in London). Developers loved their BDD, TDD, pairing and peer review… Maybe a bit too much even. They used GitHub PRs from branches because they’re less hassle more generic than forks. Anyway, then more developers joined and deadlines started coming as fast as the pivots were happening.

And actually not all of them did love having their code reviewed, maybe because they were Ninja Rock-Star 10x devs, or something.

But more and more would pull master and find it had magically changed despite no PRs being merged. Rebasing became more fun. Tickets didn’t get auto-updated. CI pushed things straight to environments before reviews…

Definition

Along the way, there was even a term coined for this…

london-style
adj.
A manner of committing directly to a source code repository’s main branch without review.
vb.
To commit in the London Style, directly to master without too many concerns for impact.
e.g. “Yeah I London-styled that config update cos nobody was around

So what now…

As a quick(ish) experiment, and to play more with my new best friend Haskell, I set about seeing if we could detect this behaviour. In theory it’s simple - just all commits on master that haven’t been merged in… right?

The development

Libraries

As with any tooling, it’s all about the libraries. After some investigation and a few failed experiments, I decided on

  • libgit wrapper for higher-order Git wrapping (as in: less dealing with C internals), and
  • optparse-applicative library, my go-to option parser for Haskell now. It makes the (simple) CLI applicative, typesafe, and generally not get in the way.

Meet the tool

London Calling (erm, pending a better name), in its own words, detects commits directly to master on a single local Git codebase. More specifically, it finds and outputs tab-separated summaries of all the commits on HEAD that don’t meet any of these criteria:

  • merged from another branch
  • in the excluded email list (e.g. commit bots)
  • committed by someone other than the author of the changeset (as per squashed branch merges).

To be continued…

With a little more spare time, the detection of self-merged PRs would also be a big win. Whilst it can be prevented by config on larger projects, smaller teams often don’t as it can become a bottleneck. Note the optional behaviour has changed in Github recently - squashing PRs (to a single commit) is now a UI option.