Before starting to work on Spring Data I hardly ever used git rebase. Everybody in those teams was happy when a commit comment included some useful comment and the id of the issue. Some even considered to much. And no one really worried about a discussion of the benefits of merge vs rebase. Therefore working on the Spring Data repositories was some serious change for me since heavy rebasing is an integral part of the workflow. But after some getting used to it I really like the approach and I think many teams might benefit from something similar. So a blog post is in order:

When working on an issue the first thing to do is to ensure that there actually is an issue in Jira. It is assigned to yourself and hit “Start progress”. This ensures some transparency in the team about who is working on what and avoids multiple working on the same issue (without knowing it). I guess that is pretty standard for most teams.

Next step is to create a feature branch and create a first “Prepare branch commit.” which changes the version in the pom.xml to contain the issue number. This is important since the CI build actually installs the artifacts and if it does that with the version of the master branch funny things happen to downstream projects. They are not amused about that kind of stuff. I tried. Once. :-/

You then start working on the issue and basically commit in whatever rhythm you like. It happens mostly on your local machine and even when you decide to push your branch to Github in order to let the CI build have a go or to discuss stuff with a coworker, the commit structure isn’t very relevant. At least for me this means I produce about one commit every hour at least and often every few minutes. Basically I commit when either something works (a test or the code that makes the test green) or before I start something that I might want to roll back without rolling back what I have just done.

Once the feature is complete I push my branch to Github as some simple form of backup. Just in case I mess something up in the steps that follow.

Next is squashing all commits except the prepare branch commit and write a proper commit message. So now there are exactly two commits.

Then I go through every single file of the commit and make sure it is nice an squeaky clean: Formatting as desired. All method small and with meaningful names. Visibility as small as possible. Appropriate documentation. All new tests tagged with the issue number. This kind of stuff.

All changes from this get amended to the original commit. If there are changes in the commit that are purely cosmetic like removing/adding blank lines or changes to JavaDoc these get extracted into a separate commit. If the master branch moved forward in the meantime I also rebase everything on the current master. Once I’m satisfied I force push the result to Github and open a PR and assign it to a suitable coworker. Yes it does feel weird to force push onto a heavily used public repository on Github, but remember in 99% of the cases only one person works on a feature branch. And if anybody else is working on the same branch this is coordinated with you and is almost certainly done before you start polishing, rebasing and force pushing.

As part of the review often another polishing commit is added (I will probably perform a spontaneous dance if I ever manage to get a PR through without any polishing).

And finally the commits except the Prepare branch commit gets cherry-picked onto master (and possibly other branches for older versions if applicable), the PR closed and the issue resolved.

This process has a couple of effects:

  1. History on master looks is really nice and is easy to read.

  2. I spend some serious time on polishing code and commits. Obviously it took even longer the first times and I expect to become even more efficient with it in the future, but I suspect it will always take some serious time.

  3. You loose the history of all the small steps involved.

The first point is obviously a win I think. If we find a regression later it is very easy to pin it to one change. This can be much more difficult when you have lots of parallel branches and merges. Did a regression occur in branch A, B or just in the merge commit? Basically the process replaces the search afterwards with more work during rebasing. And really I don’t think it makes much of a difference if a regression was introduced during implementing a feature or when rebasing/merging it with master.

I don’t think number 3 is a big issue. The commits I do in the middle of a feature are really quick things. Sometimes the code doesn’t even compile, let alone pass all the tests. And since I strictly follow TDD most of the time almost every newly introduced line gets change 3 times before it reaches the final version for the feature. After a couple of days these commits really lose their value. I also don’t want to spend much time on them because they are just fallback points for myself.

Steven Schwenke has a different view on these things. He switched with his team from an approach similar to ours to one without squashing.

There are two concerns with the Spring Data or similar approaches I’d like to address.

Is it really continuous integration when you work with feature branches?

Well I guess, strictly speaking it isn’t. But I see this only as a problem when many feature branches exist on a single project for a long time. If you have this situation I’d expect a more fruitful discussion would be if you can separte the project in more independent modules and what you can do to make your features smsaller, so your branches don’t live so long.

The real dirty history of how features got developed in parallel gets lost, including the tiny commit that introduced a bug

I already described it earlier: That tiny commit is probably useless, because while it might have introduced a bug it probably looks already differently now. And instead of looking at a single somewhat larger commit you now look at many tiny commits that just make things more complex. So in the end effect life gets more difficult when you don’t squash your commits.

Anyway, that’s how we work.

How does your team work?

And how do you like it?