Sunday 5 July 2009

Breaking up work for review

It was Friday morning after three days of working on one feature. Last thing Thurday I counted the size of the change and it was over 1100 lines and I wasn't quite finished. I found myself in the situation that turns up periodically that I wanted to break up my work into cohesive reviewable chunks. Now it isn't a matter of taking commits x through y as chunk one, and so on, as the size grew organically as I changed what needed to be changed, and wrote what needed to be written without really stopping to think about the size of the change until it was done. However now it was done, I wanted to break it up.

Last time I did this, I used looms, but Aaron told me we could do it easily using his new Bazaar pipeline plugin. So I spent some time talking through with Aaron on how to do it, promising to write it up if it worked well. I must say that it was good. During the process we identified a number of enhancements to the plug in to make it even easier.

I'm going to show the progression we made, along with our thoughts. I have trimmed some of the output when I've decided that it doesn't add value.

The first thing I had to do was to get the pipeline plugin.

$ bzr branch lp:bzr-pipeline ~/.bazaar/plugins/pipeline


Unfortunately this seemed to clash slightly with the QBzr plugin. The were both trying to redefine merge. Personally I don't use QBzr and had probably just installed it to take a look, so I removed that plugin.

Caution: the pipeline plugin relies on switch so works with lightweight checkouts. This is how I work anyway, so I didn't have anything to do here, but if you work differently, YMMV.


The pipeline plugin is designed around having a set of branches one after (individual pipes) the other that perform a pipeline, clever eh? When you have the pipeline plugin, any branch is also considered a pipeline of one.

$ bzr show-pipeline
* nice-distribution-code-listing


What I was wanting to do was to break up this work into a number of distinct change sets, each that could be reviewed independently. We decided that the way to do this was to create a pipe before the current one, and bring changes in. This is done with the command add-pipe.

$ bzr add-pipe factory-tweaks --before nice-distribution-code-listing
$ bzr show-pipeline
factory-tweaks
* nice-distribution-code-listing


Right here we decided that there should be an easier way to add a pipe before the current pipe, as right now it needs a pipe name. A bug was filed to track this.

You can see from the show-pipeline command that the new pipe is before the current one. The pipeline plugin addes a number of branch aliases:

  • :first - the first pipe in the pipeline

  • :prev - the pipe before the current pipe

  • :next - the pipe after the current pipe

  • :last - the last pipe in the pipeline



Now to make the switch to the first pipe. Both :prev and :first refer to the same branch here, and I could have used either.

$ bzr switch :prev
... changed files shown
All changes applied successfully.
Now on revision 8747.


Now this pipe was added from the pipe after it, so it starts off with the same head revision. Not exactly the starting point I wanted, so we replaced the head of this branch with the last revision of the trunk branch that we had merged in.

$ bzr pull --overwrite -r submit:


The submit: alias refers to the submit branch. This is often trunk, and is in my project layout (specified using submit_branch in .bazaar/locations.conf).

Now the lower pipe was a copy of trunk. A good place to start adding changes I think. The next problem was how to get the changes from the following pipe into this one. Our first attempt was to merge in the following branch, shelve what we didn't want, throw away the actual merge, but keep the changed text, and commit.

$ bzr merge :next
$ bzr shelve
$ bzr stat
modified:
lib/lp/testing/factory.py
pending merge tips: (use -v to see all merge revisions)
Tim Penhey 2009-07-02 New view added.
$ bzr revert --forget-merges
$ bzr stat
modified:
lib/lp/testing/factory.py
$ bzr commit -m "More default args to factory methods and whitespace cleanup."


Now this seemed very convoluted. Why merge and then forget the merge? I seemed kinda icky, but it worked. The next thing to do is to merge these changes down the pipeline. This is done through another command pump.

$ bzr pump


This merges and commits the changes down the pipeline. If there are conflicts, it stops and leaves you in the conflicted pipe. This didn't occur here, nor did it occur for any of my other ones. Here you can see the commit message that pump used:

$ bzr switch :next
$ bzr log --line -r -1
8714: Tim Penhey 2009-07-03 [merge] Merged factory-tweaks into nice-distribution-code-listing.
$ bzr switch :prev


Now it was time to add the next pipe.

$ bzr add-pipe code-test-helpers
$ bzr show-pipeline
* factory-tweaks
code-test-helpers
nice-distribution-code-listing
$ bzr switch :next


This time, instead of merging in the changes, we shelved them in. The shelve command. The shelve command can apply changes from arbitrary revisions, and it also knows about files. The change that I wanted in this branch was a single added file, so I could tell shelve about that file.

$ bzr shelve -r branch::next lib/lp/code/tests/helpers.py
Selected changes:
+ lib/lp/code/tests/helpers.py
Changes shelved with id "2".


However the big problem with this is it all looks backwards. We are shelving from the future not the past. This really did my head in. Shelve would say "remove this file?" and by shelving it, it would add it in. It worked but made my head fuzzy. We filed a bug about this too. By adding a better way to take the changes, the command could do the reversal for you and provide you with a nicer question.

More of the same happened for the next few pipes, and I won't bore you with repeated commands.

On the whole, the pipeline plugin worked really well. I was able to break my work up into five hunks which could be reviewed easily. In the end I kept working on the branch that was my original, so all my original history remained intact. It would have been just as easy to add another pipe and take the remaining changes. This would have left me with five branches, each with one commit. This works well for the way we work as we have reviews based on branches. Each pipe could be pushed to Launchpad and a review initiated for it. With some more UI polish, I think pipelines will be even more awesome than I think they are now.

3 comments:

Unknown said...

This looks really promising.
It would be nice to see the log -n0 of all your changes after this process. IIUC, it would show your original commit messages as part of your final merge. This would hopefully ensure that annotate would show your original revision for these changes, which would be excellent.

Duncan McGreggor said...

Great post! Thanks for sharing :-)

Jamu and I were just talking about this on Friday regarding a large branch that will be up for review. This approach looks very promising!

Aaron Bentley said...

For users who don't use lightweight checkouts already, the reconfigure-pipeline command will convert your tree into a lightweight checkout of a pipe.