Monday, 20 February 2012

Guilt reduction

So it is now Monday morning and I'm sitting next to Thomi.  We are going to pair program on this test stuff.  Partly because I think that pair programming is really cool, and partly due to Thomi knowing the autopilot test infrastructure really well, and that'll make this go much faster.

The bug in question related to the launcher getting into a very confused state where it thought there were multiple active applications.  And clicking on a launcher icon that was in this confused state caused a new application to be started rather than switching to the one that was running.

The first step in making all this work then, is to create a branch that is based off a revision that was before the fix.  This way we can write a test that fails first.  A key part of tests is to make sure they fail first.  Then when they start passing, you know it isn't by mistake, and that you have tested what you think, not just created something that passes.

Firstly, find that revision...

$ bzr log | less

The fix is revision 1977, so lets make a branch of trunk from revision 1976.

$ bzr cbranch trunk -r 1976 hud-ap-test
$ cd hud-ap-test/
$ bzr revno
1976


I use light weight checkouts for the unity repo, hence cbranch rather than branch.

At this revision, there is a HUD test that really just checks the reveal. Lets make sure it passes...

$ cd tests/autopilot/
$ python -m testtools.run autopilot.tests.test_hud
Tests running...
No handlers could be found for logger "autopilot.emulators.X11"

Ran 1 test in 4.238s
OK


I deleted a bunch of gtk warnings, they don't add any value for what I'm trying to show here.  Would be great if someone fixed them though :-)

Now I need to actually build and run my local unity (and test the autopilot test again).

Found out that my machine was failing to build for other reasons, so we switched to Thomi's.  The existing test still passed (of course it did), so the next step was to write a test that encapsulated the broken behaviour that we had found during the many hours of analysis.

That can be found at lp:~thomir/unity/autopilot-hud-triple-hit.

The test failed with the old revision, we then merged trunk, rebuilt, and ran the test again.  Test passed.  Job done.

Saturday, 18 February 2012

That guilty feeling

Today had been a frustrating day.  I had been quick to anger and my family bore the brunt of that. It wasn't until I was confronted with this that I actually took a minute to think why I was feeling this way.  It came back to something I read on IRC this morning, where I read that some people I deeply respect were disappointed with the test coverage with Unity 5.4.

I took this disappointment the way people often take it from their parents.  Remember when as a child, one of the worst things you could feel was the disappointment of your parents.  Well I guess that is how I felt.

I took over the engineering manager position of the unity team at the end of last year, and I tend to take criticism of the project and team personally.

So... why the guilty feeling?

Well, back around the time I took over managing the team, the general acceptance criteria for getting Canonical projects into Ubuntu changed.  This includes Unity.  There were a number of automated tests for Unity, and a series of distro acceptance tests that were manually executed.  What we needed to do was to really change the team culture to one where tests were not only written, but expected.  New features needed test coverage, bug fixes needed test coverage.  The idea here, for all those that understand test driven development, and automated testing, was to make sure that bugs that were fixed, and new features, didn't get broken accidentally by new changes.

The guilt really came from knowing that I had allowed code reviews through the process without enforcing the need for tests.  And that as a senior person on the team, others took a lead from what I did.  If I was letting things through, so would others.  This is where the feeling really came from.

It is very easy to land fixes to crashes quickly when under pressure.  Especially when you've spent the last eight hours debugging in gdb, and auditing all the recently landed code looking for that change that would contribute to the broken behaviour that you have been trying to fix.  When you finally find that one line fix, it is so tempting to just commit the one line.  You know it works, you've just spent the last freaking eight hours looking at the weird behaviour.  What you haven't done however, is stopped it from happening again, by encapsulating the behaviour in an automated test.

I plan to spend some of Monday going back and adding an automated test to cover the particular behaviour that we fixed the other day.  I'll also write up what, and how this test gets written.  Hopefully by writing this, not only will Unity get better test coverage, but I'll personally feel better knowing that I've done the right thing.

Sunday, 30 October 2011

6 months on Unity

We have just finished another design sprint prior to UDS-P.

While talking with some others I realise that I have worked on Unity for six months, and not changed a single pixel on the output.  No graphical changes, no moving widgets, no changing colours.

So what have I been doing?

First step was getting some new coding standards accepted by the team, which was much easier than I was expecting.

I added some property classes to nux, and did some general clean up in the code of nux and unity.

Refactored the indicator internals for the panel service which started off the shared unity core library for sharing code between the 2D and 3D code-bases.

Then I focused primarily on fixing memory leaks and crashes.

Once we hit final freeze, I did a little more refactoring internally, and now we are on to Precise Pangolin.

Monday, 11 July 2011

Properties in C++

Once you have done any development in a language that natively supports properties, like Python or C#, going back to C++ and not having them often feels like a real pain. I've just proposed my second attempt at C++ properties for the nux library.

This change leans heavily on a paper written by Lois Goldthwaite: SC22/WG21/N1615 - C++ Properties -- a Library Solution. I added change notifications using sigc++. I found that using sigc::slot was nicer than templating the properties on the class and member function pointer. This also meant that I could provide a way for a simple property to get its own custom setter method while still having a sensible default.

Compiling C++ templates still gives absolutely horrendous error message that can take a while to mentally parse. I guess one advantage of having done a lot of template programming in the past is that I don't get too phased by copious quantities of error messages, especially for templates, as for one example today, I had just forgotten to change a template arg in a test function, and got way too many lines to sensibly look at. One benefit of that was it caused me to look at what I was doing, and I ended up simplifying my tests a little more.

Thank you Lois for the time you spent writing up the C++ properties proposal, it was a fantastic starting point for me.

Monday, 23 May 2011

Getting back into C++

I have to admit that unit testing in C++, even with google-test, is so much more of a PITA than in python. Especially when checking string output. Simple string matching using split and regular expressions has really spoiled me.

Another thing that I've noticed is that I spend more time thinking about object design, and what should that object really be able to do, and what should it allow others to do to it.

It is an interesting time as I realise how much I still have to learn for our current domain. Most of my previous C++ experience has been on server side processing. Drawing stuff on the screen, real end user stuff, is still relatively new for me.

Sunday, 8 May 2011

DX Sprint

What a week. I've spent the last week in Budapest sprinting with the rest of the Desktop Experience (DX) team. This week was also my first official week with the DX team as I have moved now from the Launchpad team to the DX team. This was a good week. I had met some of the DX team before at other company get togethers, but not really talked to them much. A really important part of any new job is meeting the people that you are working with. This is always ends up happening when you work in the same office with them, but for a distributed company like Canonical, you can end up working on the same team with people that you don't get to meet for months.

It was great to meet different sub-teams of DX, especially those that I'll be working closely with. I'll be hanging out in the #ayatana irc channel now, but I'll also still be in #launchpad and #launchpad-dev. There are some very interesting plans for oneiric, and it will be interesting to see how much we can end up getting done. In the normal way we have "too much to do" and the gauntlet has been thrown.

So... I'll be hacking on the unity stack. Please don't ask me to fix any particular bugs yet as it'll no doubt take me time to find my way through the code :-)

Now for the 40+ hour journey home.

Thursday, 21 April 2011

Launchpad and stacked branches

As I'm sure most of you are aware, Launchpad hosts Bazaar branches. One early design decision that we had on Launchpad was that branches should be able to be renamed, moved between people and projects without limitation. This is one reason why each branch that was pushed to Launchpad had a complete history. We wanted to make sure that there weren't any problems where one person was blocking another pushing branches, or that people weren't able to get at revisions that they shouldn't be able to.

The Bazaar library, bzrlib, gives you awesome power to manipulate the internals giving you access to the repository of revisions through the branch. This can be a blessing and a curse, as if you have private revisions, then they can't be in a public repository.

Having a complete copy of the data for every branch became a severe limitation, especially for larger projects, of which Launchpad itself is one. A solution to this was a change in Bazaar itself that allowed a fallback repository which contained some of the revisions. This is what we call stacked branches. The repository for the branch on Launchpad has a fallback to another repository, which is linked to a different Launchpad branch. We ideally wanted all of this to be entirely transparent to the users of Launchpad. What it means is that when you are pushing a new branch to Launchpad, the bzr client asks for a stacked location. If there is a development focus branch specified for the project, this is then offered back to the client. The new branch then only adds revisions to its repository that don't exist in the development focus branch's repository. This makes for faster pushes, and smaller server side repositories.

The problem though was what do we specify the stacked on location to be? When we created the feature, we used absolute paths from the transport root. What the mean was that we stored the path aspect of the branch. For example, lp:wikkid gets translated to bzr+ssh://bazaar.launchpad.net/~wikkid/wikkid/trunk or http://bazaar.launchpad.net/~wikkid/wikkid/trunk depending on whether the bzr client knows your Launchpad id. The absolute path stored would be /~wikkid/wikkid/trunk. This information was then stored in the branch data on the file system.

The problem however was that the web interface allows you to rename branches. The actual branch itself on disk is referred to using a database id, which is hidden from the user using a virtual file system which has rewrite rules for http and at the bazaar transport level. However since the stacked on location refers to a full branch path, changing any part of that, whether it is the branch owner, branch name, or the project or package that the branch is for, would cause any branches stacked on that changed branch to break, bug 377519.

In order to fix this we had to change the location that the branch is stacked on to be independent of the branch path. The best solution here is to use the database id. I really didn't want to expose the user to this opaque id, but one opaque id is as good as another. Now when pushing branches to Launchpad, when it is creating a stacked branch you'll see a message like:

Created new stacked branch referring to /+branch-id/317141.

Existing branches still have their old branch paths saved for now. We'll run a migration script early next week to fix all these up, and hopefully we'll have seen the last of this bug.