How Bazaar

Monday, 29 July 2013

A personal thank you

Yesterday evening I had a wonderful IM conversation with a previous team member. He moved on from Canonical to new challenges last year and he was just getting in touch to let me know that I had been a significant positive influence in his professional development. This gave me nice warm fuzzies, but also made me think of those that had helped me over the years. This post is dedicated to those people, who I am going to attempt to recall in roughly historical order. I am however going to try to keep this limited to significant remembered events otherwise this list may get too huge (it may well anyway).

Firstly I'd like to thank Jason Butler. You taught me an important lesson very early on. Jason and I worked together as interns (as close a term as I can work out) while at university. Jason taught me me this:

Just because someone talks slowly, doesn't mean that they think slowly.

I'd like to thank Jason Ngaio for my first real exposure to C++. Jason was the instructor of the C++ course that my first employers sent me on. This was my first real job, and the first time that I think I really got object oriented programming.

I'd like to thank Derrick and Pam Finlayson, Arran Finlayson, Blair Crookston, Jenny Cohen, Mathew Downes and Rachel Saunders. You guys helped me develop personally. The confidence and people skills that I learnt while around you has undoubtedly helped me in my professional career in software development.

David Cittadini from the then Sapphire Technology company based in Wellington really expanded my vision and understanding of developing complex systems. David also got me back into reading around the programming topic. My technical library started there. Working with Chris Double helped me understand what it is like to work with someone else in synergy. Our joint output I'm sure was a lot more than what we would have both produced independently added together.

David Ewing made a significant impression on me around knowing my worth and helped in contract negotiations. David has a wonderful way of dealing with people.

Moving over to London gave me the opportunity to meet up with some truly awesome people. Getting involved with ACCU was great for me. I worked briefly with Allan Kelly at Reuters, but learned a lot in a brief time. I also had the opportunity to work with Giovanni Asproni and Alan Griffiths at Barclays Capital. Working with you two really helped me understand the power that the developers hold when talking to the business. A few other people I'd like to make a personal note of from this time in the UK are Kevlin Henney, Roger Orr and Pete Goodliffe.

From my early time at Canonical, I'd like to personally thank Jonathan Lange, Robert Collins and Michael Hudson-Doyle. You guys really helped me understand the importance of writing good tests, and test driven development. Also the hammering in the code reviews teaching me how to write those tests well.

There are so many other people that I have had great connections with over my professional career and I'd like to thank you all. Work is more than just what you produce, but the friendships and connections you make with the people you are creating things with.

Wednesday, 17 July 2013

Stunned by Go

The original working title for this post was "Go is hostile to developers". This was named at a time of extreme frustration, and it didn't quite seem right in the cooler light of days later. Instead I've settled on the term "stunned", because I really was. I felt like the built-in standard library had really let me down.

Let's take a small step back in time to the end of last week as I was debugging a problem. In our codebase, we had an open file that we would read from, seek back to the start, and re-read, sometimes several times. This file was passed as an io.Reader into another of our interfaces which had a Put method. This stored the content of the io.Reader in a remote location. I was getting this succeeding the first time, but then erroring out with "bad file descriptor".

The more irritating bit was that the same code worked perfectly as expected with one of our interface implementations but not another. The one that failed was our "simple" one. All it used was the built-in http library to serve a directory using GET and PUT http commands.

@TheMue suggested that our simple storage provider must be closing the file somehow. Some digging ensued. What I found had me a little exasperated. The standard http library was calling Close on my io.Reader. This is not expected behaviour when the interface clearly just takes an io.Reader (which exposes one and only one method Read).

This clearly breaks the "Principle of Least Astonishment"

People are part of the system. The design should match the user's experience, expectations, and mental models.

Developers and maintainers are users of the development language. As an experienced developer, it is my expectation that if a method says it takes an interface that exposes only Read, then only Read will be called. This is not the case in Go standard library.

While I have found just one case, I have been informed that this is common in Go, and that interfaces are just a "minimum" requirement.

@howbazaar you'll also find it's pretty common. io.Copy() will call ReadFrom and WriteTo methods, WriteString() is called to avoid copy.
— Jesse McNelis (@jessemcnelis) July 13, 2013

It seems to me that Go uses the interface casting mechanism as a way to allow the function implementation to see if the underlying structure supports other methods, or to check for actual concrete implementation types so the function can take advantage of extra knowledge. It is one thing to call methods that don't modify state, however calling a mutating function that the original function did not express an intent to call is so much more than just unexpected, but astonishing.

The types of the parameters being passed into a function form a contract. This has been formalized in a number of languages, particularly D and Eiffel.

I found myself asking the question "Why do they do this?" The answer I came up with two things:

To take advantage of extra information about the underlying object to make the execution of the function more efficient
To work around the lack of function overloading

Now the second point is tightly coupled to the first point, because if there was function overloading, then you could clearly have another function that took a ReaderCloser and it would be clear that the Close method may well be called.

My fundamental issue here is that the contract between the function and the caller has been broken. There was not even any documentation to suggest that the contract may be broken. In this case, the calling of the Close method on my io.Reader broke our code in unexpected ways. As a language that is supposed to be used for systems programming, this just seems crazy.

Friday, 10 May 2013

juju switch

The switch command has recently landed in trunk, and will be included in the next juju-core release.

juju switch is another way to specify the current working environment. Current precedence for environment lookup still holds, but this now sits between the JUJU_ENV environment variable and the default value in environments.yaml.

If you have multiple environments defined, there are several different ways to tell juju which environment you mean when executing commands.

Prior to switch, there were three ways to specify the environment.

The first and default way to specify the environment is to use the default value in the environments.yaml file. This was always the fallback position if one of the other ways was not specified.

Another way was to be explicit for some commands, and use the -e or --environment command line argument.


$ juju bootstrap -e hpcloud

There is also an environment variable that can be set which will override the default specified in the environments.yaml file.


$ export JUJU_ENV=hpcloud
$ juju bootstrap          # now bootstraps hpcloud
$ juju deploy wordpress   # deploys to hpcloud

The switch option effectively overrides what the default is for the environments.yaml file without actually changing the environments.yaml file. This means that -e and the JUJU_ENV options still override the environment defined by switch.


$ juju help switch
usage: juju switch [options] [environment name]
purpose: show or change the default juju environment name
  
options:
-l, --list  (= false)
    list the environment names
  
Show or change the default juju environment name.
  
If no command line parameters are passed, switch will output the current
environment as defined by the file $JUJU_HOME/current-environment.
  
If a command line parameter is passed in, that value will is stored in the
current environment file if it represents a valid environment name as
specified in the environments.yaml file.
  
aliases: env

It works something like this:


$ juju env
Current environment: "amazon-ap"
$ juju switch
Current environment: "amazon-ap"
$ juju switch -l
Current environment: "amazon-ap"

Environments:
        amazon
        amazon-ap
        hpcloud
        openstack
$ juju switch amazon
Changed default environment from "amazon-ap" to "amazon"
$ juju switch amazon
Current environment: "amazon"
$ juju switch
Current environment: "amazon"

If you have JUJU_ENV set, then you get told that the current environment is defined by this. Also if you try to use switch to change the current environment when the environment is defined by JUJU_ENV, you will get an error.


$ export JUJU_ENV="amazon-ap"
$ juju switch
Current environment: "amazon-ap" (from JUJU_ENV)
$ juju switch amazon
error: Cannot switch when JUJU_ENV is overriding the environment (set to "amazon-ap")

Wednesday, 17 April 2013

The Go Language - My thoughts

I've been using the Go programming language for almost three months now pretty much full time. I have moved from the Desktop Experience team working on Unity, into the Juju team. One of the main reasons I moved was to learn Go. It had been too long since I had learned another language, and I felt it was better to dive in, than to just mess with it on my own time.

A friend of mine had poked around with Go during a hack fest and blogged about his thoughts. This was just before I really started poking around. Interestingly the main issues that Aldo found frustrating with the errors for unused variables and unused imports, I have found not to be such a big deal. Passingly griping, sure, but not a big issue. Having the language enforce what is often a lint checker in other languages I see as an overall benefit. Also, even though I don't agree with the Go formatting rules, enforced by gofmt, it doesn't matter. It doesn't matter because all code is formatted by the tool prior to commit. As an emacs user, I found the go-mode to be extremely helpful, as I have it formatting all my code using gofmt before saving. I never have to think about it. One thing I couldn't handle though, was the eight character tabs. Luckily emacs can hide this from me.


;; Go bits.

(require 'go-mode-load)
(add-hook 'before-save-hook #'gofmt-before-save)
(add-hook 'go-mode-hook (lambda () (setq tab-width 4)))

There are some nice bits to Go. I very much approve of channels being first class objects, and the use of channels to communicate between concurrently executing code. Go routines are also nifty, although I've not used them too much myself yet. Our codebase does, but I've not poked into all the nooks and crannies yet.

However there are several things which irritate the crap out of me with Go.

Error handling

The first one I guess is a fundamental design decision which I don't really agree with. That is around error handling being in your face so you have to deal with it, as opposed to exceptions, which are all to often not thought about. Now if our codebase is in any way representative of Go code out there, this is just flat out wrong. The most repeated lines of code in the codebase would have to be:


if err != nil {

return nil

This isn't error handling. This is just passing it up to the chain, which is exactly what exception propagation does, only Go makes your codebase two to three times larger due to needing these three lines after every line of code that calls into another function. This is one thing I really dislike, but unlikely to change.

As a user of a language though, there are other things that could be added at the language level to make things slightly nicer. Syntactic sugar, as it is often known, makes the code easier to read.

If the language is wanting to keep the explicit handling of errors in the current way, how about some sugar with that.

Instead of

func magic() (*type, error) {
something, err := somefunc("blah")
if err == nil {
return nil, err
}
otherThing, err := otherfunc("blah")
if err == nil {
return nil, err
}
return foo(something, otherThing), nil
}

we had some magic sugar, say a built-in method like raise_error, which interrogated the function signature, and returned zeroed values for all non-error types, and the error, and returned only non-error values, we could have this


func magic() (*type, error) {

something := raise_error(somefunc("blah"))
otherThing := raise_error(otherfunc("blah"))
return foo(something, otherThing), nil
}

The range function

There are several different issues I have with the range function.

range returns one or two parameters, but the language doesn't allow any user defined functions to return one or two parameters, range is super special
using range with a slice or array and getting a single value, doesn't give you the value, but instead the index - I never want this
there is no way to define range behaviour for a user defined type

These three things are mostly equal in annoyance factor. I'd love to see this change.

No generics

Initially I accepted this as a general part of the language. Shouldn't be a big deal right? C doesn't have generics. I guess I spent too long with C++ then.
My first real annoyance was when I had two integers, and I wanted to find the maximum value of the two. I go to look in the standard library and find math.max. However that is just for the float64 type. The standard response from the team was "it is only a two line function". My response is "that's not the point".

Since there is no function overloading, nor generics, there is no way with the language at this stage to make a standard library function that determines the maximum value of two or more numeric types, and return that maximum in the same type as the parameters. Generics would help here.

A second case for generics is standard containers. The primary container in Go at this stage is the map. So many places in our codebase we have map[string]interface{}. The problem with this is that you have to cast all values retrieved from the map. There is also no set, multi_map, or multi_set. Since there is no way to provide simple iteration for user defined types, you can't easily define your own set type and have simple iteration using range.

Interfaces that aren't explicitly marked as being implemented help in some ways to provide features provided by generic types and functions, but it is a poor substitute.

So far...

Learning Go has been an interesting experience so far. I like learning new things, and I'm going to be using Go for some time now with the current project. No doubt I'll have more to write about later.

Tuesday, 27 March 2012

Unity 5.8 issues and workarounds

Well... with the release of Unity 5.8 and associated dependencies, we got the extra testing we were after in precise, and with it a number of bugs. The positive side to this is that with the extra information from our wonderful beta-testers we have been able to work out how to reproduce a number of the issues. As any developer would tell you, being able to reproduce your user's problems is often the biggest hurdle.

Over the weekend I noticed a number of issues around the release of Unity 5.8, and this morning while going through the bug reports, I was happy to notice that we had some way to work around most of them.

Unity 5.8: Flickering and corruption on Unity UI elements - a fix for many is "unity --reset". The cause appears to be how compiz is dealing with plug-ins that are no longer around. For some there have been plug-ins that existed with Oneiric that are no longer around in Precise, and the reset caused them to be removed from the list to load.

Unity 5.8: Login to blank screen (all black or just wallpaper) - some have been fixed by "unity --reset", but the underlying cause of this one is still a bit of a mystery.

Unity 5.8: Can't login to Unity since upgrade to 5.8 - some have found that disabling "Unity MT Grab Handles" compiz plug-in fixes this issue. We still need to work out what the underlying problem is.

white box randomly shows up at top left corner blocking applications from using stuff under it - this one appears to be triggered by chromium desktop notifications. There have been reports that disabling the animations plug-in in compiz, and then re-enabling it fixes this. We are still investigating why.

If you are getting these issues, you can try the workarounds suggested here.

Monday, 20 February 2012

Guilt reduction

So it is now Monday morning and I'm sitting next to Thomi. We are going to pair program on this test stuff. Partly because I think that pair programming is really cool, and partly due to Thomi knowing the autopilot test infrastructure really well, and that'll make this go much faster.

The bug in question related to the launcher getting into a very confused state where it thought there were multiple active applications. And clicking on a launcher icon that was in this confused state caused a new application to be started rather than switching to the one that was running.

The first step in making all this work then, is to create a branch that is based off a revision that was before the fix. This way we can write a test that fails first. A key part of tests is to make sure they fail first. Then when they start passing, you know it isn't by mistake, and that you have tested what you think, not just created something that passes.

Firstly, find that revision...

$ bzr log | less

The fix is revision 1977, so lets make a branch of trunk from revision 1976.

$ bzr cbranch trunk -r 1976 hud-ap-test
$ cd hud-ap-test/
$ bzr revno
1976

I use light weight checkouts for the unity repo, hence cbranch rather than branch.

At this revision, there is a HUD test that really just checks the reveal. Lets make sure it passes...

$ cd tests/autopilot/
$ python -m testtools.run autopilot.tests.test_hud
Tests running...
No handlers could be found for logger "autopilot.emulators.X11"

Ran 1 test in 4.238s
OK

I deleted a bunch of gtk warnings, they don't add any value for what I'm trying to show here. Would be great if someone fixed them though :-)

Now I need to actually build and run my local unity (and test the autopilot test again).

Found out that my machine was failing to build for other reasons, so we switched to Thomi's. The existing test still passed (of course it did), so the next step was to write a test that encapsulated the broken behaviour that we had found during the many hours of analysis.

That can be found at lp:~thomir/unity/autopilot-hud-triple-hit.

The test failed with the old revision, we then merged trunk, rebuilt, and ran the test again. Test passed. Job done.

Saturday, 18 February 2012

That guilty feeling

Today had been a frustrating day. I had been quick to anger and my family bore the brunt of that. It wasn't until I was confronted with this that I actually took a minute to think why I was feeling this way. It came back to something I read on IRC this morning, where I read that some people I deeply respect were disappointed with the test coverage with Unity 5.4.

I took this disappointment the way people often take it from their parents. Remember when as a child, one of the worst things you could feel was the disappointment of your parents. Well I guess that is how I felt.

I took over the engineering manager position of the unity team at the end of last year, and I tend to take criticism of the project and team personally.

So... why the guilty feeling?

Well, back around the time I took over managing the team, the general acceptance criteria for getting Canonical projects into Ubuntu changed. This includes Unity. There were a number of automated tests for Unity, and a series of distro acceptance tests that were manually executed. What we needed to do was to really change the team culture to one where tests were not only written, but expected. New features needed test coverage, bug fixes needed test coverage. The idea here, for all those that understand test driven development, and automated testing, was to make sure that bugs that were fixed, and new features, didn't get broken accidentally by new changes.

The guilt really came from knowing that I had allowed code reviews through the process without enforcing the need for tests. And that as a senior person on the team, others took a lead from what I did. If I was letting things through, so would others. This is where the feeling really came from.

It is very easy to land fixes to crashes quickly when under pressure. Especially when you've spent the last eight hours debugging in gdb, and auditing all the recently landed code looking for that change that would contribute to the broken behaviour that you have been trying to fix. When you finally find that one line fix, it is so tempting to just commit the one line. You know it works, you've just spent the last freaking eight hours looking at the weird behaviour. What you haven't done however, is stopped it from happening again, by encapsulating the behaviour in an automated test.

I plan to spend some of Monday going back and adding an automated test to cover the particular behaviour that we fixed the other day. I'll also write up what, and how this test gets written. Hopefully by writing this, not only will Unity get better test coverage, but I'll personally feel better knowing that I've done the right thing.