Wednesday, 17 July 2013

Stunned by Go

The original working title for this post was "Go is hostile to developers". This was named at a time of extreme frustration, and it didn't quite seem right in the cooler light of days later. Instead I've settled on the term "stunned", because I really was. I felt like the built-in standard library had really let me down.

Let's take a small step back in time to the end of last week as I was debugging a problem. In our codebase, we had an open file that we would read from, seek back to the start, and re-read, sometimes several times. This file was passed as an io.Reader into another of our interfaces which had a Put method. This stored the content of the io.Reader in a remote location. I was getting this succeeding the first time, but then erroring out with "bad file descriptor".

The more irritating bit was that the same code worked perfectly as expected with one of our interface implementations but not another. The one that failed was our "simple" one. All it used was the built-in http library to serve a directory using GET and PUT http commands.

@TheMue suggested that our simple storage provider must be closing the file somehow. Some digging ensued. What I found had me a little exasperated. The standard http library was calling Close on my io.Reader. This is not expected behaviour when the interface clearly just takes an io.Reader (which exposes one and only one method Read).

This clearly breaks the "Principle of Least Astonishment"

People are part of the system. The design should match the user's experience, expectations, and mental models.

Developers and maintainers are users of the development language. As an experienced developer, it is my expectation that if a method says it takes an interface that exposes only Read, then only Read will be called. This is not the case in Go standard library.

While I have found just one case, I have been informed that this is common in Go, and that interfaces are just a "minimum" requirement.

It seems to me that Go uses the interface casting mechanism as a way to allow the function implementation to see if the underlying structure supports other methods, or to check for actual concrete implementation types so the function can take advantage of extra knowledge. It is one thing to call methods that don't modify state, however calling a mutating function that the original function did not express an intent to call is so much more than just unexpected, but astonishing.

The types of the parameters being passed into a function form a contract. This has been formalized in a number of languages, particularly D and Eiffel.

I found myself asking the question "Why do they do this?" The answer I came up with two things:

  1. To take advantage of extra information about the underlying object to make the execution of the function more efficient
  2. To work around the lack of function overloading

Now the second point is tightly coupled to the first point, because if there was function overloading, then you could clearly have another function that took a ReaderCloser and it would be clear that the Close method may well be called.

My fundamental issue here is that the contract between the function and the caller has been broken. There was not even any documentation to suggest that the contract may be broken. In this case, the calling of the Close method on my io.Reader broke our code in unexpected ways. As a language that is supposed to be used for systems programming, this just seems crazy.

16 comments:

Jeroen Vermeulen said...

Stunned is the word. If this were a real-world contract, you'd be filing charges for fraud! :)

(Fraud: engaging in a contract without the intention to live up to it. Or at least that's what I learned.)

Is it not so bad if the method doesn't modify any state? I don't think it makes a difference. Your method might still read something that's also being updated concurrently. If you have no reason to expect that method to be called here, you can't be expected to prevent a concurrency bug.

Unknown said...

If this is normal in Go, my question is, how are you supposed to know what is called and when? Do I have to start naming my methods in odd ways to make sure Go libraries don't call things unexpectedly?

It's crazy.

Tom Burdick said...

Unexpected mutation is exactly why functional languages are such a joy in some cases... and a pain in others.

In other news why is a file descriptor ever closed before all references are lost?

Andrew Gerrand said...

It's not really normal in Go. We were surprised by it when it came up, and had to document it as an extraordinary case.

Tim Penhey (thumper) said...

Andrew, I'm pleased it isn't really normal in Go. That would just freak me out. I ended up tweeting about it when I found out, and was genuinely surprised when someone said that this wasn't abnormal. The very minimum should be updated documentation, with a clear indication that Close will be called, and preferably with a work around if you don't want that. I used:

justReader := &struct{io.Reader}{theReader}

That worked, and is now also in our test suite.

Alexandre Jasmin said...

@Tom Burdick I don't know the specifics in Go but closing a file can have noticeable side effects. It could cause the last bits of data to be written out e.g. fclose() in C can fail just like fwrite().

Also freeing the file descriptor can reclaim disk space when the file is unlinked or allow the file to be overwritten on Windows.

Also closing one end of a socket or a pipe cause the other end to see the end-of-file.

So you probably don't want to leave closing to the GC, even RAII style object lifetime can lead to bad surprises if you don't think about it.

Again I don't know how any of this applies to GO...

jaybuff said...

http://en.wikipedia.org/wiki/Action_at_a_distance_(computer_programming)

jaybuff said...
This comment has been removed by the author.
Nate Finch said...

It seems as though the correct solution would be to simply take a ReadCloser and if someone only had a reader, it would be on them to wrap their reader with a type that has a noop close method.

Matt Joiner said...

How unfortunate. This occurs throughout the standard library too.

rsc said...

Go developer here.

This is a mistake, plain and simple. It was pointed out on the Go developer mailing list a few days before your post, and you can see my reply there.

NewRequest should take an io.ReadCloser. Unfortunately, due to backwards compatibility, a long standing semantic mistake like this is not something we can just fix, but we've at least documented it.

The tweet you quoted is wrong, and I want to explain why. If an interface value v is passed to a function f, then what does happen from time to time is that f will look for custom methods on v that are at least logically equivalent to the static interface type f is declared to expect. For example, io.Copy takes an io.Reader and an io.Writer and does the obvious thing, Reading into a buffer and then passing that buffer to Write. Of course, it would be nice to bypass the buffer when possible, and so there is a standard WriterTo interface that an io.Reader can also implement that, in contrast to Read, which copies into a buffer, says "read the data out of yourself and directly into this Writer". If io.Copy finds that method, it will use it, but the operation - reading - is the same. There is an analogous case for the Writer too. If you pass a value v to json.Marshal, Marshal will check to see if v implements json.Marshaler, meaning v knows how to marshal itself as JSON, in which case v is given that opportunity. Again, same operation you requested.

In contrast, if you have an io.Reader, while it might be okay to look for other Read-like methods, it is certainly not okay to look for Close. That's a very different operation, it's surprising, and it shouldn't be there. FWIW, I'm not aware of any other instances of this kind of semantic mixup in the standard library. But again, unfortunately, we can't change it until Go 2. All we can do for now is document it and move on.

Tim Penhey (thumper) said...

@rsc thank you very much for the detailed reply.

I only found out today that Rog had brought this up on the dev list as I don't follow golang-dev. I think Rog brought it up because of this issue I hit.

I am pleased to hear that the expected behaviour of functions is what I expect, and that this particular issue is an aberration, and not considered OK.

I'm pleased that Go has a strong stance on backwards compatibility, and I think documenting this at least will be helpful. Preferably along with the sample code to how to avoid the Close being called (i.e. wrapping in a struct that only exposes io.Reader.

gryftir said...

If they don't want to fix it because it could break legacy code, I suppose that's understandable, but then they should add a ReadNoClose method as a stop gap, to make later refactoring easier if they ever do change it.

mbp said...

ALWAYS BE CLOSING.

openid said...

Maybe you can relate to my little, affectionate rant about the encoding/csv package... http://mwholt.blogspot.com/2013/07/why-yes-gogolang-i-still-want-to-read.html

David said...

a work-around is to wrap the reader to an ioutil.NopCloser