[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
|
|
Subscribe / Log in / New account

Generics for Go

July 1, 2020

This article was contributed by Ben Hoyt

The Go programming language was first released in 2009, with its 1.0 release made in March 2012. Even before the 1.0 release, some developers criticized the language as being too simplistic, partly due to its lack of user-defined generic types and functions parameterized by type. Despite this omission, Go is widely used, with an estimated 1-2 million developers worldwide. Over the years there have been several proposals to add some form of generics to the language, but the recent proposal written by core developers Ian Lance Taylor and Robert Griesemer looks likely to be included in a future version of Go.

Background

Go is a statically typed language, so types are specified in the source code (or inferred from it) and checked by the compiler. The compiler produces optimized machine code, so CPU-intensive code is significantly more efficient than languages like Python or Ruby, which have bytecode compilers and use virtual machines for execution.

Generics, also known as "parameterized types" or "parametric polymorphism", are a way to write code or build data structures that will work for any data type; the code or data structure can be instantiated to process each different data type, without having to duplicate code. They're useful when writing generalized algorithms like sorting and searching, as well as type-independent data structures like trees, thread-safe maps, and so on. For example, a developer might write a generic min() function that works on all integer and floating-point types, or create a binary tree that can associate a key type to a value type (and work with strings, integers, or user-defined types). With generics, you can write this kind of code without any duplication, and the compiler will still statically check the types.

Like the first versions of Java, Go doesn't ship with user-defined generics. As the Go FAQ notes, generics "may well be added at some point"; it also describes how leaving them out was an intentional trade-off:

Generics are convenient but they come at a cost in complexity in the type system and run-time. We haven't yet found a design that gives value proportionate to the complexity, although we continue to think about it. Meanwhile, Go's built-in maps and slices, plus the ability to use the empty interface to construct containers (with explicit unboxing) mean in many cases it is possible to write code that does what generics would enable, if less smoothly.

Part of the reason actual users of the language don't complain loudly about the lack of generics is that Go does include them for the built-in container types, specifically slices (Go's growable array type), maps (hash tables), and channels (thread-safe communication queues). For example, a developer writing blog software might write a function to fetch a list of articles or a mapping of author ID to author information:

    // takes ID, returns "slice of Article" (compiler checks types)
    func GetLatestArticles(num int) []Article {
        ...
    }

    // takes "slice of int" of IDs, returns "map of int IDs to Author"
    func GetAuthors(authorIDs []int) map[int]Author {
        ...
    }

Built-in functions like len() and append() work on these container types, though there's no way for a developer to define their own equivalents of those generic built-in functions. As many Go developers will attest, having built-in versions of growable arrays and maps that are parameterized by type goes a long way, even without user-defined generic types.

In addition, Go has support for two features that are often used instead of generics or to work around their lack: interfaces and closures. For example, sorting in Go is done using the sort.Interface type, which is an interface requiring three methods:

    type Interface interface {
        Len() int           // length of this collection
        Less(i, j int) bool // true if i'th element < j'th element
        Swap(i, j int)      // swap i'th and j'th elements
    }

If a user-defined collection implements this interface, it is sortable using the standard library's sort.Sort() function. Since sort.Slice() was added in Go 1.8, developers can use that function and pass in a "less-than closure" rather than implementing the full sorting interface; for example:

    // declare a struct for names and ages and a slice of those structs with four entries
    people := []struct {
        Name string
        Age  int
    }{
        {"Gopher", 7},
        {"Alice", 55},
        {"Vera", 24},
        {"Bob", 75},
    }

    // sort people using the "less-than closure" specified in the call
    sort.Slice(
        people,
        func(i, j int) bool { // i and j are the two slice indices
            return people[i].Name < people[j].Name
        },
    )

There are other ways to work around Go's lack of generics, such as creating container types that use interface{} (the "empty interface"). This effectively boxes every value inserted into the collection, and requires run-time type assertions, so it is neither particularly efficient nor type-safe. However, it works and even some standard library types like sync.Map use this approach.

Some developers go so far as to argue that generics shouldn't be added to Go at all, since they will bring too much complexity. For example, Greg Hall hopes "that Go never has generics, or if it does, the designers find some way to avoid the complexity and difficulties I have seen in both Java generics and C++ templates".

The Go team takes the complexity issue seriously. As core developer Russ Cox states in his 2009 article "The Generic Dilemma":

It seems like there are three basic approaches to generics:

  1. (The C approach.) Leave them out. This slows programmers. But it adds no complexity to the language.
  2. (The C++ approach.) Compile-time specialization or macro expansion. This slows compilation. It generates a lot of code, much of it redundant, and needs a good linker to eliminate duplicate copies. [...]
  3. (The Java approach.) Box everything implicitly. This slows execution. [...]

The generic dilemma is this: do you want slow programmers, slow compilers and bloated binaries, or slow execution times?

Still, many Go developers are asking for generics, and there has been a huge amount of discussion over the years on the best way to add them in a Go-like way. Several developers have provided thoughtful rationale in "experience reports" from their own usage of Go. Taylor's entry in the official Go blog, "Why Generics?", details what adding generics will bring to Go, and lists the guidelines the Go team is following when adding them:

Most importantly, Go today is a simple language. Go programs are usually clear and easy to understand. A major part of our long process of exploring this space has been trying to understand how to add generics while preserving that clarity and simplicity. We need to find mechanisms that fit well into the existing language, without turning it into something quite different.

These guidelines should apply to any generics implementation in Go. That's the most important message I want to leave you with today: generics can bring a significant benefit to the language, but they are only worth doing if Go still feels like Go.

The recent proposal

Taylor, in particular, has been prolific on the subject of adding generics to Go, having written no fewer than six proposals. The first four, written from 2010 through 2013, are listed at the bottom of his document, "Go should have generics". About them, he notes: "all are flawed in various ways". In July 2019 he posted the "Why Generics?" blog article mentioned above, which links to the lengthy 2019 proposal written by Taylor and Griesemer for a version of generics based on "contracts". Almost a year later, in June 2020, Taylor and Griesemer published the current proposal, which avoids adding contracts. In Taylor's words:

An earlier draft design of generics implemented constraints using a new language construct called contracts. Type lists appeared only in contracts, rather than on interface types. However, many people had a hard time understanding the difference between contracts and interface types. It also turned out that contracts could be represented as a set of corresponding interfaces; thus there was no loss in expressive power without contracts. We decided to simplify the approach to use only interface types.

The removal of contracts comes in part based on work by Philip Wadler and his collaborators in their May 2020 paper, "Featherweight Go [PDF]" (video presentation). Wadler is a type theorist who has contributed to the design of Haskell, and was involved in adding generics to Java back in 2004. Rob Pike, one of Go's creators, had asked Wadler if he would "be interested in helping us get polymorphism right (and/or figuring out what 'right' means) for some future version of Go"; this paper is the response to Pike's request.

The 2020 proposal suggests adding optional type parameters to functions and types, allowing generic algorithms and generic container types, respectively. Here is an example of what a generic function looks like under this proposal:

    // Stringify calls the String method on each element of s,
    // and returns the results.
    func Stringify(type T Stringer)(s []T) []string {
        var ret []string
        for _, v := range s {
            ret = append(ret, v.String())
        }
        return ret
    }

    // Stringer is a type constraint that requires the type argument to have
    // a String method and permits the generic function to call String.
    // The String method should return a string representation of the value.
    type Stringer interface {
        String() string
    }

The type parameter is T (an arbitrary name), specified in the extra set of parentheses after the function name, along with the Stringer constraint: type T Stringer. The actual arguments to the function are in the second set of parentheses, s []T. Writing functions like this is not currently possible in Go; it does not allow passing a slice of a concrete type to a function that accepts a slice of an interface type (e.g., Stringer).

In addition to generic functions, the new proposal also supports parameterization of types, to support type-safe collections such as binary trees, graph data structures, and so on. Here is what a generic Vector type might look like:

    // Vector is a name for a slice of any element type.
    type Vector(type T) []T

    // Push adds a value to the end of a vector.
    func (v *Vector(T)) Push(x T) {
        *v = append(*v, x)
    }

    // v is a Vector of Authors
    var v Vector(Author)
    v.Push(Author{Name: "Ben Hoyt"})

Because Go doesn't support operator overloading or define operators in terms of methods, there's no way to use interface constraints to specify that a type must support the < operator (as an example). In the proposal, this is done using a new feature called "type lists", an example of which is shown below:

    // Ordered is a type constraint that matches any ordered type.
    // An ordered type is one that supports the <, <=, >, and >= operators.
    type Ordered interface {
        type int, int8, int16, int32, int64,
            uint, uint8, uint16, uint32, uint64, uintptr,
            float32, float64,
            string
    }

In practice, a constraints package would probably be added to the standard library which pre-defined common constraints like Ordered. Type lists allow developers to write generic functions that use built-in operators:

    // Smallest returns the smallest element in a slice of "Ordered" values.
    func Smallest(type T Ordered)(s []T) T {
        r := s[0]
        for _, v := range s[1:] {
            if v < r { // works due to the "Ordered" constraint
                r = v
            }
        }
        return r
    }

The one constraint that can't be written as a type list is a constraint for the == and != operators, because Go allows comparing structs, arrays, and interface types for equality. To solve this, the proposal suggests adding a built-in comparable constraint to allow equality operators. This would be useful, for example, in a function that finds the index of a value in a slice or array:

    // Index returns the index of x in s, or -1 if not found.
    func Index(type T comparable)(s []T, x T) int {
        for i, v := range s {
            // v and x are type T, which has the comparable
            // constraint, so we can use == here.
            if v == x {
                return i
            }
        }
        return -1
    }

Taylor and Griesemer have developed a tool for experimentation (on the go2go branch) that converts the Go code as specified in this proposal to normal Go code, allowing developers to compile and run generic code today. There's even a version of the Go playground that lets people share and run code written under this proposal online — for example, here is a working example of the Stringify() function above.

The Go team is asking developers to try to solve their own problems with the generics experimentation tool and send detailed feedback in response to the following questions:

First, does generic code make sense? Does it feel like Go? What surprises do people encounter? Are the error messages useful?

Second, we know that many people have said that Go needs generics, but we don't necessarily know exactly what that means. Does this draft design address the problem in a useful way? If there is a problem that makes you think "I could solve this if Go had generics," can you solve the problem when using this tool?

Discussion

There has been a lot of public discussion about generics on the main golang-nuts mailing list since the latest proposal was published, as well as on Hacker News and reddit.com/r/golang threads.

As Pike said [YouTube] last year, "syntax is not the problem, at least not yet", however, many of the threads on the mailing list have been immediately critical of the syntax. Admittedly, the syntax is unusual, and it adds another set of (round) parentheses to Go, which is already known for having lots of parentheses (for example, Go's method definitions use one set for the method's receiver type, and another for the method's arguments). The proposal tries to preempt the syntax bikeshedding with an explanation of why they chose parentheses instead of angle brackets:

When parsing code within a function, such as v := F<T>, at the point of seeing the < it's ambiguous whether we are seeing a type instantiation or an expression using the < operator. Resolving that requires effectively unbounded lookahead. In general we strive to keep the Go parser efficient.

Most responders on the mailing list are proposing the use of angle brackets like C++, Java, and C#, for example, using List<T> instead of List(T). Taylor is much more interested in whether the semantics of the new proposal make sense, but has been patiently replying to each of these syntax threads with something like the following:

Let's see what real code looks like with the suggested syntax, before we worry about alternatives. Thanks.

This has happened so many times that one mailing list contributor, Tyler Compton, compiled a helpful list of all the syntax-related threads.

Generics will help eliminate types and functions repeated for multiple types, for example sort.Ints, sort.Float64s, and sort.Strings in the sort package. In a comment on Hacker News, Kyle Conroy showed "a four-line replacement for the various sql.Null* types in the standard library":

    type Null(type T) struct {
        Val   T
        Valid bool // Valid is true if Val is not NULL
    }

Mailing list contributor Pee Jai wondered whether there's a way to constrain a type to only allow structs, but Taylor indicated that's not possible; he noted that "generics don't solve all problems". Robert Engels said that the reflect package would still be needed for this case anyway.

In one thread, "i3dmaster" asked some questions about custom map types, and Taylor clarified that "custom container types aren't going to support len() or range". Creators of collection types won't have access to this special syntax, but will need to define their own Len() method, and their own way to iterate through the collection.

Go core contributor Bryan Mills has posted insightful replies on a number of threads. He has also created his own repository with various notes and code examples from his experiments with generics, including an explanation about why he considers type lists less than ideal. The repository also includes various attempts at re-implementing the append() built-in using generics as proposed.

Timeline

In their recent blog entry, Taylor and Griesemer are clear that adding generics to the language won't be a quick process — they want to get it right, and take into account community feedback:

We will use the feedback we gather from the Go community to decide how to move forward. If the draft design is well received and doesn't need significant changes, the next step would be a formal language change proposal. To set expectations, if everybody is completely happy with the design draft and it does not require any further adjustments, the earliest that generics could be added to Go would be the Go 1.17 release, scheduled for August 2021. In reality, of course, there may be unforeseen problems, so this is an optimistic timeline; we can't make any definite prediction.

My own guess is that August 2021 (just over a year away) is optimistic for a feature of this size. It's going to take quite a while to solicit feedback, iterate on the design, and implement generics in a production-ready way instead of using the current Go-to-Go translator. But given the number of proposals and the amount of feedback so far, generics are sure to be a much-used (and hopefully little-abused) feature whenever they do arrive.


Index entries for this article
GuestArticlesHoyt, Ben


to post comments

Generics for Go

Posted Jul 1, 2020 16:44 UTC (Wed) by Cyberax (✭ supporter ✭, #52523) [Link]

I'm honestly amazed by Ian Lance Taylor. He's active in most of discussions about generics and always takes time to calmly discuss questions and explain issues.

Generics for Go

Posted Jul 1, 2020 17:21 UTC (Wed) by kjp (guest, #39639) [Link] (3 responses)

If only google and msft could team up and get native compilation going for C#. Nope, gotta be NIH.

Generics for Go

Posted Jul 1, 2020 20:15 UTC (Wed) by epa (subscriber, #39769) [Link] (1 responses)

Yes, I was surprised not to see C# mentioned in the discussion, since it seems to avoid the macro expansion approach of C++, while not suffering the boxing overhead of Java. (Though surely Java too, these days, lets you create a vector of ints without having to box each one?)

Generics for Go

Posted Jul 1, 2020 20:17 UTC (Wed) by Cyberax (✭ supporter ✭, #52523) [Link]

> Though surely Java too, these days, lets you create a vector of ints without having to box each one?
Nope. That's on the TODO list, though: https://openjdk.java.net/projects/valhalla/

Generics for Go

Posted Jul 1, 2020 20:27 UTC (Wed) by dvdeug (guest, #10998) [Link]

How is this related? Go seems like a decidedly different language for a decidedly different audience. What's the expected win from native compilation? https://benchmarksgame-team.pages.debian.net/benchmarksga... says that C# already ties or beats Go on every single problem.

Generics for Go

Posted Jul 1, 2020 19:13 UTC (Wed) by acarno (subscriber, #123476) [Link] (4 responses)

Not that I know much about Go, but why not look at generics in languages other than C/C++/Java? For example, Ada has (strongly-typed) generics:
https://www.adaic.org/resources/add_content/standards/05r...

Generics for Go

Posted Jul 1, 2020 19:53 UTC (Wed) by dvdeug (guest, #10998) [Link] (2 responses)

C++ templates were designed after Ada generics, and Alexander Stepanov, who wrote the Standard Template Library for C++ and pushed for C++ templates with Bjarne Stroustrup, did his first proto-STL in Ada. So C++ and Java should show a more refined form of Ada generics.

Personally, I've worked with Ada generics. The Ada type system added complexities in places that don't seem to have wins in retrospect, and Ada generics reflect that. They don't have the sheer power that C++ templates have, or the usual elegance of Java. I'd be interested to see languages from a universe where Ada was more successful and modern languages stuck more to its design principles, but in any case, Ada generics were the first major generics implementation, and it shows.

Generics for Go

Posted Jul 8, 2020 6:58 UTC (Wed) by ncm (guest, #165) [Link] (1 responses)

C++ generics do not, in fact, draw on Ada generics, and the Ada attempt at an STL was a dismal failure, as it would also be in Java were it attempted.

C++ generics do owe a lot to ML (not "machine learning"; there was a language ML that also influenced Haskell). Java generics are extremely limited, and their backward compatibility means they offer no performance improvement over older mechanisms.

Generics for Go

Posted Jul 16, 2020 10:17 UTC (Thu) by jch (guest, #51929) [Link]

> there was a language ML

Why in the past tense? ML is doing fine, the dialect with the most active community would appear to be OCaml, https://ocaml.org/ . (It looks to me like ML was an important inspiration for Rust.)

Generics for Go

Posted Jul 2, 2020 8:25 UTC (Thu) by k3ninho (subscriber, #50375) [Link]

>Not that I know much about Go
Read the "Featherweight Go" journal article [1] and its parent, "Featherweight Java"[2]. I think Rob Pike looked and though Featherweight Java more relevant because of the user-facing side of Go is more like the C++/C#/Java kind of object orientation than the approach taken by Ada. You should be heartened that both Featherweight approaches prove (rather than merely providing) type safety.

K3n.

1: https://arxiv.org/pdf/2005.11710.pdf
2: https://dl.acm.org/doi/10.1145/503502.503505

Generics for Go

Posted Jul 2, 2020 9:32 UTC (Thu) by ragnar (guest, #139237) [Link] (1 responses)

I really like the approach zig has taken with generics. They simply have types as compile-time parameters to functions and data structures.

https://ziglang.org/#Generic-data-structures-and-functions

Generics for Go

Posted Jul 4, 2020 8:19 UTC (Sat) by cpitrat (subscriber, #116459) [Link]

IIUC, this is conceptually like the C++ approach but with a nicer syntax (same as the language rather than a dedicated one). AmI missing something?

Generics parameter syntax

Posted Jul 2, 2020 15:34 UTC (Thu) by davecb (subscriber, #1574) [Link] (9 responses)

I rather wonder if v := F<T> can be parsed both accurately and efficiently with strongly bounded lookahead?

Generics parameter syntax

Posted Jul 3, 2020 18:49 UTC (Fri) by mathstuf (subscriber, #69389) [Link] (3 responses)

How complicated can `T` get syntactically? Can you decide before you see the `>` somehow? This is the reason for the weird `obj.mem1.mem2.template foo<T>` syntax in C++: T can be very complicated and "type or expression" is not something one can answer without actually having a name lookup table.

Generics parameter syntax

Posted Jul 3, 2020 22:39 UTC (Fri) by davecb (subscriber, #1574) [Link] (2 responses)

Thanks: I come from an era when we assumed a symbol table, and so did not realize Go does not have one in it's parser. Rather amazing!

Generics parameter syntax

Posted Jul 5, 2020 2:26 UTC (Sun) by mathstuf (subscriber, #69389) [Link] (1 responses)

Note that I actually don't know how Go's parser in particular works. I surmise that the parser requiring access to a symbol table would be unacceptable for performance. It'd be better to let the parser go and make ASTs for all the source and only then figure out what symbols are defined where (using the imports as a guideline). In fact, the mere existence of the AST module in the standard library likely means that symbol tables cannot be allowed (because requiring `import` to actually do work and look at what symbols are available just to parse `sym<arg>` would be a little ridiculous IMO).

Generics parameter syntax

Posted Jul 17, 2020 17:08 UTC (Fri) by anton (subscriber, #25547) [Link]

I expect that symbol table lookup in the parser is not a performance problem. It is a conceptual problem because you need to know the parse tree to know the scope of an identifier; of course, for some kinds of languages the problem is solvable (LL(1) languages with definitions before use of symbols should be fine), but it means that the language requires an integration of parsing and "static semantics". And possibly scanning, if, e.g., the syntax requires a token for a type that's separate from an identifier token.

Generics parameter syntax

Posted Jul 5, 2020 10:06 UTC (Sun) by dezgeg (subscriber, #92243) [Link] (4 responses)

In some languages (Haskell comes to mind, also Erlang I think) case-sensitivity is used to enforce this kind of problems without requiring a symbol table in the parser. In Haskell, uppercase names always refer to types or type constructors; and lowercase names always refer to variables or type parameters. Sadly not many new languages seem to take use of this as I personally find it quite elegant.

Generics parameter syntax

Posted Jul 5, 2020 19:11 UTC (Sun) by Cyberax (✭ supporter ✭, #52523) [Link]

Go uses upper case names for exported symbols and lower-case names for package-local ones.

Generics parameter syntax

Posted Jul 5, 2020 22:23 UTC (Sun) by dtlin (subscriber, #36537) [Link] (2 responses)

In Haskell, types do begin with an upper-case letter or : symbol, but data constructors (which are values) also begin with an upper-case letter or : symbol and type variables begin with a lower-case letter or non-: symbol, so that's not useful for disambiguation.

But only values appear to the left of :: and only types appear to the right of ::, so there's no confusion. Except in import lists, or when -XTypeApplications, -XDataKinds, -XStarIsType, … No angle brackets so it doesn't suffer from C++'s problems though.

Generics parameter syntax

Posted Jul 6, 2020 20:38 UTC (Mon) by dezgeg (subscriber, #92243) [Link] (1 responses)

Haskell was just an example of some other language where caseness of a name matters (but apparently even Go already uses it in other manner). I was thinking about having names starting with Uppercase always refer to types and names starting lowercase be variables. Thus in the v := F<T> example the caseness of F would tell whether the < is the less than operator or the opening bracket of a type parameter.

Generics parameter syntax

Posted Jul 7, 2020 12:19 UTC (Tue) by dtlin (subscriber, #36537) [Link]

It helps for that case, but if you have higher-order kinds, f<t> is also a type, for some appropriate type variables f and t.

Generics for Go

Posted Jul 4, 2020 8:26 UTC (Sat) by cpitrat (subscriber, #116459) [Link]

"(The C approach.) Leave them out. This slows programmers. But it adds no complexity to the language."

Actually the C approach is drop the safety net (typing) and use (void *)

"The generic dilemma is this: do you want slow programmers, slow compilers and bloated binaries, or slow execution times?"

Actually the first option's issue is not slow development, it's choosing between loosing type safety or duplicating code which in both cases, means higher risk of bugs. Yes it also means slower development, but not by a huge amount and that's not a big issue.

Generics for Go

Posted Jul 7, 2020 20:49 UTC (Tue) by NYKevin (subscriber, #129325) [Link] (2 responses)

> When parsing code within a function, such as v := F<T>, at the point of seeing the < it's ambiguous whether we are seeing a type instantiation or an expression using the < operator. Resolving that requires effectively unbounded lookahead. In general we strive to keep the Go parser efficient.

Eh... Python used square brackets to solve this exact problem (which had the bonus of being entirely backwards-compatible with their existing syntax; Python's types are "just" variables pointing to class objects, so applying the subscript operator was already syntactically legal). Perhaps Go didn't want to parse generics as an extension/generalization of the subscript operator? I suppose that saves you the trouble of disambiguating generics from "classical" subscripting at the semantics layer, but you're already going to have to do kind-checking there anyway (i.e. to prohibit Int<List> and similar errors), so additionally checking that the argument *is* a type isn't really that much more work.

Maybe they just thought that parentheses were less ugly than square brackets?

Generics for Go

Posted Jul 8, 2020 13:41 UTC (Wed) by mathstuf (subscriber, #69389) [Link]

> but you're already going to have to do kind-checking there anyway (i.e. to prohibit Int<List> and similar errors), so additionally checking that the argument *is* a type isn't really that much more work.

But you don't want your AST module to have to decide that. It feels like a layering violation to me.

Generics for Go

Posted Jul 9, 2020 17:10 UTC (Thu) by renox (guest, #23785) [Link]

I wonder why everyone just doesn't copy DLang solution: using !( ) for generics: no ambiguity, no need to hack the parser, no conflict with with arrays.

Generics for Go

Posted Jul 8, 2020 21:34 UTC (Wed) by briangordon (guest, #135428) [Link]

So I guess the suggested usage is still to use interface{} when there are no type constraints...
func Keys(type K comparable, V interface{})(m map[K]V) []K {
That being said I do think the choice to forgo covariance and contravariance in favor of simply talking of type constraints as interfaces is very golang. Not exactly the language I would prefer to use but it's at least consistent.

Generics for Go

Posted Aug 10, 2020 23:14 UTC (Mon) by mcortese (guest, #52099) [Link]

I'll tangentially note that a problem with 3 horns is not a "dilemma".


Copyright © 2020, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds