Golang JSON encoding / decoding and text / HTML templates

1, JSON encoding and decoding

JSON is a Unicode textual encoding of various types of values in JavaScript -- strings, numbers, Booleans, and objects. It can represent basic data types and aggregate data types such as array, slice, structure and map in an effective and readable way.

Consider an application that collects Movie reviews and provides feedback. Its Movie data type and a typical list of values representing the Movie are shown below. (in the struct declaration, the string literal after the Year and Color members is the struct member Tag)

type Movie struct {
    Title  string
    Year   int  `json:"released"`
    Color  bool `json:"color,omitempty"`
    Actors []string
}

var movies = []Movie{
    {Title: "Casablanca", Year: 1942, Color: false,
        Actors: []string{"Humphrey Bogart", "Ingrid Bergman"}},
    {Title: "Cool Hand Luke", Year: 1967, Color: true,
        Actors: []string{"Paul Newman"}},
    {Title: "Bullitt", Year: 1968, Color: true,
        Actors: []string{"Steve McQueen", "Jacqueline Bisset"}},
}

Such a data structure is particularly suitable for JSON format, and it is easy to transform between the two. The process of changing a movie like structure slice into JSON in Go language is called marshalling. Marshalling is done by calling the json.Marshal function:

data, err := json.Marshal(movies)
if err != nil {
    log.Fatalf("JSON marshaling failed: %s", err)
}
fmt.Printf("%s\n", data)

The Marshal function returns a encoded byte slice, which contains a long string and has no blank indentation; I wrap it to display:

[{"Title":"Casablanca","released":1942,"Actors":["Humphrey Bogart","Ingr id Bergman"]},
{"Title":"Cool Hand Luke","released":1967,"color":true,"Actors":["Paul Newman"]},
{"Title":"Bullitt","released":1968,"color":true,"Actors":["Steve McQueen","Jacqueline Bisset"]}]

Although this compact representation contains all the information, it is difficult to read. To generate a format that is easy to read, another JSON. Marshaldindent function will produce indented output. This function has two additional string parameters to represent the prefix of each line output and the indentation of each level:

data, err := json.MarshalIndent(movies, "", "    ")
if err != nil {
    log.Fatalf("JSON marshaling failed: %s", err)
}
fmt.Printf("%s\n", data)

The above code will produce output like this (Note: there is no comma separator after the last member or element):

[
    {
        "Title": "Casablanca",
        "released": 1942,
        "Actors": [
            "Humphrey Bogart",
            "Ingrid Bergman"
        ]
    },
    {
        "Title": "Cool Hand Luke",
        "released": 1967,
        "color": true,
        "Actors": [
            "Paul Newman"
        ]
    },
    {
        "Title": "Bullitt",
        "released": 1968,
        "color": true,
        "Actors": [
            "Steve McQueen",
            "Jacqueline Bisset"
        ]
    }
]

When encoding, the member name of the Go language structure is used as the JSON object by default. Only exported struct members are encoded, which is why we choose member names that start with uppercase letters.

Careful readers may have noticed that the members of the Year name become released after encoding, and the color members become the color with lowercase letters after encoding. This is caused by the construct member Tag. A construct member Tag is a meta information string associated with the member during compilation:

Year  int  `json:"released"`
Color bool `json:"color,omitempty"`

The member Tag of a structure can be any string face value, but it is usually a series of key value pairs separated by spaces: "value". Because the value contains double quotation marks, the member Tag is usually written in the form of a native string face value. The value corresponding to the beginning key name of JSON is used to control the encoding and decoding behavior of encoding/json package, and encoding / The following other packages follow this Convention. The first part of JSON corresponding value in member Tag is used to specify the name of JSON object, such as mapping total count member in Go language to total count object in JSON. The Tag of the Color member also comes with an additional option, which means that JSON objects are not generated when the Go language structure member is null or zero (where false is zero). As expected, Casablanca is a black-and-white movie with no output Color member.

The reverse operation of encoding is decoding, corresponding to decoding JSON data into the data structure of Go language. In Go language, it is generally called unmarshaling, which is completed by json.Unmarshal function. The following code decodes the movie data in JSON format into a structure slice with only the Title member. By defining the appropriate Go language data structure, we can selectively decode the interested members in JSON. When the Unmarshal function call returns, slice will be filled with only the Title information value, and other JSON members will be ignored.

var titles []struct{ Title string }
if err := json.Unmarshal(data, &titles); err != nil {
    log.Fatalf("JSON unmarshaling failed: %s", err)
}
fmt.Println(titles) // "[{Casablanca} {Cool Hand Luke} {Bullitt}]"

Many web services provide JSON interfaces, which send JSON format requests and return JSON format information through the HTTP interface. To illustrate this, we demonstrate a similar use through Github's issue query service. First, we define the appropriate types and constants:

package github

import "time"

const IssuesURL = "https://api.github.com/search/issues"

type IssuesSearchResult struct {
    TotalCount int `json:"total_count"`
    Items          []*Issue
}

type Issue struct {
    Number    int
    HTMLURL   string `json:"html_url"`
    Title     string
    State     string
    User      *User
    CreatedAt time.Time `json:"created_at"`
    Body      string    // in Markdown format
}

type User struct {
    Login   string
    HTMLURL string `json:"html_url"`
}

As before, even if the corresponding JSON object name is a lowercase letter, the member name of each structure is declared to start with an uppercase letter. Because some JSON member names are not the same as the Go struct member names, you need the Go language struct member Tag to specify the corresponding JSON names. Similarly, we need to do the same when decoding. The GitHub service returns a lot more information than we defined.

The SearchIssues function makes an HTTP request and decodes the returned JSON formatted result. Because the query conditions provided by users may contain special characters such as? And &, in order to avoid URL conflicts, we use url.QueryEscape to escape the special characters in the query.

package github

import (
    "encoding/json"
    "fmt"
    "net/http"
    "net/url"
    "strings"
)

func SearchIssues(terms []string) (*IssuesSearchResult, error) {
    q := url.QueryEscape(strings.Join(terms, " "))
    resp, err := http.Get(IssuesURL + "?q=" + q)
    if err != nil {
        return nil, err
    }

    if resp.StatusCode != http.StatusOK {
        resp.Body.Close()
        return nil, fmt.Errorf("search query failed: %s", resp.Status)
    }

    var result IssuesSearchResult
    if err := json.NewDecoder(resp.Body).Decode(&result); err != nil {
        resp.Body.Close()
        return nil, err
    }
    resp.Body.Close()
    return &result, nil
}

In an earlier example, we used the json.Unmarshal function to decode a JSON formatted string into a byte slice. But in this example, we use a stream based decoder, json.Decoder, which can decode JSON data from an input stream and a json.Encoder encoding object for an output stream.

We call the Decode method to populate the variable. There are many ways to format structures. Here's the simplest way to print each issue with a fixed width, but in the next section we'll see how to use templates to output complex formats.

package main

import (
    "fmt"
    "log"
    "os"

    "gopl.io/ch4/github"
)

func main() {
    result, err := github.SearchIssues(os.Args[1:])
    if err != nil {
        log.Fatal(err)
    }
    fmt.Printf("%d issues:\n", result.TotalCount)
    for _, item := range result.Items {
        fmt.Printf("#%-5d %9.9s %.55s\n",
            item.Number, item.User.Login, item.Title)
    }
}

Specify the retrieval criteria through command line parameters. The following command is to query the problems related to JSON decoding in the Go language project, as well as the results returned by the query:

$ go build gopl.io/ch4/issues
$ ./issues repo:golang/go is:open json decoder
13 issues:
#5680    eaigner encoding/json: set key converter on en/decoder
#6050  gopherbot encoding/json: provide tokenizer
#8658  gopherbot encoding/json: use bufio
#8462  kortschak encoding/json: UnmarshalText confuses json.Unmarshal
#5901        rsc encoding/json: allow override type marshaling
#9812  klauspost encoding/json: string tag not symmetric
#7872  extempora encoding/json: Encoder internally buffers full output
#9650    cespare encoding/json: Decoding gives errPhase when unmarshalin
#6716  gopherbot encoding/json: include field name in unmarshal error me
#6901  lukescott encoding/json, encoding/xml: option to treat unknown fi
#6384    joeshaw encoding/json: encode precise floating point integers u
#6647    btracey x/tools/cmd/godoc: display type kind of each named type
#4237  gjemiller encoding/base64: URLEncoding padding is optional
2, Text and HTML templates

A template is a string or a file that contains one or more {{action}} objects contained in double curly braces. Most of the strings are printed only as literals, but other behaviors are triggered for the actions section. Each action contains an expression written in the template language. Although an action is short, it can output complex print values. The template language includes many features such as selecting members of the structure, calling functions or methods, expression control flow if else statement and range loop sentence, and other instantiation templates. Here is a simple template string:

const templ = `{{.TotalCount}} issues:
{{range .Items}}----------------------------------------
Number: {{.Number}}
User:   {{.User.Login}}
Title:  {{.Title | printf "%.64s"}}
Age:    {{.CreatedAt | daysAgo}} days
{{end}}`

For each action, there is a current value concept, corresponding to the point operator, which is written with ".". The current value "." is initially initialized as a parameter when calling the template. In the current example, it corresponds to a variable of type github.IssuesSearchResult. The action corresponding to {. TotalCount}} in the template will be expanded to the value printed by the TotalCount member in the structure by default. In the template, {range. Items}} and {end}} correspond to a circular action, so their direct content may be expanded many times, and the current value of each iteration of the circular corresponds to the value of the current items element.

In an action, the| operator represents taking the result of the previous expression as the input of the latter function, similar to the concept of pipeline in UNIX. In the action of the Title line, the second operation is a printf function, which is a built-in function based on the implementation of fmt.Sprintf. All templates can be used directly. For the Age part, the second action is a function called daysAgo, which uses the time.Since function to convert the CreatedAt member to the past time length:

func daysAgo(t time.Time) int {
    return int(time.Since(t).Hours() / 24)
}

Note that the parameter type of CreatedAt is time.Time, not string. In the same way, we can control the formatting of strings by defining some methods. A type can also customize its own JSON encoding and decoding behavior. The JSON value corresponding to the time.Time type is a string in standard time format.

Generating the output of a template requires two steps. The first step is to analyze the template and turn it into an internal representation, and then execute the template based on the specified input. Generally, the analysis template part only needs to be executed once. The following code creates and analyzes the template templ defined above. Pay attention to the order of method call chain: template.New creates and returns a template first; the Funcs method registers the custom functions such as daysAgo into the template and returns to the template; finally, Parse function is used to analyze the template.

report, err := template.New("report").
    Funcs(template.FuncMap{"daysAgo": daysAgo}).
    Parse(templ)
if err != nil {
    log.Fatal(err)
}

Because templates are usually tested at compile time, it is a fatal error if template parsing fails. The template.most helper function can simplify the handling of this fatal error: it takes a template and an error type parameter, detects whether the error is nil (if not, issues a panic exception), and then returns the passed in template.

Once the template has been created and registered with the daysAgo function, and through analysis and detection, we can use github.IssuesSearchResult as the input source and os.Stdout as the output source to execute the template:

var report = template.Must(template.New("issuelist").
    Funcs(template.FuncMap{"daysAgo": daysAgo}).
    Parse(templ))

func main() {
    result, err := github.SearchIssues(os.Args[1:])
    if err != nil {
        log.Fatal(err)
    }
    if err := report.Execute(os.Stdout, result); err != nil {
        log.Fatal(err)
    }
}

The program outputs a plain text report:

$ go build gopl.io/ch4/issuesreport
$ ./issuesreport repo:golang/go is:open json decoder
13 issues:
----------------------------------------
Number: 5680
User:      eaigner
Title:     encoding/json: set key converter on en/decoder
Age:       750 days
----------------------------------------
Number: 6050
User:      gopherbot
Title:     encoding/json: provide tokenizer
Age:       695 days
----------------------------------------
...

Now let's go to the html/template template package. It uses the same API and template language as the text/template package, but adds a feature to automatically escape strings, which can avoid conflicts between input strings and HTML, JavaScript, CSS or URL syntax. This feature can also avoid some long-standing security problems, such as generating HTML injection attacks, constructing a problem title with malicious code, which may make the template output the wrong output, so that they can control the page.

The following template outputs the issue list in HTML format. Note the difference in the import statement:

import "html/template"

var issueList = template.Must(template.New("issuelist").Parse(`
<h1>{{.TotalCount}} issues</h1>
<table>
<tr style='text-align: left'>
  <th>#</th>
  <th>State</th>
  <th>User</th>
  <th>Title</th>
</tr>
{{range .Items}}
<tr>
  <td><a href='{{.HTMLURL}}'>{{.Number}}</a></td>
  <td>{{.State}}</td>
  <td><a href='{{.User.HTMLURL}}'>{{.User.Login}}</a></td>
  <td><a href='{{.HTMLURL}}'>{{.Title}}</a></td>
</tr>
{{end}}
</table>
`))

The following command performs a slightly different query on the new template:

$ go build gopl.io/ch4/issueshtml
$ ./issueshtml repo:golang/go commenter:gopherbot json encoder >issues.html

Figure 4.4 shows the rendering in a web browser. Each issue contains a link to the corresponding Github page.

The issue in Figure 4.4 does not contain special characters that conflict with the HTML format, but we will see the issue with the & and < characters in the title soon. The following command selects two such issues:

$ ./issueshtml repo:golang/go 3133 10535 >issues2.html

Figure 4.5 shows the resu lt s of the query. Note that the html/template package has automatically escaped special characters, so we can still see the correct literal value. If we use the text/template package, these two issue s will generate errors, in which "< 4 characters will be treated as less than" < ", and the" "string will be treated as a link element, which will lead to changes in the structure of HTML documents, leading to unknown risks.

We can also suppress this auto escaping behavior by using the template. HTML type for trusted HTML strings. There are many string types named by type corresponding to trusted JavaScript, CSS and URL respectively. The following program demonstrates two different results of using the same string of different types: A is a normal string, and B is a trusted template.HTML string type.

func main() {
    const templ = `<p>A: {{.A}}</p><p>B: {{.B}}</p>`
    t := template.Must(template.New("escape").Parse(templ))
    var data struct {
        A string        // untrusted plain text
        B template.HTML // trusted HTML
    }
    data.A = "<b>Hello!</b>"
    data.B = "<b>Hello!</b>"
    if err := t.Execute(os.Stdout, data); err != nil {
        log.Fatal(err)
    }
}

Figure 4.6 shows the template output that appears in the browser. We see that the black mark of A has been escaped, but B has not.

We only cover the most basic features of the template system here. As always, if you want to learn more, check the package documentation yourself:

$ go doc text/template
$ go doc html/template

If you have any questions, please leave a message in the comment area and discuss together!

Published 5 original articles, praised 0, visited 45
Private letter follow

Tags: JSON encoding github Go

Posted on Sun, 12 Jan 2020 23:12:50 -0500 by lawnmowerman