Abhinav Gupta | About

Parsing Newline Delimited JSON in Go

Introduction

Newline Delimited JSON (ndjson) is a format to represent a stream of structured objects. A stream in this format takes the following form.

{"base": "white rice", "proteins": ["tofu"]}
{"base": "salad", "proteins": ["tuna", "salmon"]}

We can parse these with Go’s encoding/json package.

Parsing JSON

Before delving into Newline Delimited JSON, let’s reiterate over how to parse a single JSON value with encoding/json.

Consider the Order type.

type Order struct {
	Base     string   `json:"base"`
	Proteins []string `json:"proteins"`
}

To parse a JSON payload into an Order object, we use json.Unmarshal.

var order Order
if err := json.Unmarshal(data, &order); err != nil {
    return fmt.Errorf("parse order: %w", err)
}

fmt.Println(order)

Using json.Decoder

The encoding/json package exports json.Decoder, which provides more control over JSON parsing. Part of its interface is shown below.

type Decoder
    func NewDecoder(io.Reader) *Decoder
	func (*Decoder) Decode(interface{}) error

The following use of json.NewDecoder to parse a single JSON value is roughly equivalent to the prior use of json.Unmarshal.

var order Order
if err := json.NewDecoder(src).Decode(&order); err != nil {
    return fmt.Errorf("parse order: %w", err)
}

fmt.Println(order)

It’s roughly equivalent with one significant difference: the input is now an io.Reader instead of a []byte.

json.Unmarshal(data, &order)			// data is a []byte
json.NewDecoder(src).Decode(&order)		// src is an io.Reader

Parsing Newline Delimited JSON

json.Decoder can parse a JSON value from an io.Reader without reading its entire contents.

Additionally, if the same decoder is read from multiple times, it parses consecutive JSON values from the io.Reader.

decoder := json.NewDecoder(src)

var o1 Order
if err := decoder.Decode(&o1); err != nil {
    return err
}

var o2 Order
if err := decoder.Decode(&o2); err != nil {
    return err
}

// ...

However, it’s unclear when we should stop reading. To help with that, json.Decoder includes the More() bool method, which reports whether there’s more JSON input available on the io.Reader.

type Decoder
    func (*Decoder) More() bool

With its help, we can use json.Decoder to parse Newline Delimited JSON like so.

decoder := json.NewDecoder(src)
for decoder.More() {
	var order Order
    if err := decoder.Decode(&order); err != nil {
    	return fmt.Errorf("parse order: %w", err)
    }

    fmt.Println(order)
}

Written on 2020-07-02.