Abhinav Gupta   About

Understanding Go's token.Pos

Introduction

When writing tooling that interacts with Go code, the packages in the go/* tree in the standard library are both, invaluable and necessary. All packages in this tree have a direct dependency on go/token.

Let’s look at the subset of the API exported by go/token.

type File
  func (*File) AddLine(offset int)
  func (*File) Offset(Pos) int
  func (*File) Pos(offset int) Pos
  func (*File) Position(Pos) Position

type FileSet
  func NewFileSet() *FileSet
  func (*FileSet) AddFile(name string, base, size int) *File
  func (*FileSet) File(Pos) *File
  func (*FileSet) Position(Pos) Position

type Pos int

type Position struct {
  Filename string
  Offset   int
  Line     int
  Column   int
}

Keeping the above subset in mind, this post discusses the purpose of these APIs, how they work, and how to use them.

What and Why

A FileSet manages state across zero or more Files. Files get added to the FileSet with the AddFile method. Each File knows its name, length, and offsets within it where new lines start. A Pos is an integer unique across the FileSet that indexes to a specific offset in a specific File in that FileSet.

The key insight here is that Files know offsets at which new lines start, and Pos maps to an offset within a File. Given an offset, and the offsets at which new lines start, the algorithm to calculate the line and column number of that offset is pretty straightforward.

This makes Pos a cheap representation of positional information which would otherwise need a struct with 4 fields: file name, line number, column number, and offset in file. Pos being an integer also makes moving between offsets a matter of basic arithmetic.

For example, imagine you have 4 files of lengths 10, 15, 5, and 16 bytes. The following visualizes the FileSet built from those files.

File # 1          2               3     4
       +----------+---------------+-----+----------------+
       | 10 bytes | 15 bytes      | 5 b | 16 bytes       |
       +----------+---------------+-----+----------------+
   Pos 1          12              28    34               51

Integers in the range [1, 51) map to a specific file and an offset within that file.

File # 1          2               3     4
       +----------+---------------+-----+----------------+
       | 10 bytes | 15 bytes      | 5 b | 16 bytes       |
       +----------+---------------+-----+----------------+
   Pos 1    ^     12              28    34               51
            |
            5

Given the Pos for offset 5, adding 3 to it moves to the Pos for offset 8 within the same file.

The following table presents other examples of Pos values given the above FileSet.

PosFileOffset in File
514
615
1220
3032
4046

Usage

token.NewFileSet creates a new empty FileSet.

fset := token.NewFileSet()

Adding Files

go/parser handles adding files to the FileSet when parsing Go code, but it may sometimes be necessary to do it yourself. Add a file to the FileSet with AddFile(name, base, size) where,

file1 := fset.AddFile("file1", -1, 10)  // base == 1
file2 := fset.AddFile("file2", -1, 15)  // base == 12
file3 := fset.AddFile("file3", -1, 5)   // base == 28
file4 := fset.AddFile("file4", -1, 16)  // base == 34

fmt.Println(fset.Base()) // base == 51

Inform Files where new lines begin using one of the following methods.

Note that the offsets accepted by AddLine and SetLines are not those of the newline character (\n). They’re offsets of the character following that: the first character of each new line.

Accessing Positional Information

Retrieve the File for a Pos using the FileSet.File(Pos) method.

file := fset.File(pos)

Convert a Pos to an offset within its File or the reverse with the File.Pos(int) and File.Offset(Pos) methods.

off := file.Offset(pos)  // offset for the Pos
pos := file.Pos(5)       // Pos for offset 5

fmt.Println(file.Pos(file.Offset(pos)) == pos)  // true

Extract positional information from a FileSet with the FileSet.Position(Pos) method. This returns a Position struct.

fmt.Println(fset.Position(pos))  // example.go:5:3

The same information is available from a File with the Position(Pos) method. This is more efficient if you already have the File for a Pos.

file := fset.File(pos)
fmt.Println(file.Position(pos))  // example.go:5:3

Correlating Positions in Generated Code

The recorded positional information alone is not useful for generated files. Correlating it to the source files is more useful. go/token supports this with the File.AddLineColumnInfo(offset, filename, line, column) method.

file.AddLineColumnInfo(5, "src.in", 15, 1)

This indicates that the contents of the file from offset 5 onwards correspond to line 15 and column 1 of src.in. The Positions for these offsets will be in src.in, relative to line 15 and column 1.

fmt.Println(file.Position(file.Pos(5)))  // src.in:15:1
fmt.Println(file.Position(file.Pos(6)))  // src.in:15:2
fmt.Println(file.Position(file.Pos(7)))  // src.in:16:1

go/scanner uses this API to interpret //line directives.

Raw positional information is still accessible with the PositionFor(Pos, bool) method on File and FileSet. The second argument specifies whether to respect positional overrides in calculating the Position.

fmt.Println(file.PositionFor(file.Pos(5), false))  // example.go:3:2
fmt.Println(file.PositionFor(file.Pos(6), false))  // example.go:3:3
fmt.Println(file.PositionFor(file.Pos(7), false))  // example.go:4:1

Conclusion

token.Pos and related types use a clever technique to cheaply and efficiently track and represent positional information in a parser. You can reuse this technique in your own systems if you are writing a parser.

Written on Mar 28, 2019.