When writing tooling that interacts with Go code,
the packages in the go/*
tree in the standard library are both,
invaluable and necessary.
All packages in this tree have a direct dependency on go/token.
Let’s look at the subset of the API exported by go/token
.
type File
func (*File) AddLine(offset int)
func (*File) Offset(Pos) int
func (*File) Pos(offset int) Pos
func (*File) Position(Pos) Position
type FileSet
func NewFileSet() *FileSet
func (*FileSet) AddFile(name string, base, size int) *File
func (*FileSet) File(Pos) *File
func (*FileSet) Position(Pos) Position
type Pos int
type Position struct {
Filename string
Offset int
Line int
Column int
}
Keeping the above subset in mind, this post discusses the purpose of these APIs, how they work, and how to use them.
1. What and why
FileSet
A
FileSet
manages state across zero or moreFile
s.File
File
s get added to theFileSet
with theAddFile
method. EachFile
knows its name, length, and offsets within it where new lines start.Pos
A
Pos
is an integer unique across theFileSet
that indexes to a specific offset in a specificFile
in thatFileSet
.
The key insight here is that File`s know offsets at which new lines
start, and `Pos
maps to an offset within a File
.
Given an offset, and the offsets at which new lines start, the algorithm to calculate the line and column number of that offset is pretty straightforward.
This makes Pos
a cheap representation of positional information which
would otherwise need a struct with 4 fields: file name, line number,
column number, and offset in file.
Pos
, being an integer, also makes moving between offsets a matter of basic
arithmetic.
For example, imagine you have 4 files of lengths 10, 15, 5, and 16
bytes. The following visualizes the FileSet
built from those files.
File # 1 2 3 4 +----------+---------------+-----+----------------+ | 10 bytes | 15 bytes | 5 b | 16 bytes | +----------+---------------+-----+----------------+ Pos 1 12 28 34 51
Integers in the range [1, 51)
map to a specific file and an offset
within that file.
File # 1 2 3 4 +----------+---------------+-----+----------------+ | 10 bytes | 15 bytes | 5 b | 16 bytes | +----------+---------------+-----+----------------+ Pos 1 ^ 12 28 34 51 | 5
Given the Pos
for offset 5, adding 3 to it moves to the Pos
for
offset 8 within the same file.
The following table presents other examples of Pos
values given the
above FileSet
.
Pos | File | Offset in File |
---|---|---|
5 | 1 | 4 |
6 | 1 | 5 |
12 | 2 | 0 |
30 | 3 | 2 |
40 | 4 | 6 |
2. Usage
token.NewFileSet
creates a new empty FileSet
.
fset := token.NewFileSet()
2.1. Adding files
go/parser
handles adding files to the FileSet
when parsing Go code,
but it may sometimes be necessary to do it yourself.
Add a file to the FileSet
with AddFile(name, base, size)
where,
name
Name of the file. It’s not required to be unique across the
FileSet
.base
Position within the
FileSet
at which the range for this file starts. Set this to-1
to say that the range for the new file starts when the range for the previous file ends.size
Number of bytes in the file.
For example,
file1 := fset.AddFile("file1", -1, 10) // base == 1
file2 := fset.AddFile("file2", -1, 15) // base == 12
file3 := fset.AddFile("file3", -1, 5) // base == 28
file4 := fset.AddFile("file4", -1, 16) // base == 34
fmt.Println(fset.Base()) // base == 51
Inform Files
where new lines begin using one of the following methods.
Zero or more
file.AddLine(offset)
calls informing it of the offsets at which each new line begins.go/scanner
uses this API as it encounters newlines while tokenizing a Go file.A single
file.SetLines([]int)
call which accepts a series of offsets of the first characters of each line.A single
file.SetLinesForContent([]byte)
call which accepts the contents of the file.
📝 Note | The offsets accepted by AddLine and SetLines are not those of the
newline character (\n ). They’re offsets of the character following that: the
first character of each new line. |
2.2. Accessing positional information
Retrieve the File
for a Pos
using the FileSet.File(Pos)
method.
file := fset.File(pos)
Convert a Pos
to an offset within its File
or the reverse with the
File.Pos(int)
and File.Offset(Pos)
methods.
off := file.Offset(pos) // offset for the Pos
pos := file.Pos(5) // Pos for offset 5
fmt.Println(file.Pos(file.Offset(pos)) == pos) // true
Extract positional information from a FileSet
with the
FileSet.Position(Pos)
method. This returns a Position
struct.
fmt.Println(fset.Position(pos)) // example.go:5:3
The same information is available from a File
with the Position(Pos)
method. This is more efficient if you already have the File
for a
Pos
.
file := fset.File(pos)
fmt.Println(file.Position(pos)) // example.go:5:3
2.3. Correlating positions in generated code
The recorded positional information alone is not useful for generated files.
Correlating it to the source files is more useful.
go/token
supports this with the
File.AddLineColumnInfo(offset, filename, line, column)
method.
file.AddLineColumnInfo(5, "src.in", 15, 1)
This indicates that the contents of the file from offset 5 onwards correspond
to line 15 and column 1 of src.in
.
The Position
s for these offsets will be in src.in
, relative to line 15
and column 1.
fmt.Println(file.Position(file.Pos(5))) // src.in:15:1
fmt.Println(file.Position(file.Pos(6))) // src.in:15:2
fmt.Println(file.Position(file.Pos(7))) // src.in:16:1
go/scanner
uses this API to interpret //line
directives.
Raw positional information is still accessible with the
PositionFor(Pos, bool)
method on File
and FileSet
.
The second argument specifies whether to respect positional overrides in
calculating the Position
.
fmt.Println(file.PositionFor(file.Pos(5), false)) // example.go:3:2
fmt.Println(file.PositionFor(file.Pos(6), false)) // example.go:3:3
fmt.Println(file.PositionFor(file.Pos(7), false)) // example.go:4:1