1. Introduction
With gopkg.in/yaml.v2, you can evolve a YAML shape in a
backwards-compatible manner by implementing the
yaml.Unmarshaler
interface on your type.
Consider the following shape.
type Config struct {
Users []string `yaml:"users"`
}
Valid YAML input for this shape takes the following form.
users:
- alice
- bob
- carol
- dave
Suppose that the program evolves over time, and there is now need to optionally specify a role for users. Valid roles are: User, Mod, and Admin. User is the default role.
type Role int
const (
RoleUser Role = iota
RoleMod
RoleAdmin
)
func (r Role) String() string {
switch r {
case RoleUser:
return "user"
case RoleMod:
return "mod"
case RoleAdmin:
return "admin"
default:
// unknown role
return fmt.Sprintf("Role(%d)", int(r))
}
}
💡 Tip
|
Prefer to start If there is no obvious default, start enumerations at one. |
2. False start
One way to quickly hack in support for roles is by adding a separate roles section.
roles:
admins: [alice]
mods: [dave]
This is brittle, complex, and inflexible.
- Brittle
-
End users must now duplicate names from the
users
section into theroles
section. It is easy to get a name wrong with a bad copy-paste or a typo. - Complex
-
The implementation must now cross-validate entries between
users
androles
. - Inflexible
-
This leaves no room for more properties to be associated with users. Every new property will end up with its own similar section, further exacerbating the prior two issues.
3. Evolve the schema
A better approach is to evolve the YAML schema so that each item in
users
can be one of two things:
-
a string specifying the name; this will behave like before
-
a user object specifying the name and other properties
The above will now take the following form.
users:
- name: alice (1)
role: admin
- bob (2)
- carol
- name: dave
role: mod
-
Alice and Dave use the new form, specifying an object with the fields
name
androle
. -
Bob and Carol use the old form, specifying just the name as a string.
Compare this with the false start.
-
user names do not need to be duplicated
-
the implementation validates the shape once
-
the user object has room for new properties
4. Implement it
To implement this with gopkg.in/yaml.v2, switch the old
Config.Users
field to a list of newly-declared User
structs.
type Config struct {
- Users []string `yaml:"users"`
+ Users []*User `yaml:"users"`
}
+
+type User struct {
+ Name string `yaml:"name"`
+ Role Role `yaml:"role"`
+}
Teach the YAML library how to parse Role
values from YAML, by
implementing encoding.TextUnmarshaler
for Role
.
// UnmarshalText specifies how to parse a Role from a string.
func (r *Role) UnmarshalText(bs []byte) error {
switch string(bs) {
case "user":
*r = RoleUser
case "mod":
*r = RoleMod
case "admin":
*r = RoleAdmin
default:
return fmt.Errorf("unknown role %q", bs)
}
return nil
}
💡 Tip
|
Use dmarkham/enumer with the |
This gets the implementation as far as supporting objects with a name
and role
.
users:
- name: alice
role: admin
- name: dave
role: mod
But it does not yet support the old form:
users:
- carol # ERROR
5. Make it flexible
To make the new format fully backwards-compatible,
it needs to support plain strings for users whose role is user
.
Do this by implementing the yaml.Unmarshaler
interface for
User
.
The interface expects the following method:
func (*User) UnmarshalYAML(
unmarshal func(interface{}) error,
) error
When decoding a User
, the YAML library will call this method with a
reference to a function called unmarshal
.
This function, when invoked with a pointer to a value, will attempt to decode
the underlying YAML into that value.
For example,
var s string
err := unmarshal(&s)
The key feature here is this: you can call unmarshal
any number of
times.
This lets User.UnmarshalYAML
attempt to decode its YAML data into a
string (for the old form), and if that fails, into a full User object.
func (u *User) UnmarshalYAML(
unmarshal func(interface{}) error,
) error {
var name string
if err := unmarshal(&name); err == nil {
// The old format was used. Only the name was specified.
// For example,
//
// - bob
//
// Set just the name.
u.Name = name
return nil
}
// The new format was used. A full object was specified.
// For example,
//
// - name: dave
// role: mod
//
// Decode the whole object.
type rawUser User
if err := unmarshal((*rawUser)(u)); err != nil {
return err
}
// Nothing to do. (*rawUser)(u) above hydrated *u.
return nil
}
📢 Important
|
The new
|
6. What about yaml.v3?
gopkg.in/yaml.v3 includes a similar
yaml.Unmarshaler
interface.
type Unmarshaler interface {
UnmarshalYAML(value *yaml.Node) error
}
Instead of an unmarshal
function, yaml.v3 gives UnmarshalYAML a
yaml.Node
This object includes the following method,
which behaves similar to the unmarshal
function in yaml.v2.
// Decode decodes the node and stores its data
// into the value pointed to by v.
func (*Node) Decode(interface{}) error
Change the parameter in User.UnmarshalYAML
to a *yaml.Node
and all
unmarshal(x)
function calls to value.Decode(x)
.
This now works with gopkg.in/yaml.v3.
func (u *User) UnmarshalYAML(
- unmarshal func(interface{}) error,
+ value *yaml.Node,
) error {
var name string
- if err := unmarshal(&name); err == nil {
+ if err := value.Decode(&name); err == nil {
...
}
...
- if err := unmarshal((*rawUser)(u)); err != nil {
+ if err := value.Decode((*rawUser)(u)); err != nil {
return err
}
7. JSON
The same method be adapted for flexible JSON.
The encoding/json package from the Go standard library
includes a json.Unmarshaler
interface.
type Unmarshaler interface {
UnmarshalJSON([]byte) error
}
Although it does not supply a handy unmarshal
function like yaml.v2,
there is nothing stopping direct use of json.Unmarshal
on the provided byte slice.
func (u *User) UnmarshalJSON(data []byte) error {
var name string
if err := json.Unmarshal(data, &name); err == nil {
// {"users": ["carol"]}
u.Name = name
return nil
}
// {"users": [{"name": "alice", "role": "admin"}]}
type rawUser User
if err := json.Unmarshal(data, (*rawUser)(u)); err != nil {
return err
}
return nil
}
7.1. Share logic with YAML
(Added on 2022-01-24.)
If a type implements both, UnmarshalYAML
and UnmarshalJSON
,
you can share the decoding logic between them by taking advantage of the
unmarshal
argument of UnmarshalYAML
.
func (*User) UnmarshalYAML(
unmarshal func(interface{}) error,
) error
This argument typically comes from the YAML library, but you can also provide your own implementation.
Before we do this, note that you can restructure the UnmarshalJSON implementation above like so:
func (u *User) UnmarshalJSON(data []byte) error {
+ unmarshal := func(target interface{}) error {
+ return json.Unmarshal(data, target)
+ }
+
var name string
- if err := json.Unmarshal(data, &name); err == nil {
+ if err := unmarshal(&name); err == nil {
// {"users": ["carol"]}
u.Name = name
return nil
}
// {"users": [{"name": "alice", "role": "admin"}]}
type rawUser User
- if err := json.Unmarshal(data, (*rawUser)(u)); err != nil {
+ if err := unmarshal((*rawUser)(u)); err != nil {
return err
}
return nil
}
This defines an anonymous function unmarshal
which decodes data
into a
target
using json.Unmarshal
.
The rest of the UnmarshalJSON
implementation uses that function everywhere it
previously called json.Unmarshal
.
The UnmarshalJSON
implementation above should look familiar.
Besides the anonymous function defined at the start, it’s exactly the same as
UnmarshalYAML
.
// UnmarshalJSON(data) | // UnmarshalYAML(unmarshal) | unmarshal := func(target interface{}) error { | return json.Unmarshal(data, target) | } | | var name string | var name string if err := unmarshal(&name); err == nil { | if err := unmarshal(&name); err == nil { u.Name = name | u.Name = name return nil | return nil } | } | type rawUser User | type rawUser User if err := unmarshal((*rawUser)(u)); err != nil { | if err := unmarshal((*rawUser)(u)); err != nil { return err | return err } | } | return nil | return nil
And that’s the key insight to make this re-use possible:
call UnmarshalYAML
from UnmarshalJSON
,
passing in that anonymous function as the unmarshal
argument.
func (u *User) UnmarshalJSON(data []byte) error {
unmarshal := func(target interface{}) error {
return json.Unmarshal(data, target)
}
return u.UnmarshalYAML(unmarshal)
}
💡 Tip
|
The above works but it merits a minor refactor to avoid questions like "Why does the JSON decoding method call the YAML decoding method?" in the future. If you do this, move the core decoding logic into a separate
|
8. Conclusion
Interfaces like yaml.Unmarshaler
,
its v3 variant,
and json.Unmarshaler
facilitate evolution of YAML and JSON shapes in a backwards-compatible way.
This pattern likely applies to libraries for other encoding formats too.
Edit(2022-01-24): Added Share logic with YAML.