# Encoding

While encoding in the Cosmos SDK used to be mainly handled by go-amino codec, the Cosmos SDK is moving towards using gogoprotobuf for both state and client-side encoding.

# Pre-requisite Readings

# Encoding

The Cosmos SDK utilizes two binary wire encoding protocols, Amino (opens new window) which is an object encoding specification and Protocol Buffers (opens new window), a subset of Proto3 with an extension for interface support. See the Proto3 spec (opens new window) for more information on Proto3, which Amino is largely compatible with (but not with Proto2).

Due to Amino having significant performance drawbacks, being reflection-based, and not having any meaningful cross-language/client support, Protocol Buffers, specifically gogoprotobuf (opens new window), is being used in place of Amino. Note, this process of using Protocol Buffers over Amino is still an ongoing process.

Binary wire encoding of types in the Cosmos SDK can be broken down into two main categories, client encoding and store encoding. Client encoding mainly revolves around transaction processing and signing, whereas store encoding revolves around types used in state-machine transitions and what is ultimately stored in the Merkle tree.

For store encoding, protobuf definitions can exist for any type and will typically have an Amino-based "intermediary" type. Specifically, the protobuf-based type definition is used for serialization and persistence, whereas the Amino-based type is used for business logic in the state-machine where they may convert back-n-forth. Note, the Amino-based types may slowly be phased-out in the future, so developers should take note to use the protobuf message definitions where possible.

In the codec package, there exists two core interfaces, BinaryCodec and JSONCodec, where the former encapsulates the current Amino interface except it operates on types implementing the latter instead of generic interface{} types.

In addition, there exists two implementations of Codec. The first being AminoCodec, where both binary and JSON serialization is handled via Amino. The second being ProtoCodec, where both binary and JSON serialization is handled via Protobuf.

This means that modules may use Amino or Protobuf encoding, but the types must implement ProtoMarshaler. If modules wish to avoid implementing this interface for their types, they may use an Amino codec directly.

# Amino

Every module uses an Amino codec to serialize types and interfaces. This codec typically has types and interfaces registered in that module's domain only (e.g. messages), but there are exceptions like x/gov. Each module exposes a RegisterLegacyAminoCodec function that allows a user to provide a codec and have all the types registered. An application will call this method for each necessary module.

Where there is no protobuf-based type definition for a module (see below), Amino is used to encode and decode raw wire bytes to the concrete type or interface:

Copy bz := keeper.cdc.MustMarshal(typeOrInterface) keeper.cdc.MustUnmarshal(bz, &typeOrInterface)

Note, there are length-prefixed variants of the above functionality and this is typically used for when the data needs to be streamed or grouped together (e.g. ResponseDeliverTx.Data)

# Authz authorizations

Since the MsgExec message type can contain different messages instances, it is important that developers add the following code inside the init method of their module's codec.go file:

Copy import authzcodec "github.com/cosmos/cosmos-sdk/x/authz/codec" init() { // Register all Amino interfaces and concrete types on the authz Amino codec so that this can later be // used to properly serialize MsgGrant and MsgExec instances RegisterLegacyAminoCodec(authzcodec.Amino) }

This will allow the x/authz module to properly serialize and de-serializes MsgExec instances using Amino, which is required when signing this kind of messages using a Ledger.

# Gogoproto

Modules are encouraged to utilize Protobuf encoding for their respective types. In the Cosmos SDK, we use the Gogoproto (opens new window) specific implementation of the Protobuf spec that offers speed and DX improvements compared to the official Google protobuf implementation (opens new window).

# Guidelines for protobuf message definitions

In addition to following official Protocol Buffer guidelines (opens new window), we recommend using these annotations in .proto files when dealing with interfaces:

  • use cosmos_proto.accepts_interface to annote fields that accept interfaces
  • pass the same fully qualified name as protoName to InterfaceRegistry.RegisterInterface
  • annotate interface implementations with cosmos_proto.implements_interface
  • pass the same fully qualified name as protoName to InterfaceRegistry.RegisterInterface

# Transaction Encoding

Another important use of Protobuf is the encoding and decoding of transactions. Transactions are defined by the application or the Cosmos SDK but are then passed to the underlying consensus engine to be relayed to other peers. Since the underlying consensus engine is agnostic to the application, the consensus engine accepts only transactions in the form of raw bytes.

  • The TxEncoder object performs the encoding.
  • The TxDecoder object performs the decoding.

Copy // TxDecoder unmarshals transaction bytes type TxDecoder func(txBytes []byte) (Tx, error) // TxEncoder marshals transaction to bytes type TxEncoder func(tx Tx) ([]byte, error)

A standard implementation of both these objects can be found in the auth module:

Copy package tx import ( "fmt" "google.golang.org/protobuf/encoding/protowire" "github.com/cosmos/cosmos-sdk/codec" "github.com/cosmos/cosmos-sdk/codec/unknownproto" sdk "github.com/cosmos/cosmos-sdk/types" sdkerrors "github.com/cosmos/cosmos-sdk/types/errors" "github.com/cosmos/cosmos-sdk/types/tx" ) // DefaultTxDecoder returns a default protobuf TxDecoder using the provided Marshaler. func DefaultTxDecoder(cdc codec.ProtoCodecMarshaler) sdk.TxDecoder { return func(txBytes []byte) (sdk.Tx, error) { // Make sure txBytes follow ADR-027. err := rejectNonADR027TxRaw(txBytes) if err != nil { return nil, sdkerrors.Wrap(sdkerrors.ErrTxDecode, err.Error()) } var raw tx.TxRaw // reject all unknown proto fields in the root TxRaw err = unknownproto.RejectUnknownFieldsStrict(txBytes, &raw, cdc.InterfaceRegistry()) if err != nil { return nil, sdkerrors.Wrap(sdkerrors.ErrTxDecode, err.Error()) } err = cdc.Unmarshal(txBytes, &raw) if err != nil { return nil, err } var body tx.TxBody // allow non-critical unknown fields in TxBody txBodyHasUnknownNonCriticals, err := unknownproto.RejectUnknownFields(raw.BodyBytes, &body, true, cdc.InterfaceRegistry()) if err != nil { return nil, sdkerrors.Wrap(sdkerrors.ErrTxDecode, err.Error()) } err = cdc.Unmarshal(raw.BodyBytes, &body) if err != nil { return nil, sdkerrors.Wrap(sdkerrors.ErrTxDecode, err.Error()) } var authInfo tx.AuthInfo // reject all unknown proto fields in AuthInfo err = unknownproto.RejectUnknownFieldsStrict(raw.AuthInfoBytes, &authInfo, cdc.InterfaceRegistry()) if err != nil { return nil, sdkerrors.Wrap(sdkerrors.ErrTxDecode, err.Error()) } err = cdc.Unmarshal(raw.AuthInfoBytes, &authInfo) if err != nil { return nil, sdkerrors.Wrap(sdkerrors.ErrTxDecode, err.Error()) } theTx := &tx.Tx{ Body: &body, AuthInfo: &authInfo, Signatures: raw.Signatures, } return &wrapper{ tx: theTx, bodyBz: raw.BodyBytes, authInfoBz: raw.AuthInfoBytes, txBodyHasUnknownNonCriticals: txBodyHasUnknownNonCriticals, }, nil } } // DefaultJSONTxDecoder returns a default protobuf JSON TxDecoder using the provided Marshaler. func DefaultJSONTxDecoder(cdc codec.ProtoCodecMarshaler) sdk.TxDecoder { return func(txBytes []byte) (sdk.Tx, error) { var theTx tx.Tx err := cdc.UnmarshalJSON(txBytes, &theTx) if err != nil { return nil, sdkerrors.Wrap(sdkerrors.ErrTxDecode, err.Error()) } return &wrapper{ tx: &theTx, }, nil } } // rejectNonADR027TxRaw rejects txBytes that do not follow ADR-027. This is NOT // a generic ADR-027 checker, it only applies decoding TxRaw. Specifically, it // only checks that: // - field numbers are in ascending order (1, 2, and potentially multiple 3s), // - and varints are as short as possible. // All other ADR-027 edge cases (e.g. default values) are not applicable with // TxRaw. func rejectNonADR027TxRaw(txBytes []byte) error { // Make sure all fields are ordered in ascending order with this variable. prevTagNum := protowire.Number(0) for len(txBytes) > 0 { tagNum, wireType, m := protowire.ConsumeTag(txBytes) if m < 0 { return fmt.Errorf("invalid length; %w", protowire.ParseError(m)) } // TxRaw only has bytes fields. if wireType != protowire.BytesType { return fmt.Errorf("expected %d wire type, got %d", protowire.BytesType, wireType) } // Make sure fields are ordered in ascending order. if tagNum < prevTagNum { return fmt.Errorf("txRaw must follow ADR-027, got tagNum %d after tagNum %d", tagNum, prevTagNum) } prevTagNum = tagNum // All 3 fields of TxRaw have wireType == 2, so their next component // is a varint, so we can safely call ConsumeVarint here. // Byte structure: <varint of bytes length><bytes sequence> // Inner fields are verified in `DefaultTxDecoder` lengthPrefix, m := protowire.ConsumeVarint(txBytes[m:]) if m < 0 { return fmt.Errorf("invalid length; %w", protowire.ParseError(m)) } // We make sure that this varint is as short as possible. n := varintMinLength(lengthPrefix) if n != m { return fmt.Errorf("length prefix varint for tagNum %d is not as short as possible, read %d, only need %d", tagNum, m, n) } // Skip over the bytes that store fieldNumber and wireType bytes. _, _, m = protowire.ConsumeField(txBytes) if m < 0 { return fmt.Errorf("invalid length; %w", protowire.ParseError(m)) } txBytes = txBytes[m:] } return nil } // varintMinLength returns the minimum number of bytes necessary to encode an // uint using varint encoding. func varintMinLength(n uint64) int { switch { // Note: 1<<N == 2**N. case n < 1<<(7): return 1 case n < 1<<(7*2): return 2 case n < 1<<(7*3): return 3 case n < 1<<(7*4): return 4 case n < 1<<(7*5): return 5 case n < 1<<(7*6): return 6 case n < 1<<(7*7): return 7 case n < 1<<(7*8): return 8 case n < 1<<(7*9): return 9 default: return 10 } }

Copy package tx import ( "fmt" "github.com/gogo/protobuf/proto" "github.com/cosmos/cosmos-sdk/codec" sdk "github.com/cosmos/cosmos-sdk/types" txtypes "github.com/cosmos/cosmos-sdk/types/tx" ) // DefaultTxEncoder returns a default protobuf TxEncoder using the provided Marshaler func DefaultTxEncoder() sdk.TxEncoder { return func(tx sdk.Tx) ([]byte, error) { txWrapper, ok := tx.(*wrapper) if !ok { return nil, fmt.Errorf("expected %T, got %T", &wrapper{}, tx) } raw := &txtypes.TxRaw{ BodyBytes: txWrapper.getBodyBytes(), AuthInfoBytes: txWrapper.getAuthInfoBytes(), Signatures: txWrapper.tx.Signatures, } return proto.Marshal(raw) } } // DefaultJSONTxEncoder returns a default protobuf JSON TxEncoder using the provided Marshaler. func DefaultJSONTxEncoder(cdc codec.ProtoCodecMarshaler) sdk.TxEncoder { return func(tx sdk.Tx) ([]byte, error) { txWrapper, ok := tx.(*wrapper) if ok { return cdc.MarshalJSON(txWrapper.tx) } protoTx, ok := tx.(*txtypes.Tx) if ok { return cdc.MarshalJSON(protoTx) } return nil, fmt.Errorf("expected %T, got %T", &wrapper{}, tx) } }

See ADR-020 for details of how a transaction is encoded.

# Interface Encoding and Usage of Any

The Protobuf DSL is strongly typed, which can make inserting variable-typed fields difficult. Imagine we want to create a Profile protobuf message that serves as a wrapper over an account:

Copy message Profile { // account is the account associated to a profile. cosmos.auth.v1beta1.BaseAccount account = 1; // bio is a short description of the account. string bio = 4; }

In this Profile example, we hardcoded account as a BaseAccount. However, there are several other types of user accounts related to vesting, such as BaseVestingAccount or ContinuousVestingAccount. All of these accounts are different, but they all implement the AccountI interface. How would you create a Profile that allows all these types of accounts with an account field that accepts an AccountI interface?

Copy // AccountI is an interface used to store coins at a given address within state. // It presumes a notion of sequence numbers for replay protection, // a notion of account numbers for replay protection for previously pruned accounts, // and a pubkey for authentication purposes. // // Many complex conditions can be used in the concrete struct which implements AccountI. type AccountI interface { proto.Message GetAddress() sdk.AccAddress SetAddress(sdk.AccAddress) error // errors if already set. GetPubKey() cryptotypes.PubKey // can return nil. SetPubKey(cryptotypes.PubKey) error GetAccountNumber() uint64 SetAccountNumber(uint64) error GetSequence() uint64 SetSequence(uint64) error // Ensure that account implements stringer String() string }

In ADR-019, it has been decided to use Any (opens new window)s to encode interfaces in protobuf. An Any contains an arbitrary serialized message as bytes, along with a URL that acts as a globally unique identifier for and resolves to that message's type. This strategy allows us to pack arbitrary Go types inside protobuf messages. Our new Profile then looks like:

Copy message Profile { // account is the account associated to a profile. google.protobuf.Any account = 1 [ (cosmos_proto.accepts_interface) = "AccountI"; // Asserts that this field only accepts Go types implementing `AccountI`. It is purely informational for now. ]; // bio is a short description of the account. string bio = 4; }

To add an account inside a profile, we need to "pack" it inside an Any first, using codectypes.NewAnyWithValue:

Copy var myAccount AccountI myAccount = ... // Can be a BaseAccount, a ContinuousVestingAccount or any struct implementing `AccountI` // Pack the account into an Any accAny, err := codectypes.NewAnyWithValue(myAccount) if err != nil { return nil, err } // Create a new Profile with the any. profile := Profile { Account: accAny, Bio: "some bio", } // We can then marshal the profile as usual. bz, err := cdc.Marshal(profile) jsonBz, err := cdc.MarshalJSON(profile)

To summarize, to encode an interface, you must 1/ pack the interface into an Any and 2/ marshal the Any. For convenience, the Cosmos SDK provides a MarshalInterface method to bundle these two steps. Have a look at a real-life example in the x/auth module (opens new window).

The reverse operation of retrieving the concrete Go type from inside an Any, called "unpacking", is done with the GetCachedValue() on Any.

Copy profileBz := ... // The proto-encoded bytes of a Profile, e.g. retrieved through gRPC. var myProfile Profile // Unmarshal the bytes into the myProfile struct. err := cdc.Unmarshal(profilebz, &myProfile) // Let's see the types of the Account field. fmt.Printf("%T\n", myProfile.Account) // Prints "Any" fmt.Printf("%T\n", myProfile.Account.GetCachedValue()) // Prints "BaseAccount", "ContinuousVestingAccount" or whatever was initially packed in the Any. // Get the address of the accountt. accAddr := myProfile.Account.GetCachedValue().(AccountI).GetAddress()

It is important to note that for GetCachedValue() to work, Profile (and any other structs embedding Profile) must implement the UnpackInterfaces method:

Copy func (p *Profile) UnpackInterfaces(unpacker codectypes.AnyUnpacker) error { if p.Account != nil { var account AccountI return unpacker.UnpackAny(p.Account, &account) } return nil }

The UnpackInterfaces gets called recursively on all structs implementing this method, to allow all Anys to have their GetCachedValue() correctly populated.

For more information about interface encoding, and especially on UnpackInterfaces and how the Any's type_url gets resolved using the InterfaceRegistry, please refer to ADR-019.

# Any Encoding in the Cosmos SDK

The above Profile example is a fictive example used for educational purposes. In the Cosmos SDK, we use Any encoding in several places (non-exhaustive list):

  • the cryptotypes.PubKey interface for encoding different types of public keys,
  • the sdk.Msg interface for encoding different Msgs in a transaction,
  • the AccountI interface for encodinig different types of accounts (similar to the above example) in the x/auth query responses,
  • the Evidencei interface for encoding different types of evidences in the x/evidence module,
  • the AuthorizationI interface for encoding different types of x/authz authorizations,
  • the Validator (opens new window) struct that contains information about a validator.

A real-life example of encoding the pubkey as Any inside the Validator struct in x/staking is shown in the following example:

Copy // NewValidator constructs a new Validator //nolint:interfacer func NewValidator(operator sdk.ValAddress, pubKey cryptotypes.PubKey, description Description) (Validator, error) { pkAny, err := codectypes.NewAnyWithValue(pubKey) if err != nil { return Validator{}, err } return Validator{ OperatorAddress: operator.String(), ConsensusPubkey: pkAny, Jailed: false, Status: Unbonded, Tokens: sdk.ZeroInt(), DelegatorShares: sdk.ZeroDec(), Description: description, UnbondingHeight: int64(0), UnbondingTime: time.Unix(0, 0).UTC(), Commission: NewCommission(sdk.ZeroDec(), sdk.ZeroDec(), sdk.ZeroDec()), MinSelfDelegation: sdk.OneInt(), }, nil }


# How to create modules using protobuf encoding

# Defining module types

Protobuf types can be defined to encode:

# Naming and conventions

We encourage developers to follow industry guidelines: Protocol Buffers style guide (opens new window) and Buf (opens new window), see more details in ADR 023

# How to update modules to protobuf encoding

If modules do not contain any interfaces (e.g. Account or Content), then they may simply migrate any existing types that are encoded and persisted via their concrete Amino codec to Protobuf (see 1. for further guidelines) and accept a Marshaler as the codec which is implemented via the ProtoCodec without any further customization.

However, if a module type composes an interface, it must wrap it in the sdk.Any (from /types package) type. To do that, a module-level .proto file must use google.protobuf.Any (opens new window) for respective message type interface types.

For example, in the x/evidence module defines an Evidence interface, which is used by the MsgSubmitEvidence. The structure definition must use sdk.Any to wrap the evidence file. In the proto file we define it as follows:

Copy // proto/cosmos/evidence/v1beta1/tx.proto message MsgSubmitEvidence { string submitter = 1; google.protobuf.Any evidence = 2 [(cosmos_proto.accepts_interface) = "Evidence"]; }

The Cosmos SDK codec.Codec interface provides support methods MarshalInterface and UnmarshalInterface to easy encoding of state to Any.

Module should register interfaces using InterfaceRegistry which provides a mechanism for registering interfaces: RegisterInterface(protoName string, iface interface{}, impls ...proto.Message) and implementations: RegisterImplementations(iface interface{}, impls ...proto.Message) that can be safely unpacked from Any, similarly to type registration with Amino:

Copy // InterfaceRegistry provides a mechanism for registering interfaces and // implementations that can be safely unpacked from Any type InterfaceRegistry interface { AnyUnpacker jsonpb.AnyResolver // RegisterInterface associates protoName as the public name for the // interface passed in as iface. This is to be used primarily to create // a public facing registry of interface implementations for clients. // protoName should be a well-chosen public facing name that remains stable. // RegisterInterface takes an optional list of impls to be registered // as implementations of iface. // // Ex: // registry.RegisterInterface("cosmos.base.v1beta1.Msg", (*sdk.Msg)(nil)) RegisterInterface(protoName string, iface interface{}, impls ...proto.Message) // RegisterImplementations registers impls as concrete implementations of // the interface iface. // // Ex: // registry.RegisterImplementations((*sdk.Msg)(nil), &MsgSend{}, &MsgMultiSend{}) RegisterImplementations(iface interface{}, impls ...proto.Message) // ListAllInterfaces list the type URLs of all registered interfaces. ListAllInterfaces() []string // ListImplementations lists the valid type URLs for the given interface name that can be used // for the provided interface type URL. ListImplementations(ifaceTypeURL string) []string }

In addition, an UnpackInterfaces phase should be introduced to deserialization to unpack interfaces before they're needed. Protobuf types that contain a protobuf Any either directly or via one of their members should implement the UnpackInterfacesMessage interface:

Copy type UnpackInterfacesMessage interface { UnpackInterfaces(InterfaceUnpacker) error }

# Custom Stringer

Using option (gogoproto.goproto_stringer) = false; in a proto message definition leads to unexpected behaviour, like returning wrong output or having missing fields in the output. For that reason a proto Message's String() must not be customized, and the goproto_stringer option must be avoided.

A correct YAML output can be obtained through ProtoJSON, using the JSONToYAML function:

Copy // MarshalYAML marshals toPrint using JSONCodec to leverage specialized MarshalJSON methods // (usually related to serialize data with protobuf or amin depending on a configuration). // This involves additional roundtrip through JSON. func MarshalYAML(cdc JSONCodec, toPrint proto.Message) ([]byte, error) { // We are OK with the performance hit of the additional JSON roundtip. MarshalYAML is not // used in any critical parts of the system. bz, err := cdc.MarshalJSON(toPrint) if err != nil { return nil, err } return yaml.JSONToYAML(bz) }

For example:

Copy func (acc BaseAccount) String() string { out, _ := acc.MarshalYAML() return out.(string) } // MarshalYAML returns the YAML representation of an account. func (acc BaseAccount) MarshalYAML() (interface{}, error) { bz, err := codec.MarshalYAML(codec.NewProtoCodec(codectypes.NewInterfaceRegistry()), &acc) if err != nil { return nil, err } return string(bz), err }

# Next

Learn about gRPC, REST and other endpoints