ADR 047: Extend Upgrade Plan
Changelog
- Nov, 23, 2021: Initial Draft
Status
PROPOSED Not Implemented
Abstract
This ADR expands the existing x/upgrade Plan
proto message to include new fields for defining pre-run and post-run processes within upgrade tooling.
It also defines a structure for providing downloadable artifacts involved in an upgrade.
Context
The upgrade
module in conjunction with Cosmovisor are designed to facilitate and automate a blockchain's transition from one version to another.
Users submit a software upgrade governance proposal containing an upgrade Plan
.
The Plan currently contains the following fields:
name
: A short string identifying the new version.height
: The chain height at which the upgrade is to be performed.info
: A string containing information about the upgrade.
The info
string can be anything.
However, Cosmovisor will try to use the info
field to automatically download a new version of the blockchain executable.
For the auto-download to work, Cosmovisor expects it to be either a stringified JSON object (with a specific structure defined through documentation), or a URL that will return such JSON.
The JSON object identifies URLs used to download the new blockchain executable for different platforms (OS and Architecture, e.g. "linux/amd64").
Such a URL can either return the executable file directly or can return an archive containing the executable and possibly other assets.
If the URL returns an archive, it is decompressed into {DAEMON_HOME}/cosmovisor/{upgrade name}
.
Then, if {DAEMON_HOME}/cosmovisor/{upgrade name}/bin/{DAEMON_NAME}
does not exist, but {DAEMON_HOME}/cosmovisor/{upgrade name}/{DAEMON_NAME}
does, the latter is copied to the former.
If the URL returns something other than an archive, it is downloaded to {DAEMON_HOME}/cosmovisor/{upgrade name}/bin/{DAEMON_NAME}
.
If an upgrade height is reached and the new version of the executable version isn't available, Cosmovisor will stop running.
Both DAEMON_HOME
and DAEMON_NAME
are environment variables used to configure Cosmovisor.
Currently, there is no mechanism that makes Cosmovisor run a command after the upgraded chain has been restarted.
The current upgrade process has this timeline:
- An upgrade governance proposal is submitted and approved.
- The upgrade height is reached.
- The
x/upgrade
module writes theupgrade_info.json
file. - The chain halts.
- Cosmovisor backs up the data directory (if set up to do so).
- Cosmovisor downloads the new executable (if not already in place).
- Cosmovisor executes the
${DAEMON_NAME} pre-upgrade
. - Cosmovisor restarts the app using the new version and same args originally provided.
Decision
Protobuf Updates
We will update the x/upgrade.Plan
message for providing upgrade instructions.
The upgrade instructions will contain a list of artifacts available for each platform.
It allows for the definition of a pre-run and post-run commands.
These commands are not consensus guaranteed; they will be executed by Cosmosvisor (or other) during its upgrade handling.
message Plan {
// ... (existing fields)
UpgradeInstructions instructions = 6;
}
The new UpgradeInstructions instructions
field MUST be optional.
message UpgradeInstructions {
string pre_run = 1;
string post_run = 2;
repeated Artifact artifacts = 3;
string description = 4;
}
All fields in the UpgradeInstructions
are optional.
pre_run
is a command to run prior to the upgraded chain restarting. If defined, it will be executed after halting and downloading the new artifact but before restarting the upgraded chain. The working directory this command runs from MUST be{DAEMON_HOME}/cosmovisor/{upgrade name}
. This command MUST behave the same as the current pre-upgrade command. It does not take in any command-line arguments and is expected to terminate with the following exit codes:Exit status code How it is handled in Cosmosvisor 0
Assumes pre-upgrade
command executed successfully and continues the upgrade.1
Default exit code when pre-upgrade
command has not been implemented.30
pre-upgrade
command was executed but failed. This fails the entire upgrade.31
pre-upgrade
command was executed but failed. But the command is retried until exit code1
or30
are returned.If defined, then the app supervisors (e.g. Cosmovisor) MUST NOT run
app pre-run
.post_run
is a command to run after the upgraded chain has been started. If defined, this command MUST be only executed at most once by an upgrading node. The output and exit code SHOULD be logged but SHOULD NOT affect the running of the upgraded chain. The working directory this command runs from MUST be{DAEMON_HOME}/cosmovisor/{upgrade name}
.artifacts
define items to be downloaded. It SHOULD have only one entry per platform.description
contains human-readable information about the upgrade and might contain references to external resources. It SHOULD NOT be used for structured processing information.
message Artifact {
string platform = 1;
string url = 2;
string checksum = 3;
string checksum_algo = 4;
}
platform
is a required string that SHOULD be in the format{OS}/{CPU}
, e.g."linux/amd64"
. The string"any"
SHOULD also be allowed. AnArtifact
with aplatform
of"any"
SHOULD be used as a fallback when a specific{OS}/{CPU}
entry is not found. That is, if anArtifact
exists with aplatform
that matches the system's OS and CPU, that should be used; otherwise, if anArtifact
exists with aplatform
ofany
, that should be used; otherwise no artifact should be downloaded.url
is a required URL string that MUST conform to RFC 1738: Uniform Resource Locators. A request to thisurl
MUST return either an executable file or an archive containing eitherbin/{DAEMON_NAME}
or{DAEMON_NAME}
. The URL should not contain checksum - it should be specified by thechecksum
attribute.checksum
is a checksum of the expected result of a request to theurl
. It is not required, but is recommended. If provided, it MUST be a hex encoded checksum string. Tools utilizing theseUpgradeInstructions
MUST fail if achecksum
is provided but is different from the checksum of the result returned by theurl
.checksum_algo
is a string identify the algorithm used to generate thechecksum
. Recommended algorithms:sha256
,sha512
. Algorithms also supported (but not recommended):sha1
,md5
. If achecksum
is provided, achecksum_algo
MUST also be provided.
A url
is not required to contain a checksum
query parameter.
If the url
does contain a checksum
query parameter, the checksum
and checksum_algo
fields MUST also be populated, and their values MUST match the value of the query parameter.
For example, if the url
is "https://example.com?checksum=md5:d41d8cd98f00b204e9800998ecf8427e"
, then the checksum
field must be "d41d8cd98f00b204e9800998ecf8427e"
and the checksum_algo
field must be "md5"
.
Upgrade Module Updates
If an upgrade Plan
does not use the new UpgradeInstructions
field, existing functionality will be maintained.
The parsing of the info
field as either a URL or binaries
JSON will be deprecated.
During validation, if the info
field is used as such, a warning will be issued, but not an error.
We will update the creation of the upgrade-info.json
file to include the UpgradeInstructions
.
We will update the optional validation available via CLI to account for the new Plan
structure.
We will add the following validation:
- If
UpgradeInstructions
are provided:- There MUST be at least one entry in
artifacts
. - All of the
artifacts
MUST have a uniqueplatform
. - For each
Artifact
, if theurl
contains achecksum
query parameter:- The
checksum
query parameter value MUST be in the format of{checksum_algo}:{checksum}
. - The
{checksum}
from the query parameter MUST equal thechecksum
provided in theArtifact
. - The
{checksum_algo}
from the query parameter MUST equal thechecksum_algo
provided in theArtifact
.
- The
- There MUST be at least one entry in
- The following validation is currently done using the
info
field. We will apply similar validation to theUpgradeInstructions
. For eachArtifact
:- The
platform
MUST have the format{OS}/{CPU}
or be"any"
. - The
url
field MUST NOT be empty. - The
url
field MUST be a proper URL. - A
checksum
MUST be provided either in thechecksum
field or as a query parameter in theurl
. - If the
checksum
field has a value and theurl
also has achecksum
query parameter, the two values MUST be equal. - The
url
MUST return either a file or an archive containing eitherbin/{DAEMON_NAME}
or{DAEMON_NAME}
. - If a
checksum
is provided (in the field or as a query param), the checksum of the result of theurl
MUST equal the provided checksum.
- The
Downloading of an Artifact
will happen the same way that URLs from info
are currently downloaded.
Cosmovisor Updates
If the upgrade-info.json
file does not contain any UpgradeInstructions
, existing functionality will be maintained.
We will update Cosmovisor to look for and handle the new UpgradeInstructions
in upgrade-info.json
.
If the UpgradeInstructions
are provided, we will do the following:
- The
info
field will be ignored. - The
artifacts
field will be used to identify the artifact to download based on theplatform
that Cosmovisor is running in. - If a
checksum
is provided (either in the field or as a query param in theurl
), and the downloaded artifact has a different checksum, the upgrade process will be interrupted and Cosmovisor will exit with an error. - If a
pre_run
command is defined, it will be executed at the same point in the process where theapp pre-upgrade
command would have been executed. It will be executed using the same environment as other commands run by Cosmovisor. - If a
post_run
command is defined, it will be executed after executing the command that restarts the chain. It will be executed in a background process using the same environment as the other commands. Any output generated by the command will be logged. Once complete, the exit code will be logged.
We will deprecate the use of the info
field for anything other than human readable information.
A warning will be logged if the info
field is used to define the assets (either by URL or JSON).
The new upgrade timeline is very similar to the current one. Changes are in bold:
- An upgrade governance proposal is submitted and approved.
- The upgrade height is reached.
- The
x/upgrade
module writes theupgrade_info.json
file (now possibly withUpgradeInstructions
). - The chain halts.
- Cosmovisor backs up the data directory (if set up to do so).
- Cosmovisor downloads the new executable (if not already in place).
- Cosmovisor executes the
pre_run
command if provided, or else the${DAEMON_NAME} pre-upgrade
command. - Cosmovisor restarts the app using the new version and same args originally provided.
- Cosmovisor immediately runs the
post_run
command in a detached process.
Consequences
Backwards Compatibility
Since the only change to existing definitions is the addition of the instructions
field to the Plan
message, and that field is optional, there are no backwards incompatibilities with respects to the proto messages.
Additionally, current behavior will be maintained when no UpgradeInstructions
are provided, so there are no backwards incompatibilities with respects to either the upgrade module or Cosmovisor.
Forwards Compatibility
In order to utilize the UpgradeInstructions
as part of a software upgrade, both of the following must be true:
- The chain must already be using a sufficiently advanced version of the Cosmos SDK.
- The chain's nodes must be using a sufficiently advanced version of Cosmovisor.
Positive
- The structure for defining artifacts is clearer since it is now defined in the proto instead of in documentation.
- Availability of a pre-run command becomes more obvious.
- A post-run command becomes possible.
Negative
- The
Plan
message becomes larger. This is negligible because A) thex/upgrades
module only stores at most one upgrade plan, and B) upgrades are rare enough that the increased gas cost isn't a concern. - There is no option for providing a URL that will return the
UpgradeInstructions
. - The only way to provide multiple assets (executables and other files) for a platform is to use an archive as the platform's artifact.
Neutral
- Existing functionality of the
info
field is maintained when theUpgradeInstructions
aren't provided.
Further Discussions
- Draft PR #10032 Comment:
Consider different names for
UpgradeInstructions instructions
(either the message type or field name). - Draft PR #10032 Comment:
- Consider putting the
string platform
field insideUpgradeInstructions
and makeUpgradeInstructions
a repeated field inPlan
. - Consider using a
oneof
field in thePlan
which could either beUpgradeInstructions
or else a URL that should return theUpgradeInstructions
. - Consider allowing
info
to either be a JSON serialized version ofUpgradeInstructions
or else a URL that returns that.
- Consider putting the
- Draft PR #10032 Comment:
Consider not including the
UpgradeInstructions.description
field, using theinfo
field for that purpose instead. - Draft PR #10032 Comment:
Consider allowing multiple artifacts to be downloaded for any given
platform
by adding aname
field to theArtifact
message. - PR #10502 Comment
Allow the new
UpgradeInstructions
to be provided via URL. - PR #10502 Comment
Allow definition of a
signer
for assets (as an alternative to using achecksum
).