Go 如何减轻供应链攻击的影响

Filippo Valsorda 31 March 2022

Modern software engineering is collaborative, and based on reusing Open Source software. That exposes targets to supply chain attacks, where software projects are attacked by compromising their dependencies.


Despite any process or technical measure, every dependency is unavoidably a trust relationship. However, the Go tooling and design help mitigate risk at various stages.


All builds are “locked” 所有构建都是 “锁定 “的

There is no way for changes in the outside world—such as a new version of a dependency being published—to automatically affect a Go build.

外部世界的变化–例如依赖关系的新版本被发布–没有办法自动影响 Go 的构建。

Unlike most other package managers files, Go modules don’t have a separate list of constraints and a lock file pinning specific versions. The version of every dependency contributing to any Go build is fully determined by the go.mod file of the main module.

与大多数其他软件包管理器文件不同,Go模块没有单独的约束列表和锁定文件,而是将特定的版本固定下来。对任何 Go 构建作出贡献的每个依赖项的版本完全由主模块的 go.mod 文件决定。

Since Go 1.16, this determinism is enforced by default, and build commands (go build, go test, go install, go run, …) will fail if the go.mod is incomplete. The only commands that will change the go.mod (and therefore the build) are go get and go mod tidy. These commands are not expected to be run automatically or in CI, so changes to dependency trees must be made deliberately and have the opportunity to go through code review.

从Go 1.16开始,这种确定性被默认执行,如果go.mod不完整,构建命令(go build, go test, go install, go run, …)将失败。唯一会改变go.mod(因此也会改变构建)的命令是go get和go mod tidy。这些命令不会自动或在CI中运行,所以对依赖关系树的改变必须有意进行,并有机会通过代码审查。

This is very important for security, because when a CI system or new machine runs go build, the checked-in source is the ultimate and complete source of truth for what will get built. There is no way for third parties to affect that.

这对安全非常重要,因为当CI系统或新机器运行go build时,签入的源码是最终的和完整的真相来源,说明什么会被构建。第三方没有办法影响这一点。

Moreover, when a dependency is added with go get, its transitive dependencies are added at the version specified in the dependency’s go.mod file, not at their latest versions, thanks to Minimal version selection. The same happens for invocations of go install example.com/cmd/devtoolx@latest, the equivalents of which in some ecosystems bypass pinning. In Go, the latest version of example.com/cmd/devtoolx will be fetched, but then all the dependencies will be set by its go.mod file.

此外,当用go get添加依赖时,由于最小版本的选择,它的交叉依赖会按照依赖的go.mod文件中指定的版本添加,而不是按照它们的最新版本。同样的情况也发生在调用 go install example.com/cmd/devtoolx@latest 的情况下,在某些生态系统中,其等价物会绕过 pinning。在Go中,example.com/cmd/devtoolx的最新版本将被获取,但所有的依赖关系将由其go.mod文件设定。

If a module gets compromised and a new malicious version is published, no one will be affected until they explicitly update that dependency, providing the opportunity to review the changes and time for the ecosystem to detect the event.


Version contents never change 版本内容永不改变

Another key property necessary to ensure third parties can’t affect builds is that the contents of a module version are immutable. If an attacker that compromises a dependency could re-upload an existing version, they could automatically compromise all projects that depend on it.


That’s what the go.sum file is for. It contains a list of cryptographic hashes of each dependency that contributes to the build. Again, an incomplete go.sum causes an error, and only go get and go mod tidy will modify it, so any changes to it will accompany a deliberate dependency change. Other builds are guaranteed to have a full set of checksums.

这就是go.sum文件的作用。它包含了对构建有贡献的每个依赖项的加密哈希值的列表。同样,一个不完整的go.sum会导致一个错误,而且只有go get和go mod tidy会修改它,所以对它的任何修改都会伴随着一个故意的依赖项改变。其他的构建被保证有一套完整的校验和。

This is a common feature of most lock files. Go goes beyond it with the Checksum Database (sumdb for short), a global append-only cryptographically-verifiable list of go.sum entries. When go get needs to add an entry to the go.sum file, it fetches it from the sumdb along with cryptographic proof of the sumdb integrity. This ensures that not only every build of a certain module uses the same dependency contents, but that every module out there uses the same dependency contents!

这是大多数锁文件的一个共同特征。Go通过校验和数据库(简称sumdb)超越了它,它是一个全局性的仅可附加的加密验证的go.sum条目列表。当go get需要在go.sum文件中添加一个条目时,它从sumdb中获取该条目,并对sumdb的完整性进行加密证明。这不仅确保了某一模块的每一次构建都使用相同的依赖内容,而且确保了每一个模块都使用相同的依赖内容。

The sumdb makes it impossible for compromised dependencies or even Google-operated Go infrastructure to target specific dependents with modified (e.g. backdoored) source. You’re guaranteed to be using the exact same code that everyone else who’s using e.g. v1.9.2 of example.com/modulex is using and has reviewed.


Finally, my favorite features of the sumdb: it doesn’t require any key management on the part of module authors, and it works seamlessly with the decentralized nature of Go modules.


The VCS is the source of truth - VCS是真理之源

Most projects are developed through some version control system (VCS) and then, in other ecosystems, uploaded to the package repository. This means there are two accounts that could be compromised, the VCS host and the package repository, the latter of which is used more rarely and more likely to be overlooked. It also means it’s easier to hide malicious code in the version uploaded to the repository, especially if the source is routinely modified as part of the upload, for example to minimize it.


In Go, there is no such thing as a package repository account. The import path of a package embeds the information that go mod download needs in order to fetch its module directly from the VCS, where tags define versions.

在Go中,不存在所谓的包库账户。包的导入路径嵌入了go mod download所需要的信息,以便直接从VCS中获取其模块,其中标签定义了版本。

We do have the Go Module Mirror, but that’s only a proxy. Module authors don’t register an account and don’t upload versions to the proxy. The proxy uses the same logic that the go tool uses (in fact, the proxy runs go mod download) to fetch and cache a version. Since the Checksum Database guarantees that there can be only one source tree for a given module version, everyone using the proxy will see the same result as everyone bypassing it and fetching directly from the VCS. (If the version is not available anymore in the VCS or if its contents changed, fetching directly will lead to an error, while fetching from the proxy might still work, improving availability and protecting the ecosystem from “left-pad” issues.)

我们确实有Go Module Mirror,但那只是一个代理。模块作者不需要注册账户,也不需要向代理上传版本。代理使用与go工具相同的逻辑(事实上,代理运行go模块下载)来获取和缓存一个版本。由于校验数据库保证一个给定的模块版本只能有一个源树,每个使用代理的人都会看到与绕过代理直接从VCS获取的结果相同。(如果该版本在VCS中不再可用,或者其内容发生了变化,直接获取将导致错误,而从代理获取可能仍然有效,提高了可用性并保护生态系统免受 “左键 “问题的影响)。

Running VCS tools on the client exposes a pretty large attack surface. That’s another place the Go Module Mirror helps: the go tool on the proxy runs inside a robust sandbox and is configured to support every VCS tool, while the default is to only support the two major VCS systems (git and Mercurial). Anyone using the proxy can still fetch code published using off-by-default VCS systems, but attackers can’t reach that code in most installations.


Building code doesn’t execute it 构建代码并不执行它

It is an explicit security design goal of the Go toolchain that neither fetching nor building code will let that code execute, even if it is untrusted and malicious. This is different from most other ecosystems, many of which have first-class support for running code at package fetch time. These “post-install” hooks have been used in the past as the most convenient way to turn a compromised dependency into compromised developer machines, and to worm through module authors.

Go工具链的一个明确的安全设计目标是,无论是获取还是构建代码,都不会让该代码执行,即使它是不被信任的和恶意的。这与其他大多数生态系统不同,许多生态系统在获取软件包时对运行代码有一流的支持。这些 “安装后 “的钩子在过去被用作最方便的方式,将受影响的依赖关系变成受影响的开发者机器,并通过模块作者进行蠕虫攻击。

To be fair, if you’re fetching some code it’s often to execute it shortly afterwards, either as part of tests on a developer machine or as part of a binary in production, so lacking post-install hooks is only going to slow down attackers. (There is no security boundary within a build: any package that contributes to a build can define an init function.) However, it can be a meaningful risk mitigation, since you might be executing a binary or testing a package that only uses a subset of the module’s dependencies. For example, if you build and execute example.com/cmd/devtoolx on macOS there is no way for a Windows-only dependency or a dependency of example.com/cmd/othertool to compromise your machine.


In Go, modules that don’t contribute code to a specific build have no security impact on it.


“A little copying is better than a little dependency” “一点复制比一点依赖项好”

The final and maybe most important software supply chain risk mitigation in the Go ecosystem is the least technical one: Go has a culture of rejecting large dependency trees, and of preferring a bit of copying to adding a new dependency. It goes all the way back to one of the Go proverbs: “a little copying is better than a little dependency”. The label “zero dependencies” is proudly worn by high-quality reusable Go modules. If you find yourself in need of a library, you’re likely to find it will not cause you to take on a dependency on dozens of other modules by other authors and owners.

在Go生态系统中,最后一个也许也是最重要的软件供应链风险缓解措施是最没有技术含量的一个。Go有一种拒绝大型依赖树的文化,宁可复制一点也不愿意添加新的依赖关系。这可以追溯到Go的一句谚语。“一点复制比一点依赖项好”。“零依赖 “的标签被高质量的可重复使用的Go模块所自豪地佩戴。如果您发现自己需要一个库,您很可能会发现它不会导致您依赖其他作者和所有者的几十个模块。

That’s enabled also by the rich standard library and additional modules (the golang.org/x/... ones), which provide commonly used high-level building blocks such as an HTTP stack, a TLS library, JSON encoding, etc.


All together this means it’s possible to build rich, complex applications with just a handful of dependencies. No matter how good the tooling is, it can’t eliminate the risk involved in reusing code, so the strongest mitigation will always be a small dependency tree.


