Merge pull request #694 from phil-opp/rustcc-translation

Add translations from rustcc/writing-an-os-in-rust
This commit is contained in:
Philipp Oppermann
2020-02-17 10:37:17 +01:00
committed by GitHub
4 changed files with 2024 additions and 305 deletions

View File

@@ -1,18 +1,18 @@
+++
title = "独立的Rust二进制"
title = "独立式可执行程序"
weight = 1
path = "zh-CN/freestanding-rust-binary"
date = 2018-02-10
+++
创建我们自己的操作系统内核的第一步是创建一个不链接标准库的Rust可执行文件。 这使得无需基础操作系统即可在[裸机][bare metal]上运行Rust代码。
创建一个不链接标准库的 Rust 可执行文件,将是我们迈出的第一步。无需底层操作系统的支撑,这样才能在**裸机**[bare metal]上运行 Rust 代码。
[bare metal]: https://en.wikipedia.org/wiki/Bare_machine
<!-- more -->
此博客在[GitHub]上公开开发. 如果您有任何问题或疑问,请在此处打开一个问题。 您也可以在[底部][at the bottom]发表评论. 这篇文章的完整源代码可以在[`post-01`] [post branch]分支中找到。
此博客在 [GitHub] 上公开开发. 如果您有任何问题或疑问,请在此处打开一个 issue。 您也可以在[底部][at the bottom]发表评论. 这篇文章的完整源代码可以在 [`post-01`] [post branch] 分支中找到。
[GitHub]: https://github.com/phil-opp/blog_os
[at the bottom]: #comments
@@ -20,41 +20,27 @@ date = 2018-02-10
<!-- toc -->
## 介
要编写操作系统内核,我们需要不依赖于任何操作系统功能的代码。 这意味着我们不能使用线程文件堆内存网络随机数标准输出或任何其他需要操作系统抽象或特定硬件的功能。这很有意义因为我们正在尝试编写自己的OS和我们的驱动程序。
##
这意味着我们不能使用大多数[Rust标准库][Rust standard library]但是我们可以使用很多Rust功能。 例如,我们可以使用[迭代器][iterators][闭包][closures][模式匹配][pattern matching][option]和[result][string formatting],当然也可以使用[所有权系统][ownership system]。 这些功能使以一种非常有表现力的高级方式编写内核成为可能,而无需担心[不确定的行为][undefined behavior]或[内存安全性][memory safety]
要编写一个操作系统内核,我们需要编写不依赖任何操作系统特性的代码。这意味着我们不能使用线程、文件、堆内存、网络、随机数、标准输出,或其它任何需要操作系统抽象和特定硬件的特性;因为我们正在编写自己的操作系统和硬件驱动
[option]: https://doc.rust-lang.org/core/option/
[result]:https://doc.rust-lang.org/core/result/
[Rust standard library]: https://doc.rust-lang.org/std/
[iterators]: https://doc.rust-lang.org/book/ch13-02-iterators.html
[closures]: https://doc.rust-lang.org/book/ch13-01-closures.html
[pattern matching]: https://doc.rust-lang.org/book/ch06-00-enums.html
[string formatting]: https://doc.rust-lang.org/core/macro.write.html
[ownership system]: https://doc.rust-lang.org/book/ch04-00-understanding-ownership.html
[undefined behavior]: https://www.nayuki.io/page/undefined-behavior-in-c-and-cplusplus-programs
[memory safety]: https://tonyarcieri.com/it-s-time-for-a-memory-safety-intervention
实现这一点,意味着我们不能使用 [Rust标准库](https://doc.rust-lang.org/std/)的大部分;但还有很多 Rust 特性是我们依然可以使用的。比如说,我们可以使用[迭代器](https://doc.rust-lang.org/book/ch13-02-iterators.html)、[闭包](https://doc.rust-lang.org/book/ch13-01-closures.html)、[模式匹配](https://doc.rust-lang.org/book/ch06-00-enums.html)、[Option](https://doc.rust-lang.org/core/option/)、[Result](https://doc.rust-lang.org/core/result/index.html)、[字符串格式化](https://doc.rust-lang.org/core/macro.write.html),当然还有[所有权系统](https://doc.rust-lang.org/book/ch04-00-understanding-ownership.html)。这些功能让我们能够编写表达性强、高层抽象的操作系统,而无需关心[未定义行为](https://www.nayuki.io/page/undefined-behavior-in-c-and-cplusplus-programs)和[内存安全](https://tonyarcieri.com/it-s-time-for-a-memory-safety-intervention)。
为了Rust中创建OS内核,我们需要创建一个无需底层操作系统即可运行的可执行文件。 此类可执行文件通常称为“独立式”或“裸机”可执行文件
为了Rust 编写一个操作系统内核,我们需要创建一个独立于操作系统的可执行程序。这样的可执行程序常被称作**独立式可执行程序**freestanding executable或**裸机程序**(bare-metal executable)
这篇文章描述了创建一个独立的Rust二进制文件的必要步骤解释为什么需要这些步骤。 如果您仅对一个最小的示例感兴趣,可以 **[跳转到摘要](summary)**
这篇文章里,我们将逐步地创建一个独立式可执行程序,并且详细解释为什么每个步骤都是必须的。如果读者只对最终的代码感兴趣,可以跳转到本篇文章的小结部分
## 禁用标准库
默认情况下所有Rust crate都链接[标准库][standard library],该库取决于操作系统的线程,文件或网络等功能。 它还依赖于C标准库“ libc”该库与OS服务紧密交互。 由于我们的计划是编写一个操作系统因此我们不能使用任何依赖于OS的库。因此我们必须通过[`no_std` 属性][`no_std` attribute]禁用自动包含标准库。
[standard library]: https://doc.rust-lang.org/std/
[`no_std` attribute]: https://doc.rust-lang.org/1.30.0/book/first-edition/using-rust-without-the-standard-library.html
在默认情况下,所有的 Rust **包**crate都会链接**标准库**[standard library](https://doc.rust-lang.org/std/)),而标准库依赖于操作系统功能,如线程、文件系统、网络。标准库还与 **Rust 的 C 语言标准库实现库**libc相关联它也是和操作系统紧密交互的。既然我们的计划是编写自己的操作系统我们就需要不使用任何与操作系统相关的库——因此我们必须禁用**标准库自动引用**automatic inclusion。使用 [no_std 属性](https://doc.rust-lang.org/book/first-edition/using-rust-without-the-standard-library.html)可以实现这一点。
我们首先创建一个新的货物应用项目。 最简单的法是通过命令
我们可以从创建一个新的 cargo 项目开始。最简单的法是使用下面的命令:
```
cargo new blog_os --bin --edition 2018
```bash
> cargo new blog_os
```
我将项目命名为`blog_os`但是您当然可以选择自己的名字。 --bin标志指定我们要创建一个可执行二进制文件(与库相反),而--edition 2018标志指定我们要为crate使用Rust的[2018版][2018 edition]。 当我们运行命令时cargo为我们创建以下目录结构:
[2018 edition]: https://rust-lang-nursery.github.io/edition-guide/rust-2018/index.html
在这里我把项目命名为 `blog_os`,当然读者也可以选择自己的项目名称。这里cargo 默认为我们添加了`--bin` 选项,说明我们要创建一个可执行文件而不是一个库cargo还为我们添加了`--edition 2018` 标签,指明项目的包要使用 Rust**2018 版次**[2018 edition](https://rust-lang-nursery.github.io/edition-guide/rust-2018/index.html))。当我们执行这行指令的时候cargo 为我们创建目录结构如下
```
blog_os
@@ -63,12 +49,11 @@ blog_os
└── main.rs
```
`Cargo.toml`包含crate构造例如crate名称作者[语义化版本][semantic version]号码,和依赖关系。 `src/main.rs`文件包含crate的根模块和main函数。您可以通过`cargo build`来编译crate,然后在`target/debug`文件夹中运行已编译的`blog_os`二进制文件。
[semantic version]: http://semver.org/
这里,`Cargo.toml` 文件包含了包的**配置**configuration比如包的名称作者、[semver版本](http://semver.org/) 和项目依赖项;`src/main.rs` 文件包含包的**根模块**root modulemain 函数。我们可以使用 `cargo build` 来编译这个包,然后在 `target/debug` 文件夹内找到编译好的 `blog_os` 二进制文件。
### `no_std` 属性
### no_std 属性
现在我们的crate隐式链接了标准库。 让我们尝试通过添加[`no_std` 属性]禁用此功能
现在我们的包依然隐式地与标准库链接。为了禁用这种链接,我们可以尝试添加 [no_std 属性](https://doc.rust-lang.org/book/first-edition/using-rust-without-the-standard-library.html)
```rust
// main.rs
@@ -80,22 +65,19 @@ fn main() {
}
```
当我们尝试立即构建它(通过运行`cargo build`)时,会发生以下错误:
看起来很顺利。当我们使用 `cargo build` 来编译的时候,却出现了下面的错误:
```
```rust
error: cannot find macro `println!` in this scope
--> src/main.rs:4:5
--> src\main.rs:4:5
|
4 | println!("Hello, world!");
| ^^^^^^^
```
发生此错误的原因是[`println`宏]是标准库的一部分,我们不再包含这个库。 因此我们无法再打印东西。这是有道理的,因为`println`写入[标准输出][standard output]这是操作系统提供的特殊文件描述符
出现这个错误的原因是[println! 宏](https://doc.rust-lang.org/std/macro.println.html)是标准库的一部分,我们的项目不再依赖于标准库。我们选择不再打印字符串。这也很好理解,因为 `println!` 将会向**标准输出**[standard output](https://en.wikipedia.org/wiki/Standard_streams#Standard_output_.28stdout.29))打印字符,它依赖于特殊的文件描述符,而这是操作系统提供的特
[`println` macro]: https://doc.rust-lang.org/std/macro.println.html
[standard output]: https://en.wikipedia.org/wiki/Standard_streams#Standard_output_.28stdout.29
因此让我们删除打印件然后使用空的main函数重试
所以我们可以移除这行代码,使用一个空的 main 函数再次尝试编译:
```rust
// main.rs
@@ -111,50 +93,37 @@ error: `#[panic_handler]` function required, but not found
error: language item required, but not found: `eh_personality`
```
现在,编译器缺少`#[panic_handler]`函数和_language item_
现在我们发现,编译器缺少一个 `#[panic_handler]` 函数和一个**语言项**language item
## Panic 实现
## 实现 panic 处理函数
`panic_handler`属性定义了发生[panic]时编译器应调用的函数。标准库提供了自己的应急处理函数,但`no_std`环境中,我们需要自己定义它
[panic]: https://doc.rust-lang.org/stable/book/ch09-01-unrecoverable-errors-with-panic.html
`panic_handler` 属性定义了一个函数,它会在一个 panic 发生时被调用。标准库提供了自己的 panic 处理函数,但在 `no_std` 环境中,我们需要定义一个自己的 panic 处理函数
```rust
// in main.rs
use core::panic::PanicInfo;
/// This function is called on panic.
/// 这个函数将在 panic 时被调用
#[panic_handler]
fn panic(_info: &PanicInfo) -> ! {
loop {}
}
```
[`PanicInfo`参数][PanicInfo]包含发生异常的文件和行以及可选的异常消息。该函数永远不应该返回,因此通过返回[“never” type] `!`将其标记为[diverging function]。 目前,我们无法在此函数中执行太多操作,因此我们只是做无限循环
类型为 [PanicInfo](https://doc.rust-lang.org/nightly/core/panic/struct.PanicInfo.html) 的参数包含了 panic 发生的文件名、代码行数和可选的错误信息。这个函数从不返回,所以他被标记为**发散函数**[diverging function](https://doc.rust-lang.org/book/first-edition/functions.html#diverging-functions))。发散函数的返回类型称作 **Never 类型**["never" type](https://doc.rust-lang.org/nightly/std/primitive.never.html)),记为`!`。对这个函数,我们目前能做的很少,所以我们只需编写一个无限循环 `loop {}`
[PanicInfo]: https://doc.rust-lang.org/nightly/core/panic/struct.PanicInfo.html
[diverging function]: https://doc.rust-lang.org/1.30.0/book/first-edition/functions.html#diverging-functions
[“never” type]: https://doc.rust-lang.org/nightly/std/primitive.never.html
## eh_personality 语言项
## `eh_personality` 语言项
语言项是一些编译器需求的特殊函数或类型。举例来说Rust 的 [Copy](https://doc.rust-lang.org/nightly/core/marker/trait.Copy.html) trait 是一个这样的语言项,告诉编译器哪些类型需要遵循**复制语义**[copy semantics](https://doc.rust-lang.org/nightly/core/marker/trait.Copy.html))——当我们查找 `Copy` trait 的[实现](https://github.com/rust-lang/rust/blob/485397e49a02a3b7ff77c17e4a3f16c653925cb3/src/libcore/marker.rs#L296-L299)时,我们会发现,一个特殊的 `#[lang = "copy"]` 属性将它定义为了一个语言项,达到与编译器联系的目的。
语言项是编译器内部所需的特殊功能和类型。例如,[`Copy`]特征是一种语言项目,它告诉编译器哪些类型具有 [_copy语义_][`Copy`]。当我们查看[实现][copy code]时,我们看到它具有特殊的`#[lang = "copy"]`属性,将该属性定义为语言项。
我们可以自己实现语言项,但这是下下策:目前来看,语言项是高度不稳定的语言细节实现,它们不会经过编译期类型检查(所以编译器甚至不确保它们的参数类型是否正确)。幸运的是,我们有更稳定的方式,来修复上面的语言项错误
[`Copy`]: https://doc.rust-lang.org/nightly/core/marker/trait.Copy.html
[copy code]: https://github.com/rust-lang/rust/blob/485397e49a02a3b7ff77c17e4a3f16c653925cb3/src/libcore/marker.rs#L296-L299
`eh_personality` 语言项标记的函数,将被用于实现**栈展开**[stack unwinding](http://www.bogotobogo.com/cplusplus/stackunwinding.php))。在使用标准库的情况下,当 panic 发生时Rust 将使用栈展开,来运行在栈上所有活跃的变量的**析构函数**destructor——这确保了所有使用的内存都被释放允许调用程序的**父进程**parent thread捕获 panic处理并继续运行。但是栈展开是一个复杂的过程如 Linux 的 [libunwind](http://www.nongnu.org/libunwind/) 或 Windows 的**结构化异常处理**[structured exception handling, SEH](https://msdn.microsoft.com/en-us/library/windows/desktop/ms680657(v=vs.85).aspx)),通常需要依赖于操作系统的库;所以我们不在自己编写的操作系统中使用它。
提供自己的语言项目实现是可能的,但这只能作为最后的选择。 原因是语言项是高度不稳定的实现细节,甚至没有类型检查(因此编译器甚至不检查函数是否具有正确的参数类型)。幸运的是,有更稳定的方法来修复上述语言项错误。
### 禁用栈展开
`eh_personality`语言项标记了用于实现[堆栈展开][stack unwinding]的功能。默认情况下Rust使用展开来运行所有活动堆栈变量的析构函数以防出现[panic]情况。 这样可以确保释放所有使用的内存并允许父线程捕获紧急情况并继续执行。但是展开是一个复杂的过程需要某些特定于操作系统的库例如Linux上的[libunwind]或Windows上的[结构化异常处理][structured exception handling]),因此我们不想在操作系统中使用它。
[stack unwinding]: http://www.bogotobogo.com/cplusplus/stackunwinding.php
[libunwind]: http://www.nongnu.org/libunwind/
[structured exception handling]: https://msdn.microsoft.com/en-us/library/windows/desktop/ms680657(v=vs.85).aspx
### 禁用展开
还有其他一些用例不希望展开因此Rust提供了[中止异常][abort on panic]的选项。 这禁用了展开符号信息的生成,因此大大减小了二进制大小。我们可以在多个地方禁用展开功能。 最简单的方法是将以下几行添加到我们的`Cargo.toml`:
在其它一些情况下栈展开并不是迫切需求的功能因此Rust 提供了**在 panic 时中止**[abort on panic](https://github.com/rust-lang/rust/pull/32900))的选项。这个选项能禁用栈展开相关的标志信息生成,也因此能缩小生成的二进制程序的长度。有许多方式能打开这个选项,最简单的方式是把下面的几行设置代码加入我们的 `Cargo.toml`
```toml
[profile.dev]
@@ -164,33 +133,28 @@ panic = "abort"
panic = "abort"
```
会将`dev`配置文件(用于`cargo build`)和`release`配置文件(用于`cargo build --release`)的应急策略设置为`abort`。 现在,不再需要`eh_personality`语言项目了
些选项能将 **dev 配置**dev profile**release 配置**release profile的 panic 策略设为 `abort``dev` 配置用于 `cargo build`,而 `release` 配置用于 `cargo build --release`。现在编译器应该不再要求我们提供 `eh_personality` 语言项实现
[abort on panic]: https://github.com/rust-lang/rust/pull/32900
现在我们已经修复了出现的两个错误,可以开始编译了。然而,尝试编译运行后,一个新的错误出现了:
现在,我们修复了以上两个错误。 但是,如果我们现在尝试对其进行编译,则会发生另一个错误:
```
```bash
> cargo build
error: requires `start` lang_item
```
我们的程序缺少定义入口点的`start`语言项
## start 语言项
## `start` 属性
这里,我们的程序遗失了 `start` 语言项,它将定义一个程序的**入口点**entry point
可能会认为`main`函数是运行程序时调用的第一个函数。 但是,大多数语言都有一个[运行时系统][runtime system]它负责诸如垃圾回收例如Java或软件线程例如Go中的goroutines之类的事情。 这个运行时需要在`main`之前调用,因为它需要初始化自己
我们通常会认为,当运行一个程序时,首先被调用的是 `main` 函数。但是,大多数语言都有一个**运行时系统**[runtime system](https://en.wikipedia.org/wiki/Runtime_system)),它通常为**垃圾回收**garbage collection或**绿色线程**software threads或 green threads服务如 Java 的 GC 或 Go 语言的协程goroutine这个运行时系统需要在 main 函数前启动,因为它需要让程序初始化。
[runtime system]: https://en.wikipedia.org/wiki/Runtime_system
在一个典型的使用标准库的 Rust 程序中,程序运行是从一个名为 `crt0` 的运行时库开始的。`crt0` 意为 C runtime zero它能建立一个适合运行 C 语言程序的环境,这包含了栈的创建和可执行程序参数的传入。在这之后,这个运行时库会调用 [Rust 的运行时入口点](https://github.com/rust-lang/rust/blob/bb4d1491466d8239a7a5fd68bd605e3276e97afb/src/libstd/rt.rs#L32-L73),这个入口点被称作 **start语言项**"start" language item。Rust 只拥有一个极小的运行时,它被设计为拥有较少的功能,如爆栈检测和打印**堆栈轨迹**stack trace。这之后这个运行时将会调用 main 函数。
在链接标准库的典型Rust二进制文件中执行从名为`crt0`(“ C运行时零”的C运行时库开始该运行时库为C应用程序设置了环境。这包括创建堆栈并将参数放在正确的寄存器中。 然后C运行时调用[Rust运行时的入口点][rt::lang_start],该入口由`start`语言项标记。Rust的运行时非常短它可以处理一些小事情例如设置堆栈溢出防护或在紧急情况下打印回溯。 然后,运行时最终调用`start`函数
我们的独立式可执行程序并不能访问 Rust 运行时或 `crt0` 库,所以我们需要定义自己的入口点。只实现一个 `start` 语言项并不能帮助我们,因为这之后程序依然要求 `crt0` 库。所以,我们要做的是,直接重写整个 `crt0` 库和它定义的入口点
[rt::lang_start]: https://github.com/rust-lang/rust/blob/bb4d1491466d8239a7a5fd68bd605e3276e97afb/src/libstd/rt.rs#L32-L73
### 重写入口点
Our freestanding executable does not have access to the Rust runtime and `crt0`, so we need to define our own entry point. Implementing the `start` language item wouldn't help, since it would still require `crt0`. Instead, we need to overwrite the `crt0` entry point directly.
### Overwriting the Entry Point
To tell the Rust compiler that we don't want to use the normal entry point chain, we add the `#![no_main]` attribute.
要告诉 Rust 编译器我们不使用预定义的入口点,我们可以添加 `#![no_main]` 属性。
```rust
#![no_std]
@@ -198,14 +162,14 @@ To tell the Rust compiler that we don't want to use the normal entry point chain
use core::panic::PanicInfo;
/// This function is called on panic.
/// 这个函数将在 panic 时被调用
#[panic_handler]
fn panic(_info: &PanicInfo) -> ! {
loop {}
}
```
You might notice that we removed the `main` function. The reason is that a `main` doesn't make sense without an underlying runtime that calls it. Instead, we are now overwriting the operating system entry point with our own `_start` function:
读者也许会注意到,我们移除了 `main` 函数。原因很显然,既然没有底层运行时调用它,`main` 函数也失去了存在的必要性。为了重写操作系统的入口点,我们转而编写一个 `_start` 函数:
```rust
#[no_mangle]
@@ -214,32 +178,25 @@ pub extern "C" fn _start() -> ! {
}
```
By using the `#[no_mangle]` attribute we disable the [name mangling] to ensure that the Rust compiler really outputs a function with the name `_start`. Without the attribute, the compiler would generate some cryptic `_ZN3blog_os4_start7hb173fedf945531caE` symbol to give every function an unique name. The attribute is required because we need to tell the name of the entry point function to the linker in the next step.
我们使用 `no_mangle` 标记这个函数,来对它禁用**名称重整**[name mangling](https://en.wikipedia.org/wiki/Name_mangling))——这确保 Rust 编译器输出一个名为 `_start` 的函数;否则,编译器可能最终生成名为 `_ZN3blog_os4_start7hb173fedf945531caE` 的函数,无法让链接器正确辨别。
We also have to mark the function as `extern "C"` to tell the compiler that it should use the [C calling convention] for this function (instead of the unspecified Rust calling convention). The reason for naming the function `_start` is that this is the default entry point name for most systems.
我们还将函数标记为 `extern "C"`,告诉编译器这个函数应当使用 [C 语言的调用约定](https://en.wikipedia.org/wiki/Calling_convention),而不是 Rust 语言的调用约定。函数名为 `_start` ,是因为大多数系统默认使用这个名字作为入口点名称。
[name mangling]: https://en.wikipedia.org/wiki/Name_mangling
[C calling convention]: https://en.wikipedia.org/wiki/Calling_convention
与前文的 `panic` 函数类似,这个函数的返回值类型为`!`——它定义了一个发散函数,或者说一个不允许返回的函数。这一点很重要,因为这个入口点不会被任何函数调用,但将直接被操作系统或**引导程序**bootloader调用。所以作为函数返回的替代这个入口点应该去调用比如操作系统提供的 **exit 系统调用**["exit" system call](https://en.wikipedia.org/wiki/Exit_(system_call)))函数。在我们编写操作系统的情况下,关机应该是一个合适的选择,因为**当一个独立式可执行程序返回时,不会留下任何需要做的事情**there is nothing to do if a freestanding binary returns。现在来看我们可以添加一个无限循环来满足对返回值类型的需求。
The `!` return type means that the function is diverging, i.e. not allowed to ever return. This is required because the entry point is not called by any function, but invoked directly by the operating system or bootloader. So instead of returning, the entry point should e.g. invoke the [`exit` system call] of the operating system. In our case, shutting down the machine could be a reasonable action, since there's nothing left to do if a freestanding binary returns. For now, we fulfill the requirement by looping endlessly.
如果我们现在编译这段程序,会出来一大段不太好看的**链接器错误**linker error
[`exit` system call]: https://en.wikipedia.org/wiki/Exit_(system_call)
## 链接器错误
When we run `cargo build` now, we get an ugly _linker_ error.
**链接器**linker是一个程序它将生成的目标文件组合为一个可执行文件。不同的操作系统如 Windows、macOS、Linux规定了不同的可执行文件格式因此也各有自己的链接器抛出不同的错误但这些错误的根本原因还是相同的链接器的默认配置假定程序依赖于C语言的运行时环境但我们的程序并不依赖于它。
## Linker Errors
为了解决这个错误我们需要告诉链接器它不应该包含includeC 语言运行环境。我们可以选择提供特定的**链接器参数**linker argument也可以选择编译为**裸机目标**bare metal target
The linker is a program that combines the generated code into an executable. Since the executable format differs between Linux, Windows, and macOS, each system has its own linker that throws a different error. The fundamental cause of the errors is the same: the default configuration of the linker assumes that our program depends on the C runtime, which it does not.
### 编译为裸机目标
To solve the errors, we need to tell the linker that it should not include the C runtime. We can do this either by passing a certain set of arguments to the linker or by building for a bare metal target.
在默认情况下Rust 尝试适配当前的系统环境,编译可执行程序。举个例子,如果你使用 `x86_64` 平台的 Windows 系统Rust 将尝试编译一个扩展名为 `.exe` 的 Windows 可执行程序,并使用 `x86_64` 指令集。这个环境又被称作为你的**宿主系统**"host" system
### Building for a Bare Metal Target
By default Rust tries to build an executable that is able to run in your current system environment. For example, if you're using Windows on `x86_64`, Rust tries to build a `.exe` Windows executable that uses `x86_64` instructions. This environment is called your "host" system.
To describe different environments, Rust uses a string called [_target triple_]. You can see the target triple for your host system by running `rustc --version --verbose`:
[_target triple_]: https://clang.llvm.org/docs/CrossCompilation.html#target-triple
为了描述不同的环境Rust 使用一个称为**目标三元组**target triple的字符串。要查看当前系统的目标三元组我们可以运行 `rustc --version --verbose`
```
rustc 1.35.0-nightly (474e7a648 2019-04-07)
@@ -251,228 +208,59 @@ release: 1.35.0-nightly
LLVM version: 8.0
```
The above output is from a `x86_64` Linux system. We see that the `host` triple is `x86_64-unknown-linux-gnu`, which includes the CPU architecture (`x86_64`), the vendor (`unknown`), the operating system (`linux`), and the [ABI] (`gnu`).
上面这段输出来自一个 `x86_64` 平台下的 Linux 系统。我们能看到,`host` 字段的值为三元组 `x86_64-unknown-linux-gnu`,它包含了 CPU 架构 `x86_64` 、供应商 `unknown` 、操作系统 `linux` 和[二进制接口](https://en.wikipedia.org/wiki/Application_binary_interface) `gnu`
[ABI]: https://en.wikipedia.org/wiki/Application_binary_interface
Rust 编译器尝试为当前系统的三元组编译,并假定底层有一个类似于 Windows 或 Linux 的操作系统提供C语言运行环境——然而这将导致链接器错误。所以为了避免这个错误我们可以另选一个底层没有操作系统的运行环境。
By compiling for our host triple, the Rust compiler and the linker assume that there is an underlying operating system such as Linux or Windows that use the C runtime by default, which causes the linker errors. So to avoid the linker errors, we can compile for a different environment with no underlying operating system.
An example for such a bare metal environment is the `thumbv7em-none-eabihf` target triple, which describes an [embedded] [ARM] system. The details are not important, all that matters is that the target triple has no underlying operating system, which is indicated by the `none` in the target triple. To be able to compile for this target, we need to add it in rustup:
[embedded]: https://en.wikipedia.org/wiki/Embedded_system
[ARM]: https://en.wikipedia.org/wiki/ARM_architecture
这样的运行环境被称作裸机环境,例如目标三元组 `thumbv7em-none-eabihf` 描述了一个 ARM **嵌入式系统**[embedded system](https://en.wikipedia.org/wiki/Embedded_system))。我们暂时不需要了解它的细节,只需要知道这个环境底层没有操作系统——这是由三元组中的 `none` 描述的。要为这个目标编译,我们需要使用 rustup 添加它:
```
rustup target add thumbv7em-none-eabihf
```
This downloads a copy of the standard (and core) library for the system. Now we can build our freestanding executable for this target:
这行命令将为目标下载一个标准库和 core 库。这之后,我们就能为这个目标构建独立式可执行程序了:
```
cargo build --target thumbv7em-none-eabihf
```
By passing a `--target` argument we [cross compile] our executable for a bare metal target system. Since the target system has no operating system, the linker does not try to link the C runtime and our build succeeds without any linker errors.
我们传递了 `--target` 参数,来为裸机目标系统**交叉编译**[cross compile](https://en.wikipedia.org/wiki/Cross_compiler))我们的程序。我们的目标并不包括操作系统,所以链接器不会试着链接 C 语言运行环境,因此构建过程成功会完成,不会产生链接器错误。
[cross compile]: https://en.wikipedia.org/wiki/Cross_compiler
我们将使用这个方法编写自己的操作系统内核。我们不会编译到 `thumbv7em-none-eabihf`,而是使用描述 `x86_64` 环境的**自定义目标**[custom target](https://doc.rust-lang.org/rustc/targets/custom.html))。在下一篇文章中,我们将详细描述一些相关的细节。
This is the approach that we will use for building our OS kernel. Instead of `thumbv7em-none-eabihf`, we will use a [custom target] that describes a `x86_64` bare metal environment. The details will be explained in the next post.
### 链接器参数
[custom target]: https://doc.rust-lang.org/rustc/targets/custom.html
我们也可以选择不编译到裸机系统,因为传递特定的参数也能解决链接器错误问题。虽然我们不会在后面使用到这个方法,为了教程的完整性,我们也撰写了专门的短文章,来提供这个途径的解决方案。
### Linker Arguments
[链接器参数](./appendix-a-linker-arguments.md)
Instead of compiling for a bare metal system, it is also possible to resolve the linker errors by passing a certain set of arguments to the linker. This isn't the approach that we will use for our kernel, therefore this section is optional and only provided for completeness. Click on _"Linker Arguments"_ below to show the optional content.
## 小结
<details>
一个用 Rust 编写的最小化的独立式可执行程序应该长这样:
<summary>Linker Arguments</summary>
In this section we discuss the linker errors that occur on Linux, Windows, and macOS, and explain how to solve them by passing additional arguments to the linker. Note that the executable format and the linker differ between operating systems, so that a different set of arguments is required for each system.
#### Linux
On Linux the following linker error occurs (shortened):
```
error: linking with `cc` failed: exit code: 1
|
= note: "cc" […]
= note: /usr/lib/gcc/../x86_64-linux-gnu/Scrt1.o: In function `_start':
(.text+0x12): undefined reference to `__libc_csu_fini'
/usr/lib/gcc/../x86_64-linux-gnu/Scrt1.o: In function `_start':
(.text+0x19): undefined reference to `__libc_csu_init'
/usr/lib/gcc/../x86_64-linux-gnu/Scrt1.o: In function `_start':
(.text+0x25): undefined reference to `__libc_start_main'
collect2: error: ld returned 1 exit status
```
The problem is that the linker includes the startup routine of the C runtime by default, which is also called `_start`. It requires some symbols of the C standard library `libc` that we don't include due to the `no_std` attribute, therefore the linker can't resolve these references. To solve this, we can tell the linker that it should not link the C startup routine by passing the `-nostartfiles` flag.
One way to pass linker attributes via cargo is the `cargo rustc` command. The command behaves exactly like `cargo build`, but allows to pass options to `rustc`, the underlying Rust compiler. `rustc` has the `-C link-arg` flag, which passes an argument to the linker. Combined, our new build command looks like this:
```
cargo rustc -- -C link-arg=-nostartfiles
```
Now our crate builds as a freestanding executable on Linux!
We didn't need to specify the name of our entry point function explicitly since the linker looks for a function with the name `_start` by default.
#### Windows
On Windows, a different linker error occurs (shortened):
```
error: linking with `link.exe` failed: exit code: 1561
|
= note: "C:\\Program Files (x86)\\…\\link.exe" […]
= note: LINK : fatal error LNK1561: entry point must be defined
```
The "entry point must be defined" error means that the linker can't find the entry point. On Windows, the default entry point name [depends on the used subsystem][windows-subsystems]. For the `CONSOLE` subsystem the linker looks for a function named `mainCRTStartup` and for the `WINDOWS` subsystem it looks for a function named `WinMainCRTStartup`. To override the default and tell the linker to look for our `_start` function instead, we can pass an `/ENTRY` argument to the linker:
[windows-subsystems]: https://docs.microsoft.com/en-us/cpp/build/reference/entry-entry-point-symbol
```
cargo rustc -- -C link-arg=/ENTRY:_start
```
From the different argument format we clearly see that the Windows linker is a completely different program than the Linux linker.
Now a different linker error occurs:
```
error: linking with `link.exe` failed: exit code: 1221
|
= note: "C:\\Program Files (x86)\\…\\link.exe" […]
= note: LINK : fatal error LNK1221: a subsystem can't be inferred and must be
defined
```
This error occurs because Windows executables can use different [subsystems][windows-subsystems]. For normal programs they are inferred depending on the entry point name: If the entry point is named `main`, the `CONSOLE` subsystem is used, and if the entry point is named `WinMain`, the `WINDOWS` subsystem is used. Since our `_start` function has a different name, we need to specify the subsystem explicitly:
```
cargo rustc -- -C link-args="/ENTRY:_start /SUBSYSTEM:console"
```
We use the `CONSOLE` subsystem here, but the `WINDOWS` subsystem would work too. Instead of passing `-C link-arg` multiple times, we use `-C link-args` which takes a space separated list of arguments.
With this command, our executable should build successfully on Windows.
#### macOS
On macOS, the following linker error occurs (shortened):
```
error: linking with `cc` failed: exit code: 1
|
= note: "cc" […]
= note: ld: entry point (_main) undefined. for architecture x86_64
clang: error: linker command failed with exit code 1 […]
```
This error message tells us that the linker can't find an entry point function with the default name `main` (for some reason all functions are prefixed with a `_` on macOS). To set the entry point to our `_start` function, we pass the `-e` linker argument:
```
cargo rustc -- -C link-args="-e __start"
```
The `-e` flag specifies the name of the entry point function. Since all functions have an additional `_` prefix on macOS, we need to set the entry point to `__start` instead of `_start`.
Now the following linker error occurs:
```
error: linking with `cc` failed: exit code: 1
|
= note: "cc" […]
= note: ld: dynamic main executables must link with libSystem.dylib
for architecture x86_64
clang: error: linker command failed with exit code 1 […]
```
macOS [does not officially support statically linked binaries] and requires programs to link the `libSystem` library by default. To override this and link a static binary, we pass the `-static` flag to the linker:
[does not officially support statically linked binaries]: https://developer.apple.com/library/content/qa/qa1118/_index.html
```
cargo rustc -- -C link-args="-e __start -static"
```
This still not suffices, as a third linker error occurs:
```
error: linking with `cc` failed: exit code: 1
|
= note: "cc" […]
= note: ld: library not found for -lcrt0.o
clang: error: linker command failed with exit code 1 […]
```
This error occurs because programs on macOS link to `crt0` (“C runtime zero”) by default. This is similar to the error we had on Linux and can be also solved by adding the `-nostartfiles` linker argument:
```
cargo rustc -- -C link-args="-e __start -static -nostartfiles"
```
Now our program should build successfully on macOS.
#### Unifying the Build Commands
Right now we have different build commands depending on the host platform, which is not ideal. To avoid this, we can create a file named `.cargo/config` that contains the platform specific arguments:
```toml
# in .cargo/config
[target.'cfg(target_os = "linux")']
rustflags = ["-C", "link-arg=-nostartfiles"]
[target.'cfg(target_os = "windows")']
rustflags = ["-C", "link-args=/ENTRY:_start /SUBSYSTEM:console"]
[target.'cfg(target_os = "macos")']
rustflags = ["-C", "link-args=-e __start -static -nostartfiles"]
```
The `rustflags` key contains arguments that are automatically added to every invocation of `rustc`. For more information on the `.cargo/config` file check out the [official documentation](https://doc.rust-lang.org/cargo/reference/config.html).
Now our program should be buildable on all three platforms with a simple `cargo build`.
#### Should You Do This?
While it's possible to build a freestanding executable for Linux, Windows, and macOS, it's probably not a good idea. The reason is that our executable still expects various things, for example that a stack is initialized when the `_start` function is called. Without the C runtime, some of these requirements might not be fulfilled, which might cause our program to fail, e.g. through a segmentation fault.
If you want to create a minimal binary that runs on top of an existing operating system, including `libc` and setting the `#[start]` attribute as described [here](https://doc.rust-lang.org/1.16.0/book/no-stdlib.html) is probably a better idea.
</details>
## Summary
A minimal freestanding Rust binary looks like this:
`src/main.rs`:
`src/main.rs`
```rust
#![no_std] // don't link the Rust standard library
#![no_main] // disable all Rust-level entry points
#![no_std] // 不链接 Rust 标准库
#![no_main] // 禁用所有 Rust 层级的入口点
use core::panic::PanicInfo;
#[no_mangle] // don't mangle the name of this function
#[no_mangle] // 不重整函数名
pub extern "C" fn _start() -> ! {
// this function is the entry point, since the linker looks for a function
// named `_start` by default
// 因为编译器会寻找一个名为 `_start` 的函数,所以这个函数就是入口点
// 默认命名为 `_start`
loop {}
}
/// This function is called on panic.
/// 这个函数将在 panic 时被调用
#[panic_handler]
fn panic(_info: &PanicInfo) -> ! {
loop {}
}
```
`Cargo.toml`:
`Cargo.toml`
```toml
[package]
@@ -480,36 +268,25 @@ name = "crate_name"
version = "0.1.0"
authors = ["Author Name <author@example.com>"]
# the profile used for `cargo build`
# 使用 `cargo build` 编译时需要的配置
[profile.dev]
panic = "abort" # disable stack unwinding on panic
panic = "abort" # 禁用panic时栈展开
# the profile used for `cargo build --release`
# 使用 `cargo build --release` 编译时需要的配置
[profile.release]
panic = "abort" # disable stack unwinding on panic
panic = "abort" # 禁用 panic 时栈展开
```
To build this binary, we need to compile for a bare metal target such as `thumbv7em-none-eabihf`:
选用任意一个裸机目标来编译。比如对 `thumbv7em-none-eabihf`,我们使用以下命令:
```
```bash
cargo build --target thumbv7em-none-eabihf
```
Alternatively, we can compile it for the host system by passing additional linker arguments:
要注意的是,现在我们的代码只是一个 Rust 编写的独立式可执行程序的一个例子。运行这个二进制程序还需要很多准备,比如在 `_start` 函数之前需要一个已经预加载完毕的栈。所以为了真正运行这样的程序,我们还有很多事情需要做。
```bash
# Linux
cargo rustc -- -C link-arg=-nostartfiles
# Windows
cargo rustc -- -C link-args="/ENTRY:_start /SUBSYSTEM:console"
# macOS
cargo rustc -- -C link-args="-e __start -static -nostartfiles"
```
## 下篇预览
Note that this is just a minimal example of a freestanding Rust binary. This binary expects various things, for example that a stack is initialized when the `_start` function is called. **So it probably for any real use of such a binary, more steps are required**.
## What's next?
The [next post] explains the steps needed for turning our freestanding binary into a minimal operating system kernel. This includes creating a custom target, combining our executable with a bootloader, and learning how to print something to the screen.
下一篇文章要做的事情基于我们这篇文章的成果,它将详细讲述编写一个最小的操作系统内核需要的步骤:如何配置特定的编译目标,如何将可执行程序与引导程序拼接,以及如何把一些特定的字符串打印到屏幕上。
[next post]: @/second-edition/posts/02-minimal-rust-kernel/index.md