Rename second-edition subfolder to `edition-2
390
blog/content/edition-2/posts/deprecated/04-unit-testing/index.md
Normal file
@@ -0,0 +1,390 @@
|
||||
+++
|
||||
title = "Unit Testing"
|
||||
weight = 4
|
||||
path = "unit-testing"
|
||||
date = 2018-04-29
|
||||
|
||||
[extra]
|
||||
warning_short = "Deprecated: "
|
||||
warning = "This post is deprecated in favor of the [_Testing_](/testing) post and will no longer receive updates."
|
||||
+++
|
||||
|
||||
This post explores unit testing in `no_std` executables using Rust's built-in test framework. We will adjust our code so that `cargo test` works and add some basic unit tests to our VGA buffer module.
|
||||
|
||||
<!-- more -->
|
||||
|
||||
This blog is openly developed on [GitHub]. If you have any problems or questions, please open an issue there. You can also leave comments [at the bottom]. The complete source code for this post can be found in the [`post-04`][post branch] branch.
|
||||
|
||||
[GitHub]: https://github.com/phil-opp/blog_os
|
||||
[at the bottom]: #comments
|
||||
[post branch]: https://github.com/phil-opp/blog_os/tree/post-04
|
||||
|
||||
<!-- toc -->
|
||||
|
||||
## Requirements
|
||||
|
||||
In this post we explore how to execute `cargo test` on the host system (as a normal Linux/Windows/macOS executable). This only works if you don't have a `.cargo/config` file that sets a default target. If you followed the [_Minimal Rust Kernel_] post before 2019-04-27, you should be fine. If you followed it after that date, you need to remove the `build.target` key from your `.cargo/config` file and explicitly pass a target argument to `cargo xbuild`.
|
||||
|
||||
[_Minimal Rust Kernel_]: @/edition-2/posts/02-minimal-rust-kernel/index.md
|
||||
|
||||
Alternatively, consider reading the new [_Testing_] post instead. It sets up a similar functionality as this post, but instead of running the tests on your host system, they are run in a realistic environment inside QEMU.
|
||||
|
||||
[_Testing_]: @/edition-2/posts/04-testing/index.md
|
||||
|
||||
## Unit Tests for `no_std` Binaries
|
||||
Rust has a [built-in test framework] that is capable of running unit tests without the need to set anything up. Just create a function that checks some results through assertions and add the `#[test]` attribute to the function header. Then `cargo test` will automatically find and execute all test functions of your crate.
|
||||
|
||||
[built-in test framework]: https://doc.rust-lang.org/book/second-edition/ch11-00-testing.html
|
||||
|
||||
Unfortunately it's a bit more complicated for `no_std` applications such as our kernel. If we run `cargo test` (without adding any test yet), we get the following error:
|
||||
|
||||
```
|
||||
> cargo test
|
||||
Compiling blog_os v0.2.0 (file:///…/blog_os)
|
||||
error[E0152]: duplicate lang item found: `panic_impl`.
|
||||
--> src/main.rs:35:1
|
||||
|
|
||||
35 | / fn panic(info: &PanicInfo) -> ! {
|
||||
36 | | println!("{}", info);
|
||||
37 | | loop {}
|
||||
38 | | }
|
||||
| |_^
|
||||
|
|
||||
= note: first defined in crate `std`.
|
||||
```
|
||||
|
||||
The problem is that unit tests are built for the host machine, with the `std` library included. This makes sense because they should be able to run as a normal application on the host operating system. Since the standard library has it's own `panic_handler` function, we get the above error. To fix it, we use [conditional compilation] to include our implementation of the panic handler only in non-test environments:
|
||||
|
||||
[conditional compilation]: https://doc.rust-lang.org/reference/conditional-compilation.html
|
||||
|
||||
|
||||
```rust
|
||||
// in src/main.rs
|
||||
|
||||
use core::panic::PanicInfo;
|
||||
|
||||
#[cfg(not(test))] // only compile when the test flag is not set
|
||||
#[panic_handler]
|
||||
fn panic(info: &PanicInfo) -> ! {
|
||||
println!("{}", info);
|
||||
loop {}
|
||||
}
|
||||
```
|
||||
|
||||
The only change is the added `#[cfg(not(test))]` attribute. The `#[cfg(…)]` attribute ensures that the annotated item is only included if the passed condition is met. The `test` configuration is set when the crate is compiled for unit tests. Through `not(…)` we negate the condition so that the language item is only compiled for non-test builds.
|
||||
|
||||
When we now try `cargo test` again, we get an ugly linker error:
|
||||
|
||||
```
|
||||
error: linking with `cc` failed: exit code: 1
|
||||
|
|
||||
= note: "cc" "-Wl,--as-needed" "-Wl,-z,noexecstack" "-m64" "-L" "/…/lib/rustlib/x86_64-unknown-linux-gnu/lib" […]
|
||||
= note: /…/blog_os-969bdb90d27730ed.2q644ojj2xqxddld.rcgu.o: In function `_start':
|
||||
/…/blog_os/src/main.rs:17: multiple definition of `_start'
|
||||
/usr/lib/gcc/x86_64-linux-gnu/5/../../../x86_64-linux-gnu/Scrt1.o:(.text+0x0): first defined here
|
||||
/usr/lib/gcc/x86_64-linux-gnu/5/../../../x86_64-linux-gnu/Scrt1.o: In function `_start':
|
||||
(.text+0x20): undefined reference to `main'
|
||||
collect2: error: ld returned 1 exit status
|
||||
|
||||
```
|
||||
|
||||
I shortened the output here because it is extremely verbose. The relevant part is at the bottom, after the second “note:”. We got two distinct errors here, “_multiple definition of `_start`_” and “_undefined reference to `main`_”.
|
||||
|
||||
The reason for the first error is that the test framework injects its own `main` and `_start` functions, which will run the tests when invoked. So we get two functions named `_start` when compiling in test mode, one from the test framework and the one we defined ourselves. To fix this, we need to exclude our `_start` function in that case, which we can do by marking it as `#[cfg(not(test))]`:
|
||||
|
||||
```rust
|
||||
// in src/main.rs
|
||||
|
||||
#[cfg(not(test))]
|
||||
#[no_mangle]
|
||||
pub extern "C" fn _start() -> ! { … }
|
||||
```
|
||||
|
||||
The second problem is that we use the `#![no_main]` attribute for our crate, which suppresses any `main` generation, including the test `main`. To solve this, we use the [`cfg_attr`] attribute to conditionally enable the `no_main` attribute only in non-test mode:
|
||||
|
||||
[`cfg_attr`]: https://chrismorgan.info/blog/rust-cfg_attr.html
|
||||
|
||||
```rust
|
||||
// in src/main.rs
|
||||
|
||||
#![cfg_attr(not(test), no_main)] // instead of `#![no_main]`
|
||||
```
|
||||
|
||||
Now `cargo test` works:
|
||||
|
||||
```
|
||||
> cargo test
|
||||
Compiling blog_os v0.2.0 (file:///…/blog_os)
|
||||
[some warnings]
|
||||
Finished dev [unoptimized + debuginfo] target(s) in 0.98 secs
|
||||
Running target/debug/deps/blog_os-1f08396a9eff0aa7
|
||||
|
||||
running 0 tests
|
||||
|
||||
test result: ok. 0 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out
|
||||
```
|
||||
|
||||
The test framework seems to work as intended. We don't have any tests yet, but we already get a test result summary.
|
||||
|
||||
### Silencing the Warnings
|
||||
We get a few warnings about unused imports, because we no longer compile our `_start` function. To silence such unused code warnings, we can add the following to the top of our `main.rs`:
|
||||
|
||||
```
|
||||
#![cfg_attr(test, allow(unused_imports))]
|
||||
```
|
||||
|
||||
Like before, the `cfg_attr` attribute sets the passed attribute if the passed condition holds. Here, we set the `allow(…)` attribute when compiling in test mode. We use the `allow` attribute to disable warnings for the `unused_import` _lint_.
|
||||
|
||||
Lints are classes of warnings, for example `dead_code` for unused code or `missing-docs` for missing documentation. Lints can be set to four different states:
|
||||
|
||||
- `allow`: no errors, no warnings
|
||||
- `warn`: causes a warning
|
||||
- `deny`: causes a compilation error
|
||||
- `forbid`: like `deny`, but can't be overridden
|
||||
|
||||
Some lints are `allow` by default (such as `missing-docs`), others are `warn` by default (such as `dead_code`), and some few are even `deny` by default.. The default can be overridden by the `allow`, `warn`, `deny` and `forbid` attributes. For a list of all lints, see `rustc -W help`. There is also the [clippy] project, which provides many additional lints.
|
||||
|
||||
[clippy]: https://github.com/rust-lang-nursery/rust-clippy
|
||||
|
||||
### Including the Standard Library
|
||||
Unit tests run on the host machine, so it's possible to use the complete standard library inside them. To link the standard library in test mode, we can make the `#![no_std]` attribute conditional through `cfg_attr` too:
|
||||
|
||||
```diff
|
||||
-#![no_std]
|
||||
+#![cfg_attr(not(test), no_std)]
|
||||
```
|
||||
|
||||
## Testing the VGA Module
|
||||
Now that we have set up the test framework, we can add a first unit test for our `vga_buffer` module:
|
||||
|
||||
```rust
|
||||
// in src/vga_buffer.rs
|
||||
|
||||
#[cfg(test)]
|
||||
mod test {
|
||||
use super::*;
|
||||
|
||||
#[test]
|
||||
fn foo() {}
|
||||
}
|
||||
```
|
||||
|
||||
We add the test in an inline `test` submodule. This isn't necessary, but a common way to separate test code from the rest of the module. By adding the `#[cfg(test)]` attribute, we ensure that the module is only compiled in test mode. Through `use super::*`, we import all items of the parent module (the `vga_buffer` module), so that we can test them easily.
|
||||
|
||||
The `#[test]` attribute on the `foo` function tells the test framework that the function is an unit test. The framework will find it automatically, even if it's private and inside a private module as in our case:
|
||||
|
||||
```
|
||||
> cargo test
|
||||
Compiling blog_os v0.2.0 (file:///…/blog_os)
|
||||
Finished dev [unoptimized + debuginfo] target(s) in 2.99 secs
|
||||
Running target/debug/deps/blog_os-1f08396a9eff0aa7
|
||||
|
||||
running 1 test
|
||||
test vga_buffer::test::foo ... ok
|
||||
|
||||
test result: ok. 1 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out
|
||||
```
|
||||
|
||||
We see that the test was found and executed. It didn't panic, so it counts as passed.
|
||||
|
||||
### Constructing a Writer
|
||||
In order to test the VGA methods, we first need to construct a `Writer` instance. Since we will need such an instance for other tests too, we create a separate function for it:
|
||||
|
||||
```rust
|
||||
// in src/vga_buffer.rs
|
||||
|
||||
#[cfg(test)]
|
||||
mod test {
|
||||
use super::*;
|
||||
|
||||
fn construct_writer() -> Writer {
|
||||
use std::boxed::Box;
|
||||
|
||||
let buffer = construct_buffer();
|
||||
Writer {
|
||||
column_position: 0,
|
||||
color_code: ColorCode::new(Color::Blue, Color::Magenta),
|
||||
buffer: Box::leak(Box::new(buffer)),
|
||||
}
|
||||
}
|
||||
|
||||
fn construct_buffer() -> Buffer { … }
|
||||
}
|
||||
```
|
||||
|
||||
We set the initial column position to 0 and choose some arbitrary colors for foreground and background color. The difficult part is the buffer construction, it's described in detail below. We then use [`Box::new`] and [`Box::leak`] to transform the created `Buffer` into a `&'static mut Buffer`, because the `buffer` field needs to be of that type.
|
||||
|
||||
[`Box::new`]: https://doc.rust-lang.org/nightly/std/boxed/struct.Box.html#method.new
|
||||
[`Box::leak`]: https://doc.rust-lang.org/nightly/std/boxed/struct.Box.html#method.leak
|
||||
|
||||
#### Buffer Construction
|
||||
So how do we create a `Buffer` instance? The naive approach does not work unfortunately:
|
||||
|
||||
```rust
|
||||
fn construct_buffer() -> Buffer {
|
||||
Buffer {
|
||||
chars: [[Volatile::new(empty_char()); BUFFER_WIDTH]; BUFFER_HEIGHT],
|
||||
}
|
||||
}
|
||||
|
||||
fn empty_char() -> ScreenChar {
|
||||
ScreenChar {
|
||||
ascii_character: b' ',
|
||||
color_code: ColorCode::new(Color::Green, Color::Brown),
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
When running `cargo test` the following error occurs:
|
||||
|
||||
```
|
||||
error[E0277]: the trait bound `volatile::Volatile<vga_buffer::ScreenChar>: core::marker::Copy` is not satisfied
|
||||
--> src/vga_buffer.rs:186:21
|
||||
|
|
||||
186 | chars: [[Volatile::new(empty_char); BUFFER_WIDTH]; BUFFER_HEIGHT],
|
||||
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ the trait `core::marker::Copy` is not implemented for `volatile::Volatile<vga_buffer::ScreenChar>`
|
||||
|
|
||||
= note: the `Copy` trait is required because the repeated element will be copied
|
||||
```
|
||||
|
||||
The problem is that array construction in Rust requires that the contained type is [`Copy`]. The `ScreenChar` is `Copy`, but the `Volatile` wrapper is not. There is currently no easy way to circumvent this without using [`unsafe`], but fortunately there is the [`array_init`] crate that provides a safe interface for such operations.
|
||||
|
||||
[`Copy`]: https://doc.rust-lang.org/core/marker/trait.Copy.html
|
||||
[`unsafe`]: https://doc.rust-lang.org/book/second-edition/ch19-01-unsafe-rust.html
|
||||
[`array_init`]: https://docs.rs/array-init
|
||||
|
||||
To use that crate, we add the following to our `Cargo.toml`:
|
||||
|
||||
```toml
|
||||
[dev-dependencies]
|
||||
array-init = "0.0.3"
|
||||
```
|
||||
|
||||
Note that we're using the [`dev-dependencies`] table instead of the `dependencies` table, because we only need the crate for `cargo test` and not for a normal build.
|
||||
|
||||
[`dev-dependencies`]: https://doc.rust-lang.org/cargo/reference/specifying-dependencies.html#development-dependencies
|
||||
|
||||
Now we can fix our `construct_buffer` function:
|
||||
|
||||
```rust
|
||||
fn construct_buffer() -> Buffer {
|
||||
use array_init::array_init;
|
||||
|
||||
Buffer {
|
||||
chars: array_init(|_| array_init(|_| Volatile::new(empty_char()))),
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
See the [documentation of `array_init`][`array_init`] for more information about using that crate.
|
||||
|
||||
### Testing `write_byte`
|
||||
Now we're finally able to write a first unit test that tests the `write_byte` method:
|
||||
|
||||
```rust
|
||||
// in vga_buffer.rs
|
||||
|
||||
mod test {
|
||||
[…]
|
||||
|
||||
#[test]
|
||||
fn write_byte() {
|
||||
let mut writer = construct_writer();
|
||||
writer.write_byte(b'X');
|
||||
writer.write_byte(b'Y');
|
||||
|
||||
for (i, row) in writer.buffer.chars.iter().enumerate() {
|
||||
for (j, screen_char) in row.iter().enumerate() {
|
||||
let screen_char = screen_char.read();
|
||||
if i == BUFFER_HEIGHT - 1 && j == 0 {
|
||||
assert_eq!(screen_char.ascii_character, b'X');
|
||||
assert_eq!(screen_char.color_code, writer.color_code);
|
||||
} else if i == BUFFER_HEIGHT - 1 && j == 1 {
|
||||
assert_eq!(screen_char.ascii_character, b'Y');
|
||||
assert_eq!(screen_char.color_code, writer.color_code);
|
||||
} else {
|
||||
assert_eq!(screen_char, empty_char());
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
We construct a `Writer`, write two bytes to it, and then check that the right screen characters were updated. When we run `cargo test`, we see that the test is executed and passes:
|
||||
|
||||
```
|
||||
running 1 test
|
||||
test vga_buffer::test::write_byte ... ok
|
||||
|
||||
test result: ok. 1 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out
|
||||
```
|
||||
|
||||
Try to play around a bit with this function and verify that the test fails if you change something, e.g. if you print a third byte without adjusting the `for` loop.
|
||||
|
||||
(If you're getting an “binary operation `==` cannot be applied to type `vga_buffer::ScreenChar`” error, you need to also derive [`PartialEq`] for `ScreenChar` and `ColorCode`).
|
||||
|
||||
[`PartialEq`]: https://doc.rust-lang.org/nightly/core/cmp/trait.PartialEq.html
|
||||
|
||||
### Testing Strings
|
||||
Let's add a second unit test to test formatted output and newline behavior:
|
||||
|
||||
```rust
|
||||
// in src/vga_buffer.rs
|
||||
|
||||
mod test {
|
||||
[…]
|
||||
|
||||
#[test]
|
||||
fn write_formatted() {
|
||||
use core::fmt::Write;
|
||||
|
||||
let mut writer = construct_writer();
|
||||
writeln!(&mut writer, "a").unwrap();
|
||||
writeln!(&mut writer, "b{}", "c").unwrap();
|
||||
|
||||
for (i, row) in writer.buffer.chars.iter().enumerate() {
|
||||
for (j, screen_char) in row.iter().enumerate() {
|
||||
let screen_char = screen_char.read();
|
||||
if i == BUFFER_HEIGHT - 3 && j == 0 {
|
||||
assert_eq!(screen_char.ascii_character, b'a');
|
||||
assert_eq!(screen_char.color_code, writer.color_code);
|
||||
} else if i == BUFFER_HEIGHT - 2 && j == 0 {
|
||||
assert_eq!(screen_char.ascii_character, b'b');
|
||||
assert_eq!(screen_char.color_code, writer.color_code);
|
||||
} else if i == BUFFER_HEIGHT - 2 && j == 1 {
|
||||
assert_eq!(screen_char.ascii_character, b'c');
|
||||
assert_eq!(screen_char.color_code, writer.color_code);
|
||||
} else if i >= BUFFER_HEIGHT - 2 {
|
||||
assert_eq!(screen_char.ascii_character, b' ');
|
||||
assert_eq!(screen_char.color_code, writer.color_code);
|
||||
} else {
|
||||
assert_eq!(screen_char, empty_char());
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
In this test we're using the [`writeln!`] macro to print strings with newlines to the buffer. Most of the for loop is similar to the `write_byte` test and only verifies if the written characters are at the expected place. The new `if i >= BUFFER_HEIGHT - 2` case verifies that the empty lines that are shifted in on a newline have the `writer.color_code`, which is different from the initial color.
|
||||
|
||||
[`writeln!`]: https://doc.rust-lang.org/nightly/core/macro.writeln.html
|
||||
|
||||
### More Tests
|
||||
We only present two basic tests here as an example, but of course many more tests are possible. For example a test that changes the writer color in between writes. Or a test that checks that the top line is correctly shifted off the screen on a newline. Or a test that checks that non-ASCII characters are handled correctly.
|
||||
|
||||
## Summary
|
||||
Unit testing is a very useful technique to ensure that certain components have a desired behavior. Even if they cannot show the absence of bugs, they're still an useful tool for finding them and especially for avoiding regressions.
|
||||
|
||||
This post explained how to set up unit testing in a Rust kernel. We now have a functioning test framework and can easily add tests by adding functions with a `#[test]` attribute. To run them, a short `cargo test` suffices. We also added a few basic tests for our VGA buffer as an example how unit tests could look like.
|
||||
|
||||
We also learned a bit about conditional compilation, Rust's [lint system], how to [initialize arrays with non-Copy types], and the `dev-dependencies` section of the `Cargo.toml`.
|
||||
|
||||
[lint system]: #silencing-the-warnings
|
||||
[initialize arrays with non-Copy types]: #buffer-construction
|
||||
|
||||
## What's next?
|
||||
We now have a working unit testing framework, which gives us the ability to test individual components. However, unit tests have the disadvantage that they run on the host machine and are thus unable to test how components interact with platform specific parts. For example, we can't test the `println!` macro with an unit test because it wants to write at the VGA text buffer at address `0xb8000`, which only exists in the bare metal environment.
|
||||
|
||||
The next post will close this gap by creating a basic _integration test_ framework, which runs the tests in QEMU and thus has access to platform specific components. This will allow us to test the full system, for example that our kernel boots correctly or that no deadlock occurs on nested `println!` invocations.
|
||||
@@ -0,0 +1,570 @@
|
||||
+++
|
||||
title = "Integration Tests"
|
||||
weight = 5
|
||||
path = "integration-tests"
|
||||
date = 2018-06-15
|
||||
|
||||
[extra]
|
||||
warning_short = "Deprecated: "
|
||||
warning = "This post is deprecated in favor of the [_Testing_](/testing) post and will no longer receive updates."
|
||||
+++
|
||||
|
||||
To complete the testing picture we implement a basic integration test framework, which allows us to run tests on the target system. The idea is to run tests inside QEMU and report the results back to the host through the serial port.
|
||||
|
||||
<!-- more -->
|
||||
|
||||
This blog is openly developed on [GitHub]. If you have any problems or questions, please open an issue there. You can also leave comments [at the bottom]. The complete source code for this post can be found in the [`post-05`][post branch] branch.
|
||||
|
||||
[GitHub]: https://github.com/phil-opp/blog_os
|
||||
[at the bottom]: #comments
|
||||
[post branch]: https://github.com/phil-opp/blog_os/tree/post-05
|
||||
|
||||
<!-- toc -->
|
||||
|
||||
## Requirements
|
||||
|
||||
This post builds upon the [_Unit Testing_] post, so you need to follow it first. Alternatively, consider reading the new [_Testing_] post instead, which replaces both _Unit Testing_ and this post. The new posts implements similar functionality, but integrates it directly in `cargo xtest`, so that both unit and integration tests run in a realistic environment inside QEMU.
|
||||
|
||||
[_Unit Testing_]: @/edition-2/posts/deprecated/04-unit-testing/index.md
|
||||
[_Testing_]: @/edition-2/posts/04-testing/index.md
|
||||
|
||||
## Overview
|
||||
|
||||
In the previous post we added support for unit tests. The goal of unit tests is to test small components in isolation to ensure that each of them works as intended. The tests are run on the host machine and thus shouldn't rely on architecture specific functionality.
|
||||
|
||||
To test the interaction of the components, both with each other and the system environment, we can write _integration tests_. Compared to unit tests, ìntegration tests are more complex, because they need to run in a realistic environment. What this means depends on the application type. For example, for webserver applications it often means to set up a database instance. For an operating system kernel like ours, it means that we run the tests on the target hardware without an underlying operating system.
|
||||
|
||||
Running on the target architecture allows us to test all hardware specific code such as the VGA buffer or the effects of [page table] modifications. It also allows us to verify that our kernel boots without problems and that no [CPU exception] occurs.
|
||||
|
||||
[page table]: https://en.wikipedia.org/wiki/Page_table
|
||||
[CPU exception]: https://wiki.osdev.org/Exceptions
|
||||
|
||||
In this post we will implement a very basic test framework that runs integration tests inside instances of the [QEMU] virtual machine. It is not as realistic as running them on real hardware, but it is much simpler and should be sufficient as long as we only use standard hardware that is well supported in QEMU.
|
||||
|
||||
[QEMU]: https://www.qemu.org/
|
||||
|
||||
## The Serial Port
|
||||
|
||||
The naive way of doing an integration test would be to add some assertions in the code, launch QEMU, and manually check if a panic occurred or not. This is very cumbersome and not practical if we have hundreds of integration tests. So we want an automated solution that runs all tests and fails if not all of them pass.
|
||||
|
||||
Such an automated test framework needs to know whether a test succeeded or failed. It can't look at the screen output of QEMU, so we need a different way of retrieving the test results on the host system. A simple way to achieve this is by using the [serial port], an old interface standard which is no longer found in modern computers. It is easy to program and QEMU can redirect the bytes sent over serial to the host's standard output or a file.
|
||||
|
||||
[serial port]: https://en.wikipedia.org/wiki/Serial_port
|
||||
|
||||
The chips implementing a serial interface are called [UARTs]. There are [lots of UART models] on x86, but fortunately the only differences between them are some advanced features we don't need. The common UARTs today are all compatible to the [16550 UART], so we will use that model for our testing framework.
|
||||
|
||||
[UARTs]: https://en.wikipedia.org/wiki/Universal_asynchronous_receiver-transmitter
|
||||
[lots of UART models]: https://en.wikipedia.org/wiki/Universal_asynchronous_receiver-transmitter#UART_models
|
||||
[16550 UART]: https://en.wikipedia.org/wiki/16550_UART
|
||||
|
||||
### Port I/O
|
||||
There are two different approaches for communicating between the CPU and peripheral hardware on x86, **memory-mapped I/O** and **port-mapped I/O**. We already used memory-mapped I/O for accessing the [VGA text buffer] through the memory address `0xb8000`. This address is not mapped to RAM, but to some memory on the GPU.
|
||||
|
||||
[VGA text buffer]: @/edition-2/posts/03-vga-text-buffer/index.md
|
||||
|
||||
In contrast, port-mapped I/O uses a separate I/O bus for communication. Each connected peripheral has one or more port numbers. To communicate with such an I/O port there are special CPU instructions called `in` and `out`, which take a port number and a data byte (there are also variations of these commands that allow sending an `u16` or `u32`).
|
||||
|
||||
The UART uses port-mapped I/O. Fortunately there are already several crates that provide abstractions for I/O ports and even UARTs, so we don't need to invoke the `in` and `out` assembly instructions manually.
|
||||
|
||||
### Implementation
|
||||
|
||||
We will use the [`uart_16550`] crate to initialize the UART and send data over the serial port. To add it as a dependency, we update our `Cargo.toml` and `main.rs`:
|
||||
|
||||
[`uart_16550`]: https://docs.rs/uart_16550
|
||||
|
||||
```toml
|
||||
# in Cargo.toml
|
||||
|
||||
[dependencies]
|
||||
uart_16550 = "0.1.0"
|
||||
```
|
||||
|
||||
The `uart_16550` crate contains a `SerialPort` struct that represents the UART registers, but we still need to construct an instance of it ourselves. For that we create a new `serial` module with the following content:
|
||||
|
||||
```rust
|
||||
// in src/main.rs
|
||||
|
||||
mod serial;
|
||||
```
|
||||
|
||||
```rust
|
||||
// in src/serial.rs
|
||||
|
||||
use uart_16550::SerialPort;
|
||||
use spin::Mutex;
|
||||
use lazy_static::lazy_static;
|
||||
|
||||
lazy_static! {
|
||||
pub static ref SERIAL1: Mutex<SerialPort> = {
|
||||
let mut serial_port = SerialPort::new(0x3F8);
|
||||
serial_port.init();
|
||||
Mutex::new(serial_port)
|
||||
};
|
||||
}
|
||||
```
|
||||
|
||||
Like with the [VGA text buffer][vga lazy-static], we use `lazy_static` and a spinlock to create a `static`. However, this time we use `lazy_static` to ensure that the `init` method is called before first use. We're using the port address `0x3F8`, which is the standard port number for the first serial interface.
|
||||
|
||||
[vga lazy-static]: @/edition-2/posts/03-vga-text-buffer/index.md#lazy-statics
|
||||
|
||||
To make the serial port easily usable, we add `serial_print!` and `serial_println!` macros:
|
||||
|
||||
```rust
|
||||
#[doc(hidden)]
|
||||
pub fn _print(args: ::core::fmt::Arguments) {
|
||||
use core::fmt::Write;
|
||||
SERIAL1.lock().write_fmt(args).expect("Printing to serial failed");
|
||||
}
|
||||
|
||||
/// Prints to the host through the serial interface.
|
||||
#[macro_export]
|
||||
macro_rules! serial_print {
|
||||
($($arg:tt)*) => {
|
||||
$crate::serial::_print(format_args!($($arg)*));
|
||||
};
|
||||
}
|
||||
|
||||
/// Prints to the host through the serial interface, appending a newline.
|
||||
#[macro_export]
|
||||
macro_rules! serial_println {
|
||||
() => ($crate::serial_print!("\n"));
|
||||
($fmt:expr) => ($crate::serial_print!(concat!($fmt, "\n")));
|
||||
($fmt:expr, $($arg:tt)*) => ($crate::serial_print!(
|
||||
concat!($fmt, "\n"), $($arg)*));
|
||||
}
|
||||
```
|
||||
|
||||
The `SerialPort` type already implements the [`fmt::Write`] trait, so we don't need to provide an implementation.
|
||||
|
||||
[`fmt::Write`]: https://doc.rust-lang.org/nightly/core/fmt/trait.Write.html
|
||||
|
||||
Now we can print to the serial interface in our `main.rs`:
|
||||
|
||||
```rust
|
||||
// in src/main.rs
|
||||
|
||||
mod serial;
|
||||
|
||||
#[cfg(not(test))]
|
||||
#[no_mangle]
|
||||
pub extern "C" fn _start() -> ! {
|
||||
println!("Hello World{}", "!"); // prints to vga buffer
|
||||
serial_println!("Hello Host{}", "!");
|
||||
|
||||
loop {}
|
||||
}
|
||||
```
|
||||
|
||||
Note that the `serial_println` macro lives directly under the root namespace because we used the `#[macro_export]` attribute, so importing it through `use crate::serial::serial_println` will not work.
|
||||
|
||||
### QEMU Arguments
|
||||
|
||||
To see the serial output in QEMU, we can use the `-serial` argument to redirect the output to stdout:
|
||||
|
||||
```
|
||||
> qemu-system-x86_64 \
|
||||
-drive format=raw,file=target/x86_64-blog_os/debug/bootimage-blog_os.bin \
|
||||
-serial mon:stdio
|
||||
warning: TCG doesn't support requested feature: CPUID.01H:ECX.vmx [bit 5]
|
||||
Hello Host!
|
||||
```
|
||||
|
||||
If you chose a different name than `blog_os`, you need to update the paths of course. Note that you can no longer exit QEMU through `Ctrl+c`. As an alternative you can use `Ctrl+a` and then `x`.
|
||||
|
||||
As an alternative to this long command, we can pass the argument to `bootimage run`, with an additional `--` to separate the build arguments (passed to cargo) from the run arguments (passed to QEMU).
|
||||
|
||||
```
|
||||
bootimage run -- -serial mon:stdio
|
||||
```
|
||||
|
||||
Instead of standard output, QEMU supports [many more target devices][QEMU -serial]. For redirecting the output to a file, the argument is:
|
||||
|
||||
[QEMU -serial]: https://qemu.weilnetz.de/doc/5.2/system/invocation.html#hxtool-9
|
||||
|
||||
```
|
||||
-serial file:output-file.txt
|
||||
```
|
||||
|
||||
## Shutting Down QEMU
|
||||
|
||||
Right now we have an endless loop at the end of our `_start` function and need to close QEMU manually. This does not work for automated tests. We could try to kill QEMU automatically from the host, for example after some special output was sent over serial, but this would be a bit hacky and difficult to get right. The cleaner solution would be to implement a way to shutdown our OS. Unfortunately this is relatively complex, because it requires implementing support for either the [APM] or [ACPI] power management standard.
|
||||
|
||||
[APM]: https://wiki.osdev.org/APM
|
||||
[ACPI]: https://wiki.osdev.org/ACPI
|
||||
|
||||
Luckily, there is an escape hatch: QEMU supports a special `isa-debug-exit` device, which provides an easy way to exit QEMU from the guest system. To enable it, we add the following argument to our QEMU command:
|
||||
|
||||
```
|
||||
-device isa-debug-exit,iobase=0xf4,iosize=0x04
|
||||
```
|
||||
|
||||
The `iobase` specifies on which port address the device should live (`0xf4` is a [generally unused][list of x86 I/O ports] port on the x86's IO bus) and the `iosize` specifies the port size (`0x04` means four bytes). Now the guest can write a value to the `0xf4` port and QEMU will exit with [exit status] `(passed_value << 1) | 1`.
|
||||
|
||||
[list of x86 I/O ports]: https://wiki.osdev.org/I/O_Ports#The_list
|
||||
[exit status]: https://en.wikipedia.org/wiki/Exit_status
|
||||
|
||||
To write to the I/O port, we use the [`x86_64`] crate:
|
||||
|
||||
[`x86_64`]: https://docs.rs/x86_64/0.5.2/x86_64/
|
||||
|
||||
```toml
|
||||
# in Cargo.toml
|
||||
|
||||
[dependencies]
|
||||
x86_64 = "0.5.2"
|
||||
```
|
||||
|
||||
```rust
|
||||
// in src/main.rs
|
||||
|
||||
pub unsafe fn exit_qemu() {
|
||||
use x86_64::instructions::port::Port;
|
||||
|
||||
let mut port = Port::<u32>::new(0xf4);
|
||||
port.write(0);
|
||||
}
|
||||
```
|
||||
|
||||
We mark the function as `unsafe` because it relies on the fact that a special QEMU device is attached to the I/O port with address `0xf4`. For the port type we choose `u32` because the `iosize` is 4 bytes. As value we write a zero, which causes QEMU to exit with exit status `(0 << 1) | 1 = 1`.
|
||||
|
||||
Note that we could also use the exit status instead of the serial interface for sending the test results, for example `1` for success and `2` for failure. However, this wouldn't allow us to send panic messages like the serial interface does and would also prevent us from replacing `exit_qemu` with a proper shutdown someday. Therefore we continue to use the serial interface and just always write a `0` to the port.
|
||||
|
||||
We can now test the QEMU shutdown by calling `exit_qemu` from our `_start` function:
|
||||
|
||||
```rust
|
||||
#[cfg(not(test))]
|
||||
#[no_mangle]
|
||||
pub extern "C" fn _start() -> ! {
|
||||
println!("Hello World{}", "!"); // prints to vga buffer
|
||||
serial_println!("Hello Host{}", "!");
|
||||
|
||||
unsafe { exit_qemu(); }
|
||||
|
||||
loop {}
|
||||
}
|
||||
```
|
||||
|
||||
You should see that QEMU immediately closes after booting when executing:
|
||||
|
||||
```
|
||||
bootimage run -- -serial mon:stdio -device isa-debug-exit,iobase=0xf4,iosize=0x04
|
||||
```
|
||||
|
||||
## Hiding QEMU
|
||||
|
||||
We are now able to launch a QEMU instance that writes its output to the serial port and automatically exits itself when it's done. So we no longer need the VGA buffer output or the graphical representation that still pops up. We can disable it by passing the `-display none` parameter to QEMU. The full command looks like this:
|
||||
|
||||
```
|
||||
qemu-system-x86_64 \
|
||||
-drive format=raw,file=target/x86_64-blog_os/debug/bootimage-blog_os.bin \
|
||||
-serial mon:stdio \
|
||||
-device isa-debug-exit,iobase=0xf4,iosize=0x04 \
|
||||
-display none
|
||||
```
|
||||
|
||||
Or, with `bootimage run`:
|
||||
|
||||
```
|
||||
bootimage run -- \
|
||||
-serial mon:stdio \
|
||||
-device isa-debug-exit,iobase=0xf4,iosize=0x04 \
|
||||
-display none
|
||||
```
|
||||
|
||||
Now QEMU runs completely in the background and no window is opened anymore. This is not only less annoying, but also allows our test framework to run in environments without a graphical user interface, such as [Travis CI].
|
||||
|
||||
[Travis CI]: https://travis-ci.com/
|
||||
|
||||
## Test Organization
|
||||
|
||||
Right now we're doing the serial output and the QEMU exit from the `_start` function in our `main.rs` and can no longer run our kernel in a normal way. We could try to fix this by adding an `integration-test` [cargo feature] and using [conditional compilation]:
|
||||
|
||||
[cargo feature]: https://doc.rust-lang.org/cargo/reference/features.html#the-features-section
|
||||
[conditional compilation]: https://doc.rust-lang.org/reference/conditional-compilation.html
|
||||
|
||||
```toml
|
||||
# in Cargo.toml
|
||||
|
||||
[features]
|
||||
integration-test = []
|
||||
```
|
||||
|
||||
```rust
|
||||
// in src/main.rs
|
||||
|
||||
#[cfg(not(feature = "integration-test"))] // new
|
||||
#[cfg(not(test))]
|
||||
#[no_mangle]
|
||||
pub extern "C" fn _start() -> ! {
|
||||
println!("Hello World{}", "!"); // prints to vga buffer
|
||||
|
||||
// normal execution
|
||||
|
||||
loop {}
|
||||
}
|
||||
|
||||
#[cfg(feature = "integration-test")] // new
|
||||
#[cfg(not(test))]
|
||||
#[no_mangle]
|
||||
pub extern "C" fn _start() -> ! {
|
||||
serial_println!("Hello Host{}", "!");
|
||||
|
||||
run_test_1();
|
||||
run_test_2();
|
||||
// run more tests
|
||||
|
||||
unsafe { exit_qemu(); }
|
||||
|
||||
loop {}
|
||||
}
|
||||
```
|
||||
|
||||
However, this approach has a big problem: All tests run in the same kernel instance, which means that they can influence each other. For example, if `run_test_1` misconfigures the system by loading an invalid [page table], it can cause `run_test_2` to fail. This isn't something that we want because it makes it very difficult to find the actual cause of an error.
|
||||
|
||||
[page table]: https://en.wikipedia.org/wiki/Page_table
|
||||
|
||||
Instead, we want our test instances to be as independent as possible. If a test wants to destroy most of the system configuration to ensure that some property still holds in catastrophic situations, it should be able to do so without needing to restore a correct system state afterwards. This means that we need to launch a separate QEMU instance for each test.
|
||||
|
||||
With the above conditional compilation we only have two modes: Run the kernel normally or execute _all_ integration tests. To run each test in isolation we would need a separate cargo feature for each test with that approach, which would result in very complex conditional compilation bounds and confusing code.
|
||||
|
||||
A better solution is to create an additional executable for each test.
|
||||
|
||||
### Additional Test Executables
|
||||
|
||||
Cargo allows to add [additional executables] to a project by putting them inside `src/bin`. We can use that feature to create a separate executable for each integration test. For example, a `test-something` executable could be added like this:
|
||||
|
||||
[additional executables]: https://doc.rust-lang.org/cargo/guide/project-layout.html
|
||||
|
||||
```rust
|
||||
// src/bin/test-something.rs
|
||||
|
||||
#![cfg_attr(not(test), no_std)]
|
||||
#![cfg_attr(not(test), no_main)]
|
||||
#![cfg_attr(test, allow(unused_imports))]
|
||||
|
||||
use core::panic::PanicInfo;
|
||||
|
||||
#[cfg(not(test))]
|
||||
#[no_mangle]
|
||||
pub extern "C" fn _start() -> ! {
|
||||
// run tests
|
||||
loop {}
|
||||
}
|
||||
|
||||
#[cfg(not(test))]
|
||||
#[panic_handler]
|
||||
fn panic(_info: &PanicInfo) -> ! {
|
||||
loop {}
|
||||
}
|
||||
```
|
||||
|
||||
By providing a new implementation for `_start` we can create a minimal test case that only tests one specific thing and is independent of the rest. For example, if we don't print anything to the VGA buffer, the test still succeeds even if the `vga_buffer` module is broken.
|
||||
|
||||
We can now run this executable in QEMU by passing a `--bin` argument to `bootimage`:
|
||||
|
||||
```
|
||||
bootimage run --bin test-something
|
||||
```
|
||||
|
||||
It should build the `test-something.rs` executable instead of `main.rs` and launch an empty QEMU window (since we don't print anything). So this approach allows us to create completely independent executables without cargo features or conditional compilation, and without cluttering our `main.rs`.
|
||||
|
||||
However, there is a problem: This is a completely separate executable, which means that we can't access any functions from our `main.rs`, including `serial_println` and `exit_qemu`. Duplicating the code would work, but we would also need to copy everything we want to test. This would mean that we no longer test the original function but only a possibly outdated copy.
|
||||
|
||||
Fortunately there is a way to share most of the code between our `main.rs` and the testing binaries: We move most of the code from our `main.rs` to a library that we can include from all executables.
|
||||
|
||||
### Split Off A Library
|
||||
|
||||
Cargo supports hybrid projects that are both a library and a binary. We only need to create a `src/lib.rs` file and split the contents of our `main.rs` in the following way:
|
||||
|
||||
```rust
|
||||
// src/lib.rs
|
||||
|
||||
#![cfg_attr(not(test), no_std)] // don't link the Rust standard library
|
||||
|
||||
// NEW: We need to add `pub` here to make them accessible from the outside
|
||||
pub mod vga_buffer;
|
||||
pub mod serial;
|
||||
|
||||
pub unsafe fn exit_qemu() {
|
||||
use x86_64::instructions::port::Port;
|
||||
|
||||
let mut port = Port::<u32>::new(0xf4);
|
||||
port.write(0);
|
||||
}
|
||||
```
|
||||
|
||||
```rust
|
||||
// src/main.rs
|
||||
|
||||
#![cfg_attr(not(test), no_std)]
|
||||
#![cfg_attr(not(test), no_main)]
|
||||
#![cfg_attr(test, allow(unused_imports))]
|
||||
|
||||
use core::panic::PanicInfo;
|
||||
use blog_os::println;
|
||||
|
||||
/// This function is the entry point, since the linker looks for a function
|
||||
/// named `_start` by default.
|
||||
#[cfg(not(test))]
|
||||
#[no_mangle] // don't mangle the name of this function
|
||||
pub extern "C" fn _start() -> ! {
|
||||
println!("Hello World{}", "!");
|
||||
|
||||
loop {}
|
||||
}
|
||||
|
||||
/// This function is called on panic.
|
||||
#[cfg(not(test))]
|
||||
#[panic_handler]
|
||||
fn panic(info: &PanicInfo) -> ! {
|
||||
println!("{}", info);
|
||||
loop {}
|
||||
}
|
||||
```
|
||||
|
||||
So we move everything except `_start` and `panic` to `lib.rs` and make the `vga_buffer` and `serial` modules public. Everything should work exactly as before, including `bootimage run` and `cargo test`. To run tests only for the library part of our crate and avoid the additional output we can execute `cargo test --lib`.
|
||||
|
||||
### Test Basic Boot
|
||||
|
||||
We are finally able to create our first integration test executable. We start simple and only test that the basic boot sequence works and the `_start` function is called:
|
||||
|
||||
```rust
|
||||
// in src/bin/test-basic-boot.rs
|
||||
|
||||
#![cfg_attr(not(test), no_std)]
|
||||
#![cfg_attr(not(test), no_main)] // disable all Rust-level entry points
|
||||
#![cfg_attr(test, allow(unused_imports))]
|
||||
|
||||
use core::panic::PanicInfo;
|
||||
use blog_os::{exit_qemu, serial_println};
|
||||
|
||||
/// This function is the entry point, since the linker looks for a function
|
||||
/// named `_start` by default.
|
||||
#[cfg(not(test))]
|
||||
#[no_mangle] // don't mangle the name of this function
|
||||
pub extern "C" fn _start() -> ! {
|
||||
serial_println!("ok");
|
||||
|
||||
unsafe { exit_qemu(); }
|
||||
loop {}
|
||||
}
|
||||
|
||||
|
||||
/// This function is called on panic.
|
||||
#[cfg(not(test))]
|
||||
#[panic_handler]
|
||||
fn panic(info: &PanicInfo) -> ! {
|
||||
serial_println!("failed");
|
||||
|
||||
serial_println!("{}", info);
|
||||
|
||||
unsafe { exit_qemu(); }
|
||||
loop {}
|
||||
}
|
||||
```
|
||||
|
||||
We don't do something special here, we just print `ok` if `_start` is called and `failed` with the panic message when a panic occurs. Let's try it:
|
||||
|
||||
```
|
||||
> bootimage run --bin test-basic-boot -- \
|
||||
-serial mon:stdio -display none \
|
||||
-device isa-debug-exit,iobase=0xf4,iosize=0x04
|
||||
Building kernel
|
||||
Compiling blog_os v0.2.0 (file:///…/blog_os)
|
||||
Finished dev [unoptimized + debuginfo] target(s) in 0.19s
|
||||
Updating registry `https://github.com/rust-lang/crates.io-index`
|
||||
Creating disk image at target/x86_64-blog_os/debug/bootimage-test-basic-boot.bin
|
||||
warning: TCG doesn't support requested feature: CPUID.01H:ECX.vmx [bit 5]
|
||||
ok
|
||||
```
|
||||
|
||||
We got our `ok`, so it worked! Try inserting a `panic!()` before the `ok` printing, you should see output like this:
|
||||
|
||||
```
|
||||
failed
|
||||
panicked at 'explicit panic', src/bin/test-basic-boot.rs:19:5
|
||||
```
|
||||
|
||||
### Test Panic
|
||||
|
||||
To test that our panic handler is really invoked on a panic, we create a `test-panic` test:
|
||||
|
||||
```rust
|
||||
// in src/bin/test-panic.rs
|
||||
|
||||
#![cfg_attr(not(test), no_std)]
|
||||
#![cfg_attr(not(test), no_main)]
|
||||
#![cfg_attr(test, allow(unused_imports))]
|
||||
|
||||
use core::panic::PanicInfo;
|
||||
use blog_os::{exit_qemu, serial_println};
|
||||
|
||||
#[cfg(not(test))]
|
||||
#[no_mangle]
|
||||
pub extern "C" fn _start() -> ! {
|
||||
panic!();
|
||||
}
|
||||
|
||||
#[cfg(not(test))]
|
||||
#[panic_handler]
|
||||
fn panic(_info: &PanicInfo) -> ! {
|
||||
serial_println!("ok");
|
||||
|
||||
unsafe { exit_qemu(); }
|
||||
loop {}
|
||||
}
|
||||
```
|
||||
|
||||
This executable is almost identical to `test-basic-boot`, the only difference is that we print `ok` from our panic handler and invoke an explicit `panic()` in our `_start` function.
|
||||
|
||||
## A Test Runner
|
||||
|
||||
The final step is to create a test runner, a program that executes all integration tests and checks their results. The basic steps that it should do are:
|
||||
|
||||
- Look for integration tests in the current project, maybe by some convention (e.g. executables starting with `test-`).
|
||||
- Run all integration tests and interpret their results.
|
||||
- Use a timeout to ensure that an endless loop does not block the test runner forever.
|
||||
- Report the test results to the user and set a successful or failing exit status.
|
||||
|
||||
Such a test runner is useful to many projects, so we decided to add one to the `bootimage` tool.
|
||||
|
||||
### Bootimage Test
|
||||
|
||||
The test runner of the `bootimage` tool can be invoked via `bootimage test`. It uses the following conventions:
|
||||
|
||||
- All executables starting with `test-` are treated as integration tests.
|
||||
- Tests must print either `ok` or `failed` over the serial port. When printing `failed` they can print additional information such as a panic message (in the next lines).
|
||||
- Tests are run with a timeout of 1 minute. If the test has not completed in time, it is reported as "timed out".
|
||||
|
||||
The `test-basic-boot` and `test-panic` tests we created above begin with `test-` and follow the `ok`/`failed` conventions, so they should work with `bootimage test`:
|
||||
|
||||
```
|
||||
> bootimage test
|
||||
test-panic
|
||||
Finished dev [unoptimized + debuginfo] target(s) in 0.01s
|
||||
Ok
|
||||
|
||||
test-basic-boot
|
||||
Finished dev [unoptimized + debuginfo] target(s) in 0.01s
|
||||
Ok
|
||||
|
||||
test-something
|
||||
Finished dev [unoptimized + debuginfo] target(s) in 0.01s
|
||||
Timed Out
|
||||
|
||||
The following tests failed:
|
||||
test-something: TimedOut
|
||||
```
|
||||
|
||||
We see that our `test-panic` and `test-basic-boot` succeeded and that the `test-something` test timed out after one minute. We no longer need `test-something`, so we delete it (if you haven't done already). Now `bootimage test` should execute successfully.
|
||||
|
||||
## Summary
|
||||
|
||||
In this post we learned about the serial port and port-mapped I/O and saw how to configure QEMU to print serial output to the command line. We also learned a trick how to exit QEMU without needing to implement a proper shutdown.
|
||||
|
||||
We then split our crate into a library and binary part in order to create additional executables for integration tests. We added two example tests for testing that the `_start` function is correctly called and that a `panic` invokes our panic handler. Finally, we presented `bootimage test` as a basic test runner for our integration tests.
|
||||
|
||||
We now have a working integration test framework and can finally start to implement functionality in our kernel. We will continue to use the test framework over the next posts to test new components we add.
|
||||
|
||||
## What's next?
|
||||
In the next post, we will explore _CPU exceptions_. These exceptions are thrown by the CPU when something illegal happens, such as a division by zero or an access to an unmapped memory page (a so-called “page fault”). Being able to catch and examine these exceptions is very important for debugging future errors. Exception handling is also very similar to the handling of hardware interrupts, which is required for keyboard support.
|
||||
|
After Width: | Height: | Size: 21 KiB |
@@ -0,0 +1,721 @@
|
||||
+++
|
||||
title = "Advanced Paging"
|
||||
weight = 10
|
||||
path = "advanced-paging"
|
||||
date = 2019-01-28
|
||||
|
||||
[extra]
|
||||
warning_short = "Deprecated: "
|
||||
warning = "This post is deprecated in favor of the [_Paging Implementation_](/paging-implementation) post and will no longer receive updates. See issue [#545](https://github.com/phil-opp/blog_os/issues/545) for reasons for this deprecation."
|
||||
+++
|
||||
|
||||
This post explains techniques to make the physical page table frames accessible to our kernel. It then uses such a technique to implement a function that translates virtual to physical addresses. It also explains how to create new mappings in the page tables.
|
||||
|
||||
<!-- more -->
|
||||
|
||||
This blog is openly developed on [GitHub]. If you have any problems or questions, please open an issue there. You can also leave comments [at the bottom]. The complete source code for this post can be found [here][post branch].
|
||||
|
||||
[GitHub]: https://github.com/phil-opp/blog_os
|
||||
[at the bottom]: #comments
|
||||
[post branch]: https://github.com/phil-opp/blog_os/tree/5c0fb63f33380fc8596d7166c2ebde03ef3d6726
|
||||
|
||||
## Introduction
|
||||
|
||||
In the [previous post] we learned about the principles of paging and how the 4-level page tables on x86_64 work. We also found out that the bootloader already set up a page table hierarchy for our kernel, which means that our kernel already runs on virtual addresses. This improves safety since illegal memory accesses cause page fault exceptions instead of modifying arbitrary physical memory.
|
||||
|
||||
[previous post]: @/edition-2/posts/08-paging-introduction/index.md
|
||||
|
||||
However, it also causes a problem when we try to access the page tables from our kernel because we can't directly access the physical addresses that are stored in page table entries or the `CR3` register. We experienced that problem already [at the end of the previous post] when we tried to inspect the active page tables.
|
||||
|
||||
[at the end of the previous post]: @/edition-2/posts/08-paging-introduction/index.md#accessing-the-page-tables
|
||||
|
||||
The next section discusses the problem in detail and provides different approaches to a solution. Afterward, we implement a function that traverses the page table hierarchy in order to translate virtual to physical addresses. Finally, we learn how to create new mappings in the page tables and how to find unused memory frames for creating new page tables.
|
||||
|
||||
### Dependency Versions
|
||||
|
||||
This post requires version 0.3.12 of the `bootloader` dependency and version 0.5.0 of the `x86_64` dependency. You can set the dependency versions in your `Cargo.toml`:
|
||||
|
||||
```toml
|
||||
[dependencies]
|
||||
bootloader = "0.3.12"
|
||||
x86_64 = "0.5.0"
|
||||
```
|
||||
|
||||
## Accessing Page Tables
|
||||
|
||||
Accessing the page tables from our kernel is not as easy as it may seem. To understand the problem let's take a look at the example 4-level page table hierarchy of the previous post again:
|
||||
|
||||

|
||||
|
||||
The important thing here is that each page entry stores the _physical_ address of the next table. This avoids the need to run a translation for these addresses too, which would be bad for performance and could easily cause endless translation loops.
|
||||
|
||||
The problem for us is that we can't directly access physical addresses from our kernel since our kernel also runs on top of virtual addresses. For example when we access address `4 KiB`, we access the _virtual_ address `4 KiB`, not the _physical_ address `4 KiB` where the level 4 page table is stored. When we want to access the physical address `4 KiB`, we can only do so through some virtual address that maps to it.
|
||||
|
||||
So in order to access page table frames, we need to map some virtual pages to them. There are different ways to create these mappings that all allow us to access arbitrary page table frames:
|
||||
|
||||
|
||||
- A simple solution is to **identity map all page tables**:
|
||||
|
||||

|
||||
|
||||
In this example, we see various identity-mapped page table frames. This way the physical addresses of page tables are also valid virtual addresses so that we can easily access the page tables of all levels starting from the CR3 register.
|
||||
|
||||
However, it clutters the virtual address space and makes it more difficult to find continuous memory regions of larger sizes. For example, imagine that we want to create a virtual memory region of size 1000 KiB in the above graphic, e.g. for [memory-mapping a file]. We can't start the region at `28 KiB` because it would collide with the already mapped page at `1004 MiB`. So we have to look further until we find a large enough unmapped area, for example at `1008 KiB`. This is a similar fragmentation problem as with [segmentation].
|
||||
|
||||
[memory-mapping a file]: https://en.wikipedia.org/wiki/Memory-mapped_file
|
||||
[segmentation]: @/edition-2/posts/08-paging-introduction/index.md#fragmentation
|
||||
|
||||
Equally, it makes it much more difficult to create new page tables, because we need to find physical frames whose corresponding pages aren't already in use. For example, let's assume that we reserved the _virtual_ 1000 KiB memory region starting at `1008 KiB` for our memory-mapped file. Now we can't use any frame with a _physical_ address between `1000 KiB` and `2008 KiB` anymore, because we can't identity map it.
|
||||
|
||||
- Alternatively, we could **map the page tables frames only temporarily** when we need to access them. To be able to create the temporary mappings we only need a single identity-mapped level 1 table:
|
||||
|
||||

|
||||
|
||||
The level 1 table in this graphic controls the first 2 MiB of the virtual address space. This is because it is reachable by starting at the CR3 register and following the 0th entry in the level 4, level 3, and level 2 page tables. The entry with index `8` maps the virtual page at address `32 KiB` to the physical frame at address `32 KiB`, thereby identity mapping the level 1 table itself. The graphic shows this identity-mapping by the horizontal arrow at `32 KiB`.
|
||||
|
||||
By writing to the identity-mapped level 1 table, our kernel can create up to 511 temporary mappings (512 minus the entry required for the identity mapping). In the above example, the kernel mapped the 0th entry of the level 1 table to the frame with address `24 KiB`. This created a temporary mapping of the virtual page at `0 KiB` to the physical frame of the level 2 page table, indicated by the dashed arrow. Now the kernel can access the level 2 page table by writing to the page starting at `0 KiB`.
|
||||
|
||||
The process for accessing an arbitrary page table frame with temporary mappings would be:
|
||||
|
||||
- Search for a free entry in the identity-mapped level 1 table.
|
||||
- Map that entry to the physical frame of the page table that we want to access.
|
||||
- Access the target frame through the virtual page that maps to the entry.
|
||||
- Set the entry back to unused thereby removing the temporary mapping again.
|
||||
|
||||
This approach keeps the virtual address space clean since it reuses the same 512 virtual pages for creating the mappings. The drawback is that it is a bit cumbersome, especially since a new mapping might require modifications of multiple table levels, which means that we would need to repeat the above process multiple times.
|
||||
|
||||
- While both of the above approaches work, there is a third technique called **recursive page tables** that combines their advantages: It keeps all page table frames mapped at all times so that no temporary mappings are needed, and also keeps the mapped pages together to avoid fragmentation of the virtual address space. This is the technique that we will use for our implementation, therefore it is described in detail in the following section.
|
||||
|
||||
### Recursive Page Tables
|
||||
|
||||
The idea behind this approach is to map some entry of the level 4 page table to the level 4 table itself. By doing this, we effectively reserve a part of the virtual address space and map all current and future page table frames to that space.
|
||||
|
||||
Let's go through an example to understand how this all works:
|
||||
|
||||

|
||||
|
||||
The only difference to the [example at the beginning of this post] is the additional entry at index `511` in the level 4 table, which is mapped to physical frame `4 KiB`, the frame of the level 4 table itself.
|
||||
|
||||
[example at the beginning of this post]: #accessing-page-tables
|
||||
|
||||
By letting the CPU follow this entry on a translation, it doesn't reach a level 3 table, but the same level 4 table again. This is similar to a recursive function that calls itself, therefore this table is called a _recursive page table_. The important thing is that the CPU assumes that every entry in the level 4 table points to a level 3 table, so it now treats the level 4 table as a level 3 table. This works because tables of all levels have the exact same layout on x86_64.
|
||||
|
||||
By following the recursive entry one or multiple times before we start the actual translation, we can effectively shorten the number of levels that the CPU traverses. For example, if we follow the recursive entry once and then proceed to the level 3 table, the CPU thinks that the level 3 table is a level 2 table. Going further, it treats the level 2 table as a level 1 table and the level 1 table as the mapped frame. This means that we can now read and write the level 1 page table because the CPU thinks that it is the mapped frame. The graphic below illustrates the 5 translation steps:
|
||||
|
||||

|
||||
|
||||
Similarly, we can follow the recursive entry twice before starting the translation to reduce the number of traversed levels to two:
|
||||
|
||||

|
||||
|
||||
Let's go through it step by step: First, the CPU follows the recursive entry on the level 4 table and thinks that it reaches a level 3 table. Then it follows the recursive entry again and thinks that it reaches a level 2 table. But in reality, it is still on the level 4 table. When the CPU now follows a different entry, it lands on a level 3 table but thinks it is already on a level 1 table. So while the next entry points at a level 2 table, the CPU thinks that it points to the mapped frame, which allows us to read and write the level 2 table.
|
||||
|
||||
Accessing the tables of levels 3 and 4 works in the same way. For accessing the level 3 table, we follow the recursive entry three times, tricking the CPU into thinking it is already on a level 1 table. Then we follow another entry and reach a level 3 table, which the CPU treats as a mapped frame. For accessing the level 4 table itself, we just follow the recursive entry four times until the CPU treats the level 4 table itself as mapped frame (in blue in the graphic below).
|
||||
|
||||

|
||||
|
||||
It might take some time to wrap your head around the concept, but it works quite well in practice.
|
||||
|
||||
#### Address Calculation
|
||||
|
||||
We saw that we can access tables of all levels by following the recursive entry once or multiple times before the actual translation. Since the indexes into the tables of the four levels are derived directly from the virtual address, we need to construct special virtual addresses for this technique. Remember, the page table indexes are derived from the address in the following way:
|
||||
|
||||

|
||||
|
||||
Let's assume that we want to access the level 1 page table that maps a specific page. As we learned above, this means that we have to follow the recursive entry one time before continuing with the level 4, level 3, and level 2 indexes. To do that we move each block of the address one block to the right and set the original level 4 index to the index of the recursive entry:
|
||||
|
||||

|
||||
|
||||
For accessing the level 2 table of that page, we move each index block two blocks to the right and set both the blocks of the original level 4 index and the original level 3 index to the index of the recursive entry:
|
||||
|
||||

|
||||
|
||||
Accessing the level 3 table works by moving each block three blocks to the right and using the recursive index for the original level 4, level 3, and level 2 address blocks:
|
||||
|
||||

|
||||
|
||||
Finally, we can access the level 4 table by moving each block four blocks to the right and using the recursive index for all address blocks except for the offset:
|
||||
|
||||

|
||||
|
||||
We can now calculate virtual addresses for the page tables of all four levels. We can even calculate an address that points exactly to a specific page table entry by multiplying its index by 8, the size of a page table entry.
|
||||
|
||||
The table below summarizes the address structure for accessing the different kinds of frames:
|
||||
|
||||
Virtual Address for | Address Structure ([octal])
|
||||
------------------- | -------------------------------
|
||||
Page | `0o_SSSSSS_AAA_BBB_CCC_DDD_EEEE`
|
||||
Level 1 Table Entry | `0o_SSSSSS_RRR_AAA_BBB_CCC_DDDD`
|
||||
Level 2 Table Entry | `0o_SSSSSS_RRR_RRR_AAA_BBB_CCCC`
|
||||
Level 3 Table Entry | `0o_SSSSSS_RRR_RRR_RRR_AAA_BBBB`
|
||||
Level 4 Table Entry | `0o_SSSSSS_RRR_RRR_RRR_RRR_AAAA`
|
||||
|
||||
[octal]: https://en.wikipedia.org/wiki/Octal
|
||||
|
||||
Whereas `AAA` is the level 4 index, `BBB` the level 3 index, `CCC` the level 2 index, and `DDD` the level 1 index of the mapped frame, and `EEEE` the offset into it. `RRR` is the index of the recursive entry. When an index (three digits) is transformed to an offset (four digits), it is done by multiplying it by 8 (the size of a page table entry). With this offset, the resulting address directly points to the respective page table entry.
|
||||
|
||||
`SSSSSS` are sign extension bits, which means that they are all copies of bit 47. This is a special requirement for valid addresses on the x86_64 architecture. We explained it in the [previous post][sign extension].
|
||||
|
||||
[sign extension]: @/edition-2/posts/08-paging-introduction/index.md#paging-on-x86-64
|
||||
|
||||
We use [octal] numbers for representing the addresses since each octal character represents three bits, which allows us to clearly separate the 9-bit indexes of the different page table levels. This isn't possible with the hexadecimal system where each character represents four bits.
|
||||
|
||||
## Implementation
|
||||
|
||||
After all this theory we can finally start our implementation. Conveniently, the bootloader not only created page tables for our kernel, but it also created a recursive mapping in the last entry of the level 4 table. The bootloader did this because otherwise there would be a [chicken or egg problem]: We need to access the level 4 table to create a recursive mapping, but we can't access it without some kind of mapping.
|
||||
|
||||
[chicken or egg problem]: https://en.wikipedia.org/wiki/Chicken_or_the_egg
|
||||
|
||||
We already used this recursive mapping [at the end of the previous post] to access the level 4 table. We did this through the hardcoded address `0xffff_ffff_ffff_f000`. When we convert this address to [octal] and compare it with the above table, we can see that it exactly follows the structure of a level 4 table entry with `RRR` = `0o777`, `AAAA` = 0, and the sign extension bits set to `1` each:
|
||||
|
||||
```
|
||||
structure: 0o_SSSSSS_RRR_RRR_RRR_RRR_AAAA
|
||||
address: 0o_177777_777_777_777_777_0000
|
||||
```
|
||||
|
||||
With our knowledge about recursive page tables we can now create virtual addresses to access all active page tables. This allows us to create a translation function in software.
|
||||
|
||||
### Translating Addresses
|
||||
|
||||
As a first step, let's create a function that translates a virtual address to a physical address by walking the page table hierarchy:
|
||||
|
||||
```rust
|
||||
// in src/lib.rs
|
||||
|
||||
pub mod memory;
|
||||
```
|
||||
|
||||
```rust
|
||||
// in src/memory.rs
|
||||
|
||||
use x86_64::PhysAddr;
|
||||
use x86_64::structures::paging::PageTable;
|
||||
|
||||
/// Returns the physical address for the given virtual address, or `None` if the
|
||||
/// virtual address is not mapped.
|
||||
pub fn translate_addr(addr: usize) -> Option<PhysAddr> {
|
||||
// introduce variables for the recursive index and the sign extension bits
|
||||
// TODO: Don't hardcode these values
|
||||
let r = 0o777; // recursive index
|
||||
let sign = 0o177777 << 48; // sign extension
|
||||
|
||||
// retrieve the page table indices of the address that we want to translate
|
||||
let l4_idx = (addr >> 39) & 0o777; // level 4 index
|
||||
let l3_idx = (addr >> 30) & 0o777; // level 3 index
|
||||
let l2_idx = (addr >> 21) & 0o777; // level 2 index
|
||||
let l1_idx = (addr >> 12) & 0o777; // level 1 index
|
||||
let page_offset = addr & 0o7777;
|
||||
|
||||
// calculate the table addresses
|
||||
let level_4_table_addr =
|
||||
sign | (r << 39) | (r << 30) | (r << 21) | (r << 12);
|
||||
let level_3_table_addr =
|
||||
sign | (r << 39) | (r << 30) | (r << 21) | (l4_idx << 12);
|
||||
let level_2_table_addr =
|
||||
sign | (r << 39) | (r << 30) | (l4_idx << 21) | (l3_idx << 12);
|
||||
let level_1_table_addr =
|
||||
sign | (r << 39) | (l4_idx << 30) | (l3_idx << 21) | (l2_idx << 12);
|
||||
|
||||
// check that level 4 entry is mapped
|
||||
let level_4_table = unsafe { &*(level_4_table_addr as *const PageTable) };
|
||||
if level_4_table[l4_idx].addr().is_null() {
|
||||
return None;
|
||||
}
|
||||
|
||||
// check that level 3 entry is mapped
|
||||
let level_3_table = unsafe { &*(level_3_table_addr as *const PageTable) };
|
||||
if level_3_table[l3_idx].addr().is_null() {
|
||||
return None;
|
||||
}
|
||||
|
||||
// check that level 2 entry is mapped
|
||||
let level_2_table = unsafe { &*(level_2_table_addr as *const PageTable) };
|
||||
if level_2_table[l2_idx].addr().is_null() {
|
||||
return None;
|
||||
}
|
||||
|
||||
// check that level 1 entry is mapped and retrieve physical address from it
|
||||
let level_1_table = unsafe { &*(level_1_table_addr as *const PageTable) };
|
||||
let phys_addr = level_1_table[l1_idx].addr();
|
||||
if phys_addr.is_null() {
|
||||
return None;
|
||||
}
|
||||
|
||||
Some(phys_addr + page_offset)
|
||||
}
|
||||
```
|
||||
|
||||
First, we introduce variables for the recursive index (511 = `0o777`) and the sign extension bits (which are 1 each). Then we calculate the page table indices and the page offset from the address through bitwise operations as specified in the graphic:
|
||||
|
||||

|
||||
|
||||
In the next step we calculate the virtual addresses of the four page tables as descripbed in the [address calculation] section. We transform each of these addresses to [`PageTable`] references later in the function. These transformations are `unsafe` operations since the compiler can't know that these addresses are valid.
|
||||
|
||||
[address calculation]: #address-calculation
|
||||
[`PageTable`]: https://docs.rs/x86_64/0.5.2/x86_64/structures/paging/page_table/struct.PageTable.html
|
||||
|
||||
After the address calculation, we use the indexing operator to look at the entry in the level 4 table. If that entry is null, there is no level 3 table for this level 4 entry, which means that the `addr` is not mapped to any physical memory, so we return `None`. If the entry is not `None`, we know that a level 3 table exists. We then do the same cast and entry-checking as with the level 4 table.
|
||||
|
||||
After we checked the three higher level pages, we can finally read the entry of the level 1 table that tells us the physical frame that the address is mapped to. As the last step, we add the page offset to that address and return it.
|
||||
|
||||
If we knew that the address is mapped, we could directly access the level 1 table without looking at the higher level pages first. But since we don't know this, we have to check whether the level 1 table exists first, otherwise our function would cause a page fault for unmapped addresses.
|
||||
|
||||
#### Try it out
|
||||
|
||||
We can use our new translation function to translate some virtual addresses in our `_start` function:
|
||||
|
||||
```rust
|
||||
// in src/main.rs
|
||||
|
||||
#[cfg(not(test))]
|
||||
#[no_mangle]
|
||||
pub extern "C" fn _start() -> ! {
|
||||
[…] // initialize GDT, IDT, PICS
|
||||
|
||||
use blog_os::memory::translate_addr;
|
||||
|
||||
let addresses = [
|
||||
// the identity-mapped vga buffer page
|
||||
0xb8000,
|
||||
// some code page
|
||||
0x20010a,
|
||||
// some stack page
|
||||
0x57ac_001f_fe48,
|
||||
];
|
||||
|
||||
for &address in &addresses {
|
||||
println!("{:?} -> {:?}", address, translate_addr(address));
|
||||
}
|
||||
|
||||
println!("It did not crash!");
|
||||
blog_os::hlt_loop();
|
||||
}
|
||||
```
|
||||
|
||||
When we run it, we see the following output:
|
||||
|
||||

|
||||
|
||||
As expected, the identity-mapped address `0xb8000` translates to the same physical address. The code page and the stack page translate to some arbitrary physical addresses, which depend on how the bootloader created the initial mapping for our kernel.
|
||||
|
||||
#### The `RecursivePageTable` Type
|
||||
|
||||
The `x86_64` provides a [`RecursivePageTable`] type that implements safe abstractions for various page table operations. The type implements the [`MapperAllSizes`] trait, which already contains a `translate_addr` function that we can use instead of hand-rolling our own. To create a new `RecursivePageTable`, we create a `memory::init` function:
|
||||
|
||||
[`RecursivePageTable`]: https://docs.rs/x86_64/0.5.2/x86_64/structures/paging/struct.RecursivePageTable.html
|
||||
[`MapperAllSizes`]: https://docs.rs/x86_64/0.5.2/x86_64/structures/paging/mapper/trait.MapperAllSizes.html
|
||||
|
||||
```rust
|
||||
// in src/memory.rs
|
||||
|
||||
use x86_64::structures::paging::{Mapper, Page, PageTable, RecursivePageTable};
|
||||
use x86_64::{VirtAddr, PhysAddr};
|
||||
|
||||
/// Creates a RecursivePageTable instance from the level 4 address.
|
||||
///
|
||||
/// This function is unsafe because it can break memory safety if an invalid
|
||||
/// address is passed.
|
||||
pub unsafe fn init(level_4_table_addr: usize) -> RecursivePageTable<'static> {
|
||||
let level_4_table_ptr = level_4_table_addr as *mut PageTable;
|
||||
let level_4_table = &mut *level_4_table_ptr;
|
||||
RecursivePageTable::new(level_4_table).unwrap()
|
||||
}
|
||||
```
|
||||
|
||||
The `RecursivePageTable` type encapsulates the unsafety of the page table walk completely so that we no longer need `unsafe` to implement our own `translate_addr` function. The `init` function needs to be unsafe because the caller has to guarantee that the passed `level_4_table_addr` is valid.
|
||||
|
||||
We can now use the `MapperAllSizes::translate_addr` function in our `_start` function:
|
||||
|
||||
```rust
|
||||
// in src/main.rs
|
||||
|
||||
#[cfg(not(test))]
|
||||
#[no_mangle]
|
||||
pub extern "C" fn _start() -> ! {
|
||||
[…] // initialize GDT, IDT, PICS
|
||||
|
||||
use blog_os::memory;
|
||||
use x86_64::{
|
||||
structures::paging::MapperAllSizes,
|
||||
VirtAddr,
|
||||
};
|
||||
|
||||
const LEVEL_4_TABLE_ADDR: usize = 0o_177777_777_777_777_777_0000;
|
||||
let recursive_page_table = unsafe { memory::init(LEVEL_4_TABLE_ADDR) };
|
||||
|
||||
let addresses = […]; // as before
|
||||
for &address in &addresses {
|
||||
let virt_addr = VirtAddr::new(address);
|
||||
let phys_addr = recursive_page_table.translate_addr(virt_addr);
|
||||
println!("{:?} -> {:?}", virt_addr, phys_addr);
|
||||
}
|
||||
|
||||
println!("It did not crash!");
|
||||
blog_os::hlt_loop();
|
||||
}
|
||||
```
|
||||
|
||||
Instead of using `u64` for all addresses we now use the [`VirtAddr`] and [`PhysAddr`] wrapper types to differentiate the two kinds of addresses. In order to be able to call the `translate_addr` method, we need to import the `MapperAllSizes` trait.
|
||||
|
||||
[`VirtAddr`]: https://docs.rs/x86_64/0.5.2/x86_64/struct.VirtAddr.html
|
||||
[`PhysAddr`]: https://docs.rs/x86_64/0.5.2/x86_64/struct.PhysAddr.html
|
||||
|
||||
By using the `RecursivePageTable` type, we now have a safe abstraction and clear ownership semantics. This ensures that we can't accidentally modify the page table concurrently, because an exclusive borrow of the `RecursivePageTable` is needed in order to modify it.
|
||||
|
||||
When we run it, we see the same result as with our handcrafted translation function.
|
||||
|
||||
#### Making Unsafe Functions Safer
|
||||
|
||||
Our `memory::init` function is an [unsafe function], which means that an `unsafe` block is required for calling it because the caller has to guarantee that certain requirements are met. In our case, the requirement is that the passed address is mapped to the physical frame of the level 4 page table.
|
||||
|
||||
[unsafe function]: https://doc.rust-lang.org/book/ch19-01-unsafe-rust.html#calling-an-unsafe-function-or-method
|
||||
|
||||
The second property of unsafe functions is that their complete body is treated as an `unsafe` block, which means that it can perform all kinds of unsafe operations without additional unsafe blocks. This is the reason that we didn't need an `unsafe` block for dereferencing the raw `level_4_table_ptr`:
|
||||
|
||||
```rust
|
||||
pub unsafe fn init(level_4_table_addr: usize) -> RecursivePageTable<'static> {
|
||||
let level_4_table_ptr = level_4_table_addr as *mut PageTable;
|
||||
let level_4_table = &mut *level_4_table_ptr; // <- this operation is unsafe
|
||||
RecursivePageTable::new(level_4_table).unwrap()
|
||||
}
|
||||
```
|
||||
|
||||
The problem with this is that we don't immediately see which parts are unsafe. For example, we don't know whether the `RecursivePageTable::new` function is unsafe or not without looking at [its definition][RecursivePageTable::new]. This makes it very easy to accidentally do something unsafe without noticing.
|
||||
|
||||
[RecursivePageTable::new]: https://docs.rs/x86_64/0.5.2/x86_64/structures/paging/struct.RecursivePageTable.html#method.new
|
||||
|
||||
To avoid this problem, we can add a safe inner function:
|
||||
|
||||
```rust
|
||||
// in src/memory.rs
|
||||
|
||||
pub unsafe fn init(level_4_table_addr: usize) -> RecursivePageTable<'static> {
|
||||
/// Rust currently treats the whole body of unsafe functions as an unsafe
|
||||
/// block, which makes it difficult to see which operations are unsafe. To
|
||||
/// limit the scope of unsafe we use a safe inner function.
|
||||
fn init_inner(level_4_table_addr: usize) -> RecursivePageTable<'static> {
|
||||
let level_4_table_ptr = level_4_table_addr as *mut PageTable;
|
||||
let level_4_table = unsafe { &mut *level_4_table_ptr };
|
||||
RecursivePageTable::new(level_4_table).unwrap()
|
||||
}
|
||||
|
||||
init_inner(level_4_table_addr)
|
||||
}
|
||||
```
|
||||
|
||||
Now an `unsafe` block is required again for dereferencing the `level_4_table_ptr` and we immediately see that this is the only unsafe operations in the function. There is currently an open [RFC][unsafe-fn-rfc] to change this unfortunate property of unsafe functions that would allow us to avoid the above boilerplate.
|
||||
|
||||
[unsafe-fn-rfc]: https://github.com/rust-lang/rfcs/pull/2585
|
||||
|
||||
### Creating a new Mapping
|
||||
After reading the page tables and creating a translation function, the next step is to create a new mapping in the page table hierarchy.
|
||||
|
||||
The difficulty of creating a new mapping depends on the virtual page that we want to map. In the easiest case, the level 1 page table for the page already exists and we just need to write a single entry. In the most difficult case, the page is in a memory region for that no level 3 exists yet so that we need to create new level 3, level 2 and level 1 page tables first.
|
||||
|
||||
Let's start with the simple case and assume that we don't need to create new page tables. The bootloader loads itself in the first megabyte of the virtual address space, so we know that a valid level 1 table exists for this region. We can choose any unused page in this memory region for our example mapping, for example, the page at address `0x1000`. As the target frame we use `0xb8000`, the frame of the VGA text buffer. This way we can easily test whether our mapping worked.
|
||||
|
||||
We implement it in a new `create_example_mapping` function in our `memory` module:
|
||||
|
||||
```rust
|
||||
// in src/memory.rs
|
||||
|
||||
use x86_64::structures::paging::{FrameAllocator, PhysFrame, Size4KiB};
|
||||
|
||||
pub fn create_example_mapping(
|
||||
recursive_page_table: &mut RecursivePageTable,
|
||||
frame_allocator: &mut impl FrameAllocator<Size4KiB>,
|
||||
) {
|
||||
use x86_64::structures::paging::PageTableFlags as Flags;
|
||||
|
||||
let page: Page = Page::containing_address(VirtAddr::new(0x1000));
|
||||
let frame = PhysFrame::containing_address(PhysAddr::new(0xb8000));
|
||||
let flags = Flags::PRESENT | Flags::WRITABLE;
|
||||
|
||||
let map_to_result = unsafe {
|
||||
recursive_page_table.map_to(page, frame, flags, frame_allocator)
|
||||
};
|
||||
map_to_result.expect("map_to failed").flush();
|
||||
}
|
||||
```
|
||||
|
||||
The function takes a mutable reference to the `RecursivePageTable` because it needs to modify it and a `FrameAllocator` that is explained below. It then uses the [`map_to`] function of the [`Mapper`] trait to map the page at address `0x1000` to the physical frame at address `0xb8000`. The function is unsafe because it's possible to break memory safety with invalid arguments.
|
||||
|
||||
[`map_to`]: https://docs.rs/x86_64/0.5.2/x86_64/structures/paging/mapper/trait.Mapper.html#tymethod.map_to
|
||||
[`Mapper`]: https://docs.rs/x86_64/0.5.2/x86_64/structures/paging/mapper/trait.Mapper.html
|
||||
|
||||
Apart from the `page` and `frame` arguments, the [`map_to`] function takes two more arguments. The third argument is a set of flags for the page table entry. We set the `PRESENT` flag because it is required for all valid entries and the `WRITABLE` flag to make the mapped page writable.
|
||||
|
||||
The fourth argument needs to be some structure that implements the [`FrameAllocator`] trait. The `map_to` method needs this argument because it might need unused frames for creating new page tables. The `Size4KiB` argument in the trait implementation is needed because the [`Page`] and [`PhysFrame`] types are [generic] over the [`PageSize`] trait to work with both standard 4KiB pages and huge 2MiB/1GiB pages.
|
||||
|
||||
[`FrameAllocator`]: https://docs.rs/x86_64/0.5.2/x86_64/structures/paging/trait.FrameAllocator.html
|
||||
[`Page`]: https://docs.rs/x86_64/0.5.2/x86_64/structures/paging/page/struct.Page.html
|
||||
[`PhysFrame`]: https://docs.rs/x86_64/0.5.2/x86_64/structures/paging/frame/struct.PhysFrame.html
|
||||
[generic]: https://doc.rust-lang.org/book/ch10-00-generics.html
|
||||
[`PageSize`]: https://docs.rs/x86_64/0.5.2/x86_64/structures/paging/page/trait.PageSize.html
|
||||
|
||||
The `map_to` function can fail, so it returns a [`Result`]. Since this is just some example code that does not need to be robust, we just use [`expect`] to panic when an error occurs. On success, the function returns a [`MapperFlush`] type that provides an easy way to flush the newly mapped page from the translation lookaside buffer (TLB) with its [`flush`] method. Like `Result`, the type uses the [`#[must_use]`] attribute to emit a warning when we accidentally forget to use it.
|
||||
|
||||
[`Result`]: https://doc.rust-lang.org/core/result/enum.Result.html
|
||||
[`expect`]: https://doc.rust-lang.org/core/result/enum.Result.html#method.expect
|
||||
[`MapperFlush`]: https://docs.rs/x86_64/0.5.2/x86_64/structures/paging/mapper/struct.MapperFlush.html
|
||||
[`flush`]: https://docs.rs/x86_64/0.5.2/x86_64/structures/paging/mapper/struct.MapperFlush.html#method.flush
|
||||
[`#[must_use]`]: https://doc.rust-lang.org/std/result/#results-must-be-used
|
||||
|
||||
Since we know that no new page tables are required for the address `0x1000`, a frame allocator that always returns `None` suffices. We create such an `EmptyFrameAllocator` for testing our mapping function:
|
||||
|
||||
```rust
|
||||
// in src/memory.rs
|
||||
|
||||
/// A FrameAllocator that always returns `None`.
|
||||
pub struct EmptyFrameAllocator;
|
||||
|
||||
impl FrameAllocator<Size4KiB> for EmptyFrameAllocator {
|
||||
fn allocate_frame(&mut self) -> Option<PhysFrame> {
|
||||
None
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
(If you're getting a 'method `allocate_frame` is not a member of trait `FrameAllocator`' error, you need to update `x86_64` to version 0.4.0.)
|
||||
|
||||
We can now test the new mapping function in our `main.rs`:
|
||||
|
||||
```rust
|
||||
// in src/main.rs
|
||||
|
||||
#[cfg(not(test))]
|
||||
#[no_mangle]
|
||||
pub extern "C" fn _start() -> ! {
|
||||
[…] // initialize GDT, IDT, PICS
|
||||
|
||||
use blog_os::memory::{create_example_mapping, EmptyFrameAllocator};
|
||||
|
||||
const LEVEL_4_TABLE_ADDR: usize = 0o_177777_777_777_777_777_0000;
|
||||
let mut recursive_page_table = unsafe { memory::init(LEVEL_4_TABLE_ADDR) };
|
||||
|
||||
create_example_mapping(&mut recursive_page_table, &mut EmptyFrameAllocator);
|
||||
unsafe { (0x1900 as *mut u64).write_volatile(0xf021_f077_f065_f04e)};
|
||||
|
||||
println!("It did not crash!");
|
||||
blog_os::hlt_loop();
|
||||
}
|
||||
```
|
||||
|
||||
We first create the mapping for the page at `0x1000` by calling our `create_example_mapping` function with a mutable reference to the `RecursivePageTable` instance. This maps the page `0x1000` to the VGA text buffer, so we should see any write to it on the screen.
|
||||
|
||||
Then we write the value `0xf021_f077_f065_f04e` to this page, which represents the string _"New!"_ on white background. We don't write directly to the beginning of the page at `0x1000` since the top line is directly shifted off the screen by the next `println`. Instead, we write to offset `0x900`, which is about in the middle of the screen. As we learned [in the _“VGA Text Mode”_ post], writes to the VGA buffer should be volatile, so we use the [`write_volatile`] method.
|
||||
|
||||
[in the _“VGA Text Mode”_ post]: @/edition-2/posts/03-vga-text-buffer/index.md#volatile
|
||||
[`write_volatile`]: https://doc.rust-lang.org/std/primitive.pointer.html#method.write_volatile
|
||||
|
||||
When we run it in QEMU, we see the following output:
|
||||
|
||||

|
||||
|
||||
The _"New!"_ on the screen is by our write to `0x1900`, which means that we successfully created a new mapping in the page tables.
|
||||
|
||||
This only worked because there was already a level 1 table for mapping page `0x1000`. When we try to map a page for that no level 1 table exists yet, the `map_to` function fails because it tries to allocate frames from the `EmptyFrameAllocator` for creating new page tables. We can see that happen when we try to map page `0xdeadbeaf000` instead of `0x1000`:
|
||||
|
||||
```rust
|
||||
// in src/memory.rs
|
||||
|
||||
pub fn create_example_mapping(…) {
|
||||
[…]
|
||||
let page: Page = Page::containing_address(VirtAddr::new(0xdeadbeaf000));
|
||||
[…]
|
||||
}
|
||||
|
||||
// in src/main.rs
|
||||
|
||||
#[no_mangle]
|
||||
pub extern "C" fn _start() -> ! {
|
||||
[…]
|
||||
unsafe { (0xdeadbeaf900 as *mut u64).write_volatile(0xf021_f077_f065_f04e)};
|
||||
[…]
|
||||
}
|
||||
```
|
||||
|
||||
When we run it, a panic with the following error message occurs:
|
||||
|
||||
```
|
||||
panicked at 'map_to failed: FrameAllocationFailed', /…/result.rs:999:5
|
||||
```
|
||||
|
||||
To map pages that don't have a level 1 page table yet we need to create a proper `FrameAllocator`. But how do we know which frames are unused and how much physical memory is available?
|
||||
|
||||
### Boot Information
|
||||
|
||||
The amount of physical memory and the memory regions reserved by devices like the VGA hardware vary between different machines. Only the BIOS or UEFI firmware knows exactly which memory regions can be used by the operating system and which regions are reserved. Both firmware standards provide functions to retrieve the memory map, but they can only be called very early in the boot process. For this reason, the bootloader already queries this and other information from the firmware.
|
||||
|
||||
To communicate this information to our kernel, the bootloader passes a reference to a boot information structure as an argument when calling our `_start` function. Right now we don't have this argument declared in our function, so it is ignored. Let's add it:
|
||||
|
||||
```rust
|
||||
// in src/main.rs
|
||||
|
||||
use bootloader::bootinfo::BootInfo;
|
||||
|
||||
#[cfg(not(test))]
|
||||
#[no_mangle]
|
||||
pub extern "C" fn _start(boot_info: &'static BootInfo) -> ! { // new argument
|
||||
[…]
|
||||
}
|
||||
```
|
||||
|
||||
The [`BootInfo`] struct is still in an early stage, so expect some breakage when updating to future [semver-incompatible] bootloader versions. It currently has the three fields `p4_table_addr`, `memory_map`, and `package`:
|
||||
|
||||
[`BootInfo`]: https://docs.rs/bootloader/0.3.11/bootloader/bootinfo/struct.BootInfo.html
|
||||
[semver-incompatible]: https://doc.rust-lang.org/stable/cargo/reference/specifying-dependencies.html#caret-requirements
|
||||
|
||||
- The `p4_table_addr` field contains the recursive virtual address of the level 4 page table. By using this field we can avoid hardcoding the address `0o_177777_777_777_777_777_0000`.
|
||||
- The `memory_map` field is most interesting to us since it contains a list of all memory regions and their type (i.e. unused, reserved, or other).
|
||||
- The `package` field is an in-progress feature to bundle additional data with the bootloader. The implementation is not finished, so we can ignore this field for now.
|
||||
|
||||
Before we use the `memory_map` field to create a proper `FrameAllocator`, we want to ensure that we can't use a `boot_info` argument of the wrong type.
|
||||
|
||||
#### The `entry_point` Macro
|
||||
|
||||
Since our `_start` function is called externally from the bootloader, no checking of our function signature occurs. This means that we could let it take arbitrary arguments without any compilation errors, but it would fail or cause undefined behavior at runtime.
|
||||
|
||||
To make sure that the entry point function has always the correct signature that the bootloader expects, the `bootloader` crate provides an [`entry_point`] macro that provides a type-checked way to define a Rust function as the entry point. Let's rewrite our entry point function to use this macro:
|
||||
|
||||
[`entry_point`]: https://docs.rs/bootloader/0.3.12/bootloader/macro.entry_point.html
|
||||
|
||||
```rust
|
||||
// in src/main.rs
|
||||
|
||||
use bootloader::{bootinfo::BootInfo, entry_point};
|
||||
|
||||
entry_point!(kernel_main);
|
||||
|
||||
#[cfg(not(test))]
|
||||
fn kernel_main(boot_info: &'static BootInfo) -> ! {
|
||||
[…] // initialize GDT, IDT, PICS
|
||||
|
||||
let mut recursive_page_table = unsafe {
|
||||
memory::init(boot_info.p4_table_addr as usize)
|
||||
};
|
||||
|
||||
[…] // create and test example mapping
|
||||
|
||||
println!("It did not crash!");
|
||||
blog_os::hlt_loop();
|
||||
}
|
||||
```
|
||||
|
||||
We no longer need to use `extern "C"` or `no_mangle` for our entry point, as the macro defines the real lower level `_start` entry point for us. The `kernel_main` function is now a completely normal Rust function, so we can choose an arbitrary name for it. The important thing is that it is type-checked so that a compilation error occurs when we now try to modify the function signature in any way, for example adding an argument or changing the argument type.
|
||||
|
||||
Note that we now pass `boot_info.p4_table_addr` instead of a hardcoded address to our `memory::init`. Thus our code continues to work even if a future version of the bootloader chooses a different entry of the level 4 page table for the recursive mapping.
|
||||
|
||||
### Allocating Frames
|
||||
|
||||
Now that we have access to the memory map through the boot information we can create a proper frame allocator on top. We start with a generic skeleton:
|
||||
|
||||
```rust
|
||||
// in src/memory.rs
|
||||
|
||||
pub struct BootInfoFrameAllocator<I> where I: Iterator<Item = PhysFrame> {
|
||||
frames: I,
|
||||
}
|
||||
|
||||
impl<I> FrameAllocator<Size4KiB> for BootInfoFrameAllocator<I>
|
||||
where I: Iterator<Item = PhysFrame>
|
||||
{
|
||||
fn allocate_frame(&mut self) -> Option<PhysFrame> {
|
||||
self.frames.next()
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
The `frames` field can be initialized with an arbitrary [`Iterator`] of frames. This allows us to just delegate `alloc` calls to the [`Iterator::next`] method.
|
||||
|
||||
[`Iterator`]: https://doc.rust-lang.org/core/iter/trait.Iterator.html
|
||||
[`Iterator::next`]: https://doc.rust-lang.org/core/iter/trait.Iterator.html#tymethod.next
|
||||
|
||||
The initialization of the `BootInfoFrameAllocator` happens in a new `init_frame_allocator` function:
|
||||
|
||||
```rust
|
||||
// in src/memory.rs
|
||||
|
||||
use bootloader::bootinfo::{MemoryMap, MemoryRegionType};
|
||||
|
||||
/// Create a FrameAllocator from the passed memory map
|
||||
pub fn init_frame_allocator(
|
||||
memory_map: &'static MemoryMap,
|
||||
) -> BootInfoFrameAllocator<impl Iterator<Item = PhysFrame>> {
|
||||
// get usable regions from memory map
|
||||
let regions = memory_map
|
||||
.iter()
|
||||
.filter(|r| r.region_type == MemoryRegionType::Usable);
|
||||
// map each region to its address range
|
||||
let addr_ranges = regions.map(|r| r.range.start_addr()..r.range.end_addr());
|
||||
// transform to an iterator of frame start addresses
|
||||
let frame_addresses = addr_ranges.flat_map(|r| r.into_iter().step_by(4096));
|
||||
// create `PhysFrame` types from the start addresses
|
||||
let frames = frame_addresses.map(|addr| {
|
||||
PhysFrame::containing_address(PhysAddr::new(addr))
|
||||
});
|
||||
|
||||
BootInfoFrameAllocator { frames }
|
||||
}
|
||||
```
|
||||
|
||||
This function uses iterator combinator methods to transform the initial `MemoryMap` into an iterator of usable physical frames:
|
||||
|
||||
- First, we call the `iter` method to convert the memory map to an iterator of [`MemoryRegion`]s. Then we use the [`filter`] method to skip any reserved or otherwise unavailable regions. The bootloader updates the memory map for all the mappings it creates, so frames that are used by our kernel (code, data or stack) or to store the boot information are already marked as `InUse` or similar. Thus we can be sure that `Usable` frames are not used somewhere else.
|
||||
- In the second step, we use the [`map`] combinator and Rust's [range syntax] to transform our iterator of memory regions to an iterator of address ranges.
|
||||
- The third step is the most complicated: We convert each range to an iterator through the `into_iter` method and then choose every 4096th address using [`step_by`]. Since 4096 bytes (= 4 KiB) is the page size, we get the start address of each frame. The bootloader page aligns all usable memory areas so that we don't need any alignment or rounding code here. By using [`flat_map`] instead of `map`, we get an `Iterator<Item = u64>` instead of an `Iterator<Item = Iterator<Item = u64>>`.
|
||||
- In the final step, we convert the start addresses to `PhysFrame` types to construct the desired `Iterator<Item = PhysFrame>`. We then use this iterator to create and return a new `BootInfoFrameAllocator`.
|
||||
|
||||
[`MemoryRegion`]: https://docs.rs/bootloader/0.3.12/bootloader/bootinfo/struct.MemoryRegion.html
|
||||
[`filter`]: https://doc.rust-lang.org/core/iter/trait.Iterator.html#method.filter
|
||||
[`map`]: https://doc.rust-lang.org/core/iter/trait.Iterator.html#method.map
|
||||
[range syntax]: https://doc.rust-lang.org/core/ops/struct.Range.html
|
||||
[`step_by`]: https://doc.rust-lang.org/core/iter/trait.Iterator.html#method.step_by
|
||||
[`flat_map`]: https://doc.rust-lang.org/core/iter/trait.Iterator.html#method.flat_map
|
||||
|
||||
We can now modify our `kernel_main` function to pass a `BootInfoFrameAllocator` instance instead of an `EmptyFrameAllocator`:
|
||||
|
||||
```rust
|
||||
// in src/main.rs
|
||||
|
||||
#[cfg(not(test))]
|
||||
fn kernel_main(boot_info: &'static BootInfo) -> ! {
|
||||
[…] // initialize GDT, IDT, PICS
|
||||
|
||||
use x86_64::structures::paging::{PageTable, RecursivePageTable};
|
||||
|
||||
let mut recursive_page_table = unsafe {
|
||||
memory::init(boot_info.p4_table_addr as usize)
|
||||
};
|
||||
// new
|
||||
let mut frame_allocator = memory::init_frame_allocator(&boot_info.memory_map);
|
||||
|
||||
blog_os::memory::create_example_mapping(&mut recursive_page_table, &mut frame_allocator);
|
||||
unsafe { (0xdeadbeaf900 as *mut u64).write_volatile(0xf021_f077_f065_f04e)};
|
||||
|
||||
println!("It did not crash!");
|
||||
blog_os::hlt_loop();
|
||||
}
|
||||
```
|
||||
|
||||
Now the mapping succeeds and we see the black-on-white _"New!"_ on the screen again. Behind the scenes, the `map_to` method creates the missing page tables in the following way:
|
||||
|
||||
- Allocate an unused frame from the passed `frame_allocator`.
|
||||
- Map the entry of the higher level table to that frame. Now the frame is accessible through the recursive page table.
|
||||
- Zero the frame to create a new, empty page table.
|
||||
- Continue with the next table level.
|
||||
|
||||
While our `create_example_mapping` function is just some example code, we are now able to create new mappings for arbitrary pages. This will be essential for allocating memory or implementing multithreading in future posts.
|
||||
|
||||
## Summary
|
||||
|
||||
In this post we learned how a recursive level 4 table entry can be used to map all page table frames to calculatable virtual addresses. We used this technique to implement an address translation function and to create a new mapping in the page tables.
|
||||
|
||||
We saw that the creation of new mappings requires unused frames for creating new page tables. Such a frame allocator can be implemented on top of the boot information structure that the bootloader passes to our kernel.
|
||||
|
||||
## What's next?
|
||||
|
||||
The next post will create a heap memory region for our kernel, which will allow us to [allocate memory] and use various [collection types].
|
||||
|
||||
[allocate memory]: https://doc.rust-lang.org/alloc/boxed/struct.Box.html
|
||||
[collection types]: https://doc.rust-lang.org/alloc/collections/index.html
|
||||
|
After Width: | Height: | Size: 7.9 KiB |
|
After Width: | Height: | Size: 9.7 KiB |
|
After Width: | Height: | Size: 51 KiB |
|
After Width: | Height: | Size: 51 KiB |
|
After Width: | Height: | Size: 52 KiB |
|
After Width: | Height: | Size: 46 KiB |
|
After Width: | Height: | Size: 21 KiB |
|
After Width: | Height: | Size: 21 KiB |
|
After Width: | Height: | Size: 21 KiB |
|
After Width: | Height: | Size: 21 KiB |
|
After Width: | Height: | Size: 44 KiB |
6
blog/content/edition-2/posts/deprecated/_index.md
Normal file
@@ -0,0 +1,6 @@
|
||||
+++
|
||||
title = "Deprecated Posts"
|
||||
sort_by = "weight"
|
||||
insert_anchor_links = "left"
|
||||
render = false
|
||||
+++
|
||||