+++
title = "Minimal Kernel"
weight = 1
path = "minimal-kernel"
date = 0000-01-01
draft = true
[extra]
chapter = "Bare Bones"
icon = '''
'''
+++
The first step in creating our own operating system kernel is to create a [bare metal] Rust executable that does not depend on an underlying operating system.
For that we need to disable most of Rust's standard library and adjust various compilation settings.
The result is a minimal operating system kernel that forms the base for the following posts of this series.
[bare metal]: https://en.wikipedia.org/wiki/Bare_machine
This blog is openly developed on [GitHub].
If you have any problems or questions, please open an issue there.
You can also leave comments [at the bottom].
The complete source code for this post can be found in the [`post-01`][post branch] branch.
[GitHub]: https://github.com/phil-opp/blog_os
[at the bottom]: #comments
[post branch]: https://github.com/phil-opp/blog_os/tree/post-01
## Introduction
To write an operating system kernel, we need code that does not depend on any operating system features.
This means that we can't use threads, files, heap memory, the network, random numbers, standard output, or any other features requiring OS abstractions or specific hardware.
Which makes sense, since we're trying to write our own OS and our own drivers.
While this means that we can't use most of the [Rust standard library], there are still a lot of Rust features that we _can_ use.
For example, we can use [iterators], [closures], [pattern matching], [option] and [result], [string formatting], and of course the [ownership system].
These features make it possible to write a kernel in a very expressive, high level way without worrying about [undefined behavior] or [memory safety].
[option]: https://doc.rust-lang.org/core/option/
[result]:https://doc.rust-lang.org/core/result/
[Rust standard library]: https://doc.rust-lang.org/std/
[iterators]: https://doc.rust-lang.org/book/ch13-02-iterators.html
[closures]: https://doc.rust-lang.org/book/ch13-01-closures.html
[pattern matching]: https://doc.rust-lang.org/book/ch06-00-enums.html
[string formatting]: https://doc.rust-lang.org/core/macro.write.html
[ownership system]: https://doc.rust-lang.org/book/ch04-00-understanding-ownership.html
[undefined behavior]: https://www.nayuki.io/page/undefined-behavior-in-c-and-cplusplus-programs
[memory safety]: https://tonyarcieri.com/it-s-time-for-a-memory-safety-intervention
In order to create a minimal OS kernel in Rust, we start by creating an executable that can be run without an underlying operating system.
Such an executable is often called a “freestanding” or “bare-metal” executable.
We then make this executable compatible with the early-boot environment of the `x86_64` architecture so that we can boot it as an operating system kernel.
## Disabling the Standard Library
By default, all Rust crates link the [standard library], which depends on the operating system for features such as threads, files, or networking.
It also depends on the C standard library `libc`, which closely interacts with OS services.
Since our plan is to write an operating system, we cannot use any OS-dependent libraries.
So we have to disable the automatic inclusion of the standard library, which we can do through the [`no_std` attribute].
[standard library]: https://doc.rust-lang.org/std/
[`no_std` attribute]: https://doc.rust-lang.org/1.30.0/book/first-edition/using-rust-without-the-standard-library.html
We start by creating a new cargo application project.
The easiest way to do this is through the command line:
```
cargo new blog_os --bin --edition 2018
```
I named the project `blog_os`, but of course you can choose your own name.
The `--bin` flag specifies that we want to create an executable binary (in contrast to a library) and the `--edition 2018` flag specifies that we want to use the [2018 edition] of Rust for our crate.
When we run the command, cargo creates the following directory structure for us:
[2018 edition]: https://doc.rust-lang.org/nightly/edition-guide/rust-2018/index.html
```
blog_os
├── Cargo.toml
└── src
└── main.rs
```
The `Cargo.toml` contains the crate configuration, for example the crate name, the author, the [semantic version] number, and dependencies.
The `src/main.rs` file contains the root module of our crate and our `main` function.
You can compile your crate through `cargo build` and then run the compiled `blog_os` binary in the `target/debug` subfolder.
[semantic version]: https://semver.org/
### The `no_std` Attribute
Right now our crate implicitly links the standard library.
Let's try to disable this by adding the [`no_std` attribute]:
```rust
// main.rs
#![no_std]
fn main() {
println!("Hello, world!");
}
```
When we try to build it now (by running `cargo build`), the following error occurs:
```
error: cannot find macro `println!` in this scope
--> src/main.rs:4:5
|
4 | println!("Hello, world!");
| ^^^^^^^
```
The reason for this error is that the [`println` macro] is part of the standard library, which we no longer include.
So we can no longer print things.
This makes sense, since `println` writes to [standard output], which is a special file descriptor provided by the operating system.
[`println` macro]: https://doc.rust-lang.org/std/macro.println.html
[standard output]: https://en.wikipedia.org/wiki/Standard_streams#Standard_output_.28stdout.29
So let's remove the printing and try again with an empty main function:
```rust
// main.rs
#![no_std]
fn main() {}
```
```
> cargo build
error: `#[panic_handler]` function required, but not found
error: language item required, but not found: `eh_personality`
```
Now the compiler is missing a `#[panic_handler]` function and a _language item_.
### Panic Implementation
The `panic_handler` attribute defines the function that the compiler should invoke when a [panic] occurs.
The standard library provides its own panic handler function, but in a `no_std` environment we need to define it ourselves:
[panic]: https://doc.rust-lang.org/stable/book/ch09-01-unrecoverable-errors-with-panic.html
```rust
// in main.rs
use core::panic::PanicInfo;
/// This function is called on panic.
#[panic_handler]
fn panic(_info: &PanicInfo) -> ! {
loop {}
}
```
The [`PanicInfo` parameter][PanicInfo] contains the file and line where the panic happened and the optional panic message.
The function should never return, so it is marked as a [diverging function] by returning the [“never” type] `!`.
There is not much we can do in this function for now, so we just loop indefinitely.
[PanicInfo]: https://doc.rust-lang.org/nightly/core/panic/struct.PanicInfo.html
[diverging function]: https://doc.rust-lang.org/1.30.0/book/first-edition/functions.html#diverging-functions
[“never” type]: https://doc.rust-lang.org/nightly/std/primitive.never.html
After defining a panic handler, only the `eh_personality` language item error remains:
```
> cargo build
error: language item required, but not found: `eh_personality`
```
### The `eh_personality` Language Item
Language items are special functions and types that are required internally by the compiler.
For example, the [`Copy`] trait is a language item that tells the compiler which types have [_copy semantics_][`Copy`].
When we look at the [implementation][copy code], we see it has the special `#[lang = "copy"]` attribute that defines it as a language item.
[`Copy`]: https://doc.rust-lang.org/nightly/core/marker/trait.Copy.html
[copy code]: https://github.com/rust-lang/rust/blob/485397e49a02a3b7ff77c17e4a3f16c653925cb3/src/libcore/marker.rs#L296-L299
While providing custom implementations of language items is possible, it should only be done as a last resort.
The reason is that language items are highly unstable implementation details and not even type checked (so the compiler doesn't even check if a function has the right argument types).
Fortunately, there is a more stable way to fix the above language item error.
The [`eh_personality` language item] marks a function that is used for implementing [stack unwinding].
By default, Rust uses unwinding to run the destructors of all live stack variables in case of a [panic].
This ensures that all used memory is freed and allows the parent thread to catch the panic and continue execution.
Unwinding, however, is a complicated process and requires some OS specific libraries (e.g. [libunwind] on Linux or [structured exception handling] on Windows), so we don't want to use it for our operating system.
[`eh_personality` language item]: https://github.com/rust-lang/rust/blob/edb368491551a77d77a48446d4ee88b35490c565/src/libpanic_unwind/gcc.rs#L11-L45
[stack unwinding]: https://www.bogotobogo.com/cplusplus/stackunwinding.php
[libunwind]: https://www.nongnu.org/libunwind/
[structured exception handling]: https://docs.microsoft.com/de-de/windows/win32/debug/structured-exception-handling
#### Disabling Unwinding
There are other use cases as well for which unwinding is undesirable, so Rust provides an option to [abort on panic] instead.
This disables the generation of unwinding symbol information and thus considerably reduces binary size.
There are multiple ways to disable unwinding, the easiest is to add the following lines to our `Cargo.toml`:
```toml
[profile.dev]
panic = "abort"
[profile.release]
panic = "abort"
```
This sets the panic strategy to `abort` for both the `dev` profile (used for `cargo build`) and the `release` profile (used for `cargo build --release`).
Now the `eh_personality` language item should no longer be required.
[abort on panic]: https://github.com/rust-lang/rust/pull/32900
Now we fixed both of the above errors.
However, if we try to compile it now, another error occurs:
```
> cargo build
error: requires `start` lang_item
```
Our program is missing the `start` language item, which defines the entry point.
### The `start` Language Item
One might think that the `main` function is the first function called when a program is run.
However, most languages have a [runtime system], which is responsible for things such as garbage collection (e.g. in Java) or software threads (e.g. goroutines in Go).
This runtime needs to be called before `main`, since it needs to initialize itself.
[runtime system]: https://en.wikipedia.org/wiki/Runtime_system
In a typical Rust binary that links the standard library, execution starts in a C runtime library called [`crt0`] (“C runtime zero”), which sets up the environment for a C application.
This includes creating a [call stack] and placing the command line arguments in the right CPU registers.
The C runtime then invokes the [entry point of the Rust runtime][rt::lang_start], which is marked by the `start` language item.
Rust only has a very minimal runtime, which takes care of some small things such as setting up stack overflow guards or printing a backtrace on panic.
The runtime then finally calls the `main` function.
[`crt0`]: https://en.wikipedia.org/wiki/Crt0
[call stack]: https://en.wikipedia.org/wiki/Call_stack
[rt::lang_start]: hhttps://github.com/rust-lang/rust/blob/0d97f7a96877a96015d70ece41ad08bb7af12377/library/std/src/rt.rs#L59-L70
Our freestanding executable does not have access to the Rust runtime and `crt0`, so we need to define our own entry point.
Implementing the `start` language item wouldn't help, since it would still require `crt0`.
Instead, we need to overwrite the `crt0` entry point directly.
#### Overwriting the Entry Point
To tell the Rust compiler that we don't want to use the normal entry point chain, we add the `#![no_main]` attribute.
```rust
#![no_std]
#![no_main]
use core::panic::PanicInfo;
/// This function is called on panic.
#[panic_handler]
fn panic(_info: &PanicInfo) -> ! {
loop {}
}
```
You might notice that we removed the `main` function.
The reason is that a `main` doesn't make sense without an underlying runtime that calls it.
Instead, we are now overwriting the operating system entry point with our own `_start` function:
```rust
#[no_mangle]
pub extern "C" fn _start() -> ! {
loop {}
}
```
By using the `#[no_mangle]` attribute we disable the [name mangling] to ensure that the Rust compiler really outputs a function with the name `_start`.
Without the attribute, the compiler would generate some cryptic `_ZN3blog_os4_start7hb173fedf945531caE` symbol to give every function an unique name.
The reason for naming the function `_start` is that this is the default entry point name for most systems.
We mark the function as `extern "C"` to tell the compiler that it should use the [C calling convention] for this function (instead of the unspecified Rust calling convention).
The `!` return type means that the function is diverging, i.e. not allowed to ever return.
This is required because the entry point is not called by any function, but invoked directly by the operating system or bootloader.
So instead of returning, the entry point should e.g. invoke the [`exit` system call] of the operating system.
In our case, shutting down the machine could be a reasonable action, since there's nothing left to do if a freestanding binary returns.
For now, we fulfill the requirement by looping endlessly.
[name mangling]: https://en.wikipedia.org/wiki/Name_mangling
[C calling convention]: https://en.wikipedia.org/wiki/Calling_convention
[`exit` system call]: https://en.wikipedia.org/wiki/Exit_(system_call)
When we run `cargo build` now, we get an ugly _linker_ error.
## Linker Errors
The [linker] is a program that combines the generated code into an executable.
Since the executable format differs between Linux, Windows, and macOS, each system has its own linker that throws a different error.
The fundamental cause of the errors is the same: the default configuration of the linker assumes that our program depends on the C runtime, which it does not.
To solve the errors, we need to tell the linker that we want to build for a bare-metal target, where no underlying operating system or C runtime exist.
As an alternative, it is also possible to disable the linking of the C runtime by passing a certain set of arguments to the linker.
### Linker Arguments
Linkers are very complex programs with a lot of configuration options.
Each of the major operating systems (Linux, Windows, macOS) has its own linker implementation with different options, but all of them provide a way to disable the linking of the C runtime.
By using these options, it is possible to create a freestanding executable that still runs on top of an existing operating system.
_This is not what we want for our kernel, so this section is only provided for completeness.
Feel free to skip this section if you like._
In the subsections below, we explain the required linker arguments for each operating system.
It's worth noting that creating a freestanding executable this way is probably not a good idea.
The reason is that our executable still expects various things, for example that a stack is initialized when the `_start` function is called.
Without the C runtime, some of these requirements might not be fulfilled, which might cause our program to fail, e.g. by causing a segmentation fault.
If you want to create a minimal binary that runs on top of an existing operating system, including `libc` and setting the `#[start]` attribute as described [here](https://doc.rust-lang.org/1.16.0/book/no-stdlib.html) is probably a better idea.
#### Linux
On Linux the following linker error occurs (shortened):
```
error: linking with `cc` failed: exit code: 1
|
= note: "cc" […]
= note: /usr/lib/gcc/../x86_64-linux-gnu/Scrt1.o: In function `_start':
(.text+0x12): undefined reference to `__libc_csu_fini'
/usr/lib/gcc/../x86_64-linux-gnu/Scrt1.o: In function `_start':
(.text+0x19): undefined reference to `__libc_csu_init'
/usr/lib/gcc/../x86_64-linux-gnu/Scrt1.o: In function `_start':
(.text+0x25): undefined reference to `__libc_start_main'
collect2: error: ld returned 1 exit status
```
The problem is that the linker includes the startup routine of the C runtime by default, which is also called `_start`.
It requires some symbols of the C standard library `libc` that we don't include due to the `no_std` attribute, therefore the linker can't resolve these references.
To solve this, we can tell the linker that it should not link the C startup routine by passing the `-nostartfiles` flag.
One way to pass linker attributes via cargo is the `cargo rustc` command.
The command behaves exactly like `cargo build`, but allows to pass options to `rustc`, the underlying Rust compiler.
`rustc` has the `-C link-arg` flag, which passes an argument to the linker.
Combined, our new build command looks like this:
```
cargo rustc -- -C link-arg=-nostartfiles
```
Now our crate builds as a freestanding executable on Linux!
We didn't need to specify the name of our entry point function explicitly since the linker looks for a function with the name `_start` by default.
#### Windows
On Windows, the following linker error occurs (shortened):
```
error: linking with `link.exe` failed: exit code: 1561
|
= note: "C:\\Program Files (x86)\\…\\link.exe" […]
= note: LINK : fatal error LNK1561: entry point must be defined
```
The "entry point must be defined" error means that the linker can't find the entry point.
On Windows, the default entry point name [depends on the used subsystem][windows-subsystems].
For the `CONSOLE` subsystem the linker looks for a function named `mainCRTStartup` and for the `WINDOWS` subsystem it looks for a function named `WinMainCRTStartup`.
To override the default and tell the linker to look for our `_start` function instead, we can pass an `/ENTRY` argument to the linker:
[windows-subsystems]: https://docs.microsoft.com/en-us/cpp/build/reference/entry-entry-point-symbol
```
cargo rustc -- -C link-arg=/ENTRY:_start
```
From the different argument format we clearly see that the Windows linker is a completely different program than the Linux linker.
Now a different linker error occurs:
```
error: linking with `link.exe` failed: exit code: 1221
|
= note: "C:\\Program Files (x86)\\…\\link.exe" […]
= note: LINK : fatal error LNK1221: a subsystem can't be inferred and must be
defined
```
This error occurs because Windows executables can use different [subsystems][windows-subsystems].
For normal programs they are inferred depending on the entry point name: If the entry point is named `main`, the `CONSOLE` subsystem is used, and if the entry point is named `WinMain`, the `WINDOWS` subsystem is used.
Since our `_start` function has a different name, we need to specify the subsystem explicitly:
```
cargo rustc -- -C link-args="/ENTRY:_start /SUBSYSTEM:console"
```
We use the `CONSOLE` subsystem here, but the `WINDOWS` subsystem would work too.
Instead of passing `-C link-arg` multiple times, we use `-C link-args` which takes a space separated list of arguments.
With this command, our executable should build successfully on Windows.
#### macOS
On macOS, the following linker error occurs (shortened):
```
error: linking with `cc` failed: exit code: 1
|
= note: "cc" […]
= note: ld: entry point (_main) undefined.
for architecture x86_64
clang: error: linker command failed with exit code 1 […]
```
This error message tells us that the linker can't find an entry point function with the default name `main` (for some reason all functions are prefixed with a `_` on macOS).
To set the entry point to our `_start` function, we pass the `-e` linker argument:
```
cargo rustc -- -C link-args="-e __start"
```
The `-e` flag specifies the name of the entry point function.
Since all functions have an additional `_` prefix on macOS, we need to set the entry point to `__start` instead of `_start`.
Now the following linker error occurs:
```
error: linking with `cc` failed: exit code: 1
|
= note: "cc" […]
= note: ld: dynamic main executables must link with libSystem.dylib
for architecture x86_64
clang: error: linker command failed with exit code 1 […]
```
macOS [does not officially support statically linked binaries] and requires programs to link the `libSystem` library by default.
To override this and link a static binary, we pass the `-static` flag to the linker:
[does not officially support statically linked binaries]: https://developer.apple.com/library/archive/qa/qa1118/_index.html
```
cargo rustc -- -C link-args="-e __start -static"
```
This still does not suffice, as a third linker error occurs:
```
error: linking with `cc` failed: exit code: 1
|
= note: "cc" […]
= note: ld: library not found for -lcrt0.o
clang: error: linker command failed with exit code 1 […]
```
This error occurs because programs on macOS link to `crt0` (“C runtime zero”) by default.
This is similar to the error we had on Linux and can be also solved by adding the `-nostartfiles` linker argument:
```
cargo rustc -- -C link-args="-e __start -static -nostartfiles"
```
Now our program should build successfully on macOS.