mirror of
https://github.com/phil-opp/blog_os.git
synced 2025-12-16 14:27:49 +00:00
fix: check writing of 10
also update the note about recent use-after-free vulnerabilities
This commit is contained in:
@@ -8,7 +8,7 @@ date = 2019-06-26
|
||||
chapter = "Memory Management"
|
||||
+++
|
||||
|
||||
This post adds support for heap allocation to our kernel. First, it gives an introduction to dynamic memory and shows how the borrow checker prevents common allocation errors. It then implements the basic allocation interface of Rust, creates a heap memory region, and sets up an allocator crate. At the end of this post all the allocation and collection types of the built-in `alloc` crate will be available to our kernel.
|
||||
This post adds support for heap allocation to our kernel. First, it gives an introduction to dynamic memory and shows how the borrow checker prevents common allocation errors. It then implements the basic allocation interface of Rust, creates a heap memory region, and sets up an allocator crate. At the end of this post, all the allocation and collection types of the built-in `alloc` crate will be available to our kernel.
|
||||
|
||||
<!-- more -->
|
||||
|
||||
@@ -32,15 +32,15 @@ Local variables are stored on the [call stack], which is a [stack data structure
|
||||
[call stack]: https://en.wikipedia.org/wiki/Call_stack
|
||||
[stack data structure]: https://en.wikipedia.org/wiki/Stack_(abstract_data_type)
|
||||
|
||||

|
||||

|
||||
|
||||
The above example shows the call stack after an `outer` function called an `inner` function. We see that the call stack contains the local variables of `outer` first. On the `inner` call, the parameter `1` and the return address for the function were pushed. Then control was transferred to `inner`, which pushed its local variables.
|
||||
The above example shows the call stack after the `outer` function called the `inner` function. We see that the call stack contains the local variables of `outer` first. On the `inner` call, the parameter `1` and the return address for the function were pushed. Then control was transferred to `inner`, which pushed its local variables.
|
||||
|
||||
After the `inner` function returns, its part of the call stack is popped again and only the local variables of `outer` remain:
|
||||
|
||||

|
||||

|
||||
|
||||
We see that the local variables of `inner` only live until the function returns. The Rust compiler enforces these lifetimes and throws an error when we use a value too long, for example when we try to return a reference to a local variable:
|
||||
We see that the local variables of `inner` only live until the function returns. The Rust compiler enforces these lifetimes and throws an error when we use a value for too long, for example when we try to return a reference to a local variable:
|
||||
|
||||
```rust
|
||||
fn inner(i: usize) -> &'static u32 {
|
||||
@@ -59,16 +59,16 @@ While returning a reference makes no sense in this example, there are cases wher
|
||||
|
||||
Static variables are stored at a fixed memory location separate from the stack. This memory location is assigned at compile time by the linker and encoded in the executable. Statics live for the complete runtime of the program, so they have the `'static` lifetime and can always be referenced from local variables:
|
||||
|
||||
![The same outer/inner example with the difference that inner has a `static Z: [u32; 3] = [1,2,3];` and returns a `&Z[i]` reference](call-stack-static.svg)
|
||||
![The same outer/inner example, except that inner has a `static Z: [u32; 3] = [1,2,3];` and returns a `&Z[i]` reference](call-stack-static.svg)
|
||||
|
||||
When the `inner` function returns in the above example, it's part of the call stack is destroyed. The static variables live in a separate memory range that is never destroyed, so the `&Z[1]` reference is still valid after the return.
|
||||
When the `inner` function returns in the above example, its part of the call stack is destroyed. The static variables live in a separate memory range that is never destroyed, so the `&Z[1]` reference is still valid after the return.
|
||||
|
||||
Apart from the `'static` lifetime, static variables also have the useful property that their location is known at compile time, so that no reference is needed for accessing it. We utilized that property for our `println` macro: By using a [static `Writer`] internally there is no `&mut Writer` reference needed to invoke the macro, which is very useful in [exception handlers] where we don't have access to any additional variables.
|
||||
Apart from the `'static` lifetime, static variables also have the useful property that their location is known at compile time, so that no reference is needed for accessing them. We utilized that property for our `println` macro: By using a [static `Writer`] internally, there is no `&mut Writer` reference needed to invoke the macro, which is very useful in [exception handlers], where we don't have access to any additional variables.
|
||||
|
||||
[static `Writer`]: @/edition-2/posts/03-vga-text-buffer/index.md#a-global-interface
|
||||
[exception handlers]: @/edition-2/posts/05-cpu-exceptions/index.md#implementation
|
||||
|
||||
However, this property of static variables brings a crucial drawback: They are read-only by default. Rust enforces this because a [data race] would occur if e.g. two threads modify a static variable at the same time. The only way to modify a static variable is to encapsulate it in a [`Mutex`] type, which ensures that only a single `&mut` reference exists at any point in time. We already used a `Mutex` for our [static VGA buffer `Writer`][vga mutex].
|
||||
However, this property of static variables brings a crucial drawback: they are read-only by default. Rust enforces this because a [data race] would occur if, e.g., two threads modified a static variable at the same time. The only way to modify a static variable is to encapsulate it in a [`Mutex`] type, which ensures that only a single `&mut` reference exists at any point in time. We already used a `Mutex` for our [static VGA buffer `Writer`][vga mutex].
|
||||
|
||||
[data race]: https://doc.rust-lang.org/nomicon/races.html
|
||||
[`Mutex`]: https://docs.rs/spin/0.5.2/spin/struct.Mutex.html
|
||||
@@ -89,9 +89,9 @@ To circumvent these drawbacks, programming languages often support a third memor
|
||||
|
||||
Let's go through an example:
|
||||
|
||||
![The inner function calls `allocate(size_of([u32; 3]))`, writes `z.write([1,2,3]);`, and returns `(z as *mut u32).offset(i)`. The outer function does `deallocate(y, size_of(u32))` on the returned value `y`.](call-stack-heap.svg)
|
||||
![The inner function calls `allocate(size_of([u32; 3]))`, writes `z.write([1,2,3]);`, and returns `(z as *mut u32).offset(i)`. On the returned value `y`, the outer function performs `deallocate(y, size_of(u32))`.](call-stack-heap.svg)
|
||||
|
||||
Here the `inner` function uses heap memory instead of static variables for storing `z`. It first allocates a memory block of the required size, which returns a `*mut u32` [raw pointer]. It then uses the [`ptr::write`] method to write the array `[1,2,3]` to it. In the last step, it uses the [`offset`] function to calculate a pointer to the `i`th element and then returns it. (Note that we omitted some required casts and unsafe blocks in this example function for brevity.)
|
||||
Here the `inner` function uses heap memory instead of static variables for storing `z`. It first allocates a memory block of the required size, which returns a `*mut u32` [raw pointer]. It then uses the [`ptr::write`] method to write the array `[1,2,3]` to it. In the last step, it uses the [`offset`] function to calculate a pointer to the `i`-th element and then returns it. (Note that we omitted some required casts and unsafe blocks in this example function for brevity.)
|
||||
|
||||
[raw pointer]: https://doc.rust-lang.org/book/ch19-01-unsafe-rust.html#dereferencing-a-raw-pointer
|
||||
[`ptr::write`]: https://doc.rust-lang.org/core/ptr/fn.write.html
|
||||
@@ -99,26 +99,26 @@ Here the `inner` function uses heap memory instead of static variables for stori
|
||||
|
||||
The allocated memory lives until it is explicitly freed through a call to `deallocate`. Thus, the returned pointer is still valid even after `inner` returned and its part of the call stack was destroyed. The advantage of using heap memory compared to static memory is that the memory can be reused after it is freed, which we do through the `deallocate` call in `outer`. After that call, the situation looks like this:
|
||||
|
||||
![The call stack contains the local variables of outer, the heap contains z[0] and z[2], but no longer z[1].](call-stack-heap-freed.svg)
|
||||
![The call stack contains the local variables of `outer`, the heap contains `z[0]` and `z[2]`, but no longer `z[1]`.](call-stack-heap-freed.svg)
|
||||
|
||||
We see that the `z[1]` slot is free again and can be reused for the next `allocate` call. However, we also see that `z[0]` and `z[2]` are never freed because we never deallocate them. Such a bug is called a _memory leak_ and often the cause of excessive memory consumption of programs (just imagine what happens when we call `inner` repeatedly in a loop). This might seem bad, but there are much more dangerous types of bugs that can happen with dynamic allocation.
|
||||
We see that the `z[1]` slot is free again and can be reused for the next `allocate` call. However, we also see that `z[0]` and `z[2]` are never freed because we never deallocate them. Such a bug is called a _memory leak_ and is often the cause of excessive memory consumption of programs (just imagine what happens when we call `inner` repeatedly in a loop). This might seem bad, but there are much more dangerous types of bugs that can happen with dynamic allocation.
|
||||
|
||||
### Common Errors
|
||||
|
||||
Apart from memory leaks, which are unfortunate but don't make the program vulnerable to attackers, there are two common types of bugs with more severe consequences:
|
||||
|
||||
- When we accidentally continue to use a variable after calling `deallocate` on it, we have a so-called **use-after-free** vulnerability. Such a bug causes undefined behavior and can often be exploited by attackers to execute arbitrary code.
|
||||
- When we accidentally free a variable twice, we have a **double-free** vulnerability. This is problematic because it might free a different allocation that was allocated in the same spot after the first `deallocate` call. Thus, it can lead to an use-after-free vulnerability again.
|
||||
- When we accidentally free a variable twice, we have a **double-free** vulnerability. This is problematic because it might free a different allocation that was allocated in the same spot after the first `deallocate` call. Thus, it can lead to a use-after-free vulnerability again.
|
||||
|
||||
These types of vulnerabilities are commonly known, so one might expect that people learned how to avoid them by now. But no, such vulnerabilities are still regularly found, for example this recent [use-after-free vulnerability in Linux][linux vulnerability] that allowed arbitrary code execution. This shows that even the best programmers are not always able to correctly handle dynamic memory in complex projects.
|
||||
These types of vulnerabilities are commonly known, so one might expect that people have learned how to avoid them by now. But no, such vulnerabilities are still regularly found, for example this [use-after-free vulnerability in Linux][linux vulnerability] (2019), that allowed arbitrary code execution. A web search like `use-after-free linux {current year}` will probably always yield results. This shows that even the best programmers are not always able to correctly handle dynamic memory in complex projects.
|
||||
|
||||
[linux vulnerability]: https://securityboulevard.com/2019/02/linux-use-after-free-vulnerability-found-in-linux-2-6-through-4-20-11/
|
||||
|
||||
To avoid these issues, many languages such as Java or Python manage dynamic memory automatically using a technique called [_garbage collection_]. The idea is that the programmer never invokes `deallocate` manually. Instead, the program is regularly paused and scanned for unused heap variables, which are then automatically deallocated. Thus, the above vulnerabilities can never occur. The drawbacks are the performance overhead of the regular scan and the probably long pause times.
|
||||
To avoid these issues, many languages, such as Java or Python, manage dynamic memory automatically using a technique called [_garbage collection_]. The idea is that the programmer never invokes `deallocate` manually. Instead, the program is regularly paused and scanned for unused heap variables, which are then automatically deallocated. Thus, the above vulnerabilities can never occur. The drawbacks are the performance overhead of the regular scan and the probably long pause times.
|
||||
|
||||
[_garbage collection_]: https://en.wikipedia.org/wiki/Garbage_collection_(computer_science)
|
||||
|
||||
Rust takes a different approach to the problem: It uses a concept called [_ownership_] that is able to check the correctness of dynamic memory operations at compile time. Thus no garbage collection is needed to avoid the mentioned vulnerabilities, which means that there is no performance overhead. Another advantage of this approach is that the programmer still has fine-grained control over the use of dynamic memory, just like with C or C++.
|
||||
Rust takes a different approach to the problem: It uses a concept called [_ownership_] that is able to check the correctness of dynamic memory operations at compile time. Thus, no garbage collection is needed to avoid the mentioned vulnerabilities, which means that there is no performance overhead. Another advantage of this approach is that the programmer still has fine-grained control over the use of dynamic memory, just like with C or C++.
|
||||
|
||||
[_ownership_]: https://doc.rust-lang.org/book/ch04-01-what-is-ownership.html
|
||||
|
||||
@@ -170,9 +170,9 @@ error[E0597]: `z[_]` does not live long enough
|
||||
| - `z[_]` dropped here while still borrowed
|
||||
```
|
||||
|
||||
The terminology can be a bit confusing at first. Taking a reference to a value is called _borrowing_ the value since it's similar to a borrow in real life: You have temporary access to an object but need to return it sometime and you must not destroy it. By checking that all borrows end before an object is destroyed, the Rust compiler can guarantee that no use-after-free situation can occur.
|
||||
The terminology can be a bit confusing at first. Taking a reference to a value is called _borrowing_ the value since it's similar to a borrow in real life: You have temporary access to an object but need to return it sometime, and you must not destroy it. By checking that all borrows end before an object is destroyed, the Rust compiler can guarantee that no use-after-free situation can occur.
|
||||
|
||||
Rust's ownership system goes even further and does not only prevent use-after-free bugs, but provides complete [_memory safety_] like garbage collected languages like Java or Python do. Additionally, it guarantees [_thread safety_] and is thus even safer than those languages in multi-threaded code. And most importantly, all these checks happen at compile time, so there is no runtime overhead compared to hand written memory management in C.
|
||||
Rust's ownership system goes even further, preventing not only use-after-free bugs but also providing complete [_memory safety_], as garbage collected languages like Java or Python do. Additionally, it guarantees [_thread safety_] and is thus even safer than those languages in multi-threaded code. And most importantly, all these checks happen at compile time, so there is no runtime overhead compared to hand-written memory management in C.
|
||||
|
||||
[_memory safety_]: https://en.wikipedia.org/wiki/Memory_safety
|
||||
[_thread safety_]: https://en.wikipedia.org/wiki/Thread_safety
|
||||
@@ -181,16 +181,16 @@ Rust's ownership system goes even further and does not only prevent use-after-fr
|
||||
|
||||
We now know the basics of dynamic memory allocation in Rust, but when should we use it? We've come really far with our kernel without dynamic memory allocation, so why do we need it now?
|
||||
|
||||
First, dynamic memory allocation always comes with a bit of performance overhead, since we need to find a free slot on the heap for every allocation. For this reason local variables are generally preferable, especially in performance sensitive kernel code. However, there are cases where dynamic memory allocation is the best choice.
|
||||
First, dynamic memory allocation always comes with a bit of performance overhead since we need to find a free slot on the heap for every allocation. For this reason, local variables are generally preferable, especially in performance-sensitive kernel code. However, there are cases where dynamic memory allocation is the best choice.
|
||||
|
||||
As a basic rule, dynamic memory is required for variables that have a dynamic lifetime or a variable size. The most important type with a dynamic lifetime is [**`Rc`**], which counts the references to its wrapped value and deallocates it after all references went out of scope. Examples for types with a variable size are [**`Vec`**], [**`String`**], and other [collection types] that dynamically grow when more elements are added. These types work by allocating a larger amount of memory when they become full, copying all elements over, and then deallocating the old allocation.
|
||||
As a basic rule, dynamic memory is required for variables that have a dynamic lifetime or a variable size. The most important type with a dynamic lifetime is [**`Rc`**], which counts the references to its wrapped value and deallocates it after all references have gone out of scope. Examples for types with a variable size are [**`Vec`**], [**`String`**], and other [collection types] that dynamically grow when more elements are added. These types work by allocating a larger amount of memory when they become full, copying all elements over, and then deallocating the old allocation.
|
||||
|
||||
[**`Rc`**]: https://doc.rust-lang.org/alloc/rc/index.html
|
||||
[**`Vec`**]: https://doc.rust-lang.org/alloc/vec/index.html
|
||||
[**`String`**]: https://doc.rust-lang.org/alloc/string/index.html
|
||||
[collection types]: https://doc.rust-lang.org/alloc/collections/index.html
|
||||
|
||||
For our kernel we will mostly need the collection types, for example for storing a list of active tasks when implementing multitasking in future posts.
|
||||
For our kernel, we will mostly need the collection types, for example, to store a list of active tasks when implementing multitasking in future posts.
|
||||
|
||||
## The Allocator Interface
|
||||
|
||||
@@ -207,7 +207,7 @@ extern crate alloc;
|
||||
|
||||
Contrary to normal dependencies, we don't need to modify the `Cargo.toml`. The reason is that the `alloc` crate ships with the Rust compiler as part of the standard library, so the compiler already knows about the crate. By adding this `extern crate` statement, we specify that the compiler should try to include it. (Historically, all dependencies needed an `extern crate` statement, which is now optional).
|
||||
|
||||
Since we are compiling for a custom target, we can't use the precompiled version of `alloc` that is shipped with the Rust installation. Instead, we have to tell cargo to recompile the crate from source. We can do that, by adding it to the `unstable.build-std` array in our `.cargo/config.toml` file:
|
||||
Since we are compiling for a custom target, we can't use the precompiled version of `alloc` that is shipped with the Rust installation. Instead, we have to tell cargo to recompile the crate from source. We can do that by adding it to the `unstable.build-std` array in our `.cargo/config.toml` file:
|
||||
|
||||
```toml
|
||||
# in .cargo/config.toml
|
||||
@@ -218,7 +218,7 @@ build-std = ["core", "compiler_builtins", "alloc"]
|
||||
|
||||
Now the compiler will recompile and include the `alloc` crate in our kernel.
|
||||
|
||||
The reason that the `alloc` crate is disabled by default in `#[no_std]` crates is that it has additional requirements. We can see these requirements as errors when we try to compile our project now:
|
||||
The reason that the `alloc` crate is disabled by default in `#[no_std]` crates is that it has additional requirements. When we try to compile our project now, we will see these requirements as errors:
|
||||
|
||||
```
|
||||
error: no global memory allocator found but one is required; link to std or add
|
||||
@@ -227,7 +227,7 @@ error: no global memory allocator found but one is required; link to std or add
|
||||
error: `#[alloc_error_handler]` function required, but not found
|
||||
```
|
||||
|
||||
The first error occurs because the `alloc` crate requires an heap allocator, which is an object that provides the `allocate` and `deallocate` functions. In Rust, heap allocators are described by the [`GlobalAlloc`] trait, which is mentioned in the error message. To set the heap allocator for the crate, the `#[global_allocator]` attribute must be applied to a `static` variable that implements the `GlobalAlloc` trait.
|
||||
The first error occurs because the `alloc` crate requires a heap allocator, which is an object that provides the `allocate` and `deallocate` functions. In Rust, heap allocators are described by the [`GlobalAlloc`] trait, which is mentioned in the error message. To set the heap allocator for the crate, the `#[global_allocator]` attribute must be applied to a `static` variable that implements the `GlobalAlloc` trait.
|
||||
|
||||
The second error occurs because calls to `allocate` can fail, most commonly when there is no more memory available. Our program must be able to react to this case, which is what the `#[alloc_error_handler]` function is for.
|
||||
|
||||
@@ -257,8 +257,8 @@ pub unsafe trait GlobalAlloc {
|
||||
```
|
||||
|
||||
It defines the two required methods [`alloc`] and [`dealloc`], which correspond to the `allocate` and `deallocate` functions we used in our examples:
|
||||
- The [`alloc`] method takes a [`Layout`] instance as argument, which describes the desired size and alignment that the allocated memory should have. It returns a [raw pointer] to the first byte of the allocated memory block. Instead of an explicit error value, the `alloc` method returns a null pointer to signal an allocation error. This is a bit non-idiomatic, but it has the advantage that wrapping existing system allocators is easy, since they use the same convention.
|
||||
- The [`dealloc`] method is the counterpart and responsible for freeing a memory block again. It receives two arguments, the pointer returned by `alloc` and the `Layout` that was used for the allocation.
|
||||
- The [`alloc`] method takes a [`Layout`] instance as an argument, which describes the desired size and alignment that the allocated memory should have. It returns a [raw pointer] to the first byte of the allocated memory block. Instead of an explicit error value, the `alloc` method returns a null pointer to signal an allocation error. This is a bit non-idiomatic, but it has the advantage that wrapping existing system allocators is easy since they use the same convention.
|
||||
- The [`dealloc`] method is the counterpart and is responsible for freeing a memory block again. It receives two arguments: the pointer returned by `alloc` and the `Layout` that was used for the allocation.
|
||||
|
||||
[`alloc`]: https://doc.rust-lang.org/alloc/alloc/trait.GlobalAlloc.html#tymethod.alloc
|
||||
[`dealloc`]: https://doc.rust-lang.org/alloc/alloc/trait.GlobalAlloc.html#tymethod.dealloc
|
||||
@@ -277,11 +277,11 @@ The trait additionally defines the two methods [`alloc_zeroed`] and [`realloc`]
|
||||
One thing to notice is that both the trait itself and all trait methods are declared as `unsafe`:
|
||||
|
||||
- The reason for declaring the trait as `unsafe` is that the programmer must guarantee that the trait implementation for an allocator type is correct. For example, the `alloc` method must never return a memory block that is already used somewhere else because this would cause undefined behavior.
|
||||
- Similarly, the reason that the methods are `unsafe` is that the caller must ensure various invariants when calling the methods, for example that the `Layout` passed to `alloc` specifies a non-zero size. This is not really relevant in practice since the methods are normally called directly by the compiler, which ensures that the requirements are met.
|
||||
- Similarly, the reason that the methods are `unsafe` is that the caller must ensure various invariants when calling the methods, for example, that the `Layout` passed to `alloc` specifies a non-zero size. This is not really relevant in practice since the methods are normally called directly by the compiler, which ensures that the requirements are met.
|
||||
|
||||
### A `DummyAllocator`
|
||||
|
||||
Now that we know what an allocator type should provide, we can create a simple dummy allocator. For that we create a new `allocator` module:
|
||||
Now that we know what an allocator type should provide, we can create a simple dummy allocator. For that, we create a new `allocator` module:
|
||||
|
||||
```rust
|
||||
// in src/lib.rs
|
||||
@@ -289,7 +289,7 @@ Now that we know what an allocator type should provide, we can create a simple d
|
||||
pub mod allocator;
|
||||
```
|
||||
|
||||
Our dummy allocator does the absolute minimum to implement the trait and always return an error when `alloc` is called. It looks like this:
|
||||
Our dummy allocator does the absolute minimum to implement the trait and always returns an error when `alloc` is called. It looks like this:
|
||||
|
||||
```rust
|
||||
// in src/allocator.rs
|
||||
@@ -310,9 +310,9 @@ unsafe impl GlobalAlloc for Dummy {
|
||||
}
|
||||
```
|
||||
|
||||
The struct does not need any fields, so we create it as a [zero sized type]. As mentioned above, we always return the null pointer from `alloc`, which corresponds to an allocation error. Since the allocator never returns any memory, a call to `dealloc` should never occur. For this reason we simply panic in the `dealloc` method. The `alloc_zeroed` and `realloc` methods have default implementations, so we don't need to provide implementations for them.
|
||||
The struct does not need any fields, so we create it as a [zero-sized type]. As mentioned above, we always return the null pointer from `alloc`, which corresponds to an allocation error. Since the allocator never returns any memory, a call to `dealloc` should never occur. For this reason, we simply panic in the `dealloc` method. The `alloc_zeroed` and `realloc` methods have default implementations, so we don't need to provide implementations for them.
|
||||
|
||||
[zero sized type]: https://doc.rust-lang.org/nomicon/exotic-sizes.html#zero-sized-types-zsts
|
||||
[zero-sized type]: https://doc.rust-lang.org/nomicon/exotic-sizes.html#zero-sized-types-zsts
|
||||
|
||||
We now have a simple allocator, but we still have to tell the Rust compiler that it should use this allocator. This is where the `#[global_allocator]` attribute comes in.
|
||||
|
||||
@@ -327,7 +327,7 @@ The `#[global_allocator]` attribute tells the Rust compiler which allocator inst
|
||||
static ALLOCATOR: Dummy = Dummy;
|
||||
```
|
||||
|
||||
Since the `Dummy` allocator is a [zero sized type], we don't need to specify any fields in the initialization expression.
|
||||
Since the `Dummy` allocator is a [zero-sized type], we don't need to specify any fields in the initialization expression.
|
||||
|
||||
When we now try to compile it, the first error should be gone. Let's fix the remaining second error:
|
||||
|
||||
@@ -354,7 +354,7 @@ fn alloc_error_handler(layout: alloc::alloc::Layout) -> ! {
|
||||
|
||||
The `alloc_error_handler` function is still unstable, so we need a feature gate to enable it. The function receives a single argument: the `Layout` instance that was passed to `alloc` when the allocation failure occurred. There's nothing we can do to resolve the failure, so we just panic with a message that contains the `Layout` instance.
|
||||
|
||||
With this function, the compilation errors should be fixed. Now we can use the allocation and collection types of `alloc`, for example we can use a [`Box`] to allocate a value on the heap:
|
||||
With this function, the compilation errors should be fixed. Now we can use the allocation and collection types of `alloc`. For example, we can use a [`Box`] to allocate a value on the heap:
|
||||
|
||||
[`Box`]: https://doc.rust-lang.org/alloc/boxed/struct.Box.html
|
||||
|
||||
@@ -378,13 +378,13 @@ fn kernel_main(boot_info: &'static BootInfo) -> ! {
|
||||
|
||||
```
|
||||
|
||||
Note that we need to specify the `extern crate alloc` statement in our `main.rs` too. This is required because the `lib.rs` and `main.rs` part are treated as separate crates. However, we don't need to create another `#[global_allocator]` static because the global allocator applies to all crates in the project. In fact, specifying an additional allocator in another crate would be an error.
|
||||
Note that we need to specify the `extern crate alloc` statement in our `main.rs` too. This is required because the `lib.rs` and `main.rs` parts are treated as separate crates. However, we don't need to create another `#[global_allocator]` static because the global allocator applies to all crates in the project. In fact, specifying an additional allocator in another crate would be an error.
|
||||
|
||||
When we run the above code, we see that our `alloc_error_handler` function is called:
|
||||
|
||||

|
||||
|
||||
The error handler is called because the `Box::new` function implicitly calls the `alloc` function of the global allocator. Our dummy allocator always returns a null pointer, so every allocation fails. To fix this we need to create an allocator that actually returns usable memory.
|
||||
The error handler is called because the `Box::new` function implicitly calls the `alloc` function of the global allocator. Our dummy allocator always returns a null pointer, so every allocation fails. To fix this, we need to create an allocator that actually returns usable memory.
|
||||
|
||||
## Creating a Kernel Heap
|
||||
|
||||
@@ -401,7 +401,7 @@ pub const HEAP_START: usize = 0x_4444_4444_0000;
|
||||
pub const HEAP_SIZE: usize = 100 * 1024; // 100 KiB
|
||||
```
|
||||
|
||||
We set the heap size to 100 KiB for now. If we need more space in the future, we can simply increase it.
|
||||
We set the heap size to 100 KiB for now. If we need more space in the future, we can simply increase it.
|
||||
|
||||
If we tried to use this heap region now, a page fault would occur since the virtual memory region is not mapped to physical memory yet. To resolve this, we create an `init_heap` function that maps the heap pages using the [`Mapper` API] that we introduced in the [_"Paging Implementation"_] post:
|
||||
|
||||
@@ -444,7 +444,7 @@ pub fn init_heap(
|
||||
}
|
||||
```
|
||||
|
||||
The function takes mutable references to a [`Mapper`] and a [`FrameAllocator`] instance, both limited to 4KiB pages by using [`Size4KiB`] as generic parameter. The return value of the function is a [`Result`] with the unit type `()` as success variant and a [`MapToError`] as error variant, which is the error type returned by the [`Mapper::map_to`] method. Reusing the error type makes sense here because the `map_to` method is the main source of errors in this function.
|
||||
The function takes mutable references to a [`Mapper`] and a [`FrameAllocator`] instance, both limited to 4 KiB pages by using [`Size4KiB`] as the generic parameter. The return value of the function is a [`Result`] with the unit type `()` as the success variant and a [`MapToError`] as the error variant, which is the error type returned by the [`Mapper::map_to`] method. Reusing the error type makes sense here because the `map_to` method is the main source of errors in this function.
|
||||
|
||||
[`Mapper`]:https://docs.rs/x86_64/0.14.2/x86_64/structures/paging/mapper/trait.Mapper.html
|
||||
[`FrameAllocator`]: https://docs.rs/x86_64/0.14.2/x86_64/structures/paging/trait.FrameAllocator.html
|
||||
@@ -457,13 +457,13 @@ The implementation can be broken down into two parts:
|
||||
|
||||
- **Creating the page range:**: To create a range of the pages that we want to map, we convert the `HEAP_START` pointer to a [`VirtAddr`] type. Then we calculate the heap end address from it by adding the `HEAP_SIZE`. We want an inclusive bound (the address of the last byte of the heap), so we subtract 1. Next, we convert the addresses into [`Page`] types using the [`containing_address`] function. Finally, we create a page range from the start and end pages using the [`Page::range_inclusive`] function.
|
||||
|
||||
- **Mapping the pages:** The second step is to map all pages of the page range we just created. For that we iterate over the pages in that range using a `for` loop. For each page, we do the following:
|
||||
- **Mapping the pages:** The second step is to map all pages of the page range we just created. For that, we iterate over these pages using a `for` loop. For each page, we do the following:
|
||||
|
||||
- We allocate a physical frame that the page should be mapped to using the [`FrameAllocator::allocate_frame`] method. This method returns [`None`] when there are no more frames left. We deal with that case by mapping it to a [`MapToError::FrameAllocationFailed`] error through the [`Option::ok_or`] method and then apply the [question mark operator] to return early in the case of an error.
|
||||
- We allocate a physical frame that the page should be mapped to using the [`FrameAllocator::allocate_frame`] method. This method returns [`None`] when there are no more frames left. We deal with that case by mapping it to a [`MapToError::FrameAllocationFailed`] error through the [`Option::ok_or`] method and then applying the [question mark operator] to return early in the case of an error.
|
||||
|
||||
- We set the required `PRESENT` flag and the `WRITABLE` flag for the page. With these flags both read and write accesses are allowed, which makes sense for heap memory.
|
||||
- We set the required `PRESENT` flag and the `WRITABLE` flag for the page. With these flags, both read and write accesses are allowed, which makes sense for heap memory.
|
||||
|
||||
- We use the [`Mapper::map_to`] method for creating the mapping in the active page table. The method can fail, therefore we use the [question mark operator] again to forward the error to the caller. On success, the method returns a [`MapperFlush`] instance that we can use to update the [_translation lookaside buffer_] using the [`flush`] method.
|
||||
- We use the [`Mapper::map_to`] method for creating the mapping in the active page table. The method can fail, so we use the [question mark operator] again to forward the error to the caller. On success, the method returns a [`MapperFlush`] instance that we can use to update the [_translation lookaside buffer_] using the [`flush`] method.
|
||||
|
||||
[`VirtAddr`]: https://docs.rs/x86_64/0.14.2/x86_64/addr/struct.VirtAddr.html
|
||||
[`Page`]: https://docs.rs/x86_64/0.14.2/x86_64/structures/paging/page/struct.Page.html
|
||||
@@ -509,7 +509,7 @@ fn kernel_main(boot_info: &'static BootInfo) -> ! {
|
||||
}
|
||||
```
|
||||
|
||||
We show the full function for context here. The only new lines are the `blog_os::allocator` import and the call to `allocator::init_heap` function. In case the `init_heap` function returns an error, we panic using the [`Result::expect`] method since there is currently no sensible way for us to handle this error.
|
||||
We show the full function for context here. The only new lines are the `blog_os::allocator` import and the call to the `allocator::init_heap` function. In case the `init_heap` function returns an error, we panic using the [`Result::expect`] method since there is currently no sensible way for us to handle this error.
|
||||
|
||||
[`Result::expect`]: https://doc.rust-lang.org/core/result/enum.Result.html#method.expect
|
||||
|
||||
@@ -519,7 +519,7 @@ We now have a mapped heap memory region that is ready to be used. The `Box::new`
|
||||
|
||||
Since implementing an allocator is somewhat complex, we start by using an external allocator crate. We will learn how to implement our own allocator in the next post.
|
||||
|
||||
A simple allocator crate for `no_std` applications is the [`linked_list_allocator`] crate. It's name comes from the fact that it uses a linked list data structure to keep track of deallocated memory regions. See the next post for a more detailed explanation of this approach.
|
||||
A simple allocator crate for `no_std` applications is the [`linked_list_allocator`] crate. Its name comes from the fact that it uses a linked list data structure to keep track of deallocated memory regions. See the next post for a more detailed explanation of this approach.
|
||||
|
||||
To use the crate, we first need to add a dependency on it in our `Cargo.toml`:
|
||||
|
||||
@@ -543,7 +543,7 @@ use linked_list_allocator::LockedHeap;
|
||||
static ALLOCATOR: LockedHeap = LockedHeap::empty();
|
||||
```
|
||||
|
||||
The struct is named `LockedHeap` because it uses the [`spinning_top::Spinlock`] type for synchronization. This is required because multiple threads could access the `ALLOCATOR` static at the same time. As always when using a spinlock or a mutex, we need to be careful to not accidentally cause a deadlock. This means that we shouldn't perform any allocations in interrupt handlers, since they can run at an arbitrary time and might interrupt an in-progress allocation.
|
||||
The struct is named `LockedHeap` because it uses the [`spinning_top::Spinlock`] type for synchronization. This is required because multiple threads could access the `ALLOCATOR` static at the same time. As always, when using a spinlock or a mutex, we need to be careful to not accidentally cause a deadlock. This means that we shouldn't perform any allocations in interrupt handlers, since they can run at an arbitrary time and might interrupt an in-progress allocation.
|
||||
|
||||
[`spinning_top::Spinlock`]: https://docs.rs/spinning_top/0.1.0/spinning_top/type.Spinlock.html
|
||||
|
||||
@@ -569,7 +569,7 @@ pub fn init_heap(
|
||||
}
|
||||
```
|
||||
|
||||
We use the [`lock`] method on the inner spinlock of the `LockedHeap` type to get an exclusive reference to the wrapped [`Heap`] instance, on which we then call the [`init`] method with the heap bounds as arguments. It is important that we initialize the heap _after_ mapping the heap pages, since the [`init`] function already tries to write to the heap memory.
|
||||
We use the [`lock`] method on the inner spinlock of the `LockedHeap` type to get an exclusive reference to the wrapped [`Heap`] instance, on which we then call the [`init`] method with the heap bounds as arguments. Because the [`init`] function already tries to write to the heap memory, we must initialize the heap only _after_ mapping the heap pages.
|
||||
|
||||
[`lock`]: https://docs.rs/lock_api/0.3.3/lock_api/struct.Mutex.html#method.lock
|
||||
[`Heap`]: https://docs.rs/linked_list_allocator/0.9.0/linked_list_allocator/struct.Heap.html
|
||||
@@ -609,7 +609,7 @@ fn kernel_main(boot_info: &'static BootInfo) -> ! {
|
||||
}
|
||||
```
|
||||
|
||||
This code example shows some uses of the [`Box`], [`Vec`], and [`Rc`] types. For the `Box` and `Vec` types we print the underlying heap pointers using the [`{:p}` formatting specifier]. For showcasing `Rc`, we create a reference counted heap value and use the [`Rc::strong_count`] function to print the current reference count, before and after dropping an instance (using [`core::mem::drop`]).
|
||||
This code example shows some uses of the [`Box`], [`Vec`], and [`Rc`] types. For the `Box` and `Vec` types, we print the underlying heap pointers using the [`{:p}` formatting specifier]. To showcase `Rc`, we create a reference-counted heap value and use the [`Rc::strong_count`] function to print the current reference count before and after dropping an instance (using [`core::mem::drop`]).
|
||||
|
||||
[`Vec`]: https://doc.rust-lang.org/alloc/vec/
|
||||
[`Rc`]: https://doc.rust-lang.org/alloc/rc/
|
||||
@@ -628,11 +628,11 @@ reference count is 1 now
|
||||
|
||||
As expected, we see that the `Box` and `Vec` values live on the heap, as indicated by the pointer starting with the `0x_4444_4444_*` prefix. The reference counted value also behaves as expected, with the reference count being 2 after the `clone` call, and 1 again after one of the instances was dropped.
|
||||
|
||||
The reason that the vector starts at offset `0x800` is not that the boxed value is `0x800` bytes large, but the [reallocations] that occur when the vector needs to increase its capacity. For example, when the vector's capacity is 32 and we try to add the next element, the vector allocates a new backing array with capacity 64 behind the scenes and copies all elements over. Then it frees the old allocation.
|
||||
The reason that the vector starts at offset `0x800` is not that the boxed value is `0x800` bytes large, but the [reallocations] that occur when the vector needs to increase its capacity. For example, when the vector's capacity is 32 and we try to add the next element, the vector allocates a new backing array with a capacity of 64 behind the scenes and copies all elements over. Then it frees the old allocation.
|
||||
|
||||
[reallocations]: https://doc.rust-lang.org/alloc/vec/struct.Vec.html#capacity-and-reallocation
|
||||
|
||||
Of course there are many more allocation and collection types in the `alloc` crate that we can now all use in our kernel, including:
|
||||
Of course, there are many more allocation and collection types in the `alloc` crate that we can now all use in our kernel, including:
|
||||
|
||||
- the thread-safe reference counted pointer [`Arc`]
|
||||
- the owned string type [`String`] and the [`format!`] macro
|
||||
@@ -682,7 +682,7 @@ fn panic(info: &PanicInfo) -> ! {
|
||||
}
|
||||
```
|
||||
|
||||
We reuse the `test_runner` and `test_panic_handler` functions from our `lib.rs`. Since we want to test allocations, we enable the `alloc` crate through the `extern crate alloc` statement. For more information about the test boilerplate check out the [_Testing_] post.
|
||||
We reuse the `test_runner` and `test_panic_handler` functions from our `lib.rs`. Since we want to test allocations, we enable the `alloc` crate through the `extern crate alloc` statement. For more information about the test boilerplate, check out the [_Testing_] post.
|
||||
|
||||
[_Testing_]: @/edition-2/posts/04-testing/index.md
|
||||
|
||||
@@ -712,7 +712,7 @@ fn main(boot_info: &'static BootInfo) -> ! {
|
||||
|
||||
It is very similar to the `kernel_main` function in our `main.rs`, with the differences that we don't invoke `println`, don't include any example allocations, and call `test_main` unconditionally.
|
||||
|
||||
Now we're ready to add a few test cases. First, we add a test that performs some simple allocations using [`Box`] and checks the allocated values, to ensure that basic allocations work:
|
||||
Now we're ready to add a few test cases. First, we add a test that performs some simple allocations using [`Box`] and checks the allocated values to ensure that basic allocations work:
|
||||
|
||||
```rust
|
||||
// in tests/heap_allocation.rs
|
||||
@@ -780,16 +780,16 @@ large_vec... [ok]
|
||||
many_boxes... [ok]
|
||||
```
|
||||
|
||||
All three tests succeeded! You can also invoke `cargo test` (without `--test` argument) to run all unit and integration tests.
|
||||
All three tests succeeded! You can also invoke `cargo test` (without the `--test` argument) to run all unit and integration tests.
|
||||
|
||||
## Summary
|
||||
|
||||
This post gave an introduction to dynamic memory and explained why and where it is needed. We saw how Rust's borrow checker prevents common vulnerabilities and learned how Rust's allocation API works.
|
||||
|
||||
After creating a minimal implementation of Rust's allocator interface using a dummy allocator, we created a proper heap memory region for our kernel. For that we defined a virtual address range for the heap and then mapped all pages of that range to physical frames using the `Mapper` and `FrameAllocator` from the previous post.
|
||||
After creating a minimal implementation of Rust's allocator interface using a dummy allocator, we created a proper heap memory region for our kernel. For that, we defined a virtual address range for the heap and then mapped all pages of that range to physical frames using the `Mapper` and `FrameAllocator` from the previous post.
|
||||
|
||||
Finally, we added a dependency on the `linked_list_allocator` crate to add a proper allocator to our kernel. With this allocator, we were able to use `Box`, `Vec`, and other allocation and collection types from the `alloc` crate.
|
||||
|
||||
## What's next?
|
||||
|
||||
While we already added heap allocation support in this post, we left most of the work to the `linked_list_allocator` crate. The next post will show in detail how an allocator can be implemented from scratch. It will present multiple possible allocator designs, shows how to implement simple versions of them, and explain their advantages and drawbacks.
|
||||
While we already added heap allocation support in this post, we left most of the work to the `linked_list_allocator` crate. The next post will show in detail how an allocator can be implemented from scratch. It will present multiple possible allocator designs, show how to implement simple versions of them, and explain their advantages and drawbacks.
|
||||
|
||||
Reference in New Issue
Block a user