mirror of
https://github.com/phil-opp/blog_os.git
synced 2025-12-16 22:37:49 +00:00
Markdown links require a blank line before them
This commit is contained in:
@@ -6,10 +6,12 @@ order = 2
|
||||
+++
|
||||
|
||||
The [GNU Binutils] are a collection of various binary tools such as `ld`, `as`, `objdump`, or `readelf`. These tools are platform-specific, so you need to compile them again if your host system and target system are different. In our case, we need `ld` and `objdump` for the x86_64 architecture.
|
||||
|
||||
[GNU Binutils]: https://www.gnu.org/software/binutils/
|
||||
|
||||
## Building Setup
|
||||
First, you need to download a current binutils version from [here][download] \(the latest one is near the bottom). After extracting, you should have a folder named `binutils-2.X` where `X` is for example `25.1`. Now can create and switch to a new folder for building (recommended):
|
||||
|
||||
[download]: ftp://sourceware.org/pub/binutils/snapshots
|
||||
|
||||
```bash
|
||||
|
||||
@@ -184,6 +184,7 @@ Idx Name Size VMA LMA File off Algn
|
||||
CONTENTS, ALLOC, LOAD, READONLY, CODE
|
||||
```
|
||||
_Note_: The `ld` and `objdump` commands are platform specific. If you're _not_ working on x86_64 architecture, you will need to [cross compile binutils]. Then use `x86_64‑elf‑ld` and `x86_64‑elf‑objdump` instead of `ld` and `objdump`.
|
||||
|
||||
[cross compile binutils]: ./extra/cross-compile-binutils.md
|
||||
|
||||
## Creating the ISO
|
||||
|
||||
@@ -246,6 +246,7 @@ Bit(s) | Name | Meaning
|
||||
When we switch to long mode, paging will be activated automatically. The CPU will then try to read the instruction at the following address, but this address is now a virtual address. So we need to do _identity mapping_, i.e. map a physical address to the same virtual address.
|
||||
|
||||
The `huge page` bit is now very useful to us. It creates a 2MiB (when used in P2) or even a 1GiB page (when used in P3). So we could map the first _gigabytes_ of the kernel with only one P4 and one P3 table by using 1GiB pages. Unfortunately 1GiB pages are relatively new feature, for example Intel introduced it 2010 in the [Westmere architecture]. Therefore we will use 2MiB pages instead to make our kernel compatible to older computers, too.
|
||||
|
||||
[Westmere architecture]: https://en.wikipedia.org/wiki/Westmere_(microarchitecture)#Technology
|
||||
|
||||
To identity map the first gigabyte of our kernel with 512 2MiB pages, we need one P4, one P3, and one P2 table. Of course we will replace them with finer-grained tables later. But now that we're stuck with assembly, we choose the easiest way.
|
||||
|
||||
@@ -189,6 +189,7 @@ If the byte is the [newline] byte `\n`, the writer does not print anything. Inst
|
||||
When printing a byte, the writer checks if the current line is full. In that case, a `new_line` call is required before to wrap the line. Then it writes a new `ScreenChar` to the buffer at the current position. Finally, the current column position is advanced.
|
||||
|
||||
The `buffer()` auxiliary method converts the raw pointer in the `buffer` field into a safe mutable buffer reference. The unsafe block is needed because the [as_mut()] method of `Unique` is unsafe. But our `buffer()` method itself isn't marked as unsafe, so it must not introduce any unsafety (e.g. cause segfaults). To guarantee that, it's very important that the `buffer` field always points to a valid `Buffer`. It's like a contract that we must stand to every time we create a `Writer`. To ensure that it's not possible to create an invalid `Writer` from outside of the module, the struct must have at least one private field and public creation functions are not allowed either.
|
||||
|
||||
[as_mut()]: https://doc.rust-lang.org/nightly/core/ptr/struct.Unique.html#method.as_mut
|
||||
|
||||
### Cannot Move out of Borrowed Content
|
||||
@@ -646,17 +647,21 @@ Now that you know the very basics of OS development in Rust, you should also che
|
||||
|
||||
- [Rust Bare-Bones Kernel]: A basic kernel with roughly the same functionality as ours. Writes output to the serial port instead of the VGA buffer and maps the kernel to the [higher half] \(instead of our identity mapping).
|
||||
_Note_: You need to [cross compile binutils] to build it (or you create some symbolic links[^fn-symlink] if you're on x86_64).
|
||||
|
||||
[Rust Bare-Bones Kernel]: https://github.com/thepowersgang/rust-barebones-kernel
|
||||
[higher half]: http://wiki.osdev.org/Higher_Half_Kernel
|
||||
[cross compile binutils]: ./posts/cross-compile-binutils/index.md
|
||||
|
||||
- [RustOS]: More advanced kernel that supports allocation, keyboard inputs, and threads. It also has a scheduler and a basic network driver.
|
||||
|
||||
[RustOS]: https://github.com/RustOS-Fork-Holding-Ground/RustOS
|
||||
|
||||
- ["Tifflin" Experimental Kernel]: Big kernel project by thepowersgang, that is actively developed and has over 650 commits. It has a separate userspace and supports multiple file systems, even a GUI is included. Needs a cross compiler.
|
||||
|
||||
["Tifflin" Experimental Kernel]:https://github.com/thepowersgang/rust_os
|
||||
|
||||
- [Redox]: Probably the most complete Rust OS today. It has an active community and over 1000 Github stars. File systems, network, an audio player, a picture viewer, and much more. Just take a look at the [screenshots][redox screenshots].
|
||||
|
||||
[Redox]: https://github.com/redox-os/redox
|
||||
[redox screenshots]: https://github.com/redox-os/redox#what-it-looks-like
|
||||
|
||||
|
||||
@@ -214,6 +214,7 @@ There are various ways to write such a frame allocator:
|
||||
We could create some kind of linked list from the free frames. For example, each frame could begin with a pointer to the next free frame. Since the frames are free, this would not overwrite any data. Our allocator would just save the head of the list and could easily allocate and deallocate frames by updating pointers. Unfortunately, this approach has a problem: It requires reading and writing these free frames. So we would need to map all physical frames to some virtual address, at least temporary. Another disadvantage is that we need to create this linked list at startup. That implies that we need to set over one million pointers at startup if the machine has 4GiB of RAM.
|
||||
|
||||
Another approach is to create some kind of data structure such as a [bitmap or a stack] to manage free frames. We could place it in the already identity mapped area right behind the kernel or multiboot structure. That way we would not need to (temporary) map each free frame. But it has the same problem of the slow initial creating/filling. In fact, we will use this approach in a future post to manage frames that are freed again. But for the initial management of free frames, we use a different method.
|
||||
|
||||
[bitmap or a stack]: http://wiki.osdev.org/Page_Frame_Allocation#Physical_Memory_Allocators
|
||||
|
||||
In the following, we will use Multiboot's memory map directly. The idea is to maintain a simple counter that starts at frame 0 and is increased constantly. If the current frame is available (part of an available area in the memory map) and not used by the kernel or the multiboot structure (we know their start and end addresses), we know that it's free and return it. Else, we increase the counter to the next possibly free frame. That way, we don't need to create a data structure when booting and the physical frames can remain unmapped. The only problem is that we cannot reasonably free frames again, but we will solve that problem in a future post (by adding an intermediate frame stack that saves freed frames).
|
||||
|
||||
@@ -29,6 +29,7 @@ Also, we will use the [information about kernel sections] to map the various sec
|
||||
|
||||
## Preparation
|
||||
There are many things that can go wrong when we switch to a new table. Therefore it's a good idea to [set up a debugger][set up gdb]. You should not need it when you follow this post, but it's good to know how to debug a problem when it occurs[^fn-debug-notes].
|
||||
|
||||
[set up gdb]: ./extra/set-up-gdb/index.md
|
||||
|
||||
We also update the `Page` and `Frame` types to make our lives easier. The `Page` struct gets some derived traits:
|
||||
@@ -42,9 +43,11 @@ pub struct Page {
|
||||
}
|
||||
```
|
||||
By making it [Copy][Copy trait], we can still use it after passing it to functions such as `map_to`. We also make the `Page::containing_address` public (if it isn't already).
|
||||
|
||||
[Copy trait]: https://doc.rust-lang.org/nightly/core/marker/trait.Copy.html
|
||||
|
||||
The `Frame` type gets a `clone` method too, but it does not implement the [Clone trait]:
|
||||
|
||||
[Clone trait]: https://doc.rust-lang.org/nightly/core/clone/trait.Clone.html
|
||||
|
||||
```rust
|
||||
@@ -275,6 +278,7 @@ pub fn map_table_frame(&mut self,
|
||||
}
|
||||
```
|
||||
This function interprets the given frame as a page table frame and returns a `Table` reference. We return a table of level 1 because it [forbids calling the `next_table` methods][some clever solution]. Calling `next_table` must not be possible since it's not a page of the recursive mapping. To be able to return a `Table<Level1>`, we need to make the `Level1` enum in `memory/paging/table.rs` public.
|
||||
|
||||
[some clever solution]: ./posts/06-page-tables/index.md#some-clever-solution
|
||||
|
||||
|
||||
@@ -321,6 +325,7 @@ impl InactivePageTable {
|
||||
}
|
||||
```
|
||||
We added two new arguments, `active_table` and `temporary_page`. We need an [inner scope] to ensure that the `table` variable is dropped before we try to unmap the temporary page again. This is required since the `table` variable exclusively borrows `temporary_page` as long as it's alive.
|
||||
|
||||
[inner scope]: http://rustbyexample.com/variable_bindings/scope.html
|
||||
|
||||
Now we are able to create valid inactive page tables, which are zeroed and recursively mapped. But we still can't modify them. To resolve this problem, we need to look at recursive mapping again.
|
||||
@@ -360,6 +365,7 @@ pub fn with<F>(&mut self,
|
||||
}
|
||||
```
|
||||
It overwrites the 511th P4 entry and points it to the inactive table frame. Then it flushes the [translation lookaside buffer (TLB)][TLB], which still contains some old translations. We need to flush all pages that are part of the recursive mapping, so the easiest way is to flush the TLB completely.
|
||||
|
||||
[TLB]: http://wiki.osdev.org/TLB
|
||||
|
||||
|
||||
@@ -440,6 +446,7 @@ impl ActivePageTable {
|
||||
}
|
||||
```
|
||||
The [Deref] and [DerefMut] implementations allow us to use the `ActivePageTable` exactly as before, for example we still can call `map_to` on it (because of [deref coercions]). But the closure called in the `with` function can no longer invoke `with` again. The reason is that we changed the type of the generic `F` parameter a bit: Instead of an `ActivePageTable`, the closure just gets a `Mapper` as argument.
|
||||
|
||||
[Deref]: https://doc.rust-lang.org/nightly/core/ops/trait.Deref.html
|
||||
[DerefMut]: https://doc.rust-lang.org/nightly/core/ops/trait.DerefMut.html
|
||||
[deref coercions]: https://doc.rust-lang.org/nightly/book/deref-coercions.html
|
||||
@@ -544,6 +551,7 @@ pub fn remap_the_kernel<A>(allocator: &mut A, boot_info: &BootInformation)
|
||||
First, we create a temporary page at page number `0xcafebabe`. We could use `0xdeadbeaf` or `0x123456789` as well, as long as the page is unused. The `active_table` and the `new_table` are created using their constructor functions.
|
||||
|
||||
Then we use the `with` function to temporary change the recursive mapping and execute the closure as if the `new_table` were active. This allows us to map the sections in the new table without changing the active mapping. To get the kernel sections, we use the [Multiboot information structure].
|
||||
|
||||
[Multiboot information structure]: ./posts/05-allocating-frames/index.md#the-multiboot-information-structure
|
||||
|
||||
Let's resolve the above `TODO` by identity mapping the sections:
|
||||
@@ -630,6 +638,7 @@ SECTIONS {
|
||||
}
|
||||
```
|
||||
The `.` is the “current location counter” and represents the current virtual address. At the beginning of the `SECTIONS` tag we set it to `1M`, so our kernel starts at 1MiB. We use the [ALIGN][linker align] function to align the current location counter to the next `4K` boundary (`4K` is the page size). Thus the end of the `.text` section – and the beginning of the next section – are page aligned.
|
||||
|
||||
[linker align]: http://www.math.utah.edu/docs/info/ld_3.html#SEC12
|
||||
|
||||
To put all sections on their own page, we add the `ALIGN` statement to all of them:
|
||||
@@ -772,6 +781,7 @@ pub fn switch(&mut self, new_table: InactivePageTable) -> InactivePageTable {
|
||||
}
|
||||
```
|
||||
This function activates the given inactive table and returns the previous active table as a `InactivePageTable`. We don't need to flush the TLB here, as the CPU does it automatically when the P4 table is switched. In fact, the `tlb::flush_all` function, which we used above, does nothing more than [reloading the CR3 register].
|
||||
|
||||
[reloading the CR3 register]: https://github.com/gz/rust-x86/blob/master/src/shared/tlb.rs#L19
|
||||
|
||||
Now we are finally able to switch to the new table. We do it by adding the following lines to our `remap_the_kernel` function:
|
||||
@@ -793,6 +803,7 @@ Let's cross our fingers and run it…
|
||||
|
||||
### Debugging
|
||||
A QEMU boot loop indicates that some CPU exception occured. We can see all thrown CPU exception by starting QEMU with `-d int` (as described [here][qemu debugging]):
|
||||
|
||||
[qemu debugging]: ./posts/03-set-up-rust/index.md#debugging
|
||||
|
||||
```bash
|
||||
@@ -806,18 +817,22 @@ check_exception old: 0xffffffff new 0xe
|
||||
These lines are the important ones. We can read many useful information from them:
|
||||
|
||||
- `v=0e`: An exception with number `0xe` occurred, which is a page fault according to the [OSDev Wiki][osdev exception overview].
|
||||
|
||||
[osdev exception overview]: http://wiki.osdev.org/Exceptions
|
||||
[page fault]: http://wiki.osdev.org/Exceptions#Page_Fault
|
||||
|
||||
- `e=0002`: The CPU set an [error code][page fault error code], which tells us why the exception occurred. The `0x2` bit tells us that it was caused by a write operation. And since the `0x1` bit is not set, the target page was not present.
|
||||
|
||||
[page fault error code]: http://wiki.osdev.org/Exceptions#Error_code
|
||||
|
||||
- `IP=0008:000000000010ab97` or `pc=000000000010ab97`: The program counter register tells us that the exception occurred when the CPU tried to execute the instruction at `0x10ab97`. We can disassemble this address to see the corresponding function. The `0008:` prefix in `IP` indicates the code [GDT segment].
|
||||
|
||||
[GDT segment]: ./posts/02-entering-longmode/index.md#loading-the-gdt
|
||||
|
||||
- `SP=0010:00000000001182d0`: The stack pointer was `0x1182d0` (the `0010:` prefix indicates the data [GDT segment]). This tells us if it the stack overflowed.
|
||||
|
||||
- `CR2=00000000000b8f00`: Finally the most useful register. It tells us which virtual address caused the page fault. In our case it's `0xb8f00`, which is part of the [VGA text buffer].
|
||||
|
||||
[VGA text buffer]: ./posts/04-printing-to-screen/index.md#the-vga-text-buffer
|
||||
|
||||
So let's find out which function caused the exception:
|
||||
@@ -836,6 +851,7 @@ We disassemble our kernel and search for `10ab97`. The `-B100` option prints the
|
||||
10ab97: 66 89 14 48 mov %dx,(%rax,%rcx,2)
|
||||
```
|
||||
The reason for the cryptical function name is Rust's [name mangling]. But we can identity the `vga_buffer::Writer::write_byte` function nonetheless.
|
||||
|
||||
[name mangling]: https://en.wikipedia.org/wiki/Name_mangling
|
||||
|
||||
So the reason for the page fault is that the `write_byte` function tried to write to the VGA text buffer at `0xb8f00`. Of course this provokes a page fault: We forgot to identity map the VGA buffer in the new page table.
|
||||
@@ -970,6 +986,7 @@ This time the responsible function is `control_regs::cr3_write()` itself. From t
|
||||
|
||||
### The NXE Bit
|
||||
The reason is that the `NO_EXECUTE` bit must only be used when the `NXE` bit in the [Extended Feature Enable Register] \(EFER) is set. That register is similar to Rust's feature gating and can be used to enable all sorts of advanced CPU features. Since the `NXE` bit is off by default, we caused a page fault when we added the `NO_EXECUTE` bit to the page table.
|
||||
|
||||
[Extended Feature Enable Register]: https://en.wikipedia.org/wiki/Control_register#EFER
|
||||
|
||||
So we need to enable the `NXE` bit. For that we use the [x86_64 crate] again:
|
||||
@@ -1013,6 +1030,7 @@ If we haven't forgotten to set the `WRITABLE` flag somewhere, it should still wo
|
||||
The final step is to create a guard page for our kernel stack.
|
||||
|
||||
The decision to place the kernel stack right above the page tables was already useful to detect a silent stack overflow in the [previous post][silent stack overflow]. Now we profit from it again. Let's look at our assembly `.bss` section again to understand why:
|
||||
|
||||
[silent stack overflow]: ./posts/06-page-tables/index.md#translate
|
||||
|
||||
```nasm
|
||||
@@ -1073,6 +1091,7 @@ Unfortunately stack probes require compiler support. They already work on Window
|
||||
|
||||
## What's next?
|
||||
Now that we have a (mostly) safe kernel stack and a working page table module, we can add a virtual memory allocator. The [next post] will explore Rust's allocator API and create a very basic allocator. At the end of that post, we will be able to use Rust's allocation and collections types such as [Box], [Vec], or even [BTreeMap].
|
||||
|
||||
[next post]: ./posts/08-kernel-heap/index.md
|
||||
[Box]: https://doc.rust-lang.org/nightly/alloc/boxed/struct.Box.html
|
||||
[Vec]: https://doc.rust-lang.org/nightly/collections/vec/struct.Vec.html
|
||||
|
||||
Reference in New Issue
Block a user