mirror of
https://github.com/phil-opp/blog_os.git
synced 2025-12-16 14:27:49 +00:00
Use misspell to fix some typos
This commit is contained in:
@@ -443,7 +443,7 @@ We don't use any multimedia registers explicitly, but the Rust compiler might au
|
||||
|
||||

|
||||
|
||||
This example shows a program that is using the first three multimedia registers (`mm0` to `mm2`). At some point, an exception occurs and control is transfered to the exception handler. The exception handler uses `mm1` for its own data and thus overwrites the previous value. When the exception is resolved, the CPU continues the interrupted program again. However, the program is now corrupt since it relies on the original `mm1` value.
|
||||
This example shows a program that is using the first three multimedia registers (`mm0` to `mm2`). At some point, an exception occurs and control is transferred to the exception handler. The exception handler uses `mm1` for its own data and thus overwrites the previous value. When the exception is resolved, the CPU continues the interrupted program again. However, the program is now corrupt since it relies on the original `mm1` value.
|
||||
|
||||
### Saving and Restoring Multimedia Registers
|
||||
In order to fix this problem, we need to backup all caller-saved multimedia registers before we call the exception handler. The problem is that the set of multimedia registers varies between CPUs. There are different standards:
|
||||
@@ -514,7 +514,7 @@ A minimal target specification that describes the `x86_64-unknown-linux-gnu` tar
|
||||
}
|
||||
```
|
||||
|
||||
The `llvm-target` field specifies the target triple that is passed to LLVM. We want to derive a 64-bit Linux target, so we choose `x86_64-unknown-linux-gnu`. The `data-layout` field is also passed to LLVM and specifies how data should be laid out in memory. It consists of various specifications seperated by a `-` character. For example, the `e` means little endian and `S128` specifies that the stack should be 128 bits (= 16 byte) aligned. The format is described in detail in the [LLVM documentation][data layout] but there shouldn't be a reason to change this string.
|
||||
The `llvm-target` field specifies the target triple that is passed to LLVM. We want to derive a 64-bit Linux target, so we choose `x86_64-unknown-linux-gnu`. The `data-layout` field is also passed to LLVM and specifies how data should be laid out in memory. It consists of various specifications separated by a `-` character. For example, the `e` means little endian and `S128` specifies that the stack should be 128 bits (= 16 byte) aligned. The format is described in detail in the [LLVM documentation][data layout] but there shouldn't be a reason to change this string.
|
||||
|
||||
The other fields are used for conditional compilation. This allows crate authors to use `cfg` variables to write special code for depending on the OS or the architecture. There isn't any up-to-date documentation about these fields but the [corresponding source code][target specification] is quite readable.
|
||||
|
||||
|
||||
@@ -171,7 +171,7 @@ check_long_mode:
|
||||
mov al, "2"
|
||||
jmp error
|
||||
```
|
||||
Like many low-level things, CPUID is a bit strange. Instead of taking a parameter, the `cpuid` instruction implicitely uses the `eax` register as argument. To test if long mode is available, we need to call `cpuid` with `0x80000001` in `eax`. This loads some information to the `ecx` and `edx` registers. Long mode is supported if the 29th bit in `edx` is set. [Wikipedia][cpuid long mode] has detailed information.
|
||||
Like many low-level things, CPUID is a bit strange. Instead of taking a parameter, the `cpuid` instruction implicitly uses the `eax` register as argument. To test if long mode is available, we need to call `cpuid` with `0x80000001` in `eax`. This loads some information to the `ecx` and `edx` registers. Long mode is supported if the 29th bit in `edx` is set. [Wikipedia][cpuid long mode] has detailed information.
|
||||
|
||||
[cpuid long mode]: https://en.wikipedia.org/wiki/CPUID#EAX.3D80000001h:_Extended_Processor_Info_and_Feature_Bits
|
||||
|
||||
|
||||
@@ -117,7 +117,7 @@ The `llvm-target` field specifies the target triple that is passed to LLVM. [Tar
|
||||
[Target triples]: http://llvm.org/docs/LangRef.html#target-triple
|
||||
[ABI]: https://en.wikipedia.org/wiki/Application_binary_interface
|
||||
|
||||
The `data-layout` field is also passed to LLVM and specifies how data should be laid out in memory. It consists of various specifications seperated by a `-` character. For example, the `e` means little endian and `S128` specifies that the stack should be 128 bits (= 16 byte) aligned. The format is described in detail in the [LLVM documentation][data layout] but there shouldn't be a reason to change this string.
|
||||
The `data-layout` field is also passed to LLVM and specifies how data should be laid out in memory. It consists of various specifications separated by a `-` character. For example, the `e` means little endian and `S128` specifies that the stack should be 128 bits (= 16 byte) aligned. The format is described in detail in the [LLVM documentation][data layout] but there shouldn't be a reason to change this string.
|
||||
|
||||
The `linker-flavor` field was recently introduced in [#40018] with the intention to add support for the LLVM linker [LLD], which is platform independent. In the future, this might allow easy cross compilation without the need to install a gcc cross compiler for linking.
|
||||
|
||||
@@ -189,7 +189,7 @@ By using such SIMD standards, programs can often speed up significantly. Good co
|
||||
|
||||
[auto-vectorization]: https://en.wikipedia.org/wiki/Automatic_vectorization
|
||||
|
||||
However, the large SIMD registers lead to problems in OS kernels. The reason is that the kernel has to backup all registers that it uses on each hardware interrupt (we will look into this in the [“Handling Exceptions”] post). So if the kernel uses SIMD registers, it has to backup a lot more data, which noticably decreases performance. To avoid this performance loss, we disable the `sse` and `mmx` features (the `avx` feature is disabled by default).
|
||||
However, the large SIMD registers lead to problems in OS kernels. The reason is that the kernel has to backup all registers that it uses on each hardware interrupt (we will look into this in the [“Handling Exceptions”] post). So if the kernel uses SIMD registers, it has to backup a lot more data, which noticeably decreases performance. To avoid this performance loss, we disable the `sse` and `mmx` features (the `avx` feature is disabled by default).
|
||||
|
||||
As noted above, floating point operations on `x86_64` use SSE registers, so floats are no longer usable without SSE. Unfortunately, the Rust core library already uses floats (e.g., it implements traits for `f32` and `f64`), so we need an alternative way to implement float operations. The `soft-float` feature solves this problem by emulating all floating point operations through software functions based on normal integers.
|
||||
|
||||
@@ -348,7 +348,7 @@ target/x86_64-blog_os/debug/libblog_os.a(core-92335f822fa6c9a6.0.o):
|
||||
[crates.io]: https://crates.io
|
||||
|
||||
#### --gc-sections
|
||||
The new errors are linker errors about various missing functions such as `__floatundisf` or `__muloti4`. These functions are part of LLVM's [`compiler-rt` builtins] and are normally linked by the standard library. For `no_std` crates like ours, one has to link the `compiler-rt` library manually. Unfortunatly, this library is implemented in C and the build process is a bit cumbersome. Alternatively, there is the [compiler-builtins] crate that tries to port the library to Rust, but it isn't complete yet.
|
||||
The new errors are linker errors about various missing functions such as `__floatundisf` or `__muloti4`. These functions are part of LLVM's [`compiler-rt` builtins] and are normally linked by the standard library. For `no_std` crates like ours, one has to link the `compiler-rt` library manually. Unfortunately, this library is implemented in C and the build process is a bit cumbersome. Alternatively, there is the [compiler-builtins] crate that tries to port the library to Rust, but it isn't complete yet.
|
||||
|
||||
[`compiler-rt` builtins]: https://compiler-rt.llvm.org/
|
||||
[compiler-builtins]: https://github.com/rust-lang-nursery/compiler-builtins
|
||||
@@ -365,7 +365,7 @@ Now we can do a `make run` again and it compiles without errors again. However,
|
||||
```
|
||||
GRUB error: no multiboot header found.
|
||||
```
|
||||
What happened? Well, the linker removed unused sections. And since we don't use the Multiboot section anywhere, `ld` removes it, too. So we need to tell the linker explicitely that it should keep this section. The `KEEP` command does exactly that, so we add it to the linker script (`linker.ld`):
|
||||
What happened? Well, the linker removed unused sections. And since we don't use the Multiboot section anywhere, `ld` removes it, too. So we need to tell the linker explicitly that it should keep this section. The `KEEP` command does exactly that, so we add it to the linker script (`linker.ld`):
|
||||
|
||||
```
|
||||
.boot :
|
||||
|
||||
@@ -804,7 +804,7 @@ Let's cross our fingers and run it…
|
||||
… and it fails with a boot loop.
|
||||
|
||||
### Debugging
|
||||
A QEMU boot loop indicates that some CPU exception occured. We can see all thrown CPU exception by starting QEMU with `-d int` (as described [here][qemu debugging]):
|
||||
A QEMU boot loop indicates that some CPU exception occurred. We can see all thrown CPU exception by starting QEMU with `-d int` (as described [here][qemu debugging]):
|
||||
|
||||
[qemu debugging]: ./first-edition/posts/03-set-up-rust/index.md#debugging
|
||||
|
||||
@@ -839,7 +839,7 @@ So let's find out which function caused the exception:
|
||||
```
|
||||
objdump -d build/kernel-x86_64.bin | grep -B100 "10ab97"
|
||||
```
|
||||
We disassemble our kernel and search for `10ab97`. The `-B100` option prints the 100 preceeding lines too. The output tells us the responsible function:
|
||||
We disassemble our kernel and search for `10ab97`. The `-B100` option prints the 100 preceding lines too. The output tells us the responsible function:
|
||||
|
||||
```
|
||||
...
|
||||
|
||||
@@ -228,7 +228,7 @@ impl<'a> Alloc for &'a LockedBumpAllocator {
|
||||
}
|
||||
```
|
||||
|
||||
However, there is a more interesting solution for our bump allocator that avoids locking alltogether. The idea is to exploit that we only need to update a single `usize` field byusing an `AtomicUsize` type. This type uses special synchronized hardware instructions to ensure data race freedom without requiring locks.
|
||||
However, there is a more interesting solution for our bump allocator that avoids locking altogether. The idea is to exploit that we only need to update a single `usize` field byusing an `AtomicUsize` type. This type uses special synchronized hardware instructions to ensure data race freedom without requiring locks.
|
||||
|
||||
#### A lock-free Bump Allocator
|
||||
A lock-free implementation looks like this:
|
||||
@@ -413,7 +413,7 @@ pub fn init(boot_info: &BootInformation) {
|
||||
|
||||
We've just moved the code to a new function. However, we've sneaked some improvements in:
|
||||
|
||||
- An additional `.filter(|s| s.is_allocated())` in the calculation of `kernel_start` and `kernel_end`. This ignores all sections that aren't loaded to memory (such as debug sections). Thus, the kernel end address is no longer artifically increased by such sections.
|
||||
- An additional `.filter(|s| s.is_allocated())` in the calculation of `kernel_start` and `kernel_end`. This ignores all sections that aren't loaded to memory (such as debug sections). Thus, the kernel end address is no longer artificially increased by such sections.
|
||||
- We use the `start_address()` and `end_address()` methods of `boot_info` instead of calculating the adresses manually.
|
||||
- We use the alternate `{:#x}` form when printing kernel/multiboot addresses. Before, we used `0x{:x}`, which leads to the same result. For a complete list of these “alternate” formatting forms, check out the [std::fmt documentation].
|
||||
|
||||
|
||||
@@ -168,7 +168,7 @@ For exception and interrupt handlers, however, pushing a return address would no
|
||||
|
||||
1. **Aligning the stack pointer**: An interrupt can occur at any instructions, so the stack pointer can have any value, too. However, some CPU instructions (e.g. some SSE instructions) require that the stack pointer is aligned on a 16 byte boundary, therefore the CPU performs such an alignment right after the interrupt.
|
||||
2. **Switching stacks** (in some cases): A stack switch occurs when the CPU privilege level changes, for example when a CPU exception occurs in an user mode program. It is also possible to configure stack switches for specific interrupts using the so-called _Interrupt Stack Table_ (described in the next post).
|
||||
3. **Pushing the old stack pointer**: The CPU pushes the values of the stack pointer (`rsp`) and the stack segment (`ss`) registers at the time when the interrupt occured (before the alignment). This makes it possible to restore the original stack pointer when returning from an interrupt handler.
|
||||
3. **Pushing the old stack pointer**: The CPU pushes the values of the stack pointer (`rsp`) and the stack segment (`ss`) registers at the time when the interrupt occurred (before the alignment). This makes it possible to restore the original stack pointer when returning from an interrupt handler.
|
||||
4. **Pushing and updating the `RFLAGS` register**: The [`RFLAGS`] register contains various control and status bits. On interrupt entry, the CPU changes some bits and pushes the old value.
|
||||
5. **Pushing the instruction pointer**: Before jumping to the interrupt handler function, the CPU pushes the instruction pointer (`rip`) and the code segment (`cs`). This is comparable to the return address push of a normal function call.
|
||||
6. **Pushing an error code** (for some exceptions): For some specific exceptions such as page faults, the CPU pushes an error code, which describes the cause of the exception.
|
||||
@@ -414,7 +414,7 @@ It works! The CPU successfully invokes our breakpoint handler, which prints the
|
||||
|
||||
> **Aside**: If it doesn't work and a boot loop occurs, this might be caused by a kernel stack overflow. Try increasing the stack size to at least 16kB (4096 * 4 bytes) in the `boot.asm` file.
|
||||
|
||||
We see that the exception stack frame tells us the instruction and stack pointers at the time when the exception occured. This information is very useful when debugging unexpected exceptions. For example, we can look at the corresponding assembly line using `objdump`:
|
||||
We see that the exception stack frame tells us the instruction and stack pointers at the time when the exception occurred. This information is very useful when debugging unexpected exceptions. For example, we can look at the corresponding assembly line using `objdump`:
|
||||
|
||||
```
|
||||
> objdump -d build/kernel-x86_64.bin | grep -B5 "1140a6:"
|
||||
|
||||
@@ -22,7 +22,7 @@ By using such SIMD standards, programs can often speed up significantly. Good co
|
||||
|
||||
[auto-vectorization]: https://en.wikipedia.org/wiki/Automatic_vectorization
|
||||
|
||||
However, the large SIMD registers lead to problems in OS kernels. The reason is that the kernel has to backup all registers that it uses to memory on each hardware interrupt, because they need to have their original values when the interrupted program continues. So if the kernel uses SIMD registers, it has to backup a lot more data (512–1600 bytes), which noticably decreases performance. To avoid this performance loss, we want to disable the `sse` and `mmx` features (the `avx` feature is disabled by default).
|
||||
However, the large SIMD registers lead to problems in OS kernels. The reason is that the kernel has to backup all registers that it uses to memory on each hardware interrupt, because they need to have their original values when the interrupted program continues. So if the kernel uses SIMD registers, it has to backup a lot more data (512–1600 bytes), which noticeably decreases performance. To avoid this performance loss, we want to disable the `sse` and `mmx` features (the `avx` feature is disabled by default).
|
||||
|
||||
We can do that through the the `features` field in our target specification. To disable the `mmx` and `sse` features we add them prefixed with a minus:
|
||||
|
||||
|
||||
@@ -379,7 +379,7 @@ This error message tells us that the linker can't find an entry point function w
|
||||
cargo rustc -- -C link-args="-e __start"
|
||||
```
|
||||
|
||||
The `-e` flag specifies the name of the entry point function. Since all functions have an additonal `_` prefix on macOS, we need to set the entry point to `__start` instead of `_start`.
|
||||
The `-e` flag specifies the name of the entry point function. Since all functions have an additional `_` prefix on macOS, we need to set the entry point to `__start` instead of `_start`.
|
||||
|
||||
Now the following linker error occurs:
|
||||
|
||||
@@ -435,7 +435,7 @@ rustflags = ["-C", "link-args=/ENTRY:_start /SUBSYSTEM:console"]
|
||||
rustflags = ["-C", "link-args=-e __start -static -nostartfiles"]
|
||||
```
|
||||
|
||||
The `rustflags` key contains arguments that are automatically added to every invocation of `rustc`. For more information on the `.cargo/config` file check out the [offical documentation](https://doc.rust-lang.org/cargo/reference/config.html).
|
||||
The `rustflags` key contains arguments that are automatically added to every invocation of `rustc`. For more information on the `.cargo/config` file check out the [official documentation](https://doc.rust-lang.org/cargo/reference/config.html).
|
||||
|
||||
Now our program should be buildable on all three platforms with a simple `cargo build`.
|
||||
|
||||
|
||||
@@ -177,7 +177,7 @@ For exception and interrupt handlers, however, pushing a return address would no
|
||||
|
||||
1. **Aligning the stack pointer**: An interrupt can occur at any instructions, so the stack pointer can have any value, too. However, some CPU instructions (e.g. some SSE instructions) require that the stack pointer is aligned on a 16 byte boundary, therefore the CPU performs such an alignment right after the interrupt.
|
||||
2. **Switching stacks** (in some cases): A stack switch occurs when the CPU privilege level changes, for example when a CPU exception occurs in an user mode program. It is also possible to configure stack switches for specific interrupts using the so-called _Interrupt Stack Table_ (described in the next post).
|
||||
3. **Pushing the old stack pointer**: The CPU pushes the values of the stack pointer (`rsp`) and the stack segment (`ss`) registers at the time when the interrupt occured (before the alignment). This makes it possible to restore the original stack pointer when returning from an interrupt handler.
|
||||
3. **Pushing the old stack pointer**: The CPU pushes the values of the stack pointer (`rsp`) and the stack segment (`ss`) registers at the time when the interrupt occurred (before the alignment). This makes it possible to restore the original stack pointer when returning from an interrupt handler.
|
||||
4. **Pushing and updating the `RFLAGS` register**: The [`RFLAGS`] register contains various control and status bits. On interrupt entry, the CPU changes some bits and pushes the old value.
|
||||
5. **Pushing the instruction pointer**: Before jumping to the interrupt handler function, the CPU pushes the instruction pointer (`rip`) and the code segment (`cs`). This is comparable to the return address push of a normal function call.
|
||||
6. **Pushing an error code** (for some exceptions): For some specific exceptions such as page faults, the CPU pushes an error code, which describes the cause of the exception.
|
||||
@@ -414,7 +414,7 @@ When we run it in QEMU now (using `cargo xrun`), we see the following:
|
||||
|
||||
It works! The CPU successfully invokes our breakpoint handler, which prints the message, and then returns back to the `_start` function, where the `It did not crash!` message is printed.
|
||||
|
||||
We see that the interrupt stack frame tells us the instruction and stack pointers at the time when the exception occured. This information is very useful when debugging unexpected exceptions.
|
||||
We see that the interrupt stack frame tells us the instruction and stack pointers at the time when the exception occurred. This information is very useful when debugging unexpected exceptions.
|
||||
|
||||
### Adding a Test
|
||||
|
||||
|
||||
@@ -41,7 +41,7 @@ Unlike exceptions, hardware interrupts occur _asynchronously_. This means that t
|
||||
|
||||
## The 8259 PIC
|
||||
|
||||
The [Intel 8259] is a programmable interrupt controller (PIC) introduced in 1976. It has long been replaced by the newer [APIC], but its interface is still supported on current systems for backwards compatibiliy reasons. The 8259 PIC is significantly easier to set up than the APIC, so we will use it to introduce ourselves to interrupts before we switch to the APIC in a later post.
|
||||
The [Intel 8259] is a programmable interrupt controller (PIC) introduced in 1976. It has long been replaced by the newer [APIC], but its interface is still supported on current systems for backwards compatibility reasons. The 8259 PIC is significantly easier to set up than the APIC, so we will use it to introduce ourselves to interrupts before we switch to the APIC in a later post.
|
||||
|
||||
[APIC]: https://en.wikipedia.org/wiki/Intel_APIC_Architecture
|
||||
|
||||
@@ -240,7 +240,7 @@ We need to be careful to use the correct interrupt vector number, otherwise we c
|
||||
|
||||
When we now execute `cargo xrun` we see dots periodically appearing on the screen:
|
||||
|
||||

|
||||

|
||||
|
||||
### Configuring the Timer
|
||||
|
||||
@@ -251,7 +251,7 @@ The hardware timer that we use is called the _Progammable Interval Timer_ or PIT
|
||||
|
||||
## Deadlocks
|
||||
|
||||
We now have a form of concurrency in our kernel: The timer interrupts occur asynchronously, so they can interrupt our `_start` function at any time. Fortunately Rust's ownership system prevents many types of concurrency related bugs at compile time. One notable exception are deadlocks. Deadlocks occur if a thread tries to aquire a lock that will never become free. Thus the thread hangs indefinitely.
|
||||
We now have a form of concurrency in our kernel: The timer interrupts occur asynchronously, so they can interrupt our `_start` function at any time. Fortunately Rust's ownership system prevents many types of concurrency related bugs at compile time. One notable exception are deadlocks. Deadlocks occur if a thread tries to acquire a lock that will never become free. Thus the thread hangs indefinitely.
|
||||
|
||||
We can already provoke a deadlock in our kernel. Remember, our `println` macro calls the `vga_buffer::_print` function, which [locks a global `WRITER`][vga spinlock] using a spinlock:
|
||||
|
||||
@@ -571,7 +571,7 @@ extern "x86-interrupt" fn keyboard_interrupt_handler(
|
||||
}
|
||||
```
|
||||
|
||||
As we see from the graphic [above](#the-8259-pic), the keyboard uses line 1 of the primary PIC. This means that it arrives at the CPU as interrupt 33 (1 + offset 32). We add this index as a new `Keyboard` variant to the `InterruptIndex` enum. We don't need to specify the value explicitely, since it defaults to the previous value plus one, which is also 33. In the interrupt handler, we print a `k` and send the end of interrupt signal to the interrupt controller.
|
||||
As we see from the graphic [above](#the-8259-pic), the keyboard uses line 1 of the primary PIC. This means that it arrives at the CPU as interrupt 33 (1 + offset 32). We add this index as a new `Keyboard` variant to the `InterruptIndex` enum. We don't need to specify the value explicitly, since it defaults to the previous value plus one, which is also 33. In the interrupt handler, we print a `k` and send the end of interrupt signal to the interrupt controller.
|
||||
|
||||
We now see that a `k` appears on the screen when we press a key. However, this only works for the first key we press, even if we continue to press keys no more `k`s appear on the screen. This is because the keyboard controller won't send another interrupt until we have read the so-called _scancode_ of the pressed key.
|
||||
|
||||
|
||||
@@ -45,7 +45,7 @@ In this post we will implement a very basic test framework that runs integration
|
||||
|
||||
## The Serial Port
|
||||
|
||||
The naive way of doing an integration test would be to add some assertions in the code, launch QEMU, and manually check if a panic occured or not. This is very cumbersome and not practical if we have hundreds of integration tests. So we want an automated solution that runs all tests and fails if not all of them pass.
|
||||
The naive way of doing an integration test would be to add some assertions in the code, launch QEMU, and manually check if a panic occurred or not. This is very cumbersome and not practical if we have hundreds of integration tests. So we want an automated solution that runs all tests and fails if not all of them pass.
|
||||
|
||||
Such an automated test framework needs to know whether a test succeeded or failed. It can't look at the screen output of QEMU, so we need a different way of retrieving the test results on the host system. A simple way to achieve this is by using the [serial port], an old interface standard which is no longer found in modern computers. It is easy to program and QEMU can redirect the bytes sent over serial to the host's standard output or a file.
|
||||
|
||||
@@ -187,7 +187,7 @@ Instead of standard output, QEMU supports [many more target devices][QEMU -seria
|
||||
|
||||
## Shutting Down QEMU
|
||||
|
||||
Right now we have an endless loop at the end of our `_start` function and need to close QEMU manually. This does not work for automated tests. We could try to kill QEMU automatically from the host, for example after some special output was sent over serial, but this would be a bit hacky and difficult to get right. The cleaner solution would be to implement a way to shutdown our OS. Unfortunatly this is relatively complex, because it requires implementing support for either the [APM] or [ACPI] power management standard.
|
||||
Right now we have an endless loop at the end of our `_start` function and need to close QEMU manually. This does not work for automated tests. We could try to kill QEMU automatically from the host, for example after some special output was sent over serial, but this would be a bit hacky and difficult to get right. The cleaner solution would be to implement a way to shutdown our OS. Unfortunately this is relatively complex, because it requires implementing support for either the [APM] or [ACPI] power management standard.
|
||||
|
||||
[APM]: https://wiki.osdev.org/APM
|
||||
[ACPI]: https://wiki.osdev.org/ACPI
|
||||
|
||||
Reference in New Issue
Block a user