Update “Returning from Exceptions” to use println

This commit is contained in:
Philipp Oppermann
2016-10-31 01:04:53 +01:00
parent 25fca59248
commit 3bfa5a8178

View File

@@ -42,15 +42,13 @@ Let's start by defining a handler function for the breakpoint exception:
extern "C" fn breakpoint_handler(stack_frame: *const ExceptionStackFrame) -> ! extern "C" fn breakpoint_handler(stack_frame: *const ExceptionStackFrame) -> !
{ {
unsafe { let stack_frame = unsafe { &*stack_frame };
print_error(format_args!("EXCEPTION: BREAKPOINT at {:#x}\n{:#?}", println!("\nEXCEPTION: BREAKPOINT at {:#x}\n{:#?}",
(*stack_frame).instruction_pointer, stack_frame.instruction_pointer, stack_frame);
*stack_frame));
}
loop {} loop {}
} }
``` ```
We print a red error message using `print_error` and also output the instruction pointer and the rest of the stack frame. Note that this function does _not_ return yet, since our `handler!` macro still requires a diverging function. We print an error message and also output the instruction pointer and the rest of the stack frame. Note that this function does _not_ return yet, since our `handler!` macro still requires a diverging function.
We need to register our new handler function in the interrupt descriptor table (IDT): We need to register our new handler function in the interrupt descriptor table (IDT):
@@ -98,7 +96,7 @@ pub extern "C" fn rust_main(...) {
When we execute `make run`, we see the following: When we execute `make run`, we see the following:
![QEMU showing `EXCEPTION: BREAKPOINT at 0x1100aa` and a dump of the exception stack frame](images/qemu-breakpoint-handler.png) ![QEMU showing `EXCEPTION: BREAKPOINT at 0x110970` and a dump of the exception stack frame](images/qemu-breakpoint-handler.png)
It works! Now we “just” need to return from the breakpoint handler somehow so that we see the `It did not crash` message again. It works! Now we “just” need to return from the breakpoint handler somehow so that we see the `It did not crash` message again.
@@ -131,18 +129,18 @@ The situation is a bit different for the breakpoint exception, since it needs no
Let's check this for our breakpoint handler. Remember, the handler printed the following message (see the image above): Let's check this for our breakpoint handler. Remember, the handler printed the following message (see the image above):
``` ```
EXCEPTION: BREAKPOINT at 0x1100aa EXCEPTION: BREAKPOINT at 0x110970
``` ```
So let's disassemble the instruction at `0x1100aa` and its predecessor: So let's disassemble the instruction at `0x110970` and its predecessor:
{{< highlight shell "hl_lines=3" >}} {{< highlight shell "hl_lines=3" >}}
> objdump -d build/kernel-x86_64.bin | grep -B1 "1100aa:" > objdump -d build/kernel-x86_64.bin | grep -B1 "110970:"
1100a9: cc int3 11096f: cc int3
1100aa: 48 89 c6 mov %rax,%rsi 110970: 48 c7 01 2a 00 00 00 movq $0x2a,(%rcx)
{{< / highlight >}} {{< / highlight >}}
We see that `0x1100aa` indeed points to the next instruction after `int3`. So we can simply jump to the stored instruction pointer when we want to return from the breakpoint exception. We see that `0x110970` indeed points to the next instruction after `int3`. So we can simply jump to the stored instruction pointer when we want to return from the breakpoint exception.
### Implementation ### Implementation
Let's update our `handler!` macro to support non-diverging exception handlers: Let's update our `handler!` macro to support non-diverging exception handlers:
@@ -192,7 +190,7 @@ extern "C" fn invalid_opcode_handler(
extern "C" fn breakpoint_handler( extern "C" fn breakpoint_handler(
- stack_frame: *const ExceptionStackFrame) -> ! { - stack_frame: *const ExceptionStackFrame) -> ! {
+ stack_frame: *const ExceptionStackFrame) { + stack_frame: *const ExceptionStackFrame) {
unsafe { print_error(...) } println!(...);
- loop {} - loop {}
} }
``` ```
@@ -201,9 +199,9 @@ Note that we also removed the `loop {}` at the end of our `breakpoint_handler` s
### Testing ### Testing
Let's try our new `iretq` logic: Let's try our new `iretq` logic:
![QEMU output with `EXCEPTION BREAKPOINT` and `EXCEPTION PAG FAULT` but no `It did not crash`](images/qemu-breakpoint-return-page-fault.png) ![QEMU output with `EXCEPTION BREAKPOINT` and `EXCEPTION PAGE FAULT` but no `It did not crash`](images/qemu-breakpoint-return-page-fault.png)
Instead of the expected _“It did not crash”_ message after the breakpoint exception, we get a page fault. The strange thing is that our kernel tried to access address `0x0`, which should never happen. So it seems like we messed up something important. Instead of the expected _“It did not crash”_ message after the breakpoint exception, we get a page fault. The strange thing is that our kernel tried to access address `0x1`, which should never happen. So it seems like we messed up something important.
### Debugging ### Debugging
Let's debug it using GDB. For that we execute `make debug` in one terminal (which starts QEMU with the `-s -S` flags) and then `make gdb` (which starts and connects GDB) in a second terminal. For more information about GDB debugging, check out our [Set Up GDB] guide. Let's debug it using GDB. For that we execute `make debug` in one terminal (which starts QEMU with the `-s -S` flags) and then `make gdb` (which starts and connects GDB) in a second terminal. For more information about GDB debugging, check out our [Set Up GDB] guide.
@@ -230,7 +228,7 @@ Breakpoint 1, blog_os::rust_main (multiboot_information_address=1539136)
``` ```
It worked! So our kernel successfully returned from the `int3` instruction, which means that the `iretq` itself works. It worked! So our kernel successfully returned from the `int3` instruction, which means that the `iretq` itself works.
However, when we `continue` the execution again, we get the page fault. So the exception occurs somewhere in the `println` logic. This means that it occurs in code generated by the compiler (and not e.g. in inline assembly). But the compiler should never access `0x0`, so how is this happening? However, when we `continue` the execution again, we get the page fault. So the exception occurs somewhere in the `println` logic. This means that it occurs in code generated by the compiler (and not e.g. in inline assembly). But the compiler should never access `0x1`, so how is this happening?
The answer is that we've used the wrong _calling convention_ for our exception handlers. Thus, we violate some compiler invariants so that the code that works fine without intermediate exceptions starts to violate memory safety when it's executed after a breakpoint exception. The answer is that we've used the wrong _calling convention_ for our exception handlers. Thus, we violate some compiler invariants so that the code that works fine without intermediate exceptions starts to violate memory safety when it's executed after a breakpoint exception.
@@ -280,7 +278,7 @@ So here is what happens:
- Our `breakpoint_handler` prints to the screen and assumes that it can overwrite `rax` freely (since it's a scratch register). Somehow the value `0` ends up in `rax`. - Our `breakpoint_handler` prints to the screen and assumes that it can overwrite `rax` freely (since it's a scratch register). Somehow the value `0` ends up in `rax`.
- We return from the breakpoint exception using `iretq`. - We return from the breakpoint exception using `iretq`.
- `rust_main` continues and accesses the memory address in `rax`. - `rust_main` continues and accesses the memory address in `rax`.
- The CPU tries to access address `0x0`, which causes a page fault. - The CPU tries to access address `0x1`, which causes a page fault.
So our exception handler erroneously assumes that the scratch registers were saved by the caller. But the caller (`rust_main`) couldn't save any registers since it didn't know that an exception occurs. So nobody saves `rax` and the other scratch registers, which leads to the page fault. So our exception handler erroneously assumes that the scratch registers were saved by the caller. But the caller (`rust_main`) couldn't save any registers since it didn't know that an exception occurs. So nobody saves `rax` and the other scratch registers, which leads to the page fault.
@@ -878,7 +876,7 @@ extern "C" fn page_fault_handler(stack_frame: *const ExceptionStackFrame,
error_code: u64) error_code: u64)
{ {
use x86::controlregs; use x86::controlregs;
unsafe { print_error(...); } println!(...);
// new // new
unsafe { unsafe {