Prevent deadlock by basing DummyWaker directly on RawWaker

Don't use Arc for DummyWaker. It causes a drop in the `add_scancode` function, which can easily lead to a deadlock because the function is called directly from the interrupt handler.
This commit is contained in:
Philipp Oppermann
2020-03-23 11:21:24 +01:00
parent 89f5350ac4
commit c26d36ebce

View File

@@ -890,53 +890,68 @@ The struct contains a single `task_queue` field of type [`VecDeque`], which is b
#### Dummy Waker
In order to call the `poll` method, we need to create a [`Context`] type, which wraps a [`Waker`] type. To start simple, we will first create a dummy waker that does nothing. The simplest way to do this is by implementing the unstable [`Wake`] trait for an empty `DummyWaker` struct:
In order to call the `poll` method, we need to create a [`Context`] type, which wraps a [`Waker`] type. To start simple, we will first create a dummy waker that does nothing. For this, we create a [`RawWaker`] instance, which defines the implementation of the different `Waker` methods, and then use the [`Waker::from_raw`] function to turn it into a `Waker`:
[`Wake`]: https://doc.rust-lang.org/nightly/alloc/task/trait.Wake.html
[`RawWaker`]: https://doc.rust-lang.org/stable/core/task/struct.RawWaker.html
[`Waker::from_raw`]: https://doc.rust-lang.org/stable/core/task/struct.Waker.html#method.from_raw
```rust
// in src/task/simple_executor.rs
use alloc::{sync::Arc, task::Wake};
use core::task::{Waker, RawWaker};
struct DummyWaker;
fn dummy_raw_waker() -> RawWaker {
todo!();
}
impl Wake for DummyWaker {
fn wake(self: Arc<Self>) {
// do nothing
}
fn dummy_waker() -> Waker {
unsafe { Waker::from_raw(dummy_raw_waker()) }
}
```
The trait is still unstable, so we have to add **`#![feature(wake_trait)]`** to the top of our `lib.rs` to use it. The `wake` method of the trait is normally responsible for waking the corresponding task in the executor. However, our `SimpleExecutor` will not differentiate between ready and waiting tasks, so we don't need to do anything on `wake` calls.
The `from_raw` function is unsafe because undefined behavior can occur if the programmer does not uphelp the documented requirements of `RawWaker`. Before we look at the implementation of the `dummy_raw_waker` function, we first try to understand how the `RawWaker` type works.
Since wakers are normally shared between the executor and the asynchronous tasks, the `wake` method requires that the `Self` instance is wrapped in the [`Arc`] type, which implements reference-counted ownership. The basic idea is that the value is heap-allocated and the number of active references to it are counted. If the number of active references reaches zero, the value is no longer needed and can be deallocated.
##### `RawWaker`
The [`RawWaker`] type requires the programmer to explicitly define a [_virtual method table_] (_vtable_) that specifies the functions that should be called when the `RawWaker` is cloned, woken, or dropped. The layout of this vtable is defined by the [`RawWakerVTable`] type. Each function receives a `*const ()` argument that is basically a _type-erased_ `&self` pointer to some struct, e.g. allocated on the heap. The reason for using a `*const ()` pointer instead of a proper reference is that the `RawWaker` type should be non-generic. The pointer value that is passed to the functions is the `data` pointer given to [`RawWaker::new`].
[_virtual method table_]: https://en.wikipedia.org/wiki/Virtual_method_table
[`RawWakerVTable`]: https://doc.rust-lang.org/stable/core/task/struct.RawWakerVTable.html
[`RawWaker::new`]: https://doc.rust-lang.org/stable/core/task/struct.RawWaker.html#method.new
Typically, the `RawWaker` is created for some heap allocated struct that is wrapped into the [`Box`] or [`Arc`] type. For such types, methods like [`Box::into_raw`] can be used to convert the `Box<T>` to a `*const T` pointer. This pointer can then be casted to an anonymous `*const ()` pointer and passed to `RawWaker::new`. Since each vtable function receives the same `*const ()` as argument, the functions can sately cast the pointer back to a `Box<T>` or a `&T` to operate on it. As you can imagine, this process is highly dangerous and can easily lead to undefined behavior on mistakes. For this reason, manually creating a `RawWaker` is not recommended unless necessary.
[`Box`]: https://doc.rust-lang.org/stable/alloc/boxed/struct.Box.html
[`Arc`]: https://doc.rust-lang.org/stable/alloc/sync/struct.Arc.html
[`Box::into_raw`]: https://doc.rust-lang.org/stable/alloc/boxed/struct.Box.html#method.into_raw
To make our `DummyWaker` usable with the [`Context`] type, we need a method to convert it to the [`Waker`] defined in the core library:
##### A Dummy `RawWaker`
While manually creating a `RawWaker` is not recommended, there is currently no other way to create a dummy `Waker` that does nothing. Fortunately, the fact that we want to do nothing makes it relatively safe to implement the `dummy_raw_waker` function:
```rust
// in src/task/simple_executor.rs
use core::task::Waker;
use core::task::RawWakerVTable;
impl DummyWaker {
fn to_waker(self) -> Waker {
Waker::from(Arc::new(self))
fn dummy_raw_waker() -> RawWaker {
fn no_op(_: *const ()) {}
fn clone(_: *const ()) -> RawWaker {
dummy_raw_waker()
}
let vtable = &RawWakerVTable::new(clone, no_op, no_op, no_op);
RawWaker::new(0 as *const (), vtable)
}
```
The method first makes the `self` instance reference-counted by wrapping it in an [`Arc`]. Then it uses the [`Waker::from`] method to create the `Waker`. This method is available for all reference counted types that implement the [`Wake`] trait.
First, we define two inner functions named `no_op` and `clone`. The `no_op` function takes a `*const ()` pointer and does nothing. The `clone` function also takes a `*const ()` pointer and returns a new `RawWaker` by calling `dummy_raw_waker` again. We use these two functions to create a minimal `RawWakerVTable`: The `clone` function is used for the cloning operations and the `no_op` function is used for all other operations. Since the `RawWaker` does nothing, it does not matter that we return a new `RawWaker` from `clone` instead of cloning it.
[`Waker::from`]: TODO
Now we have a way to create a `Waker` instance, we can use it to implement a `run` method on our executor.
After creating the `vtable`, we use the [`RawWaker::new`] function to create the `RawWaker`. The passed `*const ()` does not matter since none of the vtable function uses it. For this reason, we simply pass a null pointer.
#### A `run` Method
The most simple `run` method is to repeatedly poll all queued tasks in a loop until all are done. This is not very efficient since it does not utilize the notifications of the `Waker` type, but it is an easy way to get things running:
Now we have a way to create a `Waker` instance, we can use it to implement a `run` method on our executor. The most simple `run` method is to repeatedly poll all queued tasks in a loop until all are done. This is not very efficient since it does not utilize the notifications of the `Waker` type, but it is an easy way to get things running:
```rust
// in src/task/simple_executor.rs
@@ -1366,78 +1381,57 @@ fn kernel_main(boot_info: &'static BootInfo) -> ! {
}
```
When we execute `cargo xrun` now, we see that keyboard input works again, but only for a short time:
![QEMU printing output for keypresses "H", "e", and "l", then it hangs](keyboard-deadlock.gif)
After pressing a few keys, the complete execution hangs. Not even the dots by the timer interrupt are printed anymore. Such bugs are typically caused by a [_deadlock_], which is a state where we endlessly wait on some lock. To find out where the program hangs, the best approach is to connect a debugger and print the backtrace. Expand the section below for the exact debugging steps:
[_deadlock_]: https://en.wikipedia.org/wiki/Deadlock
<details style="margin-bottom: 1rem;">
<summary><strong>Debugging Steps</strong></summary>
- Make sure `gdb` or `gdb-multiarch` is installed on your system.
- Pass the `-s` flag to QEMU when running your kernel. You can do this through the command `cargo xrun -- -s`.
- Run `gdb` with the file name of your kernel as argument:
```
gdb target/x86_64-blog_os/debug/blog_os
```
- From the `gdb` console, connect to the QEMU instance by executing `target ext :1234`.
- Print the backtrace by executing `backtrace` or `bt`.
The backtrace in this case looks like this:
```
#0 AtomicBool::load (self=0x22d250 <blog_os::allocator::ALLOCATOR>, …)
at libcore/sync/atomic.rs:404
#1 spin::Mutex<T>::obtain_lock (self=0x22d250 <blog_os::allocator::ALLOCATOR>)
at spin-0.5.2/src/mutex.rs:134
#2 spin::Mutex<T>::lock (self=0x22d250 <blog_os::allocator::ALLOCATOR>)
at spin-0.5.2/src/mutex.rs:158
#3 blog_os::allocator::Locked<A>::lock (…)
at src/allocator.rs:73
#4 Locked::dealloc (…) at src/allocator/fixed_size_block.rs:83
#5 __rg_dealloc (…) at src/allocator.rs:19
#6 alloc::alloc::dealloc (…) at liballoc/alloc.rs:103
#7 alloc::alloc::Global::dealloc (…) at liballoc/alloc.rs:174
#8 alloc::sync::Arc<T>::drop_slow (…) at liballoc/sync.rs:743
#9 alloc::sync::Arc<T>::drop (…) at liballoc/sync.rs:1249
#10 core::ptr::drop_in_place () at libcore/ptr/mod.rs:174
#11 blog_os::task::simple_executor::DummyWaker::wake (…)
at src/task/simple_executor.rs:37
#12 alloc::task::raw_waker::wake (waker=0x4444444400d0)
at liballoc/task.rs:69
#13 core::task::wake::Waker::wake (self=...) at libcore/task/wake.rs:241
#14 AtomicWaker::wake (self=0x22d210 <blog_os::task::keyboard::WAKER>)
at futures-core-0.3.4/src/task/__internal/atomic_waker.rs:355
#15 blog_os::task::keyboard::add_scancode (scancode=31) at src/task/keyboard.rs:24
#16 blog_os::interrupts::keyboard_interrupt_handler (…) at src/interrupts.rs:87
```
Note that I shortened the output a bit to make it more readable.
</details>
From the backtrace, we can deduce that the deadlock was caused by the following order of operations:
```
keyboard_interrupt_handler -> add_scancode -> AtomicWaker::wake
-> GlobalAlloc::dealloc -> allocator::Locked::lock
```
TODO
#### Fixing the Deadlock
When we execute `cargo xrun` now, we see that keyboard input works again:
TODO image
If you keep an eye on the CPU utilization of your computer, you will see that the `QEMU` process now keeps one CPU completely busy. This happens because our `SimpleExecutor` polls tasks over and over again in a loop. So even if we don't press any keys on the keyboard, the executor repeatedly calls `poll` on our `print_keypresses` task, even though the task cannot make any progress and will return `Poll::Pending` each time.
To fix this, we need to create an executor that properly utilizes the `Waker` notifications. This way, the executor is notified when the next keyboard interrupt occurs, so it does not need to keep polling the `print_keypresses` task over and over again.
### Executor with Waker Support
#### The `Wake` Trait
The simplest way to do this is by implementing the unstable [`Wake`] trait for an empty `DummyWaker` struct:
[`Wake`]: https://doc.rust-lang.org/nightly/alloc/task/trait.Wake.html
```rust
// in src/task/simple_executor.rs
use alloc::{sync::Arc, task::Wake};
struct DummyWaker;
impl Wake for DummyWaker {
fn wake(self: Arc<Self>) {
// do nothing
}
}
```
The trait is still unstable, so we have to add **`#![feature(wake_trait)]`** to the top of our `lib.rs` to use it. The `wake` method of the trait is normally responsible for waking the corresponding task in the executor. However, our `SimpleExecutor` will not differentiate between ready and waiting tasks, so we don't need to do anything on `wake` calls.
Since wakers are normally shared between the executor and the asynchronous tasks, the `wake` method requires that the `Self` instance is wrapped in the [`Arc`] type, which implements reference-counted ownership. The basic idea is that the value is heap-allocated and the number of active references to it are counted. If the number of active references reaches zero, the value is no longer needed and can be deallocated.
[`Arc`]: https://doc.rust-lang.org/stable/alloc/sync/struct.Arc.html
To make our `DummyWaker` usable with the [`Context`] type, we need a method to convert it to the [`Waker`] defined in the core library:
```rust
// in src/task/simple_executor.rs
use core::task::Waker;
impl DummyWaker {
fn to_waker(self) -> Waker {
Waker::from(Arc::new(self))
}
}
```
The method first makes the `self` instance reference-counted by wrapping it in an [`Arc`]. Then it uses the [`Waker::from`] method to create the `Waker`. This method is available for all reference counted types that implement the [`Wake`] trait.
[`Waker::from`]: TODO