Bare Metal Rust: Low-level CPU I/O ports
Want to build your own kernel in Rust? See the Bare Metal Rust page for more resources and more posts in this series.
Rust is a really fun language: It allows me to work on low-level kernel code, but it also allows me to wrap my code up in clean, high-level APIs. If you this sounds interesting, you should really check out Philipp Oppermann’s blog posts about writing a basic x86_64 operating system kernel in Rust. He walks you through booting the kernel, entering long mode, getting Rust running, and printing text to the screen.
Once you get a basic kernel running, you’ll probably want to start working
on basic I/O, which requires interrupts. And this point, you’ll find that
pretty much every tutorial dives right into the in
and out
instructions. For example, if you look at the
OSDev.org introduction to interrupts, the very first code
you’ll see is (comments added):
mov al,20h ; Move interrupt acknowledgment code into al.
out 20h,al ; Write al to PIC on port 0x20.
Here, we’re talking to the PIC (“Programmable Interrupt Controller”), and we’re telling it that we’ve finished handling a processor interrupt. To do this, we need to write an 8-bit status code to the I/O port at address 0x20.
Traditionally, we would wrap this up in an outb
(“out byte”) function,
which might look something like this in Rust:
// The asm! macro requires a nightly build of Rust, and
// we need to opt-in explicitly.
#![feature(asm)]
unsafe fn outb(value: u8, port: u16) {
asm!("outb %al, %dx" ::
"{dx}"(port), "{al}"(value) ::
"volatile");
}
This writes an 8-byte value to the specified port. It uses the unstable
Rust extension asm!
, which allows us to use GCC/LLVM-style inline
assembly. We’d invoke it like this:
outb(0x20, 0x20);
But let’s see if we can wrap a higher-level API around an I/O port.
What we want to wrap
First, let’s look at the complete family of functions we want to wrap:
unsafe fn inb(port: u16) -> u8 { ... }
unsafe fn outb(value: u8, port: u16) { ... }
unsafe fn inw(port: u16) -> u16 { ... }
unsafe fn outw(value: u16, port: u16) { ... }
unsafe fn inl(port: u16) -> u32 { ... }
unsafe fn outl(value: u32, port: u16) { ... }
The implementations of these functions are all similar to
outb
above. The functions whose names begin with in
all read from port
, and the
functions beginning with out
write to it. We use the b
functions for
8-bit-wide ports, the w
functions for 16-bit-wide ports, and the l
functions for 32-bit ones.
API design & safety
We could imagine an API which looks something like:
// Create a port at location 0x3F8, the data line
// associated with the serial port COM1.
let mut port = Port::new(0x3F8);
// Write `out_val` to our port.
port.write(out_val);
// Receive `in_val` from our port.
let in_val = port.read();
But out course, this is Rust, so we need to ask some questions about
safety. We call a Rust API “unsafe
” if we can use it to cause
undefined behavior. Could we do that with this API? Well, it
depends on the port address: Port::new(0x3F8)
is used to read and write
data on COM1, so it’s harmless, but Port::new(0x20)
allows us to
reprogram the interrupt controller, which can corrupt the stack. So we
should mark Port::new
as an unsafe API, and call it as:
let mut port = unsafe { Port::new(0x3F8) };
But we probably don’t need to make read
or write
unsafe, because we can
leave it up to the unsafe
code which calls Port::new
to decide who can
safely have access to the port. Once some unsafe
code somewhere has
taken responsibility the port, it’s their problem.
So here’s the first draft of our API:
pub struct Port {
port: u16,
}
impl Port {
pub unsafe fn new(port: u16) -> Port {
Port { port: port }
}
pub fn read(&mut self) -> u8 {
unsafe { inb(self.port) }
}
pub fn write(&mut self, value: u8) {
unsafe { outb(value, self.port) }
}
}
This works, but only for u8
. We can do better!
Trying to add a type parameter
We could try to add a type parameter:
pub struct Port<T> {
port: u16,
}
But when we compile this, we get very helpful error:
src/lib.rs:1:17: 1:18 error: parameter `T` is never used [E0392]
src/lib.rs:1 pub struct Port<T> {
^
src/lib.rs:1:17: 1:18 help: run `rustc --explain E0392` to see
a detailed explanation
src/lib.rs:1:17: 1:18 help: consider removing `T` or using a
marker such as `core::marker::PhantomData`
The Rust documentation explains PhantomData
in detail.
It’s basically a zero-sized placeholder type which we can use as follows:
pub struct Port<T> {
port: u16,
phantom: PhantomData<T>,
}
We can now update new
, adding the type parameter <T>
everywhere
we need it:
impl<T> Port<T> {
pub unsafe fn new(port: u16) -> Port<T> {
Port { port: port, phantom: PhantomData }
}
// ...
}
We initialize the field phantom
to the value PhantomData
, which is the
only possible value of type PhantomData<T>
. This is probably seems
really weird to everybody except Haskell and ML programmers, who’ve seen
types with only one value before.
Problem 1: We don’t want anyone creating Port<f64>
As written above, Port<T>
could be used with any type T
, including
something weird like f64
. But this won’t work, because our CPU’s in
and out
instructions only work with the types u8
, u16
and u32
. So
we need to restrict the possible values of T
, which we can do by defining
a new trait:
pub trait InOut {}
impl InOut for u8 {}
impl InOut for u16 {}
impl InOut for u32 {}
We can then add a type constraint to T
, giving us T: InOut
in several
places:
pub struct Port<T: InOut> {
port: u16,
phantom: PhantomData<T>,
}
impl<T: InOut> Port<T> {
// ...
}
Problem 2: What do we put in read
and write
?
But now, if we try to define the read
function, we have no way to decide
whether to call inb
, inw
or inl
:
impl<T: InOut> Port<T> {
// ...
pub fn read(&mut self) -> T {
unsafe { /* inb, inw or inl??? */ }
}
// ...
}
If T
were u8
, we could use inb
like before. But u16
would require
inw
, and u32
would require inl
. We can sort this out by adding some
functions to our InOut
trait:
pub trait InOut {
unsafe fn port_in(port: u16) -> Self;
unsafe fn port_out(port: u16, value: Self);
}
We then need to define these methods for each instance of the trait:
impl InOut for u8 {
unsafe fn port_in(port: u16) -> u8 { inb(port) }
unsafe fn port_out(port: u16, value: u8) { outb(value, port); }
}
impl InOut for u16 {
unsafe fn port_in(port: u16) -> u16 { inw(port) }
unsafe fn port_out(port: u16, value: u16) { outw(value, port); }
}
impl InOut for u32 {
unsafe fn port_in(port: u16) -> u32 { inl(port) }
unsafe fn port_out(port: u16, value: u32) { outl(value, port); }
}
This finally gives us:
impl<T: InOut> Port<T> {
pub unsafe fn new(port: u16) -> Port<T> {
Port { port: port, phantom: PhantomData }
}
pub fn read(&mut self) -> T {
unsafe { T::port_in(self.port) }
}
pub fn write(&mut self, value: T) {
unsafe { T::port_out(self.port, value); }
}
}
An experimental enhancement: Creating Port
s at compile time
Using a Mutex
from the Rust spin
crate, we might try to create a
global port object protected by a lock:
extern crate spin;
use spin::Mutex;
/// Port used to access a PS/2 keyboard.
static KEYBOARD: Mutex<Port<u8>> = Mutex::new(unsafe {
Port::new(0x60)
});
If we do this, Rust tells us:
src/lib.rs:24:5: 24:20 error: function calls in statics are limited to
constant functions, struct and enum constructors [E0015]
src/lib.rs:24 Port::new(0x60)
^~~~~~~~~~~~~~~
It would be nice if we could run Port::new
at compile time. Happily, if
we use the nightly build of Rust, we can try out the new const_fn
feature:
#![feature(const_fn)]
impl<T: InOut> Port<T> {
pub const unsafe fn new(port: u16) -> Port<T> {
/// ...
See that const
in the function declaration? That means we can now safely
write:
static KEYBOARD: Mutex<Port<u8>> = Mutex::new(unsafe {
Port::new(0x60)
});
…and get our Mutex
-protected port! We can access this as:
let scancode = KEYBOARD.lock().read();
Available as a crate! (and where to next?)
This library is available as a crate, which you can also download
from crates.io. If you follow
Philipp Oppermann’s instructions for using cargo
to build
a kernel, you can include this by adding cpuio
to your Cargo.toml
file:
[dependencies]
cpuio = "*"
And using this library as a building block, we can tackle:
- Setting up interrupts (needed for I/O).
- Reading keyboard input.
- Writing to the serial port (useful for debugging kernels with QEMU).
If people are interested, I’m happy to dive into these subjects. And if you have any questions, I’ll be following the discussion at Reddit.
Want to build your own kernel in Rust? See the Bare Metal Rust page for more resources and more posts in this series.
Want to contact me about this article? Or if you're looking for something else to read, here's a list of popular posts.